We are looking for highly motivated and talented individual with passion in oncology and genomics research to join the Data and Computational Science Core at the National Cancer Centre Singapore. You will primarily work with the Senior Data Scientist lead and the current research and clinical teams. The selected individual is expected to actively contribute to our core multi-omics research and big data analysis infrastructure that is focused on using electronic medical records, next-generation sequencing (NGS), radiological imaging and other multi-modal datatypes to develop biomarkers predictive of clinical responses in cancer patients. On-site on-the-job guidance and mentorship will be provided and there will also be ample opportunities for inter-departmental and cross-institution collaborations with oncologists, pathologists, and scientists. For more information, please feel free to refer to the laboratory website - www.chualabnccs.com.
Responsibilities:
- To wrangle and optimise large datasets, execute complex parallel computational pipelines and apply AI techniques to better understand the complexity of cancer progression and treatment resistance across multiple cancer types.
- Build up and continuously develop a robust analysis infrastructure to support local research and clinical needs at the NCCS Data and Computational Science Core.
Requirements:
- Bachelors degree or higher in a relevant STEM discipline.
- Knowledge of production data pipelines, especially in a bioinformatics or clinical healthcare setting.
- Strong programming expertise in at least one major language (e.g. Python, R, C/++, Rust).
- Big data wrangling experience across multimodal datatypes (e.g. Polars, Arrow, vector DBs and column-stores).
- Familiarity with Linux or other Unix flavours, preferably as an administrator/superuser.
- A keenness in incrementally designing, building and testing software components to ensure correct end-to-end running of primary production pipelines.
- A keenness to continually learn and integrate new tools and parameters to keep up with industry best practices, adapting them to local needs.
- Ability to independently plan and execute data analysis and ad-hoc projects, in collaboration with teammates and external parties.
- Strong organizational, interpersonal and presentation skills.
Desired:
- A passion for healthcare, precision medicine and scientific research.
- Familiarity with pipeline management systems (Nextflow, Snakemake, CWL, WDL).
- Familiarity with job schedulers (SLURM, PBS, SGE, LSF).
- Familiarity with container/virtualization systems (Docker, Singularity, Podman, Kubernetes).
- Interest in optimising GPU workloads, Large Models and generative AI.
- Interest in front-end and back-end development for data analytics and parallel computation.