Apr 2023 - Present
City of Edinburgh, Scotland, United Kingdom
Developed ML models (Python: scikit-learn, NumPy, Pandas) for bacteria-phage interaction prediction and E. coli host assignment.
Built and optimized Nextflow-based pipelines containerized with Docker/Singularity for large-scale genomic data processing, automating model training, validation, and testing, with Git for version control.
Worked with HPC clusters and cloud computing (Google Collaborative Cloud, CLIMB) for scalable workflows.
Applied feature engineering & statistical modelling of high-dimensional biological data to extract key insights on E. coli & phage phylogeny, antimicrobial resistance (AMR), and deconvolute host specificity.
Utilized R for analysis & visualization, ensuring output interpretability and actionable insights.
Developed a Flask-based intranet web app for automated sequence analysis and integrated ML models into a government food surveillance data platform.
Presented research at international conferences (talks & posters) and led industry-focused ML workshops to bridge research with real-world applications.