C3G Interns

Our teams have a broad expertise in bioinformatics and significant experience in personalized medicine applications

Want to be a intern?

Internships at C3G

Through internships, C3G and the Bourque Lab offer students an opportunity to get hands-on experience in applied bioinformatics, software/web development or genomics research.

How to Apply: Available summer internship positions will be posted here around December each year. Outstanding candidates are also welcome to contact us here to apply outside of the regular summer internship program.

Meet our past interns:

Étienne Collette

During my internship at C3G, I had the opportunity to work on multiple projects doing hands-on bioinformatics. The first project was to look at a transgenic mouse’s genome sequencing data for an insertion/deletion event which wasn’t picked up on by the usual detection methods. Afterwards, I compared the differences between using the GRCh37 and GRCh38 genome references by applying the GenPipes RNA-seq pipeline on previously analyzed data. I made a few interesting discoveries that proved useful to C3G in order to improve their analyses. Next, I worked on a project with the goal of replicating results from a previous study looking into optimizing the sequencing and detection of cassava mosaic viruses (CMV) present in African crops (TreeLab).The end goal of said project being to be able to detect CMV with a laptop unconnected from the Internet directly in the fields.

For the whole time that I was here, I was able to learn a lot about the workings of HPC systems and furthering my knowledge of both R and BASH.

Linkedin profile >

Summer Intern 2020

Sebastian Ballesteros Ramirez

Throughout my summer internship at the C3G I worked on two main projects. First, I developed web based data visualization tools for a federated genomics database. These tools helped users visualize genomic variants and the information embedded in them. The second project I worked on was to implement a decision tree to guide researches in protecting the privacy and confidentiality of the human health-related datasets they process. Some of the technologies I used during my internship were: Javascript, Python and React.

Since I did not have any prior experience with genomics I learned a lot about bioinformatics and the way the human genome works (specially genomic variants). In addition, I learned about the different regulation laws that protect personal genomic information.

Linkedin profile >

Summer Intern 2020

Solomia Yanishevsky

Over the course of my internship with C3G, I was involved with a few key projects. More specifically, I was tasked with generating synthetic FHIR datasets using FHIR APIs. These datasets were created with the goal of ingesting them into the metadata service and to test the ingestion algorithm. All synthetic datasets that I generated were published on C3G’s public repository, so other developers and researchers will have access to already prepared datasets. A large aspect of my internship involved mapping across data standards. I worked on mapping the GA4GH Phenopackets standard used by the metadata service to the mCODE standard. This mapping will be used in future C3G projects that will benefit from incorporating specific mCODE data elements. Finally, I was tasked with mapping data elements collected by a COVID-19 initative, CanCOGen, to the Phenopackets schema. Ultimately, the mapping I completed was sufficient for building a synthetic dataset for a prototype to be used for the CanCOGen project.

My time as an intern with C3G has been very hands-on and educational. I learned a lot through my work with APIs and extensive documentation of data standards as well as through interactions with my colleagues. As I have a more clinical background, rather than computer science background, my internship has made me eager to continue to strengthen my programming skills.

Linkedin profile >

Summer Intern 2020

Soulaine Theocharides

I was tasked with organizing massive sets of epigenomic data in accordance with IHEC standards during my internship. When I arrived, data were very loosely organized by consortium, all of which had different naming conventions and internal organizational structures. There were three steps to organize the data. First, I wrote a script to scrape information for the IHEC database and store it as a master metadata file. With IHEC IDs and consortium IDs linked, I then wrote a program to move data files into a structure that better followed the international standard. The most interesting part of my internship was writing a search program that allowed users to quickly locate files based on dataset characteristics, such as donor age, tissue type, or experiment type. I quickly learned that even the best file organizational system would not work for everyone, so the search function was key to meet the goals of my internship.

I loved working with C3G. I had an opportunity for the first time to use powerful computing clusters, became more familiar working in a UNIX environment, and thoroughly developed my python skills.

Linkedin profile >

Summer Intern 2020

Rami Coles

During my internship, I worked on the development of a new pipeline for GenPipes called EpiQC. EpiQC can be used on C3G’s computer clusters. The pipeline’s objective is to assess the quality of a ChIP-Seq signal track (bigwig) dataset and determine if it should be used or not. I implemented different metrics to verify the quality of those bigwig files using 3rd party tools such as BigWigInfo, ChromImpute and EpiGeEC and also coded multiple python scripts to facilitate the use of the pipeline. After having tested the pipeline on small datasets (~100 files), I ran it on the whole ChIP-Seq data available through the IHEC data portal (~2500 files) to see how the pipeline runs on a big dataset. While most of the features worked, I had some issues with ChromImpute. I successfully trained ChromImpute on the whole dataset but could not impute the predicted files.

I learned a lot about computer clusters since I had no previous experience with them and how to handle large datasets, and while coding the pipeline and various scripts, I further improved my skill in using python.

Linkedin profile >

Summer Intern 2019

Shereen Elaidi

This summer, I worked on metagenomics analysis. In particular, I installed the whole genome shotgun (WGS) pipeline, MOCAT2, on C3G’s computing clusters. Since it did not work out-of-box, we did a lot of debugging and wrote our own wrapper scripts that were tested to work on C3G’s servers (since the ones shipped with MOCAT2 were depreciated). Additionally, we modified some of the MOCAT2 code to bring the run time of one of the pipeline’s steps down from several hours (and sometimes days, depending on the sample size) to only minutes and reduced RAM usage from 250 – 500GB of RAM down to less than 1GB of RAM. We used MOCAT2 to analyze 156 stool samples to examine the relationship between the gut microbiome and fibromyalgia.
Since I had no bioinformatics experience, and minimal computer science experience, I learned a lot during this internship, including genome sequencing and the importance of the gut microbiome, how to debug large pipelines and write my own wrapper scripts, how to process very large datasets, and how to use computing clusters.

Linkedin profile >

Summer Intern 2019

Nick Zombolas

During my time as a web development intern at C3G, I worked to improve the GenPipes Dashboard for viewing pipeline execution. I got a lot of great experience working with modern frameworks in the frontend and backend, such as React, Redux, Node.js, and Express. My main focuses throughout the internship were to improve the performance of the website, as well as add new features to provide a better user experience. Performance enhancements include adding an efficient sample search feature, as well as lazy-loading and virtual scrolling for samples. I also added a graphical view of sample execution, created a page to view project statistics, and provided a general cleanup of the user interface.

Throughout the internship I got a lot of experience with modern web frameworks, as well as learned the best practices for keeping code clean and concise.

Linkedin profile >

Summer Intern 2019

David Lougheed

As a software development intern at C3G, I worked on three projects: a tool to look at the research output of Genome Canada, a website which displays research connections between professors at McGill and especially the McGill initiative in Computational Medicine, and a browser for data produced by the MHcut tool (a collaboration between Dr. Bourque’s lab and the Woltgen lab at Kyoto University.) During my internship, I primarily used Python to write scripts and web service backends, and D3 to produce interactive data-driven JavaScript-based front-ends.

Linkedin profile >

Summer Intern 2018