By Dr. Julia Lee-Soety
In 2009, Dr. Christina King Smith and I applied and were chosen to participate in the Howard Hughes Medical Institute (HHMI)-sponsored SEA-PHAGES (Science Education Alliance – Phage Hunters Advancing Genomics and Evolutionary Science) program. This opened an opportunity for us to offer first-year students authentic research experiences and get them excited about doing science. SEA-PHAGES is now in its fifth year with 70 participating colleges and universities from around the country; Saint Joseph’s University has been a member since the second year.
Since the 2009-2010 academic year, four cohorts, each with 13 to 18 first-year Biology and Chemical-Biology students, have participated in Phage Safari. Students are selected to be a part of this two-semester lab experience in lieu of the traditional Cells and Genetics labs. In the fall semester, every student isolates viruses that have infected bacteria from a soil sample they have collected – from around campus, near their home, or at various animal enclosures at the Philadelphia Zoo. These viruses are officially known as mycobacteriphage or just simply phage. Each student would take ownership of his or her phage and even give it an official name. Nearing the end of the fall, the class agrees on one or two phages to be fully sequenced from all the phages the students had isolated.
Over Christmas break, DNA sequencing facilities off campus are hard at work mapping out the blueprints of the phages. Each phage has unique blueprints or DNA sequences that sets it apart from another phage even those that may appear almost identical. The sequences store information that builds up the components of the phage, dictates how it will infect a bacterial cell, and determine how it will multiply before leaving its host. When the DNA sequences are returned to us from the sequencing facilities, they are a long string of Gs, Cs, As, and Ts, representing the 4 nucleotides of DNA. Our job is to annotate it. If a DNA sequence from a phage genome is a continuous string of letters on a piece of paper, then annotating genes is analogous to identifying individual words and meaningful sentences. As each sentence has specific start and stop, each gene has the same.
Students have employed various software programs to help them do this. For the first two years, students worked on web-based workflow containing complex algorithms to identify consensus sequences frequently found at the start of each gene. To further validate the gene, students align each gene with genes of other known phages using BLAST tools; the Basic Local Alignment Search Tool is maintained by the US National Center for Biotechnology Information (NCBI) and has been the staple tool for molecular biologists globally. As freshman, the Phage students are learning and mastering complex bioinformatics tools for research that only a handful of upper level and graduate students routinely use.
In the most recent years, students have been using DNA Master, a specifically designed genome annotation and exploration tool designed and written by Dr. Jeffrey Lawrence of the University of Pittsburgh. This program combines the gene identification algorithms and the BLAST tools onto one single work space. When the software identifies multiple potential starts of a gene, the students must sort through each gene and authenticate it based on specific rules governing all phage genes. For example, one gene should not overlap too much with its adjacent genes. There also should not be large gaps between genes. There have been incidences that DNA Master missed a gene that should have been called. The students will use the BLAST tool within DNA Master and determine if a specific region of the DNA aligns with genes of other known phages. Working in groups of 2-3, students help each other, walk through these software programs, and discuss their decisions.
At the end of the semester, the students’ work is checked before submitting the final draft to the University of Pittsburgh Bacteriophage Institute for further quality control. The team at Pittsburgh formally submits all analyzed phage sequences to GenBank, a database of DNA and protein sequences that is curated by NCBI.
The first cohort (2009-2010) of students identified 102 genes in phage Daisy while the second cohort (2010-2011) identified 97 genes in phage BPBiebs31. The annotated genomes for these two phages are published in GenBank. The 2011-2012 Phage students annotated two phages, Flux and Winky. Using DNA Master, the students overcame several early glitches with the program and were able to map all of Flux’s 89 genes within four weeks. The draft annotations were submitted to Pittsburgh by mid-March. Within five weeks, the students identified 142 genes in Phage Winky. The draft annotations were completed and forwarded to Pittsburgh for formal GenBank submission. Flux was published in GenBank this past June but Winky still awaits quality control. This spring, Dr. King Smith is leading the fourth cohort of phage students to annotate phages DTDevon and Oaker, again using DNA Master.