14:00 – 16:00
The Path to Your Genome: Biology, Technology, and Algorithms
The initial sequencing and analysis of the human genome, completed in
2003, was a major biological breakthrough, leading to a better
understanding of the evolution and function of many human genes. The
key to translating this newly acquired knowledge into medical advances
relies on the availability of the genomes of many individuals, and in
the study of correlation between genomes and diseases. Because the
initial human genome was sequenced over 8 years and at the cost of $3
billion, another technological leap was necessary in order to allow
for the economical sequencing of the genomes of many humans. Today
this leap has been accomplished: Next-Generation Sequencing (NGS)
technologies are able to sequence a human genome in a few weeks, at a
cost of $10,000 to $100,000. Using these technologies, scientists are
hoping to sequence thousands of human genomes in the next few years,
and eventually allow each individual to know his or her personal
genome.
Some of the biggest remaining challenges on the path to the personal
genome are algorithmic. The NGS technologies are only able to read
many small fragments of a genomic sequence, and reconstructing the
source genome from these fragments, as well as the analysis of the
differences between the sets of fragments from various individuals are
difficult computational problems. Furthermore, the challenges of using
the NGS datasets are exacerbated by the errors and biases in the
underlying sequencing technologies. In this talk I will give an
overview of genome sequencing and NGS technologies, and discuss some
of the computational methods used to address the challenges posed by
NGS datasets.