Sequencing a genome is only the beginning. Several layers of analysis are necessary to convert raw sequence data into an understanding of functional biology. First, error sources in the original raw data from multiple platforms and diverse applications must be accounted for. Then, as computational methods for assembly, alignment and variation detection continue to advance, a broad range of genetic analysis applications including comparative genomics, high-throughput polymorphism detection, analysis of coding and non-coding RNAs and identifying mutant genes in disease pathways can be addressed. CHI’s Sequencing Data Analysis and Interpretation conference combines unique perspectives from a variety of researchers, engineers, biostatisticians and software developers involved in NGS data analysis.
Day 1 | Day 2
Tuesday, August 20
12:00 pm Main Conference Registration
12:30 Luncheon Technology Workshop: From Reads to Variants: Ten-Fold Reduction in Time and Cost with Improved Accuracy
Rupert Yip, Ph.D., Director, Product Marketing, Bina Technologies
Alignment and variant calling of raw NGS reads has been plagued by expensive HPC hardware and the bioinformatics personnel to support and maintain home-grown, open-source secondary analysis solutions. Such solutions can take up to weeks and $1000s per analysis. We present a genomic analysis platform that reduces, by ten-fold, the time and cost for secondary analysis while improving accuracy compared to standard pipelines. Our innovative model reduces costs by ten-fold while preventing hardware obsolescence.
» Plenary Keynote Session
2:00 Chairperson's Opening Remarks
Toby Bloom, Ph.D., Deputy Scientific Director, Informatics, New York Genome Center
2:10 A Revolution in DNA Sequencing Technologies: Challenges and Opportunities
Jeffery A. Schloss, Ph.D., Director, Division of Genome Sciences, National Human Genome Research Institute, National Institutes of Health Biography
The initial sequencing of the human genome spurred an appetite for much more human sequence information to better understand the contributions of human sequence variation to health and disease. However, despite dramatic reductions during the Human Genome Project, the cost of sequencing was clearly too high to collect the very large numbers of human and numerous other organism genome sequences needed to achieve that understanding. In 2004, NHGRI launched parallel programs to reduce the cost of sequencing a mammalian genome initially by two (in five years), and eventually by four orders of magnitude (in ten years). This presentation will summarize the technologies that are in high-throughput use to produce stunning amounts of sequence and related data and novel biological insights, and will emphasize technologies currently emerging and on the horizon that may provide human genome sequence data with the nature, quality, cost and turnaround time needed for applications in research and medicine.
2:50 RNA is Everywhere: Characterizing the Spectra and Flux of RNA in Mammalian Circulation
David Galas, Ph.D., Principal Scientist, Pacific Northwest Diabetes Research Institute Biography
The discovery of foreign RNA in blood and tissues of humans and mice raises many questions, including its origins, the mechanisms of its transport and stability and what, if any, functions it has. I will discuss what we know about circulating exRNA in human plasma and the use of NGS in the exploration of this new area of investigation in biology and medicine.
3:30 Refreshment Break in the Exhibit Hall with Poster Viewing
4:15 Genomics and the Single Cell
Sherman Weissman, Ph.D., Sterling Professor of Genetics and Medicine, Yale University School of Medicine Biography
Studies of single cells are being approached by widely different methods, principally either florescence microscopy including super-high resolution methods, cloning and expansion of single cells or most generally applicable, genomic-scale nucleic acid analyses. The last includes single-cell DNA sequence analysis, gene expression analysis and most recently analyses of telomere length, DNA methylation and potentially closed regions of chromatin. Also, in the near future, it may be possible to combine several analyses of a single cell, including mRNA expression, genomic DNA methylation and protein secretion. These approaches will have major value for diverse fields, including molecular analysis of the early stages of development, the nature and heterogeneity of stem cells and transient repopulating cells in various systems including the hematopoietic system, the nature and extent of heterogeneity of neurons, heterogeneity in neoplasia and in functional subsets of cells of the immune system. A substantial experimental challenge is to distinguish technical variation from stochastic and deterministic events in single cells. Another, broader challenge is to correlate the results of genomic properties that necessarily involve destruction of the cell with the functional properties and potential of the individual cell being analyzed. These issues will be discussed briefly in the presentation.
4:55 Genome Hacking
Yaniv Erlich, Ph.D., Principal Investigator, Whitehead Fellow, Whitehead Institute for Biomedical Research Biography
Sharing sequencing datasets without identifiers has become a common practice in genomics. We developed a technique that uses entirely free, publicly accessible Internet resources to fully identify individuals in these studies. I will present quantitative analysis about the probability of identifying U.S. individuals by this technique. In addition, I will demonstrate the power of our approach by tracing back the identities of multiple whole-genome datasets in public sequencing repositories.
Genetic Privacy: Technology and Ethics with Yaniv Erlich
5:35 Short Course Registration
Day 1 | Day 2