From discovery to technology explosion •1868: Discovery of DNA •1953: Watson and Crick propose double helix structure •1977: Sanger sequencing •1985: PCR •2000: Working draft human genome announced (Sanger method) • •2005: 454 sequencer launch (pyrosequencing) •2006: Genome Analyzer launched (Solexa sequencing) •2007: SOLiD launched (ligation sequencing) •2009: Whole human genome no longer merits Nature/Science paper •2010: “third-gen” systems $ human Genome $3 billion $2-3 million $250k $50k $20k <$1k http://www.nature.com/nature/journal/v464/n7289/images/cover_nature.jpg • 6 Oxford Nanopore Sensor array chip: many nanopores in parallel DNA Sequencing Proteins Polymers Small Molecules Adaptable protein nanopore: array animatd viual.png Electronic read-out system Mechanical damage during tissue homogenization. Wrong pH and ionic strength of extraction buffer. Incomplete removal / contamination with nucleases. Phenol: too old, or inappropriately buffered (pH 7.8 – 8.0); incomplete removal. Wrong pH of DNA solvent (acidic water). Recommended: 1:10 TE for short-term storage, or 1xTE for long-term storage. Vigorous pipetting (wide-bore pipet tips). Vortexing of DNA in high concentrations. Too many freeze-thaw cycles (we tested 5, still Ok). Debatable: sequence-dependent DNA degradation Two strategies • • Whole genome shotgun (bottom-top) • • Clone-by-clone (top-bottom) • Genome sequencing http://corelabs.cgrb.oregonstate.edu/sites/default/files/HTS_HISeq2000.png •A rapid progress in next generation sequencing technologies promises to provide complete (reference) DNA sequences •The bottleneck: –NOT the sequencing capacity –BUT the ability to assemble many short reads with prevalence of repeated DNA (and polyploidy) Sequencing without a limit? Genome sequencing •GenBank 1982 Los Alamos Sequence Database genbankgrowth Walter Goad Frederick Sanger 1958 – Nobel prize – insuline structure 1975 - Dideoxy sequencing method 1977 – Φ-X174 (5,368 bp) sequence 1980 – second Nobel prize λ phage sequence shotgun method (48,502 bp) Genome sequencing •1986 Leroy Hood: • automatic sequencing machine • •1986 Human Genome Initiative – • beyondhgp reflections_hood Leroy Hood Genome sequencing •1995 John Craig Venter • first bacterial genome • 1102_Horz_Venter_top_R John Craig Venter Craig Venter Global Ocean Sampling Expedition Synthetic genomics Human Longevity Inc Craigventer2.jpg (3328×4992) http://www.youtube.com/watch?v=J0rDFbrhjtI Which applications are labs performing? 2010 Human genome reference 2010 Human genome reference Anne Wojcicki CEO - manželka spoluzakladatele Google Sergey Mikhaylovich Brin 23andme (30% GSK) C:\Users\Roman Hobza\Dropbox\Screenshots\Screenshot 2015-05-18 14.39.06.png • C:\Users\Roman Hobza\Dropbox\Screenshots\Screenshot 2015-05-18 13.25.17.png • C:\Users\Roman Hobza\Dropbox\Screenshots\Screenshot 2015-05-18 12.58.03.png • C:\Users\Roman Hobza\Dropbox\Screenshots\Screenshot 2015-05-18 13.20.04.png • C:\Users\Roman Hobza\Dropbox\Screenshots\Screenshot 2015-05-18 12.53.55.png • C:\Users\Roman Hobza\Dropbox\Screenshots\Screenshot 2015-05-18 15.02.56.png • C:\Users\Roman Hobza\Dropbox\Screenshots\Screenshot 2015-05-18 15.04.59.png • C:\Users\Roman Hobza\Dropbox\Screenshots\Screenshot 2015-05-18 14.56.13.png • C:\Users\Roman Hobza\Dropbox\Screenshots\Screenshot 2015-05-18 14.58.38.png • C:\Users\Roman Hobza\Dropbox\Screenshots\Screenshot 2015-05-18 15.01.12.png •http://www.454.com Genome Sequencer 20 System 454 pyrosequencing (2005) DNA library preparation Fragmentace DNA Ligace adaptoru Vychytání DNA molekul denaturace emPCR Vznik emulze (olej) emPCR emPCR Vychytání kuliček Vychytání kuliček denaturace Sekvenační primer Disperze na sklíčko Disperze na sklíčko Parametry mikroreaktorů Parametry mikroreaktorů • sekvenace sekvenace sekvenace sekvenace sekvenace sekvenace sekvenace sekvenace SOLID (Sequencing by Oligonucleotide Ligation and Detection) • 2-base encoding sequencing (2007) Solexa (2007) • HELICOS (2008) True Single Molecule Sequencing (tSMS) Single Molecule Real-Time (SMRT) Pacific Biosciences 20 zeptolitrů Ion Torrent • • Oxford nanopore Další technologie •Mikroelektroforéza •Sekvenování na bázi microarray CHALLENGES IN GENOME SEQUENCING De novo genome assemblies using only short read data of NGS technologies are generally incomplete and highly fragmented due to §Large duplications §High proportion of repetitive DNA - - - - - - - - chromosomal approach, BAC-by-BAC sequencing - challenge! §Large genome size (~17 Gb) §Polyploidy (3 subgenomes) Chromosomal approach BAC-BY-BAC SEQUENCING BAC clones §Physical map is composed of contigs of overlapping BAC clones §BAC contigs are landed on the chromosome through markers comprised in the contigs § § https://encrypted-tbn3.gstatic.com/images?q=tbn:ANd9GcT-qFG2up8agPzlWRkGaXoSU6iqvqE8G_szdwOQc9ltHp- d060X SOLUTIONS FOR THE REPEATS §Long mate-pair reads > 10 kb § §Long read technologies – PacBio, Oxford Nanopore § §Optical mapping § §Single-molecule mapping of genomic DNA hundreds of kilobases to several megabases in size § §Creates sequence-motif maps, which provide long-range template for ordering genomic sequences § §Visualisation of reality “Seeing is Believing” § § labeling.jpg Three enzymatic approaches §restriction enzymes: sequence-specifically cleave DNA immobilized on a surface § § §nicking enzymes: fluorescent labelling of the nicking site in solution (BioNano Genomics - Irys) §methyltransferase enzymes: labelling with ultra-high density OPTICAL MAPPING Nicking Strand displacement Incorporation of fluorescent nucleotides BIONANO GENOME MAPPING ON NANOCHANEL ARRAYS 3 Fluorescence imaging Lam et al., Nat. Biotechnol. 30(8) 2012 4 Map construction DNA linearization 2 5 Building consensus map Nickase (Nt.BspQI) 1 Sequence-specific labeling U U A Fluorescent dye conjugated nucleotides (Alexa 546 dUTP) were incorporated at the Nt.BspQI sites by Vent (exo−) polymerase. Next, we stained the labeled DNA molecules with the DNA-intercalating dye, YOYO-1, which facilitates visualization of the DNA molecule and measurement of its size. Then, we loaded the DNA onto a nanochannel array chip and applied an electric field, which gradually drives the long, coiled DNA molecules in free suspension through a series of micro- and nanofluidic structures. Once the nanochannels were populated by a set of linearized DNA molecules, we imaged them with automated high-resolution fluorescent microscopy. We determined the size of each DNA molecule by directly measuring its contour length. The histogram peaks represent the location of each sequence motif along the molecules.