Bi7492 DNA Sequence Analysis

Faculty of Science
Autumn 2010 - only for the accreditation
Extent and Intensity
2/1. 3 credit(s) (fasci plus compl plus > 4). Type of Completion: zk (examination).
Teacher(s)
doc. Mgr. Natália Martínková, Ph.D. (lecturer)
Guaranteed by
prof. RNDr. Ladislav Dušek, Ph.D.
RECETOX – Faculty of Science
Prerequisites (in Czech)
Studenti musí mít základní znalosti z molekulární biologie a genetiky a být dostatečně schopní porozumět angličtině na navigaci probíraných webových stránek. Doporučuje se absolvovat Neparametrické metody (K. Kubošová).
Course Enrolment Limitations
The course is offered to students of any study field.
The capacity limit for the course is 30 student(s).
Current registration and enrolment status: enrolled: 0/30, only registered: 0/30, only registered with preference (fields directly associated with the programme): 0/30
Course objectives
The course will provide information about analysis and evaluation of information from large DNA sequence datasets obtained from genomic sequencing or public databases. The students will learn to process bioinformatic data from the phase of initial recognition of DNA sequence from the image file, through data validation, to sequence annotation. Further, the students will learn approaches and methods how to put their result into appropriate context, establish relationships between investigated units based on phylogenetic tree reconstruction and interpret the model. The course focuses on teaching students to find solutions to complex problems in intuitive ways.
Syllabus
  • 1. Genomics: (a) Genomes, genome organisation, variability, (b) Genome sequencing – next-gen sequencing, pyrosequencing, sequencing by ligation, ChIP-seq.
  • 2. Genome assembly: (a) De-novo, resequencing, mutation detection, (b) Contig, length and number of contigs, (c) Parameters of contig assembly, (d) Coverage – variability detection.
  • 3. Sequence search: (a) GenBank, EMBL, DDBJ, UniProt, (b) Entréz, SRS, (c) Libraries, cross-referencing.
  • 4. BLAST: (a) nucleotide and protein blast, megablast, psi-blast, (b) Search principle, (c) Result evaluation, E-value.
  • 5. Genomic sequence annotation: (a) RNA prediction, (b) GC content, (c) Protein prediction – Prokaryota.
  • 6. Genomic sequence annotation: (a) Protein prediction – Eukaryota, (b) Detection of domains.
  • 7. Alignment: (a) Homologous positions, (b) Local, global alignment – assembly options, (c) Dynamic programming.
  • 8. Relationship modelling: (a) Phylogenetic tree - interpretation.
  • 9. Substitution model: (a) Sequence evolution, types of mutations, (b) Model parameters, (c) Problems and solutions.
  • 10. Phylogenetics: (a) Phylogenetic analysis – neighbour-joining, maximum parsimony, (b) Significance of signal, bootstrap.
  • 11. Maximum likelihood: (a) Likelihood function, (b) Randomised axelerated maximum likelihood.
  • 12. Bayesian analysis: (a) Posterior probability, (b) Effect of priors on posterior distribution, (c) Convergence, Metropolis-coupled MCMC.
  • 13. Gene and species evolution: (a) Bayesian estimation of species trees, (b) Supermatrix, (c) Supertrees.
  • 14. Visualisation.
Teaching methods
Lectures, discussion, in-class data analysis.
Assessment methods
Final written test, data analysis.
Language of instruction
Czech
The course is also listed under the following terms Autumn 2010, Autumn 2011, Autumn 2011 - acreditation, Autumn 2012, Autumn 2013, Autumn 2014, Autumn 2015, Autumn 2016, spring 2018, Spring 2019, Autumn 2019, Spring 2020, Autumn 2020, Spring 2021, autumn 2021.