Pokročilé metody bioinformatiky

GRAPH ALGORITHMS AND DATA STRUCTURES 13. 3. 2023

Eizenga et al.(2023) Pangenome Graphs

Annual Review of Genomics and Human Genetics

https://www.annualreviews.org/doi/pdf/10.1146/annurev-genom-120219-080406

=====

deBruijn graphs ( e.g. velvet)

variation graphs (unbiased pangenome graphs )

=====

EXERCISE:

wgsim - simulate short reads (150bp) from E.coli, 50x coverage, error rate 0.5% and 5%, paired (d=2000bp) and unpaired
https://github.com/lh3/wgsim

seqtk sample - subsample fastaq (create 5x coverage)
(also possible sample from https://github.com/alexpreynolds/sample)

velvet - assemble reads
https://cw.fel.cvut.cz/b182/_media/courses/bin/assembly_jk_2017_2p.pdf
https://www.cs.jhu.edu/~langmea/resources/lecture_notes/assembly_dbg.pdf

characterize contigs (try assembly-stats from Ubuntu package of the same name or stats.sh on hedron from the BBMap package)

compare with reference
https://www.biostars.org/p/383339/#383346