Týden 4 - Genome and transcriptome assembly: state of the art and best practices
Vyučující: Mgr. Monika Čechová, Ph.D.
PŘEDNÁŠKA
- Best practices in the genome and transcriptome assembly
CVIČENÍ
DOMÁCÍ ÚKOL
Remember there is no “one right way” to do an analysis. Choose parameters that you think are the most suitable for your goal.
-
Create an account at https://usegalaxy.eu/
-
Load following fastq files as a Collection (List of Pairs):
https://zenodo.org/record/3541678/files/A1_left.fq.gz
https://zenodo.org/record/3541678/files/A1_right.fq.gz
- https://zenodo.org/record/3541678/files/A2_left.fq.gz
- https://zenodo.org/record/3541678/files/A2_right.fq.gz
- https://zenodo.org/record/3541678/files/A3_left.fq.gz
- https://zenodo.org/record/3541678/files/A3_right.fq.gz
- https://zenodo.org/record/3541678/files/B1_left.fq.gz
- https://zenodo.org/record/3541678/files/B1_right.fq.gz
- https://zenodo.org/record/3541678/files/B2_left.fq.gz
- https://zenodo.org/record/3541678/files/B2_right.fq.gz
- https://zenodo.org/record/3541678/files/B3_left.fq.gz
- https://zenodo.org/record/3541678/files/B3_right.fq.gz
-
Run FastQC before and after trimming reads with Trimmomatic. Trim for quality and consider whether the adaptor removal should be performed.
-
Assemble the trimmed reads with Trinity. Trinity will output both gene and isoform files. Focus on the isoforms.
-
Align trimmed reads to this de-novo reference assembly and estimate read abundance per isoform (Align reads and estimate abundance on a de novo assembly of RNA-Seq data). Use salmon as Abundance estimation method.
-
Rename the datasets: A1_raw, A2_raw, A3_raw, B1_raw, B2_raw, B3_raw
-
Build expression matrix for your de novo assembly of RNA-Seq data by Trinity (this is the first step in the differential gene expression pipeline)
-
Share your history with the user cechova.biomonika@gmail.com
-
Export your history to a file and upload your .tar.gz to the Odevzdávarna by April 13th, 2021
This exercise is inspired by the following draft tutorial: