by Sébastien Renaut
Next generation sequencing has promised cheap DNA sequences to the masses. While this may be true, the bottleneck has now shifted from generating data to analyzing it. Here, I will use transcriptome sequencing data (RNAseq) to quantify gene expression. I will introduce data formats commonly used in genomics (e.g.: .fastq,.bam,.sam) and I will use the R programming language to identify differentially expressed genes (e.g. DESeq2, edgeR packages), cluster samples based on gene expression, detects gene ontology categories which are over/under represented (goseq) and present various graphics to illustrate results.