IMPRS Comparative genomics 2013

(Eva Stukenbrock and Julien Dutheil)

Presentations

  1. Genetic variation. Here we talk about mutations, diversity, selection and demography.
  2. Alignment. Here we talk about homology and its inference.
  3. Phylogeny. Here we demonstrate how to reconstruct the history of sequences.
  4. Positive selection. Here we introduce the methodology for infering positive selection.
  5. Positive selection 2: codon models. Here we briefly introduce codon models of sequence evolution.

Practical course

Practical session 1: origin and maintenance of genetic variation

Genetic drift: http://darwin.eeb.uconn.edu/simulations/drift.html

Drift + selection: http://darwin.eeb.uconn.edu/simulations/selection-drift.html

The effect of selection: http://darwin.eeb.uconn.edu/simulations/selection.html

(Simulations are from the Holsinger lab).

The UCSC Genome Browser: http://genome.ucsc.edu/cgi-bin/hgGateway

Link toward population size estimates to paste in UCSC: http://kimura.univ-montp2.fr/~jdutheil/Gorilla/UCSCTracks/thetaHCPerAln.track.gz

Practical session 2: homology and alignment

Get SeaView: http://pbil.univ-lyon1.fr/software/seaview.html

Random sequences generator (R script)

A random sequence file (Fasta)

  1. Open the file with Seaview
  2. Align the sequences
  3. Compare Clustal and Muscle output
  4. Assess the quality of the alignment
  5. Filter the alignment using GBLocks

Some globin sequences (Proteins, Fasta)

Practical session 3: reconstructing the history of sequences

[Hemoglobin data set for session 2]

  1. Using Seaview, build a phylogenetic tree from the previously filtered alignment
  2. Compare Parsimony, Distance and ML methods
  3. Assess the confidence of the tree reconstruction
  4. Identify duplication events and date them

rRNA sequences aligment (Mase)

Full species names for rRNA alignment

Analyze the data! Which model best describes the data?

Practical session 4: inferring positive selection

LysM data and programs (windows exe) and protocol.