This is the home page for class SDS 348, Computational Biology and Bioinformatics. All relevant course materials will be posted here.

(For the latest version of this class, with updated materials, please check here.)

Syllabus: SDS348_syllabus_spring2017.pdf

Lectures

1. Jan 17, 2017 – Introduction, R Markdown

2. Jan 19, 2017 – R review

3. Jan 24, 2017 – Data visualization with ggplot2

4. Jan 26, 2017 – Data visualization with ggplot2

5. Jan 31, 2017 – Working with tidy data

6. Feb 2, 2017 – Working with tidy data

7. Feb 7, 2017 – Working with tidy data

8. Feb 9, 2017 – Rearranging data tables with tidyr

9. Feb 14, 2017 – Principal Components Analysis (PCA)

10. Feb 16, 2017 – k-means clustering

11. Feb 21, 2017 – Binary prediction/logistic regression

12. Feb 23, 2017 – Sensitivity/Specificity, ROC curves

13. Feb 28, 2017 – Training and test data sets, cross-validation

14. Mar 2, 2017 – Installing and running python, basic data structures

15. Mar 7, 2017 – Control flow in python

16. Mar 9, 2017 – Functions in python

17. Mar 21, 2017 – More on python data structures, classes

18. Mar 23, 2016 – Working with files

19. Mar 28, 2017 – Introduction to Biopython

20. Mar 30, 2017 – Working with gene features and genomes

21. Apr 4, 2017 – Running queries on Entrez

22. Apr 6, 2017 – Regular expressions

23. Apr. 11, 2017 – Using regular expressions to analyze data

24. Apr. 13, 2017 – Using regular expressions to analyze data

25. Apr. 18, 2017 – Aligning sequences

26. Apr. 20, 2017 – Global and local alignments, BLAST

27. Apr. 25, 2017 – Multiple sequence alignments and phylogenetic trees

28. Apr. 27, 2017 – Working with protein structures

29. May 2, 2017 – Protein structure and PyMOL scripting

30. May 4, 2017 – Molecular evolution of proteins

  • Slides: class30.pdf
  • Relevant papers:
    • J. Echave, S. J. Spielman, C. O. Wilke (2016). Causes of evolutionary rate variation among protein sites. Nature Rev. Genet. 17:109–121. doi: 10.1038/nrg.2015.18
    • B. R. Jack, A. G. Meyer, J. Echave, C. O. Wilke (2016). Functional sites induce long-range evolutionary constraints in enzymes. PLOS Biol 14:e1002452. doi: 10.1371/journal.pbio.1002452
    • A. H. Kachroo, J. M. Laurent, C. M. Yellman, A. G. Meyer, C. O. Wilke, E. M. Marcotte (2015). Systematic humanization of yeast genes reveals conserved functions and genetic modularity. Science 348:921–925. doi: 10.1126/science.aaa0769
    • C. D. McWhite, A. G. Meyer, C. O. Wilke (2016). Sequence amplification via cell passaging creates spurious signals of positive adaptation in influenza virus H3N2 hemagglutinin. Virus Evol. 2:vew026. doi: 10.1093/ve/vew026

Homeworks

All homeworks are due by 7:00pm on the day they are due. Homeworks need to be submitted as pdf files on Canvas.

  • Homework 1: HW1.Rmd (due Jan 24, 2017)
  • Homework 2: HW2.Rmd (due Jan 31, 2017)
  • Homework 3: HW3.Rmd (due Feb 7, 2017)
  • Homework 4: HW4.Rmd (due Feb 14, 2017)
  • Homework 5: HW5.Rmd (due Feb 28, 2017)
  • Homework 6: HW6.Rmd (due Mar 7, 2017)
  • Homework 7: HW7.ipynb (due Mar 21, 2017)
  • Homework 8: HW8.ipynb (due Apr 4, 2017)
  • Homework 9: HW9.pdf (due Apr 11, 2017)
  • Homework 10: HW10.ipynb (due Apr 18, 2017)
  • Homework 11: HW11.ipynb (due Apr 25, 2017)

Labs

1. Jan. 18, 2017

2. Jan. 25, 2017

3. Feb. 1, 2017

4. Feb. 8, 2017

5. Feb. 15, 2017

6. Feb. 22, 2017

7. Mar. 1, 2017

8. Mar. 8, 2017

9. Mar. 22, 2017

10. Mar. 29, 2017

11. Apr. 5, 2017

12. Apr. 12, 2017

13. Apr. 19, 2017

14. Apr. 26, 2017

15. May 2, 2017

  • Lab worksheet:

Projects

All projects are due by 7:00pm on the day they are due. Projects need to be submitted on Canvas, both in pdf format and as source code (plus data where needed).