This is the home page for class SDS 348, Computational Biology and Bioinformatics. All relevant course materials will be posted here.

(For the latest version of this class, with updated materials, please check here.)

Syllabus: SDS348_syllabus_spring2016.pdf

Lectures

1. Jan 19, 2016 – Introduction, R Markdown

2. Jan 21, 2016 – R review

3. Jan 26, 2016 – Data visualization with ggplot2

4. Jan 28, 2016 – Data visualization with ggplot2

5. Feb 2, 2016 – Working with tidy data

6. Feb 4, 2016 – Working with tidy data

7. Feb 9, 2016 – Working with tidy data

8. Feb 11, 2016 – Rearranging data tables with tidyr

9. Feb 16, 2016 – Principal Components Analysis (PCA)

10. Feb 18, 2016 – k-means clustering

11. Feb 23, 2016 – Binary prediction/logistic regression

12. Feb 25, 2016 – Guest lecture: Epidemiology

Guest lecture by Steve Bellan

13. Mar 1, 2016 – Sensitivity/Specificity, ROC curves

14. Mar 3, 2016 – Training and test data sets, cross-validation

15. Mar 8, 2016 – Installing and running python, basic data structures

16. Mar 10, 2016 – Control flow in python

17. Mar 22, 2016 – Functions in python

18. Mar 24, 2016 – More on python data structures, classes

19. Mar 29, 2016 – Working with files

20. Mar 31, 2016 – Introduction to Biopython

21. Apr 5, 2016 – Working with gene features and genomes

22. Apr 7, 2016 – Running queries on Entrez

23. Apr 12, 2016 – Regular expressions

24. Apr. 14, 2016 – Using regular expressions to analyze data

25. Apr. 19, 2016 – Aligning sequences

26. Apr. 21, 2016 – Global and local alignments, BLAST

27. Apr. 26, 2016 – Multiple sequence alignments and phylogenetic trees

28. Apr. 28, 2016 – Working with protein structures

29. May 3, 2016 – Protein structure and PyMOL scripting

30. May 5, 2016 – Molecular evolution of proteins

  • Slides: class30.pdf
  • Relevant papers:
    • B. R. Jack, A. G. Meyer, J. Echave, C. O. Wilke (2016). Functional sites induce long-range evolutionary constraints in enzymes. PLOS Biol 14:e1002452. doi: 10.1371/journal.pbio.1002452
    • A. H. Kachroo, J. M. Laurent, C. M. Yellman, A. G. Meyer, C. O. Wilke, E. M. Marcotte (2015). Systematic humanization of yeast genes reveals conserved functions and genetic modularity. Science 348:921–925. doi: 10.1126/science.aaa0769
    • S. A. Kerr, E. L. Jackson, O. I. Lungu, A. G. Meyer, A. Demogines, A. D. Ellington, G. Georgiou, C. O. Wilke, S. L. Sawyer (2015). Computational and functional analysis of the virus-receptor interface reveals host range trade-offs in New World arenaviruses. J. Virol. 89:11643–11653. doi: 10.1128/JVI.01408-15

Homeworks

All homeworks are due by 11:59pm on the day they are due. Homeworks need to be submitted as pdf files on Canvas.

  • Homework 1: HW1.Rmd (due Jan 26, 2016)
  • Homework 2: HW2.Rmd (due Feb 2, 2016)
  • Homework 3: HW3.Rmd (due Feb 9, 2016)
    • This homework requires the gapminder package. Install it by running install.packages('gapminder') in the R console.
  • Homework 4: HW4.Rmd (due Feb 16, 2016)
  • Homework 5: HW5.Rmd (due Mar 1, 2016)
  • Homework 6: HW6.Rmd (due Mar 8, 2016)
  • Homework 7: HW7.ipynb (due Mar 22, 2016)
  • Homework 8: HW8.ipynb (due Apr 5, 2016)
  • Homework 9: HW9.pdf (due Apr 12, 2016)
  • Homework 10: HW10.ipynb (due Apr 19, 2016)
  • Homework 11: HW11.ipynb (Problem 2 due Apr 26, 2016, Problem 1 due Apr 27, 2016)

Projects

All projects are due by 11:59pm on the day they are due. Projects need to be submitted on Canvas, both in pdf format and as source code (plus data where needed).

Labs

1. Jan. 20, 2016

2. Jan. 27, 2016

3. Feb. 3, 2016

4. Feb. 10, 2016

5. Feb. 17, 2016

6. Feb. 24, 2016

7. Mar. 2, 2016

8. Mar. 9, 2016

9. Mar. 23, 2016

10. Mar. 30, 2016

11. April 6, 2016

  • Complete the in-class worksheet from Class 21, including the “If that was easy…” questions.

12. April 13, 2016

13. April 20, 2016

14. April 27, 2016

15. May 4, 2016

  • Lab worksheet:
  • Running a script inside qtconsole: IPython %run
  • PyCharm, a python integrated development environment (IDE): PyCharm