class: center, middle, title-slide .title[ # How to succeed in this class ] .author[ ### Claus O. Wilke ] .date[ ### last updated: 2022-11-22 ] --- ## Syllabus -- - Read the syllabus -- - The syllabus is available on the edX platform -- - There may be minor changes in syllabus from semester to semester; in case of doubt, the version posted on edX is binding --- ## Course materials -- - All slides and worksheets are also available at: https://wilkelab.org/DSC385/ -- - For the curious, the source materials for the slides are here: https://github.com/wilkelab/DSC385 -- - You can keep a local copy of the class site by downloading the `docs` folder and opening `index.html` in your browser: https://github.com/wilkelab/DSC385/tree/main/docs -- - The class uses extensive materials from my book, available here: https://clauswilke.com/dataviz --- ## Homework and project submissions -- - Make sure you actually submit your assignment -- - Before submission, verify you have submitted the right files -- - Once submitted, you can no longer resubmit -- - You will need to peer grade before receiving your own grade -- - EdX calculates the overall grade as the median of the first *n* grades --- ## Assignment format for submissions We ask that you submit two files: -- 1. A pdf document representing your submission -- 2. A zip file containing the source files (.Rmd file, data files, etc) to reproduce the pdf -- The pdf should be separate from the zip file, not included inside it, as that helps us to quickly look at an assignment. -- Add some random letters or numbers to the file name: `project1_dlxdle.pdf` -- **Do not** include your name or EID in the document or file name. --- ## Best practices for code in assignments -- - Stick to materials and concepts covered in class up to this point -- - Do not use packages beyond those listed in the syllabus -- - Your peers cannot be expected to understand materials not covered -- - Add comments to your code -- - Format your code according to a style guide: https://style.tidyverse.org/ --- ## In case of problems with submission, grading, etc. -- Please do the following: - Send email to: <a href="mailto:onlinedataexploration@austin.utexas.edu">onlinedataexploration@austin.utexas.edu</a> -- - Include your edX user name -- - Attach any assignment files if applicable -- <br> Please **do not:** - post on Piazza - send email directly to instructor or TA - try to message on edX - try to message on Canvas --- ## Peer grading -- - Peer grading is an important component of this class -- - The main purpose is for you to engage with the work of your peers -- - There are two types of graded assignments: homeworks and projects --- ## Peer grading of homeworks -- - Homeworks are primarily graded on effort: Did the student make an honest attempt to complete the assignment? -- - Homeworks **do not** have to match the example solution 1:1 -- - Please provide written comments justifying your grade -- - Read the grading rubric! -- - Project 1 is treated as a homework --- ## Peer grading of projects -- - Projects have more complex rubrics. They differ somewhat among projects! -- - Projects **do not** have to match the example solution 1:1 -- - Some projects do not have an example solution -- - Please provide written comments justifying your grade --- ## Grading the reproducibility of a submission -- - An assignment is reproducible if you can take the source files and reproduce the submitted pdf -- - The following error is **not** a reproducibility problem: `pdflatex: command not found` You can resolve it by clicking the button to knit to HTML -- - Assignments should only rely on the R packages listed as required in the syllabus --- [//]: # "segment ends here" --- class: center middle ## Making a reproducible example --- ## Making a reproducible example -- - For questions on Piazza, please include a reproducible example (reprex) -- - A reprex must include **all** code necessary to reproduce the problem -- - A reprex should be as small as possible; delete what isn't needed -- - It is always possible to create a good reprex, but sometimes it's a bit of work --- ## Include **all** code necessary to reproduce the problem Example: Why does this code not run? ```r mtcars %>% ggplot(aes(disp, mpg)) + geom_point() ``` -- Error obtained: ``` ## Error in mtcars %>% ggplot(aes(disp, mpg)) : ## could not find function "%>%" ``` --- ## Include **all** code necessary to reproduce the problem Complete code as run by student: ```r library(ggplot2) mtcars %>% ggplot(aes(disp, mpg)) + geom_point() ``` ``` ## Error in mtcars %>% ggplot(aes(disp, mpg)) : ## could not find function "%>%" ``` --- ## Include **all** code necessary to reproduce the problem Complete code as run by student: ```r library(ggplot2) mtcars %>% ggplot(aes(disp, mpg)) + geom_point() ``` ``` ## Error in mtcars %>% ggplot(aes(disp, mpg)) : ## could not find function "%>%" ``` Problem: Missing `library(tidyverse)` --- ## Include **all** code necessary to reproduce the problem This code works: ```r library(tidyverse) mtcars %>% ggplot(aes(disp, mpg)) + geom_point() ``` .pull-left.width-35[ ![](succeed-in-class_files/figure-html/mtcars_working-out-1.svg)<!-- --> ] -- .pull-right.width-60[ The problem is often not where you are looking for it. ] --- ## Make up data if necessary A common concern: <br> I cannot share my complete code. It would reveal my solution to the homework / assignment / etc. -- <br> Solution: <br> Prepare a simplified example with made-up data --- ## Make up data if necessary Example: Why are the data points not colored? .tiny-font.pull-left[ ```r library(tidyverse) ggplot(iris) + aes(Sepal.Length, Sepal.Width) + aes(colr = Species) + geom_point(size = 2) + ggtitle("Measurements on iris flowers") + xlab("Sepal Length") + ylab("Sepal Width") + theme_bw() ``` ] .pull-right[ ![](succeed-in-class_files/figure-html/iris_broken-out-1.svg)<!-- --> ] --- ## Make up data if necessary Same example with made-up data: .tiny-font.pull-left[ ```r library(tidyverse) data <- tibble( x = 1:3, y = c(2, 1, 3), letter = c("A", "B", "C") ) ggplot(data) + aes(x, y) + aes(colr = letter) + # problem is here geom_point() # note: irrelevant code has been removed ``` ] .pull-right[ ![](succeed-in-class_files/figure-html/madeup_broken-out-1.svg)<!-- --> ] --- ## Make up data if necessary Same example with made-up data: .tiny-font.pull-left[ ```r library(tidyverse) data <- tibble( x = 1:3, y = c(2, 1, 3), letter = c("A", "B", "C") ) ggplot(data) + aes(x, y) + aes(color = letter) + # now it's fixed geom_point() # note: irrelevant code has been removed ``` ] .pull-right[ ![](succeed-in-class_files/figure-html/madeup_fixed-out-1.svg)<!-- --> ] --- ## Closing thoughts about reproducible examples -- - It can be a lot of effort to prepare a good, minimal example! -- - You often end up solving the problem as you try to isolate it -- - Use the **reprex** package to format your example: https://reprex.tidyverse.org/ -- - **reprex** works great with Piazza -- - Copy and paste your actual code, not a screenshot; again, the **reprex** package helps --- [//]: # "segment ends here" --- ## Further reading - All slides and worksheets: [wilkelab.org/DSC385](https://wilkelab.org/DSC385/) - Github repo of class materials: [https://github.com/wilkelab/DSC385](https://github.com/wilkelab/DSC385) - **reprex** package: [https://reprex.tidyverse.org/](https://reprex.tidyverse.org/) - Tidyverse style guide: [https://style.tidyverse.org/](https://style.tidyverse.org/)