class: center, middle, title-slide .title[ # From Data to Visualization 1 ] .author[ ### Claus O. Wilke ] .date[ ### last updated: 2022-06-26 ] --- ## Topics covered -- 1. Visualizing amounts -- 2. Visualizing distributions -- 3. Visualizing associations and trends --- class: middle center ## Visualizing amounts --- ## We often encounter datasets containing simple amounts --- ## We often encounter datasets containing simple amounts Example: Highest grossing movies Dec. 2017 <br> .center[ <table> <thead> <tr> <th style="text-align:right;"> rank </th> <th style="text-align:left;"> title </th> <th style="text-align:right;"> amount </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> Star Wars </td> <td style="text-align:right;"> 71.57 </td> </tr> <tr> <td style="text-align:right;"> 2 </td> <td style="text-align:left;"> Jumanji </td> <td style="text-align:right;"> 36.17 </td> </tr> <tr> <td style="text-align:right;"> 3 </td> <td style="text-align:left;"> Pitch Perfect 3 </td> <td style="text-align:right;"> 19.93 </td> </tr> <tr> <td style="text-align:right;"> 4 </td> <td style="text-align:left;"> Greatest Showman </td> <td style="text-align:right;"> 8.81 </td> </tr> <tr> <td style="text-align:right;"> 5 </td> <td style="text-align:left;"> Ferdinand </td> <td style="text-align:right;"> 7.32 </td> </tr> </tbody> </table> ] .tiny-font.absolute-bottom-right[ Data source: Box Office Mojo ] --- ## We can visualize amounts with bar plots <br> .center.move-up-1em[ data:image/s3,"s3://crabby-images/1e7dd/1e7dd94d8488fae7b88543921eacd86121dcf532" alt=""<!-- --> ] --- ## Bars can also run horizontally <br> .center.move-up-1em[ data:image/s3,"s3://crabby-images/39a65/39a65800d07de41c4c9938f34f75b005bcf49a13" alt=""<!-- --> ] --- ## Pay attention to the order of the bars <br> .center.move-up-1em[ data:image/s3,"s3://crabby-images/1ddfe/1ddfe710623f5b8fbe7e2bcfcda7179038850c44" alt=""<!-- --> ] --- ## Pay attention to the order of the bars <br> .center.move-up-1em[ data:image/s3,"s3://crabby-images/39a65/39a65800d07de41c4c9938f34f75b005bcf49a13" alt="" ] --- ## We can use dots instead of bars <br> .center.move-up-1em[ data:image/s3,"s3://crabby-images/ef530/ef53039fd8895817474136e83947dd85a1e6c7a6" alt=""<!-- --> ] --- ## Dots are preferable if we want to truncate the axes .center.move-up-1em[ data:image/s3,"s3://crabby-images/515c1/515c1e0cc8e6c1784b2b5b92bab2069d68bafb46" alt=""<!-- --> ] --- ## Dots are preferable if we want to truncate the axes .center.move-up-1em[ data:image/s3,"s3://crabby-images/0b66f/0b66f4cf7df8966ecedb48a0616cbe61a0610590" alt=""<!-- --> ] .absolute-bottom-right[ bar lengths do<br>not accurately<br>represent the<br>data values ] --- ## Dots are preferable if we want to truncate the axes .center.move-up-1em[ data:image/s3,"s3://crabby-images/451ad/451ad6dd0745b8d46c21f84f4c8369247ca6e3d1" alt=""<!-- --> ] .absolute-bottom-right[ key features<br>of the data<br>are obscured ] --- ## Dots are preferable if we want to truncate the axes .center.move-up-1em[ data:image/s3,"s3://crabby-images/515c1/515c1e0cc8e6c1784b2b5b92bab2069d68bafb46" alt="" ] --- ## We use grouped bars for higher-dimensional datasets -- <br> .center.move-up-1em[ data:image/s3,"s3://crabby-images/c2c30/c2c30528f419eacedd35d5b22cd661465b6a328a" alt=""<!-- --> ] .absolute-bottom-right[ Data source: United States Census Bureau, 2016 ] --- ## We are free to choose by which variable to group <br> .center[ data:image/s3,"s3://crabby-images/fc516/fc51638658fa8679a0dc762a18dd49526300a900" alt=""<!-- --> ] .absolute-bottom-right[ Data source: United States Census Bureau, 2016 ] --- ## We can also use multiple plot panels (facets) .center[ data:image/s3,"s3://crabby-images/c9f83/c9f83604d52eb648f34c939eceb4fc7498faff96" alt=""<!-- --> ] .absolute-bottom-right[ Data source: United States Census Bureau, 2016 ] --- class: middle center ## Visualizing distributions --- ## Passengers on the Titanic .center.small-font[ <table> <thead> <tr> <th style="text-align:right;"> age </th> <th style="text-align:left;"> sex </th> <th style="text-align:left;"> class </th> <th style="text-align:left;"> survived </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 0.17 </td> <td style="text-align:left;"> female </td> <td style="text-align:left;"> 3rd </td> <td style="text-align:left;"> survived </td> </tr> <tr> <td style="text-align:right;"> 0.33 </td> <td style="text-align:left;"> male </td> <td style="text-align:left;"> 3rd </td> <td style="text-align:left;"> died </td> </tr> <tr> <td style="text-align:right;"> 0.80 </td> <td style="text-align:left;"> male </td> <td style="text-align:left;"> 2nd </td> <td style="text-align:left;"> survived </td> </tr> <tr> <td style="text-align:right;"> 0.83 </td> <td style="text-align:left;"> male </td> <td style="text-align:left;"> 2nd </td> <td style="text-align:left;"> survived </td> </tr> <tr> <td style="text-align:right;"> 0.83 </td> <td style="text-align:left;"> male </td> <td style="text-align:left;"> 3rd </td> <td style="text-align:left;"> survived </td> </tr> <tr> <td style="text-align:right;"> 0.92 </td> <td style="text-align:left;"> male </td> <td style="text-align:left;"> 1st </td> <td style="text-align:left;"> survived </td> </tr> <tr> <td style="text-align:right;"> 1.00 </td> <td style="text-align:left;"> female </td> <td style="text-align:left;"> 2nd </td> <td style="text-align:left;"> survived </td> </tr> <tr> <td style="text-align:right;"> 1.00 </td> <td style="text-align:left;"> female </td> <td style="text-align:left;"> 3rd </td> <td style="text-align:left;"> survived </td> </tr> <tr> <td style="text-align:right;"> 1.00 </td> <td style="text-align:left;"> male </td> <td style="text-align:left;"> 2nd </td> <td style="text-align:left;"> survived </td> </tr> <tr> <td style="text-align:right;"> 1.00 </td> <td style="text-align:left;"> male </td> <td style="text-align:left;"> 2nd </td> <td style="text-align:left;"> survived </td> </tr> </tbody> </table> <table> <thead> <tr> <th style="text-align:right;"> age </th> <th style="text-align:left;"> sex </th> <th style="text-align:left;"> class </th> <th style="text-align:left;"> survived </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 1.0 </td> <td style="text-align:left;"> male </td> <td style="text-align:left;"> 3rd </td> <td style="text-align:left;"> survived </td> </tr> <tr> <td style="text-align:right;"> 1.5 </td> <td style="text-align:left;"> female </td> <td style="text-align:left;"> 3rd </td> <td style="text-align:left;"> died </td> </tr> <tr> <td style="text-align:right;"> 1.5 </td> <td style="text-align:left;"> female </td> <td style="text-align:left;"> 3rd </td> <td style="text-align:left;"> died </td> </tr> <tr> <td style="text-align:right;"> 2.0 </td> <td style="text-align:left;"> female </td> <td style="text-align:left;"> 1st </td> <td style="text-align:left;"> died </td> </tr> <tr> <td style="text-align:right;"> 2.0 </td> <td style="text-align:left;"> female </td> <td style="text-align:left;"> 2nd </td> <td style="text-align:left;"> survived </td> </tr> <tr> <td style="text-align:right;"> 2.0 </td> <td style="text-align:left;"> female </td> <td style="text-align:left;"> 3rd </td> <td style="text-align:left;"> died </td> </tr> <tr> <td style="text-align:right;"> 2.0 </td> <td style="text-align:left;"> female </td> <td style="text-align:left;"> 3rd </td> <td style="text-align:left;"> died </td> </tr> <tr> <td style="text-align:right;"> 2.0 </td> <td style="text-align:left;"> male </td> <td style="text-align:left;"> 2nd </td> <td style="text-align:left;"> survived </td> </tr> <tr> <td style="text-align:right;"> 2.0 </td> <td style="text-align:left;"> male </td> <td style="text-align:left;"> 2nd </td> <td style="text-align:left;"> survived </td> </tr> <tr> <td style="text-align:right;"> 2.0 </td> <td style="text-align:left;"> male </td> <td style="text-align:left;"> 2nd </td> <td style="text-align:left;"> survived </td> </tr> </tbody> </table> <table> <thead> <tr> <th style="text-align:right;"> age </th> <th style="text-align:left;"> sex </th> <th style="text-align:left;"> class </th> <th style="text-align:left;"> survived </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 3 </td> <td style="text-align:left;"> female </td> <td style="text-align:left;"> 2nd </td> <td style="text-align:left;"> survived </td> </tr> <tr> <td style="text-align:right;"> 3 </td> <td style="text-align:left;"> female </td> <td style="text-align:left;"> 3rd </td> <td style="text-align:left;"> survived </td> </tr> <tr> <td style="text-align:right;"> 3 </td> <td style="text-align:left;"> male </td> <td style="text-align:left;"> 2nd </td> <td style="text-align:left;"> survived </td> </tr> <tr> <td style="text-align:right;"> 3 </td> <td style="text-align:left;"> male </td> <td style="text-align:left;"> 2nd </td> <td style="text-align:left;"> survived </td> </tr> <tr> <td style="text-align:right;"> 3 </td> <td style="text-align:left;"> male </td> <td style="text-align:left;"> 3rd </td> <td style="text-align:left;"> survived </td> </tr> <tr> <td style="text-align:right;"> 3 </td> <td style="text-align:left;"> male </td> <td style="text-align:left;"> 3rd </td> <td style="text-align:left;"> survived </td> </tr> <tr> <td style="text-align:right;"> 4 </td> <td style="text-align:left;"> female </td> <td style="text-align:left;"> 2nd </td> <td style="text-align:left;"> survived </td> </tr> <tr> <td style="text-align:right;"> 4 </td> <td style="text-align:left;"> female </td> <td style="text-align:left;"> 2nd </td> <td style="text-align:left;"> survived </td> </tr> <tr> <td style="text-align:right;"> 4 </td> <td style="text-align:left;"> female </td> <td style="text-align:left;"> 3rd </td> <td style="text-align:left;"> survived </td> </tr> <tr> <td style="text-align:right;"> 4 </td> <td style="text-align:left;"> female </td> <td style="text-align:left;"> 3rd </td> <td style="text-align:left;"> survived </td> </tr> </tbody> </table> ] --- ## Histogram: Define bins and count cases .pull-left.small-font[ <table> <thead> <tr> <th style="text-align:left;"> age range </th> <th style="text-align:right;"> count </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> 0–5 </td> <td style="text-align:right;"> 36 </td> </tr> <tr> <td style="text-align:left;"> 6–10 </td> <td style="text-align:right;"> 19 </td> </tr> <tr> <td style="text-align:left;"> 11–15 </td> <td style="text-align:right;"> 18 </td> </tr> <tr> <td style="text-align:left;"> 16–20 </td> <td style="text-align:right;"> 99 </td> </tr> <tr> <td style="text-align:left;"> 21–25 </td> <td style="text-align:right;"> 139 </td> </tr> <tr> <td style="text-align:left;"> 26–30 </td> <td style="text-align:right;"> 121 </td> </tr> <tr> <td style="text-align:left;"> 31–35 </td> <td style="text-align:right;"> 76 </td> </tr> <tr> <td style="text-align:left;"> 36–40 </td> <td style="text-align:right;"> 74 </td> </tr> </tbody> </table> <table> <thead> <tr> <th style="text-align:left;"> age range </th> <th style="text-align:right;"> count </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> 41–45 </td> <td style="text-align:right;"> 54 </td> </tr> <tr> <td style="text-align:left;"> 46–50 </td> <td style="text-align:right;"> 50 </td> </tr> <tr> <td style="text-align:left;"> 51–55 </td> <td style="text-align:right;"> 26 </td> </tr> <tr> <td style="text-align:left;"> 56–60 </td> <td style="text-align:right;"> 22 </td> </tr> <tr> <td style="text-align:left;"> 61–65 </td> <td style="text-align:right;"> 16 </td> </tr> <tr> <td style="text-align:left;"> 66–70 </td> <td style="text-align:right;"> 3 </td> </tr> <tr> <td style="text-align:left;"> 71–75 </td> <td style="text-align:right;"> 3 </td> </tr> <tr> <td style="text-align:left;"> 76–80 </td> <td style="text-align:right;"> 0 </td> </tr> </tbody> </table> ] -- .pull-right[ data:image/s3,"s3://crabby-images/9ea6a/9ea6ac169b6ae75050cfd913eae577b8d555391a" alt=""<!-- --> ] ??? Figure redrawn from [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) --- # Histograms depend on the chosen bin width .center[ data:image/s3,"s3://crabby-images/7ed21/7ed214fad7db7a5480cff75e8121167395a0b2ba" alt=""<!-- --> ] ??? Figure redrawn from [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) --- ## Alternative to histogram: Kernel density estimate (KDE) .pull-left[ data:image/s3,"s3://crabby-images/9cff7/9cff7c54a0a1e8f316ef27ee7ce94ac2da2b5a83" alt=""<!-- --> ] -- .pull-right[ data:image/s3,"s3://crabby-images/88003/880033270744c1464c0b811d03f6d7535d6075b0" alt=""<!-- --> ] -- Histograms show raw counts, KDEs show proportions. (Total area = 1) ??? Figures redrawn from [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) --- ## KDEs also depend on parameter settings .center[ data:image/s3,"s3://crabby-images/3525b/3525ba4abe36c2271515a7be4b85ab132095e666" alt=""<!-- --> ] ??? Figure redrawn from [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) --- ## Careful: Are bars stacked or overlapping? .pull-left[ data:image/s3,"s3://crabby-images/9cf8a/9cf8a2e202c8b73fa51719629af558f7e3bf961e" alt=""<!-- --> ] -- .pull-right[ data:image/s3,"s3://crabby-images/c46e4/c46e4129a3d74c0c718c5275582dcbb4c85cc0c1" alt=""<!-- --> ] -- Stacked or overlapping histograms are rarely a good choice. ??? Figures redrawn from [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) --- ## Alternatively: Age pyramid .center[ data:image/s3,"s3://crabby-images/94588/945886a47e68d210efab79532c2501afc73f499f" alt=""<!-- --> ] ??? Figures redrawn from [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) --- ## What if we want to show more than two distributions? .pull-left.small-font[ Mean temperatures in Lincoln, NE, in January 2016: .center[ |date | mean temp| |:----------|---------:| |2016-01-01 | 24| |2016-01-02 | 23| |2016-01-03 | 23| |2016-01-04 | 17| |2016-01-05 | 29| |2016-01-06 | 33| |2016-01-07 | 30| |2016-01-08 | 25| |2016-01-09 | 9| |2016-01-10 | 11| |2016-01-11 | 28| |2016-01-12 | 24| |2016-01-13 | 33| |2016-01-14 | 40| |2016-01-15 | 29| |2016-01-16 | 19| |2016-01-17 | 5| |2016-01-18 | 11| |2016-01-19 | 22| |2016-01-20 | 28| |2016-01-21 | 25| |2016-01-22 | 22| |2016-01-23 | 28| |2016-01-24 | 30| |2016-01-25 | 26| |2016-01-26 | 29| |2016-01-27 | 33| |2016-01-28 | 41| |2016-01-29 | 41| |2016-01-30 | 39| |2016-01-31 | 35| ]] -- .pull-right[ data:image/s3,"s3://crabby-images/98475/98475c03a58fa6a82d13142b28df89c095b0d3b4" alt=""<!-- --> ] --- ## What if we want to show more than two distributions? .pull-left.small-font[ Mean temperatures in Lincoln, NE, in January 2016: .center[ |date | mean temp| |:----------|---------:| |2016-01-01 | 24| |2016-01-02 | 23| |2016-01-03 | 23| |2016-01-04 | 17| |2016-01-05 | 29| |2016-01-06 | 33| |2016-01-07 | 30| |2016-01-08 | 25| |2016-01-09 | 9| |2016-01-10 | 11| |2016-01-11 | 28| |2016-01-12 | 24| |2016-01-13 | 33| |2016-01-14 | 40| |2016-01-15 | 29| |2016-01-16 | 19| |2016-01-17 | 5| |2016-01-18 | 11| |2016-01-19 | 22| |2016-01-20 | 28| |2016-01-21 | 25| |2016-01-22 | 22| |2016-01-23 | 28| |2016-01-24 | 30| |2016-01-25 | 26| |2016-01-26 | 29| |2016-01-27 | 33| |2016-01-28 | 41| |2016-01-29 | 41| |2016-01-30 | 39| |2016-01-31 | 35| ]] .pull-right[ data:image/s3,"s3://crabby-images/dc9b7/dc9b7ffd4d49669fe4e38a7d7cdbcbff4b65baf0" alt=""<!-- --> How can we compare distributions across months? ] --- ## A bad idea: Many overlapping density plots .center[ data:image/s3,"s3://crabby-images/80a54/80a54fad4421d1a8830b205055222ef2e3197499" alt=""<!-- --> ] --- ## Another bad idea: Stacked density plots .center[ data:image/s3,"s3://crabby-images/2fe68/2fe6802665f3459bf69ff35bc00685ee8390e4ac" alt=""<!-- --> ] --- ## Somewhat better: Small multiples .center[ data:image/s3,"s3://crabby-images/53aab/53aab1be4cb1a6367adc393857f3d09200070475" alt=""<!-- --> ] --- ## Instead: Show values along y, conditions along x .center[ data:image/s3,"s3://crabby-images/0b61d/0b61df9a0ee098cb69d2dfb91b0da8c33759cbfa" alt=""<!-- --> ] ??? Figure redrawn from [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) -- A boxplot is a crude way of visualizing a distribution. --- ## How to read a boxplot .center[ data:image/s3,"s3://crabby-images/f0780/f0780fc57d35690d2900ac9eaa8bf835b775af89" alt=""<!-- --> ] ??? Figure redrawn from [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) --- ## If you like density plots, consider violins .center[ data:image/s3,"s3://crabby-images/39efa/39efa6c266b21b2c98eb1c6e6b13bef9d8301584" alt=""<!-- --> ] ??? Figure redrawn from [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) -- A violin plot is a density plot rotated 90 degrees and then mirrored. --- ## How to read a violin plot .center[ data:image/s3,"s3://crabby-images/eee83/eee83b13a5fa942dfa0f505509689fc8aee88b57" alt=""<!-- --> ] ??? Figure redrawn from [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) --- ## For small datasets, you can also use a strip chart Advantage: Can see raw data points instead of abstract representation. .center[ data:image/s3,"s3://crabby-images/afa7b/afa7b8907bb593f154e840cc9ead68a68f089865" alt=""<!-- --> ] ??? Figure redrawn from [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) -- Horizontal jittering may be necessary to avoid overlapping points. --- ## For small datasets, you can also use a strip chart Advantage: Can see raw data points instead of abstract representation. .center[ data:image/s3,"s3://crabby-images/b9331/b9331d71791f3329aa502f630de555e4436d1083" alt=""<!-- --> ] Horizontal jittering may be necessary to avoid overlapping points. ??? Figure redrawn from [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) --- ## For small datasets, you can also use a strip chart Advantage: Can see raw data points instead of abstract representation. .center[ data:image/s3,"s3://crabby-images/1ad76/1ad7609a5906d716ea49af5d933999ba55190b82" alt=""<!-- --> ] Horizontal jittering may be necessary to avoid overlapping points. ??? Figure redrawn from [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) --- ## And the final option, a ridgeline plot .center[ data:image/s3,"s3://crabby-images/00ae7/00ae76d554fd94ae375330f365aecd22b882709e" alt=""<!-- --> ] ??? Figure redrawn from [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) -- Notice the single fill color. More colors would be distracting. --- class: middle center ## Visualizing associations and trends --- ## We visualize associations with scatter plots .center[ data:image/s3,"s3://crabby-images/bee96/bee96f690876aa3f7ceeaf28c41a615fb56ad491" alt=""<!-- --> ] ??? Figure redrawn after [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) --- ## We visualize associations with scatter plots .center[ data:image/s3,"s3://crabby-images/e233d/e233df3d6a4f307fd91007a8e295fb005e80e029" alt=""<!-- --> ] ??? Figure redrawn after [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) --- ## Regression lines emphasize the overall trend .center[ data:image/s3,"s3://crabby-images/00c0c/00c0c83526367132a1764ae28ac8c89b4e013133" alt=""<!-- --> ] ??? Figure redrawn after [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) --- ## Regression lines emphasize the overall trend .center[ data:image/s3,"s3://crabby-images/c86dc/c86dc113e83924c93d1abd26644c8d10aec8c659" alt=""<!-- --> ] ??? Figure redrawn after [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) --- ## Regression lines emphasize the overall trend .center[ data:image/s3,"s3://crabby-images/1d636/1d636df35ecdc4e36c5f95253898e3445578c0ad" alt=""<!-- --> ] ??? Figure redrawn after [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) --- ## Regression lines emphasize the overall trend .center[ data:image/s3,"s3://crabby-images/10d22/10d22c3747b2f67332270d061ab282c891e54a0c" alt=""<!-- --> ] ??? Figure redrawn after [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) --- ## Regression lines emphasize the overall trend .center[ data:image/s3,"s3://crabby-images/90343/903431536e49a6a4df1127a142ecf32b859e6e47" alt=""<!-- --> ] ??? Figure redrawn after [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) --- ## Regression lines emphasize the overall trend .center[ data:image/s3,"s3://crabby-images/4f70f/4f70f9841184e17fb52c184924aee5a998361687" alt=""<!-- --> ] ??? Figure redrawn after [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) --- ## Detrending: Removing the underlying trend .center[ data:image/s3,"s3://crabby-images/d2084/d2084ede8fbf3c4461678c3e9db0f111cf0e9b26" alt=""<!-- --> ] -- .small-font[ Did housing prices in California decline substantially from 1990 to 1998? ] -- .small-font[ Did housing prices in West Virginia recover by 2017? ] ??? Figure redrawn after [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) --- ## Detrending: Removing the underlying trend .center[ data:image/s3,"s3://crabby-images/a6905/a690511ecae0836a95d31ac9ad95ddf665b96998" alt=""<!-- --> ] .small-font[ Did housing prices in California decline substantially from 1990 to 1998? ] .small-font[ Did housing prices in West Virginia recover by 2017? ] ??? Figure redrawn after [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) --- ## Detrending: Removing the underlying trend .center[ data:image/s3,"s3://crabby-images/cc750/cc7505ea362a924fc4aced1136c8e90719e3ecc9" alt=""<!-- --> ] .small-font[ Did housing prices in California decline substantially from 1990 to 1998? — yes ] .small-font[ Did housing prices in West Virginia recover by 2017? — no ] ??? Figure redrawn after [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) --- ## Further reading Relevant chapters from Fundamentals of Data Visualization: - [Chapter 6: Visualizing amounts](https://clauswilke.com/dataviz/visualizing-amounts.html) - [Chapter 7: Visualizing distributions: Histograms and density plots](https://clauswilke.com/dataviz/histograms-density-plots.html) - [Chapter 9: Visualizing many distributions at once](https://clauswilke.com/dataviz/boxplots-violins.html) - [Chapter 12: Visualizing associations among two or more quantitative variables](https://clauswilke.com/dataviz/visualizing-associations.html) - [Chapter 14: Visualizing trends](https://clauswilke.com/dataviz/visualizing-trends.html)