Considerations in data visualization

Author

Gabriel I. Cook

Published

October 30, 2024

Overview

The focus of this module is to raise considerations for creating data visualizations to tell a story about data. Anyone can code a bar plot, scatterplot, line plot or any other visualization whether simple or complex. Out of the box, {ggplot2} allows you to produce some reasonable but the library does not evaluate items like:

  • the appropriateness of your plot to the specific story you are trying to tell,
  • the appropriateness of the color scheme used for a client project, or
  • the appropriateness of the colors for your client’s internal team or their public audience. Function libraries also do not consider whether your plot takes on perceptual properties that may bias your readership.

Issues arise when you accept the default performance of functions and produce plots that contain elements that are not appropriate to a specific audience or perceptually problematic to the general human audience. You should not consider a coded plot complete without considering its elements thoroughly. This module introduces some of these issues.

To Do

Prior to class

Complete the brief readings below. There is no coding video corresponding to this module.

Readings

The goal should be to familiarize yourself and bring questions to class. The online readings are taken from Fundamentals of Data Visualization, a great book by Clause Wilke, a professor of molecular evolution at The University of Texas at Austin.

In class

Class time will be allocated to discussing elements from the readings and addressing other elements relevant to building a successful project.

Session Info

sessionInfo()
R version 4.4.1 (2024-06-14 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 22631)

Matrix products: default


locale:
[1] LC_COLLATE=English_United States.utf8 
[2] LC_CTYPE=English_United States.utf8   
[3] LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.utf8    

time zone: America/Los_Angeles
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] htmltools_0.5.8.1 DT_0.33           vroom_1.6.5       lubridate_1.9.3  
 [5] forcats_1.0.0     stringr_1.5.1     dplyr_1.1.4       purrr_1.0.2      
 [9] readr_2.1.5       tidyr_1.3.1       tibble_3.2.1      ggplot2_3.5.1    
[13] tidyverse_2.0.0  

loaded via a namespace (and not attached):
 [1] bit_4.0.5         gtable_0.3.5      jsonlite_1.8.8    crayon_1.5.3     
 [5] compiler_4.4.1    tidyselect_1.2.1  scales_1.3.0      yaml_2.3.10      
 [9] fastmap_1.2.0     here_1.0.1        R6_2.5.1          generics_0.1.3   
[13] knitr_1.47        htmlwidgets_1.6.4 munsell_0.5.1     rprojroot_2.0.4  
[17] tzdb_0.4.0        pillar_1.9.0      R.utils_2.12.3    rlang_1.1.4      
[21] utf8_1.2.4        stringi_1.8.4     xfun_0.45         bit64_4.0.5      
[25] timechange_0.3.0  cli_3.6.3         withr_3.0.1       magrittr_2.0.3   
[29] digest_0.6.36     grid_4.4.1        rstudioapi_0.16.0 hms_1.1.3        
[33] lifecycle_1.0.4   R.methodsS3_1.8.2 R.oo_1.26.0       vctrs_0.6.5      
[37] evaluate_0.24.0   glue_1.7.0        fansi_1.0.6       colorspace_2.1-0 
[41] rmarkdown_2.27    tools_4.4.1       pkgconfig_2.0.3