Title, Captions, and Tables

Author

Gabriel I. Cook

Published

October 30, 2024

Under construction.

This page is a work in progress and may contain areas that need more detail or that required syntactical, grammatical, and typographical changes. If you find some part requiring some editing, please let me know so I can fix it for you.

Overview

Every data visualization has a purpose, which is to offer a visual representation of data. The visualization serves as an actor in the overall story of data. A visualization’s effectiveness relies on its ability to convey a finding and communicate some important element or point to an audience. The type of visualization should be chosen purposefully based on ease of processing and interpretation. There are other actors, however. For example, the title, figure caption, annotations, labels, legend, which are also important. This messaging plays an important role in communicating the meaning of those data and to reduce misinterpretation.

The title will play a role in dressing up the plot and contextualizing it. By dressing up, or accessorizing, the plot, we do not mean to improve the aesthetics of the visualization but rather to leave no ambiguity as to the message and intent. Consider the following analogy. A button-up long-sleeved shirt paired with slacks communicate some information about where a person might be going to or coming from. A tie, coat, belt, and shoes will contextualize the scene quite differently. Still yet, a bow tie, coat, cumber bun, and black shoes will provide a different set of expectations. There are few places one will go dressed in a tuxedo. The same is true for data visualizations. The data presented in the form of bars, lines, points, tiles, or otherwise should serve to communicate something meaningful. Proper accessorizing with a clear title, subtitle, and relevant legend or caption will contextualize the plot and leave little ambiguity of its intent. This module is about contextualizing plots with accompanying titles, captions, and other annotations to guide the reader to extract particularly relevant information.

To Do

Readings

External Functions

Provided in class:

view_html(): for viewing data frames in html format, from /src/my_functions.R

You can use this in your own work space but I am having a challenge rendering this of the website, so I’ll default to print() on occasion.

R.utils::sourceDirectory(here::here("src", "functions"))

Libraries

  • {dplyr} 1.1.4: for selecting, filtering, and mutating
  • {ggplot2} 3.5.1: for plotting
  • {geomtextpath} 0.1.4: for annotation of curved paths (also straight)
  • {ggtext} 0.1.2: for text on plots; markdown elements (viz, element_markdown)
  • {forcats} 1.0.0: for reordering factors

Loading Libraries

library(dplyr)
library(magrittr)

Attaching package: 'magrittr'
The following object is masked from 'package:purrr':

    set_names
The following object is masked from 'package:tidyr':

    extract
library(ggplot2)
library(geomtextpath)
library(ggtext)

Titles

Visualizations should contain titles. We have used functions to add words to plots in places were titles would go but until now we have not addressed the significance of a title or how to create one.

The title is used to precisely and accurately communicate the main point of the visualization. After all, the visualization was created and selected for a particular purpose. What is that purpose? Make that purpose clear in your title. Are you communicating key differences across groups? If so, communicate them. Are you illustrating a trend or association? If so, make that clear. Are you showcasing similarities in trends or in shapes of distributions? If so, highlight them in your title. Are you differentiating trends, associations, or distribution shapes across groups? If so, direct your audience to examine those differences in the title.

Remember, you should not be selecting geoms and creating data visualizations that look aesthetically pleasing without considering whether that visualization is the best for your communication goals. Whatever that key finding is that you are trying to communicate, you should make that a part of your title. If you are communicating differences in average performance or in the range of data, is a sina plot necessary? If you are communicating differences in shapes of distributions, is a point-range plot relevant or bar plot a good choice? The geometric choice and the title need to match the goal of the data being visualized.

Now that you have a title, you have to determine where to place it. The location of the title is independent of its intent. Yes, tiles are often positioned above the graphical representation of data. Perhaps they are centered or even left or right justified. Data visualizations that appear on webpages, in newspapers or magazines, or serve as isolated info-graphics (without explanatory text) will typically contain titles in these locations. Titles may appear inside the plot space or even positioned below the plot in a caption position as your might find in books, journals, or scientific pieces. The title type does not make a visualization more professional than another; location may simply align with formatting standards of the medium within which you place the visualization.

For visualizations appearing in reports, papers, journals, etc., for which a figure caption is relevant, ensure placing the title before the caption. If the title is not placed in the caption, then there will not be a title. And yes, visualizations need titles.

Finally, follow the title with a more clear and detailed description of the data presented. If the title highlights the key difference and the caption addresses less significant but nevertheless relevant nuances. If there are no other meaningful messages, you can be more detailed about the main message.

A plot with no title

Adding the title

By now, you know how to add a title. One way is with labs().

my_title <- "A clear title of the main point"

plot +
  labs(title = my_title)

Making a title font bold

plot +
  labs(title = my_title) +
  theme(title = element_markdown(face = "bold"))

Wrapping long titles

Inserting a new line using \n

One way to deal with long titles is to break up the text

my_title <- "A very, very, very, very, very, \nvery, very, very, very, very, very, \nvery, very, very, very, very, very, \nvery, long title"

plot +
  labs(title = my_title) +
  theme(title = element_markdown(face = "bold"))

Inserting breaks in HTML using

If you have an HTML formatted title, you will need to use breaks rather than new lines.

my_html_title <- '<span style="color:blue">A very, very, very, very, very, very, <br>very, very, very, very, very, very, very, very, <br>very, very, very, very, <br>long title</span>'

plot +
  labs(title = my_html_title) +
  theme(plot.title = element_markdown(face = "bold"))

Wrapping strings using stringr::str_wrap()

If you prefer not finding a place to insert a new line or a break, user str_wrap() from {stringr}. By setting width to a value represent the number of characters, str_wrap() will break up the string into pieces that do not exceed the width. You will want to adjust the width to be appropriate given your plot output dimensions. More on saving plots of given dimensions later.

plot +
  labs(title = stringr::str_wrap(my_title, width = 20))

Although this approach is fine with element_text(), it seems compromised with element_markdown(). I have not had time to discover a work around.

plot +
  labs(title = stringr::str_wrap(my_title, width = 20)) +
  theme(plot.title = element_markdown(face = "bold"))

plot +
  labs(title = stringr::str_wrap(my_title, width = 20)) +
  theme(plot.title = element_markdown(face = "bold"))

And element_markdown() is how you process the HTML code.

my_html_title <- '<span style="color:blue">A very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, long title</span>'

plot +
  labs(title = stringr::str_wrap(my_html_title,  width = 40)) +
  theme(plot.title = element_markdown(face = "bold"))

If you are passing HTML code to the text (e.g., you are using direct labeling of color), you may need to use the

Figure Caption

The figure caption is where you communicate a more detailed description of the data than what the title communicates alone.

my_title <- "Range of Distance is not always Indicative of Maximum Distance"

my_caption <- "Minimum and maximum distances illustrated by line length are complemented with median distance \nrepresented by a point. Within each quartile of performance based on maximum distance thrown, \nathletes with small ranges or with lower median distance are not always those throwing greater distances."

plot +
  labs(title = my_title, 
       caption = my_caption
       )

You see that the figure caption is justified to the right.

Adjusting horizontal justification

Right justification

plot +
  labs(title = my_title, caption = my_caption) +   
  theme(plot.caption = element_text(hjust = 1))

Centered

plot +
  labs(title = my_title, caption = my_caption) +   
  theme(plot.caption = element_text(hjust = .5))

Left justification

plot +
  labs(title = my_title, caption = my_caption) +   
  theme(plot.caption = element_text(hjust = 0))

This looks good about right for the caption.

Titles in a Figure Caption

my_full_caption <- "Figure x: Range of distance is not always indicative of maximum distance. Minimum and maximum distances \nillustrated by line length are complemented with median distance represented by a point. Within each \nquartile of performance based on maximum distance thrown, athletes with small ranges or with lower \nmedian distance are not always those throwing greater distances."

plot +
  labs(col = "Percentile", 
       title = NULL,                 # make sure to remove title 
       x = NULL,
       y = "Distance (m)",
       caption = my_full_caption
       ) +
  theme(plot.title = element_text(face = "bold"), 
        plot.caption = element_text(hjust = 0)
        )

This looks good for the title and the caption.

Changing Fonts

Modifying fonts is achieved by changing the family element of element_text() for each plot component. We can do this for the title, subtitle, and axis components.

Fonts with {ragg} device

There are a few ways to manage custom fonts. You can use {showtext} to download custom font binaries and access them. Managing fonts, however, is beyond the scope of this topic.

A good option is to use {ragg}, which provides graphic devices for R based on the AGG (anti-grain geometry) library. Why use the AGG device?

  • You have direct access to all system fonts
  • It supports advanced text rendering, (e.g., right-to-left text, emojis, etc.)
  • For its high quality for anti-aliasing and rotated text
  • It supports 16-bit output
  • It’s system independent, which is greats for collaborating with others using Mac, Windows, and Linux operating systems
  • Speed

Setting the {ragg} graphics device

  • First, install the {ragg} library.

  • In RStudio, navigate to the Tools menu item and then navigate the drop down to Global Options. Click the Graphics tab on the top and set your Graphics Device back end to AGG.

  • Load the library in your R Markdown file

  • You can also set the device in your global option settings for your R Markdown file by adding this before your first code block: knitr::opts_chunk$set(dev = "ragg_png"). You can make changes within a code block too but I don’t know why you would want to use different devices.

Changing fonts in the theme

Within theme(), you can set the fonts for different elements. Here we change the title, subtitle, axis titles, and axis text.

Saving your plot

There are different ways to save plots but using ggsave() may be the easiest. It will use your system graphics device settings. If you change that device in your global settings, that device will be used for your plots. You can also specify the device in ggsave().

  • device = ragg::agg_png
ggsave(filename = here::here("figs", "my_plot.png"), 
       plot = new_plot,              # last_plot(),  the default is the last plot 
       device = ragg::agg_png,
       dpi = 320                     # 320 retina, 300 is fine 
       )
Saving 7 x 5 in image

Loading saved plots in R Markdown

If you have saved plots that you wish to call into your R Markdown file, the easiest way will likely be to use knitr::include_graphics(). Using {here} to assist with access the project and within which /figs directory, we can specify it all for the path.

  • knitr::include_graphics(path = here::here("figs", "my_plot.png"))
knitr::include_graphics(path = here::here("figs", "my_plot.png"),
                        dpi = 320
                        )

By default, the alignment will be left. You can change the alignment in the {r} code chunk by setting fig.align to fig.align = 'center' or fig.align = 'right'.

Example:

{r fig.align = 'center'}
knitr::include_graphics(path = here::here("figs", "my_plot.png"),
                        dpi = 320
                        )

knitr::include_graphics(path = here::here("figs", "my_plot.png"),
                        dpi = 320
                        )

To change the size you can change:

  • fig.width = 7.7 or some other number of inches
  • fig.height = 6
  • out.width="50%" or some other percent
  • out.height="50%"

Example:

{r fig.align="center", out.width="90%"}
knitr::include_graphics(path = here::here("figs", "my_plot.png"),
                        dpi = 320
                        )

Example:

{r fig.align="center", fig.width = 7.78}
knitr::include_graphics(path = here::here("figs", "my_plot.png"),
                        dpi = 320
                        )

If you wish to add R code at the top of your R Markdown file, you can set the options for all chunks using opt_chunk. This way, all plots will take the same settings unless the specific code block in which the plot is rendered so defined otherwise.

For example:

knitr::opts_chunk$set(
 fig.width = 6,
 fig.asp = 0.8,
 out.width = "80%"
)

Plot file formats

You may not understand all of the differences between image file formats but you are likely familiar with image file extensions like .jpg (Joint Photographic Experts Group), .png (Portable Network Graphics), pdf (Portable Document Format), or XML-based scalable vector graphics files like .svg.

You can read more in Wilke’s Data Visualization book or in Peng’s Exploratory Data Analysis book.

  • Raster: Constructed by a series of pixels (e.g., JPEG, GIF, and PNG)

  • Vector: Constructed using proportional formulas rather than pixels; they are great when when they need to be resized, for example a logo that would appear on a business card or a billboard) (e.g., EPS, AI and PDF)

Vector formats like .pdf and .svg may be good for line drawings and solid colors (bars) but they are less familiar by some and you might not be able to get someone to load them someplace.

Raster or Bitmap formats like .png (and .jpg but stay away from it) are generally good for visualization many number points and are good for embedding on web pages. This is likely your go-to. Whatever you do, don’t save your plots image files as .jpg. Your best option will be to use .png for its portability. You could use .pdf and convert them as needed but that will require other steps. I recommend just saving as ragg::png file.

Session Info

R version 4.4.1 (2024-06-14 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 22631)

Matrix products: default


locale:
[1] LC_COLLATE=English_United States.utf8 
[2] LC_CTYPE=English_United States.utf8   
[3] LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.utf8    

time zone: America/Los_Angeles
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] ggtext_0.1.2       geomtextpath_0.1.4 magrittr_2.0.3     htmltools_0.5.8.1 
 [5] DT_0.33            vroom_1.6.5        lubridate_1.9.3    forcats_1.0.0     
 [9] stringr_1.5.1      dplyr_1.1.4        purrr_1.0.2        readr_2.1.5       
[13] tidyr_1.3.1        tibble_3.2.1       ggplot2_3.5.1      tidyverse_2.0.0   

loaded via a namespace (and not attached):
 [1] gld_2.6.6         gtable_0.3.5      xfun_0.45         htmlwidgets_1.6.4
 [5] lattice_0.22-6    monochromeR_0.2.0 tzdb_0.4.0        vctrs_0.6.5      
 [9] tools_4.4.1       generics_0.1.3    proxy_0.4-27      fansi_1.0.6      
[13] pkgconfig_2.0.3   R.oo_1.26.0       Matrix_1.7-0      data.table_1.15.4
[17] readxl_1.4.3      rootSolve_1.8.2.4 lifecycle_1.0.4   farver_2.1.2     
[21] compiler_4.4.1    textshaping_0.4.0 Exact_3.2         munsell_0.5.1    
[25] class_7.3-22      DescTools_0.99.54 yaml_2.3.10       pillar_1.9.0     
[29] crayon_1.5.3      MASS_7.3-60.2     R.utils_2.12.3    boot_1.3-30      
[33] commonmark_1.9.1  tidyselect_1.2.1  digest_0.6.36     mvtnorm_1.2-5    
[37] stringi_1.8.4     rprojroot_2.0.4   fastmap_1.2.0     grid_4.4.1       
[41] expm_0.999-9      here_1.0.1        colorspace_2.1-0  lmom_3.0         
[45] cli_3.6.3         utf8_1.2.4        e1071_1.7-14      withr_3.0.1      
[49] scales_1.3.0      bit64_4.0.5       timechange_0.3.0  httr_1.4.7       
[53] rmarkdown_2.27    bit_4.0.5         cellranger_1.1.0  ragg_1.3.2       
[57] png_0.1-8         moments_0.14.1    R.methodsS3_1.8.2 hms_1.1.3        
[61] evaluate_0.24.0   knitr_1.47        markdown_1.13     rlang_1.1.4      
[65] gridtext_0.1.5    Rcpp_1.0.12       glue_1.7.0        xml2_1.3.6       
[69] rstudioapi_0.16.0 jsonlite_1.8.8    R6_2.5.1          systemfonts_1.1.0