::sourceDirectory(here::here("src", "functions")) R.utils
Title, Captions, and Tables
This page is a work in progress and may contain areas that need more detail or that required syntactical, grammatical, and typographical changes. If you find some part requiring some editing, please let me know so I can fix it for you.
Overview
Every data visualization has a purpose, which is to offer a visual representation of data. The visualization serves as an actor in the overall story of data. A visualization’s effectiveness relies on its ability to convey a finding and communicate some important element or point to an audience. The type of visualization should be chosen purposefully based on ease of processing and interpretation. There are other actors, however. For example, the title, figure caption, annotations, labels, legend, which are also important. This messaging plays an important role in communicating the meaning of those data and to reduce misinterpretation.
The title will play a role in dressing up the plot and contextualizing it. By dressing up, or accessorizing, the plot, we do not mean to improve the aesthetics of the visualization but rather to leave no ambiguity as to the message and intent. Consider the following analogy. A button-up long-sleeved shirt paired with slacks communicate some information about where a person might be going to or coming from. A tie, coat, belt, and shoes will contextualize the scene quite differently. Still yet, a bow tie, coat, cumber bun, and black shoes will provide a different set of expectations. There are few places one will go dressed in a tuxedo. The same is true for data visualizations. The data presented in the form of bars, lines, points, tiles, or otherwise should serve to communicate something meaningful. Proper accessorizing with a clear title, subtitle, and relevant legend or caption will contextualize the plot and leave little ambiguity of its intent. This module is about contextualizing plots with accompanying titles, captions, and other annotations to guide the reader to extract particularly relevant information.
To Do
Readings
External Functions
Provided in class:
view_html()
: for viewing data frames in html format, from /src/my_functions.R
You can use this in your own work space but I am having a challenge rendering this of the website, so I’ll default to print()
on occasion.
Libraries
- {dplyr} 1.1.4: for selecting, filtering, and mutating
- {ggplot2} 3.5.1: for plotting
- {geomtextpath} 0.1.4: for annotation of curved paths (also straight)
- {ggtext} 0.1.2: for text on plots; markdown elements (viz, element_markdown)
- {forcats} 1.0.0: for reordering factors
Loading Libraries
library(dplyr)
library(magrittr)
Attaching package: 'magrittr'
The following object is masked from 'package:purrr':
set_names
The following object is masked from 'package:tidyr':
extract
library(ggplot2)
library(geomtextpath)
library(ggtext)
Titles
Visualizations should contain titles. We have used functions to add words to plots in places were titles would go but until now we have not addressed the significance of a title or how to create one.
The title is used to precisely and accurately communicate the main point of the visualization. After all, the visualization was created and selected for a particular purpose. What is that purpose? Make that purpose clear in your title. Are you communicating key differences across groups? If so, communicate them. Are you illustrating a trend or association? If so, make that clear. Are you showcasing similarities in trends or in shapes of distributions? If so, highlight them in your title. Are you differentiating trends, associations, or distribution shapes across groups? If so, direct your audience to examine those differences in the title.
Remember, you should not be selecting geoms and creating data visualizations that look aesthetically pleasing without considering whether that visualization is the best for your communication goals. Whatever that key finding is that you are trying to communicate, you should make that a part of your title. If you are communicating differences in average performance or in the range of data, is a sina plot necessary? If you are communicating differences in shapes of distributions, is a point-range plot relevant or bar plot a good choice? The geometric choice and the title need to match the goal of the data being visualized.
Now that you have a title, you have to determine where to place it. The location of the title is independent of its intent. Yes, tiles are often positioned above the graphical representation of data. Perhaps they are centered or even left or right justified. Data visualizations that appear on webpages, in newspapers or magazines, or serve as isolated info-graphics (without explanatory text) will typically contain titles in these locations. Titles may appear inside the plot space or even positioned below the plot in a caption position as your might find in books, journals, or scientific pieces. The title type does not make a visualization more professional than another; location may simply align with formatting standards of the medium within which you place the visualization.
For visualizations appearing in reports, papers, journals, etc., for which a figure caption is relevant, ensure placing the title before the caption. If the title is not placed in the caption, then there will not be a title. And yes, visualizations need titles.
Finally, follow the title with a more clear and detailed description of the data presented. If the title highlights the key difference and the caption addresses less significant but nevertheless relevant nuances. If there are no other meaningful messages, you can be more detailed about the main message.
A plot with no title
Adding the title
By now, you know how to add a title. One way is with labs()
.
<- "A clear title of the main point"
my_title
+
plot labs(title = my_title)
Making a title font bold
+
plot labs(title = my_title) +
theme(title = element_markdown(face = "bold"))
Wrapping long titles
Inserting a new line using \n
One way to deal with long titles is to break up the text
<- "A very, very, very, very, very, \nvery, very, very, very, very, very, \nvery, very, very, very, very, very, \nvery, long title"
my_title
+
plot labs(title = my_title) +
theme(title = element_markdown(face = "bold"))
Inserting breaks in HTML using
If you have an HTML formatted title, you will need to use breaks rather than new lines.
<- '<span style="color:blue">A very, very, very, very, very, very, <br>very, very, very, very, very, very, very, very, <br>very, very, very, very, <br>long title</span>'
my_html_title
+
plot labs(title = my_html_title) +
theme(plot.title = element_markdown(face = "bold"))
Wrapping strings using stringr::str_wrap()
If you prefer not finding a place to insert a new line or a break, user str_wrap()
from {stringr}. By setting width
to a value represent the number of characters, str_wrap()
will break up the string into pieces that do not exceed the width. You will want to adjust the width
to be appropriate given your plot output dimensions. More on saving plots of given dimensions later.
+
plot labs(title = stringr::str_wrap(my_title, width = 20))
Although this approach is fine with element_text()
, it seems compromised with element_markdown()
. I have not had time to discover a work around.
+
plot labs(title = stringr::str_wrap(my_title, width = 20)) +
theme(plot.title = element_markdown(face = "bold"))
+
plot labs(title = stringr::str_wrap(my_title, width = 20)) +
theme(plot.title = element_markdown(face = "bold"))
And element_markdown()
is how you process the HTML code.
<- '<span style="color:blue">A very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, very, long title</span>'
my_html_title
+
plot labs(title = stringr::str_wrap(my_html_title, width = 40)) +
theme(plot.title = element_markdown(face = "bold"))
If you are passing HTML code to the text (e.g., you are using direct labeling of color), you may need to use the
Changing Fonts
Modifying fonts is achieved by changing the family
element of element_text()
for each plot component. We can do this for the title, subtitle, and axis components.
Fonts with {ragg} device
There are a few ways to manage custom fonts. You can use {showtext} to download custom font binaries and access them. Managing fonts, however, is beyond the scope of this topic.
A good option is to use {ragg}, which provides graphic devices for R based on the AGG (anti-grain geometry) library. Why use the AGG device?
- You have direct access to all system fonts
- It supports advanced text rendering, (e.g., right-to-left text, emojis, etc.)
- For its high quality for anti-aliasing and rotated text
- It supports 16-bit output
- It’s system independent, which is greats for collaborating with others using Mac, Windows, and Linux operating systems
- Speed
Setting the {ragg} graphics device
First, install the {ragg} library.
In RStudio, navigate to the Tools menu item and then navigate the drop down to Global Options. Click the Graphics tab on the top and set your Graphics Device back end to AGG.
Load the library in your R Markdown file
You can also set the device in your global option settings for your R Markdown file by adding this before your first code block:
knitr::opts_chunk$set(dev = "ragg_png")
. You can make changes within a code block too but I don’t know why you would want to use different devices.
Changing fonts in the theme
Within theme()
, you can set the fonts for different elements. Here we change the title, subtitle, axis titles, and axis text.
Saving your plot
There are different ways to save plots but using ggsave()
may be the easiest. It will use your system graphics device settings. If you change that device in your global settings, that device will be used for your plots. You can also specify the device in ggsave()
.
device = ragg::agg_png
ggsave(filename = here::here("figs", "my_plot.png"),
plot = new_plot, # last_plot(), the default is the last plot
device = ragg::agg_png,
dpi = 320 # 320 retina, 300 is fine
)
Saving 7 x 5 in image
Loading saved plots in R Markdown
If you have saved plots that you wish to call into your R Markdown file, the easiest way will likely be to use knitr::include_graphics()
. Using {here} to assist with access the project and within which /figs
directory, we can specify it all for the path.
knitr::include_graphics(path = here::here("figs", "my_plot.png"))
::include_graphics(path = here::here("figs", "my_plot.png"),
knitrdpi = 320
)
By default, the alignment will be left. You can change the alignment in the {r}
code chunk by setting fig.align
to fig.align = 'center'
or fig.align = 'right'
.
Example:
{r fig.align = 'center'}
knitr::include_graphics(path = here::here("figs", "my_plot.png"),
dpi = 320
)
::include_graphics(path = here::here("figs", "my_plot.png"),
knitrdpi = 320
)
To change the size you can change:
fig.width = 7.7
or some other number of inchesfig.height = 6
out.width="50%"
or some other percentout.height="50%"
Example:
{r fig.align="center", out.width="90%"}
::include_graphics(path = here::here("figs", "my_plot.png"),
knitrdpi = 320
)
Example:
{r fig.align="center", fig.width = 7.78}
::include_graphics(path = here::here("figs", "my_plot.png"),
knitrdpi = 320
)
If you wish to add R code at the top of your R Markdown file, you can set the options for all chunks using opt_chunk
. This way, all plots will take the same settings unless the specific code block in which the plot is rendered so defined otherwise.
For example:
knitr::opts_chunk$set(
fig.width = 6,
fig.asp = 0.8,
out.width = "80%"
)
Plot file formats
You may not understand all of the differences between image file formats but you are likely familiar with image file extensions like .jpg
(Joint Photographic Experts Group), .png
(Portable Network Graphics), pdf
(Portable Document Format), or XML-based scalable vector graphics files like .svg
.
You can read more in Wilke’s Data Visualization book or in Peng’s Exploratory Data Analysis book.
Raster: Constructed by a series of pixels (e.g., JPEG, GIF, and PNG)
Vector: Constructed using proportional formulas rather than pixels; they are great when when they need to be resized, for example a logo that would appear on a business card or a billboard) (e.g., EPS, AI and PDF)
Vector formats like .pdf
and .svg
may be good for line drawings and solid colors (bars) but they are less familiar by some and you might not be able to get someone to load them someplace.
Raster or Bitmap formats like .png
(and .jpg
but stay away from it) are generally good for visualization many number points and are good for embedding on web pages. This is likely your go-to. Whatever you do, don’t save your plots image files as .jpg
. Your best option will be to use .png
for its portability. You could use .pdf
and convert them as needed but that will require other steps. I recommend just saving as ragg::png
file.
Session Info
R version 4.4.1 (2024-06-14 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 22631)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.utf8
[2] LC_CTYPE=English_United States.utf8
[3] LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.utf8
time zone: America/Los_Angeles
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] ggtext_0.1.2 geomtextpath_0.1.4 magrittr_2.0.3 htmltools_0.5.8.1
[5] DT_0.33 vroom_1.6.5 lubridate_1.9.3 forcats_1.0.0
[9] stringr_1.5.1 dplyr_1.1.4 purrr_1.0.2 readr_2.1.5
[13] tidyr_1.3.1 tibble_3.2.1 ggplot2_3.5.1 tidyverse_2.0.0
loaded via a namespace (and not attached):
[1] gld_2.6.6 gtable_0.3.5 xfun_0.45 htmlwidgets_1.6.4
[5] lattice_0.22-6 monochromeR_0.2.0 tzdb_0.4.0 vctrs_0.6.5
[9] tools_4.4.1 generics_0.1.3 proxy_0.4-27 fansi_1.0.6
[13] pkgconfig_2.0.3 R.oo_1.26.0 Matrix_1.7-0 data.table_1.15.4
[17] readxl_1.4.3 rootSolve_1.8.2.4 lifecycle_1.0.4 farver_2.1.2
[21] compiler_4.4.1 textshaping_0.4.0 Exact_3.2 munsell_0.5.1
[25] class_7.3-22 DescTools_0.99.54 yaml_2.3.10 pillar_1.9.0
[29] crayon_1.5.3 MASS_7.3-60.2 R.utils_2.12.3 boot_1.3-30
[33] commonmark_1.9.1 tidyselect_1.2.1 digest_0.6.36 mvtnorm_1.2-5
[37] stringi_1.8.4 rprojroot_2.0.4 fastmap_1.2.0 grid_4.4.1
[41] expm_0.999-9 here_1.0.1 colorspace_2.1-0 lmom_3.0
[45] cli_3.6.3 utf8_1.2.4 e1071_1.7-14 withr_3.0.1
[49] scales_1.3.0 bit64_4.0.5 timechange_0.3.0 httr_1.4.7
[53] rmarkdown_2.27 bit_4.0.5 cellranger_1.1.0 ragg_1.3.2
[57] png_0.1-8 moments_0.14.1 R.methodsS3_1.8.2 hms_1.1.3
[61] evaluate_0.24.0 knitr_1.47 markdown_1.13 rlang_1.1.4
[65] gridtext_0.1.5 Rcpp_1.0.12 glue_1.7.0 xml2_1.3.6
[69] rstudioapi_0.16.0 jsonlite_1.8.8 R6_2.5.1 systemfonts_1.1.0