Data Visualization and Exploration

ggplot II

Review

Last time we introduced the ggplot package (philosophy).

Special assignments1 (1/2)

Identify a “research” topic in an area of personal interest.

  1. Research common visualization types or styles in that area.
  • Find one (or more) examples of interest.
  • As you did recently, create a document that contains your example visualization(s).
  • Attempt an honest discussion the effectiveness of the visualization(s).
  • As appropriate (do one or both of),
    • suggest improvements to the visualization based on your class experiences.
    • identify elements you would like to incorporate in your own work.

Special assignments (2/2)

Considering some research topic in an area of personal interest.

  1. Locate a source or sources of publicly-available data1.
  • Searching on the internet (e.g., https://github.com/datasets, https://datadryad.org, https://kaggle.com2) find a total of three datasets with some overlap to your interest.
  • Find available documentation to learn about the variables and the history of the data.
  • Soon this will be developed into your “semester project” (i.e., “final project”), but begin searching now.

Goal for Today

We will revisit with our “graph groups” from a few weeks ago.

  • Reflect on, assess, and edit (as necesary) the graphs you produced together.
  • Ensure that all group members are comfortable with the context, content, and code of the graph.
  • Regenerate your graph using ggplot() tools, if possible.
  • Other – consider incorporating new datasets or visiting with other groups.

These will be shared by the end of the week to D2L and used as opening slides next week for a brief, relatively informal, Data Viz “Fashion Show”.

Brief content reminder, a break to work, then “normal class”.

“plot annotation”

The idea of annotation as a layer was another fundamental feature of the “grammar”.

labs() (axes and titles)

One more helpful thing to consider at this point would be to use

... + labs(x = ..., y = ..., 
           title = ..., subtitle = ...)

as the analog to mtext() from base R.

With this you can do meaningful annotation.

geom_text() (in-plot text)

After that you might need to annotate (using text) within the space of the graph.

... + geom_text(mapping = aes(label = ...), ...)
  • The RHS of label = ... should be a variable name.
  • The optional arguments after could include, among others,
    • hjust = ... with 0 (left-justified) or 1 (right-justfied) as options.
    • nudge_x = ... (or _y) as numbers for text placement adjustment.
    • parse = ... with TRUE to interpret text as (porentially)hopefully) properly-formatted mathematical expressions.

Used in combination with geom_text(), geom_label() may make text more readable by providing a “background”.

... + geom_label(mapping = aes(label = ...), ...)

annotate() (open-ended)

Alternatively, annotate() with a variety of options to do many things1.

... + annotate("geom", x = ..., y = ..., ...)

Here "geom" could be replaced (literaly) by one of,

  • "text", with x, y, and label
  • "rect", with “corners” fully-specified by xmin = ..., xmax = ..., ymin = ..., ymax = ...
  • "segments", with “endpoints” fully-specified by x = ..., xend = , y = ..., yend = ...
  • "point", with x, y, and likely color and size

Other ideas

Including just the list of "geom" choices that are compatible with annotate() (e.g., "curve" for curved annotation arrows), there are a lot of special functions for individual tasks.

Feel free to start browsing online resources like https://ggplot2-book.org/ (if you haven’t been) for examples.