Loading [MathJax]/jax/output/HTML-CSS/jax.js
+ - 0:00:00
Notes for current slide
Notes for next slide

gghdr: Graphing highest density regions
using grammar of graphics


Sayani Gupta
Sayani07     SayaniGupta07
https://sayanigupta-useR2020.netlify.com/

useR! 2020

1 / 15

Graphing distribution summaries

  • several ways to summarize a distribution
  • R package ggdist for visualizing distributions and uncertainty

2 / 15

Different summarization expose different features

  • Boxplot not suitable for summarizing multi-modal distributions
  • often useful to display different summarization in unison

3 / 15
4 / 15

Highest Density Region (HDR) plots

If f(x) is the density function of a random variable X, then the 100 (1α) HDR is the subset R(fα) of the sample space of X such that R(fα)={x:f(x)fα}, where fα is the largest constant such that Pr(XR(fα))1α).

5 / 15

Graphing HDRs using R package hdrcde

Data set: Waiting time between eruptions and the duration of the eruption for the Old Faithful geyser in Yellowstone National Park, Wyoming, USA.

faithful %>% as_tibble
## # A tibble: 272 x 2
## eruptions waiting
## <dbl> <dbl>
## 1 3.6 79
## 2 1.8 54
## 3 3.33 74
## 4 2.28 62
## 5 4.53 85
## 6 2.88 55
## 7 4.7 88
## 8 3.6 85
## 9 1.95 51
## 10 4.35 85
## # … with 262 more rows

6 / 15

Graphing HDRs using R package hdrcde

7 / 15

Extending ggplot2

  • HDR is a novel technique for summarizing distribution for which plotting is not implemented in ggplot2
  • ggplot2 creates graphics based on The Grammar of Graphics
  • graphics are built step by step by adding new elements that allows for extensive flexibility and customization of plots
  • need to extend the functionality of ggplot2
8 / 15

Package gghdr

  • implement HDR plots in ggplot2
  • key elements: geom and stats
  • inspired from R package hdrcde developed by Rob Hyndman

9 / 15

HDR boxplots

library(hdrcde)
hdr.boxplot(faithful$eruptions)


library(gghdr)
library(ggplot2)
ggplot(faithful, aes(y = eruptions)) +
geom_hdr_boxplot()

10 / 15

HDR rug plots and scatter plots

faithful %>%
ggplot(aes(x = waiting,
y = eruptions)) +
geom_hdr_rug(fill = "blue") +
geom_point()

faithful %>%
ggplot(aes(x = waiting,
y = eruptions)) +
geom_point(aes(colour =
hdr_bin(x = waiting,y = eruptions)))

11 / 15

Keep combining - HDR box + jitter

faithful %>%
ggplot(aes(y = eruptions)) +
geom_hdr_boxplot(fill = c("blue")) +
geom_jitter(aes(x = 0))
  • jitter to supplement the insight drawn from the HDR boxplot

12 / 15

Keep combining - HDR scatter + HDR marginal

faithful %>%
ggplot(aes(x = waiting, y = eruptions)) +
geom_point(aes(colour = hdr_bin(x = waiting, y = eruptions)))+
geom_hdr_rug()
  • Both bivariate and marginal HDRs displayed at once
  • Bimodality in both marginal and bivariate distributions

13 / 15

Authors

14 / 15

More Information

Package: https://github.com/ropenscilabs/gghdr

Slides: https://sayanigupta-useR2020.netlify.com/

Materials: https://github.com/Sayani07/useR2020

Slides created with Rmarkdown, knitr, xaringan, xaringanthemer

15 / 15

Graphing distribution summaries

  • several ways to summarize a distribution
  • R package ggdist for visualizing distributions and uncertainty

2 / 15
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow