We are hiring thousands of people for the 2020 Census. Click to learn more and apply.

Skip Header

Component ID: #ti847798910


Census Data with R

Component ID: #ti2062356887

Using the R Package RankingProject to Make Simple Visualizations for Comparing Populations

Developed and presented by Jerzy Wieczorek.

 

Skill level: Advanced

Duration: 1-2 hours

Component ID: #ti346377194

Description

This course introduces the "RankingProject" package in R, which accompanies "A Primer on Visualizations for Comparing Populations, Including the Issue of Overlapping Confidence Intervals" (Wright, Klein, and Wieczorek, 2018). In comparing a collection of K populations, it is common practice to display K confidence intervals (CIs) for the corresponding population parameters on a single graph. For a pair of CIs that do (or do not) overlap, many viewers find it natural to declare that there is not (or there is) a statistically significant difference between the two corresponding parameters, even though it is well known that this interpretation is not strictly correct.

We will discuss several alternative visualizations designed to help data users avoid this common misinterpretation. CIs for differences from a baseline make the reference population explicit. "Comparison intervals" show a CI for the reference as well as CIs for its difference with other populations. "Shaded columns plots" show the statistical significance of differences directly. Goldstein-Healy adjusted CIs show a confidence level chosen such that overlap (non-overlap) of CIs does indeed imply non-significance (significance) of differences at an "average significance level" across all possible pairwise comparisons. Two-tiered error bars allow us to show several types of CIs at once.

We will justify and recommend use-cases for each of these plots. Finally, we will demonstrate how to produce them in R with the RankingProject package, illustrating its usage on several U.S. Census Bureau datasets with a variety of population types and demographic variables.

Component ID: #ti2142069875

Who Should Take this Course?

Data Analysts, Data Scientists and developers who wish to learn more about how to use Census Data with R to create visualizations.

Component ID: #ti1432606511

Instructor

Jerzy Wieczorek is an Assistant Professor of Statistics at Colby College. His research focuses on model selection and assessment, from cross-validation in high-dimensional settings to multiple comparisons-corrected visualization of estimates with uncertainty.

Course Materials

Component ID: #ti1678380557

Module 1: Motivations

In this module you will learn about:

  • Motivations
  • Reviewing ranking tables, statistical significance and confidence intervals
  • How to best visualize and analyze ranking tables

Component ID: #ti1964723731
Component ID: #ti1462200451

Module 2: Visualization

In this module you will learn about:

  • Plotting ranking tables and statistical significance
  • Plotting Comparison intervals
  • The Goldstein-Healy Concept
  • Two-Tiered Confidence Intervals (CIs)

Component ID: #ti776893422
Component ID: #ti411664109

Module 3: R Package Ranking Project

In this module you will learn about:

  • Datasets Structure and Formatting
  • Setting up a Table for Plots of CIs for Differences
  • Cleaning up and Modifying Plots
  • Where to Access the R Package Ranking Project

Component ID: #ti173635227

 

Back to top


 

X
  Is this page helpful?
Thumbs Up Image Yes    Thumbs Down Image No
X
Comments or suggestions?
No, thanks
255 characters remaining
X
Thank you for your feedback.
Comments or suggestions?
Back to Header