U.S. flag

An official website of the United States government

Skip Header
Webinars
Resources
REQUEST A DATA TRAINING

Webinars
Live and recorded classes led by Census Bureau instructors on a variety of topics.

Census Data With R

Using the R Package RankingProject to Make Simple Visualizations for Comparing Populations

Developed and presented by Jerzy Wieczorek.

 

Skill Level: Advanced

Duration: 1-2 hours

Description

This course introduces the "RankingProject" package in R, which accompanies "A Primer on Visualizations for Comparing Populations, Including the Issue of Overlapping Confidence Intervals" (Wright, Klein, and Wieczorek, 2018). In comparing a collection of K populations, it is common practice to display K confidence intervals (CIs) for the corresponding population parameters on a single graph. For a pair of CIs that do (or do not) overlap, many viewers find it natural to declare that there is not (or there is) a statistically significant difference between the two corresponding parameters, even though it is well known that this interpretation is not strictly correct.

We will discuss several alternative visualizations designed to help data users avoid this common misinterpretation. CIs for differences from a baseline make the reference population explicit. "Comparison intervals" show a CI for the reference as well as CIs for its difference with other populations. "Shaded columns plots" show the statistical significance of differences directly. Goldstein-Healy adjusted CIs show a confidence level chosen such that overlap (non-overlap) of CIs does indeed imply non-significance (significance) of differences at an "average significance level" across all possible pairwise comparisons. Two-tiered error bars allow us to show several types of CIs at once.

We will justify and recommend use-cases for each of these plots. Finally, we will demonstrate how to produce them in R with the RankingProject package, illustrating its usage on several U.S. Census Bureau datasets with a variety of population types and demographic variables.

Who Should Take this Course?

Data Analysts, Data Scientists and developers who wish to learn more about how to use Census Data with R to create visualizations.

Instructor

Jerzy Wieczorek is an Assistant Professor of Statistics at Colby College. His research focuses on model selection and assessment, from cross-validation in high-dimensional settings to multiple comparisons-corrected visualization of estimates with uncertainty.

Course Materials

 

 

Beginning of Course

 

 

Module 1: Motivations

In this module you will learn about:

  • Motivations
  • Reviewing ranking tables, statistical significance and confidence intervals
  • How to best visualize and analyze ranking tables

Module 2: Visualization

In this module you will learn about:

  • Plotting ranking tables and statistical significance
  • Plotting Comparison intervals
  • The Goldstein-Healy Concept
  • Two-Tiered Confidence Intervals (CIs)

Module 3: R Package Ranking Project

In this module you will learn about:

  • Datasets Structure and Formatting
  • Setting up a Table for Plots of CIs for Differences
  • Cleaning up and Modifying Plots
  • Where to Access the R Package Ranking Project

Congratulations! You finished the course!

Share your accomplishment on your social media by downloading this certificate or badge!

Copy Text:

I just finished the Census Data With R Course with Census Academy! Be sure to check out their free resources! 

#CensusBureau #CensusData #CensusAcademy #AmericanCommunitySurvey #ACS

Page Last Revised - November 4, 2022
Is this page helpful?
Thumbs Up Image Yes Thumbs Down Image No
NO THANKS
255 characters maximum 255 characters maximum reached
Thank you for your feedback.
Comments or suggestions?

Top

Back to Header