DAS Production Parameters Metrics; Webinar Today

Registered United States Census Bureau Logo
Privacy lock

Disclosure Avoidance System Production Parameters Metrics;

July 1 Webinar

On June 8, the Census Bureau’s Data Stewardship Executive Policy (DSEP) committee made decisions about the parameters for the 2020 Census Disclosure Avoidance System, setting the privacy-loss budget (PLB) for the redistricting data product (represented by “ε,” the Greek letter “epsilon”) of ε=19.61, which includes ε=17.14 for the persons file and ε=2.47 for the housing unit data. The increased PLB over the levels reflected in the April 2021 demonstration data was primarily allocated to the total population and race by ethnicity queries at the block group level and above. Additional changes to the parameters for the TopDown Algorithm (TDA) included changes to the implementation of geographic spine optimization.

In arriving at its decisions about the disclosure avoidance system’s parameters, DSEP considered a broad range of factors, including feedback provided to the Census Bureau by stakeholders in the weeks and months leading up to its decision.  In the days since the decision was made, stakeholders have continued to provide feedback on the decision, including asking the Census Bureau to provide some method for the public to assess how the parameters would impact the accuracy of the data.

Today, the Census Bureau released a set of metrics that will help data users understand how the parameters for the disclosure avoidance system will impact the P.L. 94-171 redistricting data file. In addition to those metrics, which are available here, we have created a small set of graphs to help data users visualize how DSEP’s decisions reflect stakeholder feedback.  

Join us for a webinar this afternoon (2pm ET) to discuss these metrics and the chosen parameters in more detail.

Addressing Tribal Geographies and Other Off-Spine Geographies

Stakeholder feedback from our April 2021 demonstration data identified a regression in the accuracy of data for tribal geographies and other off-spine geographies. The DAS team made changes to the ‘optimized spine’ to address these concerns; those changes were integrated into the spine that was approved by DSEP (see description of the optimized spine on p. 5, “Geographic Hierarchy,” in this fact sheet). These charts show how the production settings for the TDA impacts the error rates for tribal geographies; additional charts that show how the TDA impacts other off-spine geographies are included below.

1. Federal American Indian Reservation / Off-Reservation Trust Land: Total Population: Mean Absolute Error (MAE)

FAIRMAE3

2. Federal American Indian Reservation / Off-Reservation Trust Land: American Indian Alaska Native Alone Population: Mean Absolute Error (MAE)


FAIRTotPopMAE

3. Federal American Indian Reservation / Off-Reservation Trust Land: Race Alone Population: Mean Absolute Error (MAE)

FAIRrace2

Addressing Geographic and Population Count Bias

Stakeholders identified several measures of bias in the April demonstration data summary metrics that they indicated were areas of concern. Stakeholders addressed concerns about both geographic bias (i.e., the accuracy of population counts being different at larger and smaller geographies) and characteristic bias (counts of racially or ethnically diverse geographies being different than more racially or ethnically homogenous areas). The DAS team made changes to the post-processing system parameters to address these concerns; those changes were integrated into the parameters that were approved by DSEP. The following graphs help illustrate how the production settings for the Top Down Algorithm reflected this stakeholder feedback.

4. County: Total Population: Mean Absolute Percent Error (MAPE)

CountyMAPE

5. County: Total Population: Mean Absolute Percent Error (MAPE): Least Populous Counties (Population Under 1,000)

CountyUnder1000MAPE

6. County: Race Alone Population: Mean Absolute Error (MAE)

CountyRaceAloneMAE

7. Place: Total Population - Number Exceeding 5% Error - All Incorporated Places

PlaceOutliers

8. Rural Block: Total Population: Mean Absolute Error (MAE) - Rural Blocks

RuralBlockMAE

9. Urban Block: Total Population: Mean Absolute Error (MAE) - Urban Blocks

UrbanBlockMAE

Addressing Race and Ethnicity Bias

Data users identified a need for more accuracy in race and ethnicity statistics at many levels of geography. The DAS team addressed those concerns by allocating additional privacy-loss budget to the race and ethnicity queries at various levels of geography; those changes were integrated into the global privacy-loss budget and privacy-loss budget allocations that were approved by DSEP. The charts below show how the production settings for the Top Down Algorithm impacted the error rates for race and ethnicity statistics at various levels of geography.  (Charts 1-3 above depict accuracy of race data for tribal geographies.)

10. Tract: Hispanic x Race Alone Population - Mean Absolute Error (MAE)

TractHispRaceAlone

11. Tract: Hispanic x Race Alone Population - Mean Absolute Error (MAE) (cont.)

TractHispRaceAlone2

12. Place: American Indian Alaska Native Alone Population: Mean Absolute Percent Error (MAPE) - All Incorporated Places

PlaceAIANMAPE

13. County: Race Alone or in Combination (AOIC) Population: Mean Absolute Error (MAE) 

CountyRaceAOIC

14. County: Hispanic x Race Alone Population: Mean Absolute Error (MAE) 

ImprovementsCountyHispRaceAlone

15: County: Hispanic x Race Alone Population: Mean Absolute Error (MAE) (cont.)

ImprovementsCountyHispRaceAlone2

16. County: American Indian Alaska Native Alone Population: Mean Absolute Percent Error (MAPE) - All Counties

CountyAIANMAPE

Addressing Accuracy at the Place, Minor Civil Division, and Tract Levels

Data users identified a need for more accuracy at the place, minor civil division, and tract levels. The DAS team addressed these concerns both through changes to the optimized geographic spine and through allocation of privacy-loss budget; those changes were integrated into the privacy-loss budget allocations and system parameters that were approved by DSEP. These charts help visualize the improvements in accuracy at the place, MCD, and tract levels.

17. Minor Civil Division: Total Population: Mean Absolute Error (MAE) - All Minor Civil Divisions

MCDTotalPop

18. Place: Total Population: Mean Absolute Percent Error (MAPE) - All Incorporated Places

PlaceMAPE

19. Place: Total Population: Mean Absolute Percent Error (MAPE) - Least Populous Incorporated Places (Population Under 500)

PlaceUnder500MAPE

20: Tract: Total Population: Mean Absolute Error (MAE) - Tracts

TractMAE

Addressing Accuracy of Occupancy Rates

Data users identified a need for more accurate statistics on occupancy rates at the block group and higher levels of geography. The DAS team addressed those concerns by allocating additional privacy-loss budget to the housing unit data; that change was integrated into the global privacy-loss budget and privacy-loss budget allocations that were approved by DSEP.  The charts below help visualize how these changes to the global PLB and PLB allocations impacted the accuracy of occupancy rates.

21. Various Geographies: Occupied Units: Mean Absolute Error (MAE)

OccUnits

22. Various Geographies: Occupied Units: Mean Absolute Error (MAE) - Including October 2019 Demonstration Data

OccUnitsInclOct

23. County: Group Quarters Population by Major Group Quarters Type: Mean Absolute Error (MAE)

CountyGQ

24. Tract: Group Quarters Population by Major Group Quarters Type: Mean Absolute Error (MAE)

TractGQ

25. Various Geographies: Population Aged 18 Years and Over: Mean Absolute Error (MAE)

VotingAgePop

DP Video Image

New Video! 

See and share our new video: "Protecting Privacy in Census Bureau Statistics"


2021 Key Dates, Redistricting (P.L. 94-171) Data Product

By August 16:

  • Release 2020 Census P.L. 94-171 data as Legacy Format Summary File*.

September:                 

  • Census Bureau releases PPMFs and Detailed Summary Metrics from applying the production version of the DAS to the 2010 Census data.
  • Census Bureau releases production code base for P.L. 94-171 redistricting summary data file and related technical papers.

By September 30:         

  • Release 2020 Census P.L. 94-171 data** and Differential Privacy Handbook.

*   Released via Census Bureau FTP site.

** Released via data.census.gov.


Was this forwarded to you?

Sign up to receive your own copy!

Sign Up!


Useful Links:


Contact Us

About Disclosure Avoidance Modernization

The Census Bureau is protecting 2020 Census data products with a powerful new cryptography-based disclosure avoidance system known as “differential privacy.”  We are committed to producing 2020 Census data products that are of the same high quality you've come to expect while protecting respondent confidentiality from emerging privacy threats in today's digital world. 

 

Share This