With the arrival of the year 2020, it will be time for America’s residents to stand up and be counted.
The census procedure has been undergoing an overhaul, designed to protect people’s privacy as much as possible.
To preserve confidentiality, the bureau’s directors have decided that they need to adopt a “formal privacy” approach, one that adds uncertainty to census data before it is published and achieves privacy assurances that are provable mathematically.
In the past, the census has always added some uncertainty to its data but a key innovation of this new framework, known as “differential privacy,” is a numerical value describing how much privacy loss a person will experience.
Mathematical breakthroughs, easy access to more powerful computing, and widespread availability of large and varied public data sets have made the bureau reconsider whether the protection it offers Americans is strong enough.
As the U.S. Supreme Court weighs whether the Trump administration can ask people if they are citizens on the 2020 census, the bureau is seeking comprehensive systems to guard against privacy exposures.
At the root of census problems are the tables of aggregate statistics that the bureau publishes. There are hundreds of tables — for example, sex by age or ethnicity by race — summarizing the population at a number of levels of geography, from regions the size of a city block all the way up to the level of a state or the nation.
In 2010, the bureau released tables with nearly eight billion numbers in all. That was about 25 numbers for each person living in the United States, even though residents were asked only 10 questions about themselves.
Two researchers developed a “database reconstruction theorem” which provides a road map for someone to turn collections of summary tables into approximate records on individuals.
For the census, this is particularly worrisome, especially if a question about citizenship is added to the 2020 census, as the Trump administration has proposed.
The Census Bureau has been an early adopter of differential privacy. Still, instituting the framework on such a large scale is not an easy task, and even some of the big technology companies have had difficulties.
One professor wrote in a paper, that with the number of private services offering personal data, a prospective hacker would have little incentive to turn to public data such as the census “in an attempt to uncover uncertain, imprecise and outdated information about a particular individual.”
Americans are urged to cooperate with census takers as they pursue their tasks so that the nation can boast of a highly accurate imprint about the people who live in the United States.