Grants

Urban Institute

To build synthetic tax datasets for use in social science research

  • Amount $500,000
  • City Washington, DC
  • Investigator Claire Bowen
  • Initiative Empirical Economic Research Enablers (EERE)
  • Year 2022
  • Program Research
  • Sub-program Economics

While tax data is highly sought after by social scientists, it is costly, sensitive, and difficult to access. The IRS has historically released public-use files—privacy-protected databases of sampled individual income tax returns—but has stopped producing them due to high costs and high vulnerability to re-identification attacks. This grant provides ongoing support for Claire Bowen at the Urban Institute, who is working with the IRS to develop synthetic versions of individual income tax return data. Synthetic data has mathematical and statistical properties that are similar to those of the real data, but that contains almost no private information from the original dataset. Grant funds will allow Bowen to continue developing two synthetic datasets, making substantial methodological improvements and exploring the application of differential privacy methods to assess the privacy attributes of this methodology. In addition, Bowen will make open-source code available on GitHub, document the methodology for use by other agencies, and disseminate the work through a white paper, blog posts, presentations, and journal articles.

Back to grants database
We use cookies to analyze our traffic. Please decide if you are willing to accept cookies from our website.