Objectives
- Propose, calculate, and showcase a first of its kind Housing Loss Index to visualize the scale and breadth of housing instability and displacement across the United States. The index, the first of its kind, ranks U.S. counties based on their combined eviction and foreclosure rates, illustrating the severity of housing loss during a 3-year period between 2014 and 2016
- Create a reproducible data pipeline for cleaning and consolidating eviction, mortgage foreclosure, tax lien and US Census data for the nation and 3 specific “Deep Dive” Sites, cities where the Future of Property Rights Program maintains connections with on-the-ground experts and local officials (Forsyth, NC; Marion, IN; and Maricopa, AZ)
- Perform detailed exploratory data analysis on this assembled dataset data; generate charts, maps and statistics related to data quality issues; examine relationships between socioeconomic factors and housing loss; explore national housing loss patterns and census-tract level patterns for the three deep-dive locations
- Create final, processed datasets combining available data to be used to generate compelling maps that will help policymakers understand housing loss in their own communities
Findings
- Compiled housing loss index showcases the proportion by which renters and homeowners are impacted by these issues and can serve as a guideline for policymakers as to where these problems are most severe
- Detected from correlation analysis that counties with predominantly non-white households see higher rates of evictions and overall housing loss than those with predominantly white households. Households below the poverty line have the strongest relationships with eviction rate; and in the three deep-dive locations, census tracts where residents lacked health insurance and census tracts in which more residents took public transit to work had higher rates of housing loss. The next phase of the project will explore these relationships in greater detail
- Detected that some counties or even states have ‘data deserts’, in which some or all face a scarcity of data on evictions as well as mortgage foreclosures. This is significant because it indicates one or both of: a lack of systematic collection and storage of this type of data, and a lack of transparency about the issue, which results in the underrepresentation of the true extent of housing loss in these areas
Question
Housing insecurity is a looming crisis in the United States: nearly 5 million Americans lose their homes through eviction and foreclosure. The onslaught of COVID-19 and subsequent economic depression is leaving many without a sustainable plan to cover rent or mortgages. Though its ultimate impact will take years to clearly understand, we know that the loss of property and forced displacements are intensely traumatic financially, physically, and emotionally.
The Future of Property Rights Program (FPR) at New America aims to help solve today’s property rights challenges by shrinking the gulf between technologists and policymakers. The Property Loss in America project shines a light on this deep and pervasive issue by communicating the extent of property insecurity and loss across the country and telling the stories of those affected. This project is but one component of FPR’s larger strategy focusing principally on extracting insights from the novel data sources for eviction and mortgage foreclosure coupled with available census data. This overall initiative strives to present a systematic and comprehensive view of the ways in which people lose their homes. Where is forced displacement most acute? Why does housing loss occur? Who is most at risk? And what happens to people after they lose their homes? By explaining the mechanisms of displacement, FPR helps elevate this multifaceted problem and provide data-driven resources to municipal leaders so they may better understand where the pandemic might exacerbate already established patterns of housing loss.
What Happened

Source: Eviction Lab & ATTOM
Led by Data Ambassador Dona Stewart, the DataKind team of data scientists included Alice Feng, Anurag Gandhi, Diana Lam, Shreya Vaidyanathan, and Dominic White. They performed an analysis of 4 different geographic regions: the United States (referred to as national level), and three county ‘deep dives’: Forsyth County (NC), Marion County (IN) and Maricopa County (AZ). The key difference between national and deep dive sites was granularity and availability of data. In the deep dive sites, the team had access to compiled data at the census tract level and, in some cases, had more detailed property information such as the reason for foreclosure, such as a “tax lien foreclosure.” At the national level, the team had access to county-level data on evictions and mortgage foreclosure.
Diving into the data, the team adopted a cross-functional and hands-on approach to tackle this project. First, they began with an exploratory data analysis of all the information that was available to understand the data and showcase trends and insights at the National level and in specific deep-dive locations. One initial finding was that data availability varies widely and many locations did not have both foreclosure and eviction data available – in fact, only two-thirds of counties in the United States had data on both types of loss. The lack of data not only makes it difficult to assess the true severity of housing loss in those areas but it leaves out communities in need of aid. One immediate outcome of this work was that the FPR team was able to meet with local leaders from the areas affected by severe housing loss, and consequently, help them allocate CARES funding for COVID-19 related housing disruption.
Next, the team focused on building prototypes of these insights as maps and charts to present a clear picture of their insights. Following DataKind’s design principle of intense co-creation with partners, the team shared all intermediate observations and analysis with FPR to receive timely feedback and posed questions to the domain experts on any quirks or anomalies spotted in the data. They generated correlation analysis to find key insights and trends between the datasets. After this, they were able to present a holistic view of the three metrics – eviction rate, foreclosure rate, and overall housing loss rate – together in all the geographic areas of our interest.
Once successful in showcasing the trends and aligning on understanding of the data, the team’s next goal was to develop a first of its kind national housing loss index, a single metric that could be used to compare the severity of an individual county’s housing loss to the national average. This approach, which first involved combining evictions and foreclosures into a total housing loss rate, then creating an index by comparing a county’s housing loss rate to the national housing loss rate, has several benefits: it’s intuitively understandable, easy to compare counties to each other, and it eliminates the need to consider populations when analyzing housing loss – a county in which 2,000 out of 10,000 families lose their homes will be ranked higher in the index than a county with 20,000 losses out of a population of two million. Ultimately, by identifying and examining which places have traditionally experienced the most acute housing loss, we can predict where future housing loss will occur and who will be impacted, and direct resources to prevent the harm before it proliferates.
In addition to developing a housing loss index, they performed exploratory correlation analyses for both the national data and the deep-dive sites in order to understand which demographic or socioeconomic variables from the American Community Survey 5-year estimates might exhibit strong relationships with housing loss. Counties or tracts with high proportions of non-white residents, counties (tracts) with high proportions of residents without health insurance, and high rent burden tended to be associated with increased evictions and foreclosures. Households below the poverty line have the strongest relationships with eviction rate; and in the three deep-dive locations, census tracts where residents lacked health insurance and census tracts in which more residents took public transit to work had higher rates of housing loss. The next phase of the project will explore these relationships in greater detail.
Finally, they set up a robust and consistent pipeline for ingesting the data and producing the metrics that FPR required. Data engineering work such as this enables FPR to augment this analysis with new data in the future, and continue to expand their important work.
Our team documented all this work in a user guide and final report so that the end user can execute the pipeline and regenerate the processed datasets and metrics once new raw data sets become available.

Source: New America
What’s Next
- In June 2020, FPR presented key results from the Marion, IN analysis to the Deputy Mayor of Indianapolis, who used the findings to help allocate housing assistance from the city’s CARES relief fund
- In September 2020, FPR also presented some local findings to Corey Woods, the mayor of Tempe, AZ – a city experiencing larger than average housing loss – in an effort to help distribute relief to residents in need.
- FPR presented their findings to the Office of Phoenix Mayor Kate Gallego and Office of the Arizona Governor Doug Ducey.
- In September 2020, FPR released its Displaced in America report, which includes maps and analysis of housing loss—via evictions and mortgage foreclosures—across the United States
DataKind and FPR intend to build toward a national dataset and public good that produces census tract level housing loss maps in as many counties as possible and create an accessible tool that would allow counties across the U.S. to generate similar insights
Read More
New America Weekly: Displaced in America
Displaced in America: In the News
‘Groundbreaking’ U.S. housing data hailed as new tool to target COVID-19 aid
There’s a Looming Eviction Crisis, and We Have No Idea How Bad It Will Be



