Photo Credit: Alison Bert for Elsevier
When a natural disaster strikes, knowing where people and buildings are is of the utmost importance for saving lives. The World Bank Global Facility for Disaster Reduction and Recovery (GFDRR) and DataKind teamed up to understand how might satellite imagery and machine learning aide in disaster risk management and improving resilience for vulnerable communities. DataKind’s DataCorps team of pro bono data scientists completed an assessment of current literature on how satellite imagery could improve disaster relief efforts using image analysis and convolutional neural networks, and then made recommendations for how the GDFRR could scale efforts using their imagery in Sri Lanka to build and implement detection models.
The GFDRR, established in 2006, is a multi-donor partnership and grant-making financing mechanism. The team supports on-the-ground technical assistance to help developing countries integrate disaster risk management (DRM) and climate change adaptation into development strategies, policies and investment programs, including post-disaster recovery and reconstruction. Broadly, the GFDRR is tasked with rapidly responding to natural disasters and protecting vulnerable populations from harm. As you can imagine, they have a slew of obstacles to helping folks in the uncertain moments that surround a natural event; information flows in from numerous sources, the landscape changes unexpectedly, and all needs are suddenly urgent.
The GFDRR wondered if machine learning and artificial intelligence could help speed up their work, make it more effective, and thus save more lives. The big “ask” was “is it possible to identify where people are at risk in a natural disaster, as measured through the locations and types of buildings in various countries?” If possible, this approach would assist the team in understanding where buildings exist and, potentially, the building type. Knowing building location and type provides GFDRR with a population estimate so they know where to distribute resources in the case of a disaster. For example, if a typhoon strikes during the day, additional resources would be deployed to schools and business centers.
The first step in solving this problem was to find relevant and available data, and the team was excited to find a numerous non-proprietary data sets that could be used in initial models to detect buildings. The common tool is the USGS EarthExplorer. Within EarthExplorer, forty years of LANDSAT imagery (30m resolution) are available. The Copernicus mission images (10m) are also available from the same source. The team was able to identify additional resources that are either open access or free for researchers/NGOs, shown in Table 1. Additional labeled data resources identified were the Spacenet Challenge data, the Urban Environment dataset, and the Urban Atlas project. All are viable resources for model training data. With these data sets in hand, the DataCorps team was able to dig into possible models.
An example application of these data sets with code examples developed using the keras deep learning library is provided in the delivered literature review document. This document also provides an overview of data science applications of satellite imagery in academia. It showcases the ideas behind models that can tell us what is in an image and where in the image an object is as well as a context for understanding the vast array of available blog posts, tutorials and MOOCs on image detection.
The team next developed recommendations explaining how the GDFRR could build a detection model for Sri Lanka using summer 2018 proprietary imagery provided by the World Bank. This document was created for a project manager to be able to source a team with the appropriate skills, pre-process the data, execute the detector model, tabulate building counts by geography, and calculate the accuracy of the model. The World Bank now has a roadmap to build a detection model to improve their disaster management efforts.
A key highlight is the project step of data processing – essential for any machine learning project, but often overlooked. The team highlighted a transformation process to standardize file size, name, and spatial extent for decision-making. An example of this is shown in Figure 1 below. The overall process could be applied to any data set, allowing the World Bank to follow a template for model development across geographies, rather than customizing the approach by individual risk assessment team – an efficiency win!
The DataCorps team collected and delivered open source datasets, a developed a roadmap to create a detection model, and proved that building detection for risk assessment is, in fact, possible! Additionally, the broad exploration that the team engaged in shows the value of transfer learning across geographies, as well as the possibility of applying directing to a dataset and geography of interest in Sri Lanka.
As the first machine learning exploration for GFDRR, the outcomes have encouraged the partner team to continue to develop the prototype models and continue to expand their knowledge of the possibilities of how satellite and drone imagery could help them save lives.
The reports are accessible here. Please note they are living documents and will undergo minor changes. All version histories are tracked within github.
Thank you to our partner the Elsevier Foundation for providing the funding to make this critical work possible!