Cambridge, Massachusetts - Projects that target aid toward villages and rural areas in the developing world often face time-consuming challenges, even at the most basic level of figuring out where the most appropriate sites are for pilot programs or deployment of new systems such as solar-power for regions that have no access to electricity. Often, even the sizes and locations of villages are poorly mapped, so time-consuming field studies are needed to locate suitable sites.

Now, a team of graduate students at MIT and a social-service group of data scientists have come up with a way of automating parts of that evaluation process, by developing software that can identify houses and even types of houses from readily-available satellite imagery — potentially saving considerable time that would otherwise be spent sending teams from village to village. Their findings have now been published in the journal Big Data.

The multidisciplinary team came together in the course of discussions at MIT’s Sidney Pacific graduate dormitory, explains team member Brian Spatocco: “We started talking about this problem, and we realized we all had skills that were relevant.” The team’s proposal gained them a $10,000 prize last year from the MIT IDEAS Global Challenge, which helped get the project rolling and enabled team members to visit rural areas in India last summer to test their image-processing system against conditions on the ground.

The group, which grew to include an MIT alumnus at New York-based DataKind DataCorps and another researcher there, focused on two initial projects, in India and Africa, though they stress that their software solutions could be applied to many other kinds of projects and other regions.

Selecting villages for aid

The first project was to select villages in sub-Saharan Africa for a program of unrestricted cash grants to help people in low-income rural areas improve their standard of living by enabling them to buy equipment, livestock, or whatever they felt best met their needs. The system adopted by the grant-giving agency was to target the poorest villages, selected by counting the percentage of houses with thatched roofs compared with those topped by more expensive metal roofs — a task that had been carried out by fieldworkers on the ground.

The second project was selecting villages in rural parts of India for installation of microgrids to supply electricity from solar panels and battery-storage systems, and then figuring out the optimum sites for those panels and the most efficient network configuration for distributing that power.

In both cases, the key first element is automating the task of figuring out where the buildings are within a satellite image. For this research, the team used two kinds of satellite imagery: Google Earth, which has three color “channels” in their images, corresponding to red, green, and blue, and commercial satellite imagery that also includes a near-infrared channel that provides additional information for detecting vegetation and other features.

Identifying structures

The process begins by having people examine the satellite images visually and pick out the houses. These manually-selected examples are then entered in as training data for a machine-learning system that attempts to generalize the criteria for determining what is a house and what isn’t, and then “can try to predict, in a new image,” where the houses are, says George Chen, a co-author of the paper. One of the challenges, he explains, is that it’s not always clear whether a given structure is two houses close together, or two parts of the same building. In other cases, “the house color is similar to the ground color,” though that’s less common, he says.

But as more examples get processed by the system, “over time, the computers can learn from the hand-picked set” and get better at figuring out where the houses are, Spatocco says. Then, in the case of the African aid project, which is currently making unconditional cash transfers in villages in Kenya and Uganda, an additional step is used to distinguish houses that have metal roofs, which are much more reflective than thatched ones.

In the project for installing microgrids in India, once the locations of houses are determined, the computer runs thousands of different variations of where solar panels, battery packs, and distribution wires could be located. This allows the team to pick the configurations that can provide power to the greatest number of houses with the least wiring needed, to minimize the costs. The program can also select configurations based on other local criteria, such as a village that specifically wants its solar panels in a particular location.

The team says that the general algorithms they’ve developed could have many other uses beyond the two specific projects they initially tested. For example, there is little data on demographic changes in India, in terms of which areas have gained or lost population and by how much, and Spatocco says “this could be an extremely powerful tool” for analyzing those population shifts by automated tracking of where houses are and how that changes over time. “It could answer deep questions about these demographic dynamics,” he adds.

As the project continues, four villages will be selected in India for the next phase of testing: Two will have solar microgrids installed using existing methods, and two will have them installed using the patterns selected by the software. These villages, selected to be as closely matched as possible, will then be compared over time for the actual costs and performance of the systems, to determine exactly how much benefit can be gained from the new approach.

“We're hoping that public agencies eventually see the wisdom of mapping 100 million rural households in developing countries,” says Stewart Craine, chair of the UN Foundation’s mapping group and head of, a company that offers satellite-based mapping services for development organizations, but was not associated with this project. “Preliminary mapping can reduce wasting expensive field-time mapping households, and spend more on village discussions and fine-tuning of the preliminary desktop design,” he explains. Overall, he says, this paper is “an excellent contribution” to the field.

The team also included MIT graduate students Kendall Nowocin, Vivek Sakhrani, and Ling Xu, as well as Kush Varshney SM ’06, PhD ’10, and Brian Abelson, both of Datakind DataCorps. The team worked with Bangalore-based nonprofit SELCO Foundation, which is carrying out the microgrid installations in India, and with GiveDirectly for the cash transfer program in Africa. The research received mentorship and support from the Tata Center for Technology and Design at MIT, which is part of the MIT Energy Initiative.