It sounds like a monumental task. Take the 164 million photos taken of America’s roads and neighborhoods for Google’s Street View and identify in each picture the environmental characteristics like the type of buildings, roads, and sidewalks.
It is certainly impossible to do by hand, but not for a computer. So it was the task of University of Utah electrical and computer engineering professor Tolga Tasdizen (pictured) and his team to develop a computer vision and machine learning model to perform the job automatically. They created the model a couple of years ago to help understand the areas in the U.S. with a prevalence of issues such as obesity and depression. But now it was used in a new project to identify the correlation of COVID-19 infection rates with areas based on their environmental attributes.
The results of a study that examined the infection rates based on the environmental makeup of each zip code were published this month in the International Journal of Environmental Research and Public Health. The study was authored by University of Maryland epidemiology and biostatistics assistant professor Quynh Nguyen and Tasdizen.
While Tasdizen and Nguyen are careful to note that this study does not show a causation between coronavirus infection rates and the makeup of an area, it does reveal interesting correlations. For example, a zip code with more sidewalks had 40% more COVID-19 cases, according to the study. Zip codes with more multi-family housing structures had 21% more virus cases. Meanwhile, areas with more single-lane roads had a decrease of COVID-19 cases as well as areas that were greener and had more trees. The model also identified zip codes with more dilapidated structures and telephone and power lines, which demonstrated more densely populated areas. Those regions had a slightly higher number of virus cases.
“We found that indicators like greater urban development (mixture of residential and commercial buildings, multiple lanes of traffic), walkability (which may increase contact), and greater physical disorder were related to more coronavirus cases,” the paper stated. “Our study results can help inform population-based strategies to mitigate COVID-19 risk. A higher level of caution can be recommended for the reopening of communities with a heightened level of risk due to their neighborhood design.”
So how did Tasdizen get computers to do in a fraction of what would have taken humans years to do by hand? Unfortunately, it did involve some manual data entry.
A team of graduate students took a sample of 18,000 Google Street View images from cities such as Chicago and Salt Lake City and identified the number of sidewalks, types of housing structures, roads and other indicators in each photo, a laborious job that took the students a couple of months. Google Street View images are a series of ground-level photos taken along just about every road in the United States that are digitally stitched together so users can see a 360-degree view from any spot on a road. The feature can be viewed on Google Maps.
From that initial data set, Tasdizen’s team formed a “training set” to develop the computer vision and machine learning model that could be used to automatically identify the same environmental traits in any Google Street View picture. The team then used another part of the data called the “validation set” to confirm the results of the model.
“Once we were comfortable with that, then we applied this to the 164 million Google Street View images sampled from the entire United States,” Tasdizen said. The resulting table of new U.S.-wide data was analyzed with COVID-19 infection rates obtained from state and county governments.
The model was created two years ago when Nguyen, an epidemiologist who used to work with the University of Utah, asked Tasdizen to create it to help identify areas with higher incidents of obesity, depression, and other conditions. Earlier this year, she asked him to use the model for COVID-19 cases for this new project. The research was funded by the National Library of Medicine of the National Institutes of Health
Tasdizen hopes his machine learning and computer vision model can be used to examine other issues plaguing neighborhoods, such as socioeconomic-related problems and other viral outbreaks.