Google is giving the world a clearer glimpse of exactly how much it knows about people everywhere -- using the coronavirus crisis as an opportunity to repackage its persistent tracking of where users go and what they do as a public good in the midst of a pandemic.
In a blog post today, the tech giant announced the publication of what it's branding COVID-19 Community Mobility Reports, an in-house analysis of the much more granular location data it maps and tracks to fuel its ad-targeting, product development and wider commercial strategy to showcase aggregated changes in population movements around the world.
The coronavirus pandemic has generated a worldwide scramble for tools and data to inform government responses. In the EU, for example, the European Commission has been leaning on telcos to hand over anonymized and aggregated location data to model the spread of COVID-19.
Google's data dump looks intended to dangle a similar idea of public policy utility while providing an eyeball-grabbing public snapshot of mobility shifts via data pulled off of its global user-base.
In terms of actual utility for policymakers, Google's suggestions are pretty vague. The reports could help government and public health officials "understand changes in essential trips that can shape recommendations on business hours or inform delivery service offerings," it writes.
"Similarly, persistent visits to transportation hubs might indicate the need to add additional buses or trains in order to allow people who need to travel room to spread out for social distancing," it goes on. "Ultimately, understanding not only whether people are traveling, but also trends in destinations, can help officials design guidance to protect public health and essential needs of communities."
The location data Google is making public is similarly fuzzy -- to avoid inviting a privacy storm -- with the company writing it's using "the same world-class anonymization technology that we use in our products every day," as it puts it.
"For these reports, we use differential privacy, which adds artificial noise to our datasets enabling high quality results without identifying any individual person," Google writes. "The insights are created with aggregated, anonymized sets of data from users who have turned on the Location History setting, which is off by default."
"In Google Maps, we use aggregated, anonymized data showing how busy certain types of places are—helping identify when a local business tends to be the most crowded. We have heard from public health officials that this same type of aggregated, anonymized data could be helpful as they make critical decisions to combat COVID-19," it adds, tacitly linking an existing offering in Google Maps to a coronavirus-busting cause.
The reports consist of per country, or per state, downloads (with 131 countries covered initially), further broken down into regions/counties -- with Google offering an analysis of how community mobility has changed vs a baseline average before COVID-19 arrived to change everything.
So, for example, a March 29 report for the whole of the U.S. shows a 47 percent drop in retail and recreation activity vs the pre-CV period; a 22% drop in grocery & pharmacy; and a 19% drop in visits to parks and beaches, per Google's data.
While the same date report for California shows a considerably greater drop in the latter (down 38% compared to the regional baseline); and slightly bigger decreases in both retail and recreation activity (down 50%) and grocery & pharmacy (-24%).
Google says it's using "aggregated, anonymized data to chart movement trends over time by geography, across different high-level categories of places such as retail and recreation, groceries and pharmacies, parks, transit stations, workplaces, and residential." The trends are displayed over several weeks, with the most recent information representing 48-to-72 hours prior, it adds.
The company says it's not publishing the "absolute number of visits" as a privacy step, adding: "To protect people’s privacy, no personally identifiable information, like an individual’s location, contacts or movement, is made available at any point."
Google's location mobility report for Italy, which remains the European country hardest hit by the virus, illustrates the extent of the change from lockdown measures applied to the population -- with retail & recreation dropping 94% vs Google's baseline; grocery & pharmacy down 85%; and a 90% drop in trips to parks and beaches.
The same report shows an 87% drop in activity at transit stations; a 63% drop in activity at workplaces; and an increase of almost a quarter (24%) of activity in residential locations -- as many Italians stay at home instead of commuting to work.
It's a similar story in Spain -- another country hard-hit by COVID-19. Though Google's data for France suggests instructions to stay-at-home may not be being quite as keenly observed by its users there, with only an 18% increase in activity at residential locations and a 56% drop in activity at workplaces. (Perhaps because the pandemic has so far had a less severe impact on France, although numbers of confirmed cases and deaths continue to rise across the region.)
While policymakers have been scrambling for data and tools to inform their responses to COVID-19, privacy experts and civil liberties campaigners have rushed to voice concerns about the impacts of such data-fueled efforts on individual rights, while also querying the wider utility of some of this tracking.
And yes, the disclaimer is very broad. I'd say, this is largely a PR move.
Apart from this, Google must be held accountable for its many other secondary data uses. And Google/Alphabet is far too powerful, which must be addressed at several levels, soon. https://t.co/oksJgQAPAY
— Wolfie Christl (@WolfieChristl) April 3, 2020
Contacts tracing is another area where apps are fast being touted as a potential solution to get the West out of economically crushing population lockdowns -- opening up the possibility of people's mobile devices becoming a tool to enforce lockdowns, as has happened in China.
"Large-scale collection of personal data can quickly lead to mass surveillance," is the succinct warning of a trio of academics from London's Imperial College's Computational Privacy Group, who have compiled their privacy concerns vis-a-vis COVID-19 contacts tracing apps into a set of eight questions app developers should be asking.
Discussing Google's release of mobile location data for a COVID-19 cause, the head of the group, Yves-Alexandre de Montjoye, gave a general thumbs up to the steps it's taken to shrink privacy risks. Although he also called for Google to provide more detail about the technical processes it's using in order that external researchers can better assess the robustness of the claimed privacy protections. Such scrutiny is of pressing importance with so much coronavirus-related data grabbing going on right now, he argues.
"It is all aggregated; they normalize to a specific set of dates; they threshold when there are too few people and on top of this they add noise to make -- according to them -- the data differentially private. So from a pure anonymization perspective it's good work," de Montjoye told TechCrunch, discussing the technical side of Google's release of location data. "Those are three of the big 'levers' that you can use to limit risk. And I think it's well done."
"But -- especially in times like this when there's a lot of people using data -- I think what we would have liked is more details. There's a lot of assumptions on thresholding, on how do you apply differential privacy, right?... What kind of assumptions are you making?" he added, querying how much noise Google is adding to the data, for example. "It would be good to have a bit more detail on how they applied [differential privacy]... Especially in times like this it is good to be... overly transparent."
While Google's mobility data release might appear to overlap in purpose with the Commission's call for EU telco metadata for COVID-19 tracking, de Montjoye points out there are likely to be key differences based on the different data sources.
"It's always a trade off between the two," he says. "It's basically telco data would probably be less fine-grained, because GPS is much more precise spatially and you might have more data points per person per day with GPS than what you get with mobile phone but on the other hand the carrier/telco data is much more representative -- it's not only smartphone, and it's not only people who have latitude on, it's everyone in the country, including non smartphone."
There may be country specific questions that could be better addressed by working with a local carrier, he also suggested. (The Commission has said it's intending to have one carrier per EU Member State providing anonymized and aggregated metadata.)
On the topical question of whether location data can ever be truly anonymized, de Montjoye -- an expert in data reidentification -- gave a "yes and no" response, arguing that original location data is "probably really, really hard to anonymize".
"Can you process this data and make the aggregate results anonymous? Probably, probably, probably yes -- it always depends. But then it also means that the original data exists... Then it's mostly a question of the controls you have in place to ensure the process that leads to generating those aggregates does not contain privacy risks," he added.
Perhaps a bigger question related to Google's location data dump is around the issue of legal consent to be tracking people in the first place.
While the tech giant claims the data is based on opt-ins to location tracking the company was fined $57M by France's data watchdog last year for a lack of transparency over how it uses people's data.
Then, earlier this year, the Irish Data Protection Commission (DPC) -- now the lead privacy regulator for Google in Europe -- confirmed a formal probe of the company's location tracking activity, following a 2018 complaint by EU consumers groups which accuses Google of using manipulative tactics in order to keep tracking web users’ locations for ad-targeting purposes.
“The issues raised within the concerns relate to the legality of Google’s processing of location data and the transparency surrounding that processing," said the DPC in a statement in February, announcing the investigation.
The legal questions hanging over Google's consent to track people likely explains the repeat references in its blog post to people choosing to opt in and having the ability to clear their Location History via settings. ("Users who have Location History turned on can choose to turn the setting off at any time from their Google Account, and can always delete Location History data directly from their Timeline," it writes in one example.)
In addition to offering up coronavirus mobility porn reports -- which Google specifies it will continue to do throughout the crisis -- the company says it's collaborating with "select epidemiologists working on COVID-19 with updates to an existing aggregate, anonymized dataset that can be used to better understand and forecast the pandemic."
"Data of this type has helped researchers look into predicting epidemics, plan urban and transit infrastructure, and understand people’s mobility and responses to conflict and natural disasters," it adds.