This article is more than 1 year old
Android's Messages, Dialer apps quietly sent text, call info to Google
Hashed text, phone call logs collected without opt-out nor specific notice
Updated Google's Messages and Dialer apps for Android devices have been collecting and sending data to Google without specific notice and consent, and without offering the opportunity to opt-out, potentially in violation of Europe's data protection law.
According to a research paper, "What Data Do The Google Dialer and Messages Apps On Android Send to Google?" [PDF], by Trinity College Dublin computer science professor Douglas Leith, Google Messages (for text messaging) and Google Dialer (for phone calls) have been sending data about user communications to the Google Play Services Clearcut logger service and to Google's Firebase Analytics service.
"The data sent by Google Messages includes a hash of the message text, allowing linking of sender and receiver in a message exchange," the paper says. "The data sent by Google Dialer includes the call time and duration, again allowing linking of the two handsets engaged in a phone call. Phone numbers are also sent to Google."
The timing and duration of other user interactions with these apps has also been transmitted to Google. And Google offers no way to opt-out of this data collection.
- Zoom agrees privacy conditions, gets low-risk rating from Netherlands
- Meta sued for 'aiding and abetting' crypto scammers
- Ireland: Meta fined $18.6m for breaking EU's GDPR
- EU, US close to replacing defunct Privacy Shield II
Google Messages (com.google.android.apps.messaging) is installed on over a billion Android handsets. It's offered by AT&T and T-Mobile on Android phones in the US and comes preloaded on recent handsets from Huawei, Samsung, and Xiaomi. Similarly, Google Dialer (also known as Phone by Google, com.google.android.dialer) has the same reach.
Both pre-installed versions of these apps, the paper observes, lack app-specific privacy policies that explain what data gets collected – something Google requires from third-party developers. And when a request was made through Google Takeout for the Google Account data associated with the apps used for testing, the data Google provided did not include the telemetry data observed.
From the Messages app, Google takes the message content and a timestamp, generates a SHA256 hash, which is the output of an algorithm that maps the human readable content to an alphanumeric digest, and then transmits a portion of the hash, specifically a truncated 128-bit value, to Google's Clearcut logger and Firebase Analytics.
Hashes are designed to be difficult to reverse, but in the case of short messages, Leith said he believes some of these could be undone to recover some of the message content.
"I’m told by colleagues that yes, in principle this is likely to be possible," Leith said in an email to The Register today. "The hash includes a hourly timestamp, so it would involve generating hashes for all combinations of timestamps and target messages and comparing these against the observed hash for a match – feasible I think for short messages given modern compute power."
The Dialer app likewise logs incoming and outgoing calls, along with the time and the call duration.
As the paper states, Google Play Services discloses that some data gets collected for security and fraud prevention, to maintain Google Play Services APIs and core services, and to provide Google services like bookmark and contact syncing. It does not, however, detail or explain its collection of message content or of callers and call recipients. As the paper put it, "few details are given as to the actual data collected."
"I was surprised to see this data being collected by these Google apps," said Leith.
Leith disclosed his findings to Google last November and said he has had several conversations with Google's engineering director for Google Messages about suggested changes.
The paper describes nine recommendations made by Leith and six changes Google has already made or plans to make to address the concerns raised in the paper. The changes Google has agreed to include:
- Halting the collection of the sender phone number by the CARRIER_SERVICES log source, of the 5 SIM ICCID, and of a hash of sent/received message text by Google Messages.
- Halting the logging of call-related events in Firebase Analytics from both Google Dialer and Messages.
- Shifting more telemetry data collection to use the least long-lived identifier available where possible, rather than linking it to a user's persistent Android ID.
- Making it clear when caller ID and spam protection is turned on and how it can be disabled, while also looking at way to use less information or fuzzed information for safety functions.
Google confirmed to The Register on Monday that the paper's representations about its interactions with Leith are accurate. "We welcome partnerships – and feedback – from academics and researchers, including those at Trinity College," a Google spokesperson said. "We've worked constructively with that team to address their comments, and will continue to do so."
The paper raises questions about whether Google's apps comply with GDPR but cautions that legal conclusions are out of scope for what is a technical analysis. We asked Google whether it believes its apps meet GDPR obligations but we received no reply.
We've worked constructively with that team to address their comments, and will continue to do so
Leith said it's not clear whether Google's commitments fully address the concerns he has raised.
"In particular, they say they will introduce a toggle within the Messages app to allow users to opt out of data collection but that this opt out will not cover data that Google considers to be 'essential' i.e. they will continue to collect some data even when users opt out," he said. "In my tests I had already opted out of Google data collection by disabling the Google 'Usage and diagnostics' option in the handset Settings, and so the data I reported on was already judged to be somehow essential by Google. I think we’ll have to wait and see."
Leith said there are two larger matters related to Google Play Service, which is installed on almost all Android phones outside of China.
"The first is that the logging data sent by Google Play Services is tagged with the Google Android ID which can often be linked to a person’s real identity – so the data is not anonymous," he said. "The second is that we know very little about what data is being sent by Google Play Services, and for what purpose(s). This study is the first to cast some light on that, but it's very much just the tip of the iceberg." ®
Updated to add
In a follow-up comment two days after this story was published, a Google spokesperson said the data was collected for diagnostic purposes:
We're committed to compliance with Europe's privacy laws and apply strict privacy protections to data collected via our Dialer and Messages apps.
Both Dialer and Messages use limited amounts of data for highly specific purposes that allow us to diagnose and resolve product functionality issues and ensure message delivery is consistently reliable.
These technical logs are not – and were never – used for targeting ads and were protected by strict internal access controls. Phone numbers and hashed SMS related data within Messages were only used in technical logs to debug app service issues. Phone numbers that were not saved in a user's contact list are only used by Dialer to guard users against unwanted spam calls.