Your smart speaker data is used in ways that you might not expect
Umar Iqbal and collaborators developed a framework to measure data collection, sharing and use by smart speaker platforms
"Hey, Alexa, play the latest Taylor Swift album.” Smart speakers offer amazing convenience – from playing your favorite tunes to re-ordering toilet paper – with only a simple voice command. But that convenience can come with a steep cost in privacy that many consumers aren’t even aware they’re paying.
We’ve all had the uncanny experience of searching for something on the internet and then suddenly ads for that very thing are popping up everywhere we look online. It’s no coincidence, says Umar Iqbal, assistant professor of computer science & engineering in the McKelvey School of Engineering at Washington University in St. Louis.
“My collaborators and I uncovered that Amazon uses smart speaker interaction data to infer user interests and then uses those interests to target personalized ads to the user,” Iqbal said. “That’s something that Amazon was not upfront about before our research.”
The team presented its work Oct. 26 at the ACM Internet Measurement Conference in Montreal, where they received the best paper award. They aim to provide visibility into what information is captured by smart speakers, how it is shared with other parties, and how it is used by such parties, allowing consumers to better understand the privacy risks of these devices and the impact of data sharing on people's online experiences.
To crack open the black box around smart devices and the data they capture, the research team built an auditing framework to measure the collection, usage and sharing of Amazon Echo interaction data. First, they created several personas with interests in specific categories and one control persona. Each persona interacted with a different Echo device, then the researchers measured data collection by intercepting network traffic and inferred data usage by observing ads targeted to each persona on the web and on Echo devices.
The team reported that as many as 41 advertisers sync or share their cookies – which are typically linked to personal information – with Amazon, and then those advertisers further sync their cookies with 247 other third parties, including advertising services.
They also found that Amazon did not clearly disclose that users’ smart speaker interactions are used for profiling them for the purposes of ad targeting. Specifically, Amazon's general privacy policy and Alexa-specific privacy disclosures did not mention that smart speaker interactions are used for ad targeting. However, after their work’s preprint was released and Amazon was made aware, Amazon updated the Alexa Privacy Hub and Alexa Device FAQs to include that Alexa Echo interaction data is used for ad targeting.
“Unfortunately, surveillance is the business model of the internet,” Iqbal said. “The issues we identified in our study seem to be part of the design of the smart speaker ecosystem, and the purpose of our study is to bring public transparency. In fact, after our work's preprint was released, Amazon updated its disclosure to include that it uses smart speaker interaction data for ad targeting.”
“Consumer protection government agencies, such as the Federal Trade Commission in the U.S. and the European Consumer Organization in the EU have also shown significant interest in our findings,” Iqbal added.
Whether interventions by lawmakers or consumer-protection agencies, including recent lawsuits against Amazon by the FTC and consumers themselves, will be successful remains to be seen. Regardless, Iqbal says it’s important for consumers to be aware of just how much data they’re giving away when they invite smart devices into their homes and how that information might be used.
Iqbal U, Bahrami PN, Trimananda R, Cui H, Gamero-Garrido A, Dubois D, Choffnes D, Markopoulou A, Roesner F, and Shafiq Z. Tracking, profiling, and ad targeting in the Amazon smart speaker ecosystem. ACM Internet Measurement Conference (IMC), Oct. 24-26, 2023. https://umariqbal.com/papers/alexa-echos-imc2023.pdf
This work is supported in part by the National Science Foundation (CNS-1956393, CNS-1955227, CNS-2103439, CNS-2114230, CNS-1909020), the Computing Research Association for the CIFellows 2021 Project (CNS-2127309), the Northeastern University Future Faculty Fellowship (2021), and the Consumer Reports Digital Lab.