The Privacy Battle to Save Google From Itself

Credit to Author: Lily Hay Newman| Date: Thu, 01 Nov 2018 14:11:06 +0000

Over two days during the summer of 2009, experts from inside and outside Google met to forge a roadmap for how the company would approach user privacy. At the time, Google was under fire for its data collection practices and user tracking. The summit was designed to codify ways that users could feel more in control.

Engineer Amanda Walker, then in her third year at Google and now the company’s software engineering manager of privacy infrastructure, jotted down notes on a paper worksheet during one of the summit’s sessions. “HMW: Mitigate Impact of bad Gov’t + 3rd party requests,” she wrote, using shorthand for “how might we.” A few suggestions followed: “Discourage abusive requests. Make privacy measurable/surface rising threats. Industry wide.” It was the seed of what would eventually become Google’s suite of transparency reports that, among other things, disclose government requests for data.

It also was just one of several features the group brainstormed that summer that became a reality. An idea called “Persona management” became Chrome and Android profiles. “Universal preferences” became My Account and My Activity. And “Private search” turned into controls to be able to see, pause, and delete search queries and other activity.

Longtime Google employees remember the 2009 privacy summit as a turning point. “A lot of these were a lot more work than we anticipated at the time, but it’s reassuring to me that I think we got the big things right,” Walker says.

And yet, nearly a decade later, privacy controversies continue to plague Google. Just in recent months, the Associated Press revealed that Google continued to store user location data on Android and iOS even when they paused collection in a privacy setting called Location History. At the end of September, Chrome had to walk back a change to user logins meant to improve privacy on shared devices after the revision prompted a different set of concerns. Google then shuttered Google+ in October, after The Wall Street Journal reported on a previously undisclosed data exposure that left personal information from more than 500,000 of the social network’s users out in the open. And Google is once again building censored services for China.

In this seemingly unshakeable cycle of improvements and gaffes, it's nearly impossible to make a full accounting of Google's user privacy impacts and protections. But it's critical to understand how the people on the front lines of that fight think about their jobs, and how it fits in with the fundamental truth of how Google makes money.

Google’s privacy apparatus—which spans the globe and includes dedicated standalone teams, groups within other teams, and an extensive leadership structure—comprises thousands of employees and billions of dollars in cumulative investment. More than a dozen Google employees who work on privacy at all levels talked with WIRED in recent weeks about the massive scale and scope of these efforts. Every employee—from research scientists to engineers, program managers, and executives—described a single shared goal: to respect Google users and help them understand and control their data as they generate it in real-time on Google’s services.

But Google is not a consumer software company, or even a search company. It’s an ad company. It collects exhaustive data about its users in the service of brokering ad sales around the web. To do so, Google requires an extensive understanding of the backgrounds, browsing habits, preferences, purchases, and lives of as many web users as possible, gleaned through massive data aggregation and analysis. In third quarter earnings announced last week, Google’s parent company Alphabet reported $33.7 billion in revenue. About 86 percent of that came from Google’s ad business.

“Google does a good job of protecting your data from hackers, protecting you from phishing, making it easier to zero out your search history or go incognito,” says Douglas Schmidt, a computer science researcher at Vanderbilt University who has studied Google's user data collection and retention policies. “But their business model is to collect as much data about you as possible and cross-correlate it so they can try to link your online persona with your offline persona. This tracking is just absolutely essential to their business. Surveillance capitalism is a perfect phrase for it.”

"We saw and had to tackle these challenges years and years before most other people."

Lea Kissner, Google

And yet Google has also played a major role in creating the superstructure of what corporate user data protections and transparency mechanisms look like today. Transparency reports have become a staple among tech giants, as have other user security and privacy features Google offered early, like tailored settings walkthroughs. And while Apple only recently introduced an option to download data—prompted by Europe’s GDPR omnibus privacy law—Google launched its first such tool, known as Takeout, in 2011. The company also continues to improve and refine its options for user privacy controls. One recent move involves surfacing information about user data flow and settings options directly in the main screens of search results, so users are actively prompted to consider these issues all the time.

“We saw and had to tackle these challenges years and years before most other people,” says Lea Kissner, Google’s global lead of privacy technology, who has been at the company for more than 11 years and oversees the NightWatch privacy audit program. “When I look back at where we were and how much we know now and how much we’ve built, I’m really proud of what we did, but you’re never going to be done.”

Google’s privacy-focused employees say they see no conflict between their work and the cash-generating side of the business, and that they don’t feel pressure to pull punches.

“We do a pretty good job of firewalling the ads business from the products we build,” says Ben Smith, a Google fellow and vice president of engineering. “But ads do fund a whole lot of free services. When we talk about building for everyone we want to build for the people who can’t afford an expensive phone and can’t afford a $20 per month subscription. And I think that democratization of access to data is a good thing for society and the world.”

Google can afford to develop top-quality consumer products—complete with expansive user security and abuse protections—and offer them at no monetary cost to anyone who wants to use them worldwide. Not many companies can. Google also funds efforts to improve web performance, stability, and security that raise the bar for the internet at large. But whether all of this is “free” is subject to debate. Google users pay for the services, in a very real sense, with their personal data.

“I think the big problem is that we give much more data to Google than it needs,” says Guillaume Chaslot, a former Google engineer who worked on YouTube’s recommendations algorithm and now runs the watchdog group AlgoTransparency. “When something is free, we behave irrationally, and that’s how users behave with Google. It makes no sense that Google keeps our data forever.”

But from a business perspective, it makes plenty of sense. “When you depend on insight from data, well, you need the data,” says Lukasz Olejnik, a security and privacy researcher and member of the W3C Technical Architecture Group.

Both current and former Google privacy employees insist that there is no internal pressure to water down privacy protections.

“One of the things that was really persistent at Google, and which was really hard to explain to outsiders, was just how committed everyone was to privacy,” says Yonatan Zunger, a former senior privacy engineer at Google who left in mid-2017 to work on privacy engineering and data protection at the workplace behavior startup Humu. “I pretty much never had to convince anyone of its importance.”

Google has also increasingly prioritized building in privacy protections for new services and features early in the development process. Led by Kissner, the effort has helped avoid tensions that arise when developers try to add protections when a deadline looms. Just how soon those privacy considerations kick in, though, is unclear. In a September Congressional hearing about a potential censored search engine for China, known as Project Dragonfly, Keith Enright, Google’s chief privacy officer, testified that his team was not yet involved in the project.

"It makes no sense that Google keeps our data forever."

Guillaume Chaslot, Former Google Engineer

Meanwhile, Google has also devoted significant resources to developing its Security and Privacy Checkup tools, which walk users through a sort of explanatory checklist of how Google’s data controls work and what options are available. The project has a special emphasis on developing privacy language that is actually understandable, and doing so for more than 15 languages that Google supports, so nothing is lost in translation. “Users are not the experts in privacy and security, it’s actually Google,” says Guemmy Kim, product management lead for Google Account security. “Google should be telling users what’s wrong, we should point out the anomalies, and guide users through their settings.”

And Google is often on the front lines of rigorous artificial intelligence, computer science, and digital privacy research, thanks to a deep bullpen of former academics who continue to publish under Google’s auspices. Privacy research coming from inside Google potentially poses conflicts of interest—you wouldn’t hire a lion to research antelope safety. But academics, including those who have investigated privacy behaviors in Google services, say its research is well-regarded.

“I think their academic work on privacy is solid,” says Gunes Acar, a postdoctoral researcher at Princeton, who studies digital data flow and overreach. “Privacy-related papers from Google researchers and engineers are published at top venues and are of top quality.”

In the past few years, for instance, Google researchers have helped develop machine learning techniques that can build models off of disparate data sets, so there never needs to be one centralized repository of the information. The mechanism, known as federated learning, allows Google (or anyone) to develop predictive algorithms locally on your device or any user devices without needing to remove it. This means that the models can train and mature on a collective data set contributed by millions of devices without sending the information to an entity’s servers somewhere else.

The technique dovetails in many ways with the concept of differential privacy, the statistical process of analyzing data from a population without learning about individuals in it. Both are next-generation techniques that reduce the amount of personal user data an entity like Google holds, which has the added benefit of improving privacy defenses against criminal hackers, intelligence agencies, or other government intrusions.

“I was hired in the big buildup of security at Google about nine years ago with the explicit mandate of looking at new things that push the envelope,” says Úlfar Erlingsson, a senior staff research scientist who heads work on improving machine learning algorithm protections. “Having worked in security and privacy for 25 years I know that there’s usually not a good solution—usually there’s a bad solution and then we struggle a lot to make it work. But with machine learning we can train these machines in such a way that they truly don’t capture any details about people.”

Google has also led on and expanded its work to produce transparency reports. The project has grown from an annual report on government requests launched in 2010 into an array of analyses and data sets for users to track over time on a range of issues like content removals due to copyright, YouTube community guidelines enforcement, search entry removals under European privacy law, and even a report about political advertising on Google. Michee Smith, the lead project manager for transparency reports, oversees a team of 10 to 15 engineers, product people, policy experts, and lawyers who work together to keep the reports coming and collaborate with various teams around Google to get the right data. The group prioritizes making its reports as easy as possible for people to understand and dig through.

“As a company we’re getting big, but we’re not trying to get evil just because we’re getting big,” she says. “With these really important topics, we’re putting data out there, so if you see a trend or you notice something you can hold us accountable. The average user is not aware of all the laws and policies that can impact the flow of information online, but we are. So my ultimate goal is for users to feel like we have your back.”

And yet Google regularly stumbles. Some of the company’s issues fit in with broader revelations over the past couple of years that massive user platforms like Facebook have underestimated, or failed to consider, the fundamental impact their services—and business priorities—could have on world societies.

“Google is strong on having people with remarkable security and privacy expertise, but reconciling privacy guarantees with business needs is a challenging topic anywhere,” independent researcher Olejnik says. “A potential issue is underestimating the possible misuse of high-impact technologies like Google’s Real-Time Bidding ads platform. I would argue that the risks could have been foreseen.” Over the past few years, Google has been criticized, and even boycotted, for allowing inappropriate or problematic content on its ad networks.

In spite of more than a decade of industry-leading work on privacy from Google, some see the carousel of errors as proof of a sort of Google privacy Groundhog Day. But the company in many cases also created the technology that solves those same problems not just for itself, but the whole industry.

Many of Google’s critics also note that they believe it is possible—at least from a technological perspective—to develop user services that are funded by ads, but still silo and control data enough to balance user privacy with business interests.

“It’s entirely possible for a company like Google to make good, usable products that strike a balance between privacy and profit,” Johns Hopkins cryptographer Matthew Green wrote at the beginning of October after publicly railing against a problematic change to Chrome. “It’s just that without some countervailing pressure forcing Google to hold up their end of the bargain, it’s going to be increasingly hard for Google executives to justify it.”

"They have the ability to change the trajectory here, but they don’t allow for any idea that things could be a bit different."

Jason Kint, Digital Content Next

Nearly everyone WIRED spoke to at Google for this story attributed the company’s privacy mistakes and failures to Google’s unique position at the forefront of encountering and dealing with unprecedented data flow challenges. “Google, by virtue of what we do and the velocity that we do it at, we are necessarily the petri dish that privacy engineering is being cultivated in,” says Google's Enright. “Most of our fumbles and missteps in my experience can be tracked to us leaning so far into our own optimism that we failed to benefit from the wisdom of others.”

The other option, though, would simply be to move a bit more slowly. Google’s critics say the company could do a better job of considering privacy and developing safeguards before its business innovations create problems.

“There’s no doubt that there are some of the smartest minds in both privacy, data protection, law, and engineering inside these companies—Google especially,” says Jason Kint, CEO of the digital publishing trade organization Digital Content Next. (WIRED parent company Condé Nast is a member.) “They pride themselves on moonshots, they’ve got just immense amounts of wealth and profitable business margin and growth. But they say, ‘well, this is our business model and if we don’t have this business model then we’re going to have to charge for access.’ It’s just a very binary view. They have the ability to change the trajectory here, but they don’t allow for any idea that things could be a bit different.”

Zunger, the former Google privacy engineer, points out that a big challenge for the company is that research and surveys consistently show that many people don’t really understand their own privacy-related concerns, beyond vague awareness that some kind of danger exists. As a result, he says, people level criticisms against and requests of Google that aren’t necessarily constructive or actionable in themselves.

But Zunger notes an even more subtle reason that people working at Google may not see the same contradictions embedded in the company that some outsiders view as inherent.

“There's one aspect which is always going to be hard to address at a company like Google, which is when people have concerns that the mere existence of a single large pile of data is itself dangerous,” Zunger says. “People who feel this way generally aren't going to come work at Google, and so this kind of concern is generally not represented very well. When Googlers address it, they do so by asking the more concrete question of ‘OK, what risks could the existence of this data create?’ They don’t try to ask the meta-question of ‘well, what if the data didn't exist at all?’”

In thinking about Google’s extensive efforts to safeguard user privacy and the struggles it has faced in trying to do so, this question articulates a radical alternate paradigm—one that Google seems unlikely to convene a summit over. What if the data didn’t exist at all?

https://www.wired.com/category/security/feed/