Facebook, Cambridge Analytica, Privacy, and Informed Consent

4 min read

There has been a significant amount of coverage and commentary on the new revelations about Cambridge Analytica and Facebook, and how Facebook's default settings were exploited to allow personal information about 50 million people to be exfiltrated from Facebook.

There are a lot of details to this story - if I ever have the time (unlikely), I'd love to write about many of them in more detail. I discussed a few of them in this thread over on Twitter. But as we digest this story, we need to move past the focus on the Trump campaign and Brexit. This story has implications for privacy and our political systems moving forward, and we need to understand them in this broader context.

But for this post, I want to focus on two things that are easy to overlook in this story: informed consent, and how small design decisions that don't respect user privacy allow large numbers of people -- and the systems we rely on -- to be exploited en masse.

The following quote is from a NY Times article - the added emphasis is mine:

Dr. Kogan built his own app and in June 2014 began harvesting data for Cambridge Analytica. The business covered the costs — more than $800,000 — and allowed him to keep a copy for his own research, according to company emails and financial records.

All he divulged to Facebook, and to users in fine print, was that he was collecting information for academic purposes, the social network said. It did not verify his claim. Dr. Kogan declined to provide details of what happened, citing nondisclosure agreements with Facebook and Cambridge Analytica, though he maintained that his program was “a very standard vanilla Facebook app.”

He ultimately provided over 50 million raw profiles to the firm, Mr. Wylie said, a number confirmed by a company email and a former colleague. Of those, roughly 30 million — a number previously reported by The Intercept — contained enough information, including places of residence, that the company could match users to other records and build psychographic profiles. Only about 270,000 users — those who participated in the survey — had consented to having their data harvested.

The first highlighted quotation gets at what passes for informed consent. However, in this case, for people to make informed consent, they had to understand two things, neither of which are obvious or accessible: first, they had to read the terms of service for the app and understand how their information could be used and shared. But second -- and more importantly -- the people who took the quiz needed to understand that by taking the quiz, they were also sharing personal information of all their "friends" on Facebook, as permitted and described in Facebook's terms. This was a clearly documented feature available to app developers that wasn't modified until 2015. I wrote about this privacy flaw in 2009 (as did many other people over the years). But, this was definitely insider knowledge, and the expectation that a person getting paid three dollars to take an online quiz (for the Cambridge Analytica research) would read two sets of dense legalese as part of informed consent is unrealistic.

As reported in the NYT and quoted above, only 270,000 people took the quiz for Cambridge Analytica - yet these 270,000 people exposed 50,000,000 people via their "friends" settings. This is what happens when we fail to design for privacy protections. To state this another way, this is what happens when we design systems to support harvesting information for companies, as opposed to protecting information for users.

Facebook worked as designed here, and this design allowed the uninformed decisions of 270,000 people to create a dataset that potentially undermined our democracy.