Facebook, Cambridge Analytica, Privacy, and Informed Consent

4 min read

There has been a significant amount of coverage and commentary on the new revelations about Cambridge Analytica and Facebook, and how Facebook's default settings were exploited to allow personal information about 50 million people to be exfiltrated from Facebook.

There are a lot of details to this story - if I ever have the time (unlikely), I'd love to write about many of them in more detail. I discussed a few of them in this thread over on Twitter. But as we digest this story, we need to move past the focus on the Trump campaign and Brexit. This story has implications for privacy and our political systems moving forward, and we need to understand them in this broader context.

But for this post, I want to focus on two things that are easy to overlook in this story: informed consent, and how small design decisions that don't respect user privacy allow large numbers of people -- and the systems we rely on -- to be exploited en masse.

The following quote is from a NY Times article - the added emphasis is mine:

Dr. Kogan built his own app and in June 2014 began harvesting data for Cambridge Analytica. The business covered the costs — more than $800,000 — and allowed him to keep a copy for his own research, according to company emails and financial records.

All he divulged to Facebook, and to users in fine print, was that he was collecting information for academic purposes, the social network said. It did not verify his claim. Dr. Kogan declined to provide details of what happened, citing nondisclosure agreements with Facebook and Cambridge Analytica, though he maintained that his program was “a very standard vanilla Facebook app.”

He ultimately provided over 50 million raw profiles to the firm, Mr. Wylie said, a number confirmed by a company email and a former colleague. Of those, roughly 30 million — a number previously reported by The Intercept — contained enough information, including places of residence, that the company could match users to other records and build psychographic profiles. Only about 270,000 users — those who participated in the survey — had consented to having their data harvested.

The first highlighted quotation gets at what passes for informed consent. However, in this case, for people to make informed consent, they had to understand two things, neither of which are obvious or accessible: first, they had to read the terms of service for the app and understand how their information could be used and shared. But second -- and more importantly -- the people who took the quiz needed to understand that by taking the quiz, they were also sharing personal information of all their "friends" on Facebook, as permitted and described in Facebook's terms. This was a clearly documented feature available to app developers that wasn't modified until 2015. I wrote about this privacy flaw in 2009 (as did many other people over the years). But, this was definitely insider knowledge, and the expectation that a person getting paid three dollars to take an online quiz (for the Cambridge Analytica research) would read two sets of dense legalese as part of informed consent is unrealistic.

As reported in the NYT and quoted above, only 270,000 people took the quiz for Cambridge Analytica - yet these 270,000 people exposed 50,000,000 people via their "friends" settings. This is what happens when we fail to design for privacy protections. To state this another way, this is what happens when we design systems to support harvesting information for companies, as opposed to protecting information for users.

Facebook worked as designed here, and this design allowed the uninformed decisions of 270,000 people to create a dataset that potentially undermined our democracy.

Spectre, Meltdown, Adtech, Malware, and Encryption

2 min read

This week has seen a lot of attention paid to Spectre and Meltdown, and justifiably so. Get the technical details here: https://spectreattack.com/

These issues are potentially catastrophic for cloud providers (see the details in the articles linked above) but they can also affect regular users on the web. While there are updates available for browsers that mitigate the attack, and updates available for most major operating systems, updates only work when they are applied, which means that we will almost certainly see vulnerable systems into the foreseeable future.

I was very happy to see both Nicholas Weaver and Zeynep Tufekci addressing the connection between these vulnerabilities and adtech. 

Adtech leaves all internet users exposed to malware - it has for a while, and, in its current form, adtech exposes us to unneeded risk (as well as compromising our privacy). This risk is increased because many commonly used adtech providers do not support or require encryption.

To examine traffic over the web, use an open source tool like OWASP ZAP. If you are running a Linux machine, routing traffic through your computer and OWASP ZAP is pretty straightforward if you set your computer up to act as an access point

But, using these basic tools, it's simple to see how widespread the issue of unencrypted adtech actually is, in both web sites and mobile applications (on a related note, some mobile apps actually get their content via an unencrypted zip file. You read that correctly - the expected payload is an unencrypted zip file. That's a topic for a different post, and I'm not naming names, but the fact that this is accepted behavior within app stores in 2018 should raise some serious questions).

The unencrypted adtech includes javascript sent to the browser or the device. Because this javascript is sent unencrypted over the network, intercepting it and modifying it would be pretty straightforward, which exposes people to increased risk. 

The next time you are in a coffee shop and see a kid playing a game on their parent's device while the parent talks with a friend, ask yourself: is that kid playing an online game, or downloading malware, or both? Because so much adtech is sent unencrypted, anything is possible.

AdTech, the New York Times, and Normalizing Fascism

4 min read

Today, the New York Times published a piece that normalized and humanized Nazis. I'm not linking to it; feel free to find it on your own, but I'm not giving it any additional traffic.

As was noted on Twitter, Hitler also received flattering press. 

However, because the NYT puts ads on its online content, they make money even when they put out dangerously irresponsible journalism. Due to the perversities of human nature, they probably make more money on the dangerously irresponsible pieces.

But, so does the adtech industry that the NYT uses to target ads. The adtech industry collects and combines information from everyone who visits this page. They aren't as visible as the NY Times, because they operate in the background, but adtech companies are happy to profit while the rest of us suffer. 

Here are some of the companies that had ads placed alongside a disgusting story that normalizes Nazis, fascism, racism, and everything that is wrong with how we are currently failing to address hate in our country. To state what should be obvious, none of these brands chose to advertise on this specific story. However, their presence alongside this story gives an explicit stamp of approval to prose that normalizes fascism. We'll return to this at the end of the piece.

Citi wants us all to know that, provided you use a Citi card, they are okay with Nazis.

Fascist literature and firepits

You know what goes great with fascist literature? A firepit from ultimatepatio.com

sig heil and grills

If your ex-band mate whitewashes fascism as being "proud of their heritage" why not invite them over for a cookout? BBQGuys.com thinks grilling with Nazis is great.

Hitler and Mussolini and toys

Better make room for some gifts from Fat Brain Toys alsongside your Hitler and Mussolini literature.

T-Mobile, your network for Nazis

And if you're calling your Nazi friends, T Mobile wants you to do it on their network.

Nazis and firepits

And, Starfire Direct wants that special Nazi to Reignite Your Life.

As I said earlier, none of these companies chose to advertise on this page. But, because of the lazy mechanism that is modern adtech, "mistakes" like this happen all the time. These "mistakes" rely on an industry that operates in the shadows, makes vague promises to do better, and is predicated on constant monitoring of the web sites we visit, the places we shop (including using facial recognition), and connecting online and offline behavior. Despite industry promises of how this near-constant tracking doesn't impact our privacy, a recent study highlighted that, using AdTech as it was designed, it takes about $1,000 US to identify an individual using mobile ads.

But returning to the NY Times piece that normalizes fascism in the United States and the advertising technology used by the NY Times and just about every other publisher out there: sloppy journalism and lazy advertising both take shortcuts that we can't afford.

Modern adtech is appealing because, for brands, it offers ease of access alongside the promise of precisely targeting users. Sloppy journalism is "appealing" because it looks suspiciously like the real thing, and -- due to how adtech works -- it can ring up revenue in the short term.

But, given where we are, we need to stop looking for shortcuts. Doing things well feels like more work as we get started, but we are currently experiencing what happens in our information and political systems when we take shortcuts.

End Note: The article had tracking calls from over 50 different adtech companies, which is actually on the average to low side of other mainstream news sites. The adtech companies used by NY Times include most of the usual suspects, including Facebook, Google/Doubleclick, LinkedIn, Moat, Twitter, Amazon, AppNexus, Media.net, Bluekai, and AddThis.

Misinformation Equilibrium

2 min read

Given the state of political discourse (or what passes for political discourse) in the US, and the resounding and ongoing success of misinformation spread via online media, the people spreading misinformation have a decided advantage.

What I've been seeing (although this is anecdotal and based solely on non-scientific observation, so take it with a grain of salt): whenever there is a topic that is remotely controversial, bots and trolls get active in amplifying the extreme positions. They don't need to make a point, or even to have a specific position "win." All they need to do is stoke distrust, which in turn makes it more difficult for actual inhabitants of these United States to talk with one another.

Because we are primed to mistrust based, we make the work of people spreading misinformation that much easier.

So, my question: what are the concrete steps we need to take to repair trust?

(and when I say "repair trust" I'm not saying accept hatred, racism, neo-fascists, or anything like that. Denying the humanity of people is a hard red line that has no place, and deserves no quarter. But for the rest of us - how do we set aside mistrust and find something that looks like common ground? How do we avoid getting derailed by the forces that thrive on our divisions?)

Media Literacy When the Platforms Are Complicit

3 min read

In June of 2016, Twitter tried to upsell RT -- a propaganda arm of the Russian government -- to increase RT's visibility on Twitter during the US elections. The image below is included in the full article on Buzzfeed:

Email from Twitter Sales to RT

As you can see from the email, Twitter's offer to a propaganda arm of a foreign government included an elections specialist, and early access to new features.

Let's just pause here: a US tech company was willing to provide consulting support to a propaganda arm controlled by a foreign government advertising in our elections for a million dollars.

This is how our democracy is sold. And given the amount of money spent on both politics and tech, the price isn't even very high. RT appeared not to take Twitter up on their offer, probably because it's cheaper to staff convincing troll accounts to manipulate Twitter users, thanks in large part to Twitter's pathetic efforts at addressing bots and trolls on their site.

And, of course, Twitter is far from alone in pursuing advertising revenue from questionable sources. Facebook and Google eagerly accepted money from and provided consulting support to racist campaigns. This is how data collection in the pursuit of targeted advertising works. They will collect data on as many people as possible, and sell to anyone who can pay. The process of online advertising is so opaque and technical that it allows companies to evade scrutiny.

Here is my question to adults working with K12 and college students on information literacy: how do you make students aware that corporate social media platforms and search engines are part of the structure that makes misinformation thrive?

How do we reconcile that when we use Google to search for verification of a story, we are providing Google with data about us that can, in turn, be used to serve us misinformation -- and that if the client pays enough, Google will provide a consultant to help the misinformation merchants do it better?

How do we work with students (and let's be honest here - other adults) to deconstruct that when Facebook or Twitter or other services "recommend" something to us, the recommendation is getting filtered through who has paid to access our attention?

As adults who care about helping people understand our information environment, what steps do we take to ask meaningful questions about the information we read, and believe, and share?

A lot of conversations about media literacy focus on the need to teach youth the skills to disentangle truth -- or a reliable version -- from misinformation. While this is important, this is incomplete. Misinformation is an over-18 problem. In the US, the vast majority of students in K12 didn't vote in the 2016 election. Adults need this training as much as -- if not more than -- kids. We can't teach this well without a rudimentary understanding of the subject.

So: how are we teaching ourselves, and our peers, to do better?

What does it mean to do informal professional development, in the form of Twitter chats, on a platform that actively tried to sell our attention to a foreign government?

More importantly, how do we reconcile or explain this conflict?

I don't have solid or satisfying answers to any of these questions, but the answers start with acknowledging the depth of the problem, and the shallowness of our understanding.

Daily Post, October 24, 2017

5 min read

It's been a busy few days, but here are some of the things I've been reading. Enjoy!

Open Source Code from ProPublica to Detect Political Ads

While the lawyers at major tech companies complain that it's too hard to find political ads, ProPublica released code showing how easy it is to identify political ads..

We're asking our readers to use this extension when they are browsing Facebook. While they are on Facebook a background script runs to collect ads they see. The extension shows those ads to users and asks them to decide whether or not a particular ad is political. Serverside, we use those ratings to train a naive bayes classifier that then automatically rates the other ads we've collected. The extension also asks the server for the most recent ads that the classifier thinks are political so that users can see political ads they haven't seen. We're careful to protect our user's privacy by not sending identifying information to our backend server.

Adtech won't fix this problem. They have a financial interest in not fixing this problem. Every day that passes without a fix for this problem is another day they make money from undermining our democracy. I also doubt the ability of our current crop of lawmakers to understand the problem, or understand a good solution.

BlockBear

Blockbear is an ad blocker for iOS, made by the same folks that make TunnelBear VPN.

A really simple, often adorable adblocker for your iPhone or iPad.

  • Blocks ads and invasive online tracking
  • Load many websites 3-5 times faster
  • Whitelist your favorite websites
  • Has bears

You could download another adblocker, but then you wouldn't have a bear!

While I haven't used this, it looks interesting.

Obfuscation Workshop Report

The report from the Inernational Workshop on Obfuscation is now released and available for download.

We have asked our panelists to each provide a brief essay summarizing their project, concept, application—with an emphasis on the questions, challenges, and discussions raised during the weekend. As with the workshop itself, this report is a starting point rather than an end point.

I haven't read this yet, so have little to say on the contents, but obfuscation is one of many tools we have to protect our privacy, and make the data collected about us less useful.

China's "Social Credit" System

China is rolling out a system that publicly measures every citizen. Thought experiment: how much more data would a country need besides what Facebook or Google already collect to create a similar system?

Imagine a world where many of your daily activities were constantly monitored and evaluated: what you buy at the shops and online; where you are at any given time; who your friends are and how you interact with them; how many hours you spend watching content or playing video games; and what bills and taxes you pay (or not). It's not hard to picture, because most of that already happens, thanks to all those data-collecting behemoths like Google, Facebook and Instagram or health-tracking apps such as Fitbit. But now imagine a system where all these behaviours are rated as either positive or negative and distilled into a single number, according to rules set by the government. That would create your Citizen Score and it would tell everyone whether or not you were trustworthy. Plus, your rating would be publicly ranked against that of the entire population and used to determine your eligibility for a mortgage or a job, where your children can go to school - or even just your chances of getting a date.

This is what data does, very well. Data supports systems that rate, rank, sort, all day long. This is not a neutral activity. Anyone who claims otherwise is not adequately informed.

Can We All Just Encrypt Our Stuff Already?

Troy Hunt lays out a clear roadmap for implementing encryption on a web site.

Well, it can be more difficult but it can also be fundamentally simple. In this post I want to detail the 6-step "Happy Path", that is the fastest, easiest way you can get HTTPS up and running right.

This change is coming, so please, just do this. Now. Please.

For $1000 You Can Track Someone Via Adtech

The research in this paper shows how the core features of an ad network can be used to track an individual.

There is a fundamental tension at work in the online advertising ecosystem: the precision targeting features we used for these attacks have been developed for legitimate business purposes. Advertisers are incentivized to provide more highly targeted ads, but each increase in targeting precision inherently increases ADINT capabilities.

This is how data tracking works. Data allows us to ask questions. The researchers in this study didn't exploit a bug. They used the advertising systems exactly as they were designed. This technicque would almost certainly work to target children.

Facebook Tests Gouging Publishers

Facebook can spin this effort to gouge publishers in a few ways, but their move to pull all non-sponsored posts from user's feeds would force publishers to pay Facebook in order to reach people.

A new system being trialled in six countries including Slovakia, Serbia and Sri Lanka sees almost all non-promoted posts shifted over to a secondary feed, leaving the main feed focused entirely on original content from friends, and adverts.

Facebook might even try and spin this as an effort to combat misinformation, but this move really demonstrates what the "meritocracy" looks like in Silicon Valley: if you want access, pay the people who control it. For any publishers who had any illusions about how Facebook views them, this move should dispel all doubts. It's also worth noting where Facebook rolled this test out: smaller countries with, presumably, a userbase with fewer connections.

Daily Post - October 18, 2017

4 min read

Some of the articles and news that crossed my desk on )ctober 18, 2017. Enjoy!

Facebook and Google Worked with Racist Campaigns, at Home and Abroad

Both Facebook and Google worked closely with an ad agency running blatantly racist ads during the 2016 campaign. Both companies worked on targeting more precisely, and provided a range of technical support.

Facebook advertising salespeople, creative advisers and technical experts competed with sales staff from Alphabet Inc.’s Google for millions in ad dollars from Secure America Now, the conservative, nonprofit advocacy group whose campaign included a mix of anti-Hillary Clinton and anti-Islam messages, the people said.

Facebook also worked with at least one campaign putting racist ads in Germany to target German voters. This is what the "neutrality" of tech looks like: racism with money behind it is always welcome. The data collection and subsequent profiling of people is a central element of how racism is spread, and how data brokers and advertising companies work together to profit.

Russia Recruited Activists to Stage Protests

The people who were recruited didn't know they were working with Russians. But this is an odd corner of Russian attempts to create noise and conflict around issues related to race.

Russia’s most infamous troll farm recruited US activists to help stage protests and organize self-defense classes in black communities as part of an effort to sow divisions in US society ahead of the 2016 election and well into 2017.

As always, research your funders and contacts.

US Government Wants the Right to Access Any Data Stored Anywhere

The US Supreme Court will hear a case that looks at whether a legal court order can compel a company to hand over information, even if that information is stored outside the US.

In its appeal to the high court, meanwhile, the US government said that the US tech sector should turn over any information requested with a valid court warrant. It doesn't matter where the data is hosted, the government argues. What matters, the authorities maintain, is whether the data can be accessed from within the United States.

This has the potential to open the floodgates for personal data to be accessed regardless of where it is stored. This would also gut privacy laws outside the US (or create a legal mess that will take years to untangle, and make lawyers very rich). It will also kills the tech economy and isolate the US, because who outside the US would want to connect to a mess like that?

For $1000 US, You Can Use AdTech to Track and Identify an Individual

A research team spent $1000 with an ad network, and used that to track an individual's location via targeted ads.

An advertising-savvy spy, they've shown, can spend just a grand to track a target's location with disturbing precision, learn details about them like their demographics and what apps they have installed on their phone, or correlate that information to make even more sensitive discoveries—say, that a certain twentysomething man has a gay dating app installed on his phone and lives at a certain address, that someone sitting next to the spy at a Starbucks took a certain route after leaving the coffee shop, or that a spy's spouse has visited a particular friend's home or business.

The researches didn't exploit any bugs in mobile ad networks. They used them as designed. So, aspiring stalkers, abusers, blackmailers, home invaders, or nosy creeps: rest easy. If you have $1000 US, AdTech has your back.

Watches Designed for Helicopter Parents Have Multiple Security and Privacy Issues. Cue Surprise

In what should surprise absolutely no one, it looks like spyware designed for the hypervigilant and short-sighted parent have multiple security flaws that expose kids to focused risk.

Together with the security firm Mnemonic, the Norwegian Consumer Council tested several smartwatches for children. Our findings are alarming. We discovered significant security flaws, unreliable safety features and a lack of consumer protection.

Surveillance isn't caring. I completely understand that raising a kid can be petrifying, but when we substitute technology for communication, we create both unintended consequences and multiple other points of potential failure.

DailyPost - October 17, 2017

6 min read

I've been thinking and rethinking how I use Twitter. I've been on the service for a while, but I am increasingly uncomfortable with the service and the company. Between Twitter's blatant failures at curbing abuse, curbing the spread of misinformation, and the general privacy issues that plague corporate social media, I will be leaving Twitter at some point in the future.

However, I still have interesting conversations on Twitter. I still learn things. I still meet people I wouldn't meet otherwise. So, while I am staying on the site for now, I am also looking at things I can change to make leaving Twitter easier - which brings us to this post.

I use Twitter as a way of storing links I will read later. I'm going to change that, and store information in a space I control, in a format that works for me. I'm hoping that this will also make be a better reader and sharer - rather than skimming and being superficial, I will spend a little more time selecting what I want to retain. For now, I'm thinking I'll keep a running list of information I encounter during the day, and rather than spin it out on Twitter over the course of the day, I'll collect them into a list, with short commentary.

This isn't revolutionary - really, it's what a whole bunch of people did before Twitter, back in Ye Olde Days of the Blogge. I see myself putting out posts like this every few days. Over time, we'll see what develops.

Collection of data in the UK

In the UK, there appears to be widespread collection of data from social media accounts:

It remains unclear exactly what aspects of our communications they hold and what other types of information the government agencies are collecting, beyond the broad unspecific categories previously identified such as “biographical details”, “commercial and financial activities”, “communications”, “travel data”, and “legally privileged communications”.

It's unclear if this information is collected via publicly available information, or via some type of access granted by the company.

Old, but always timely: How to Write a Tom Friedman Article

From 2004, but, unfortunately, timeless. How to Write a Tom Friedman Article.

What’s important, however, is that we focus on what these events mean [on the ground/in the street/to the citizens themselves]. The [media/current administration] seems too caught up in [worrying about/dissecting/spinning] the macro-level situation to pay attention to the important effects on daily life. Just call it missing the [desert for the sand/fields for the wheat/battle for the bullets].

You too can write like intellectually lazy hot takes. Because we need more of those.

InfoSec Pros Among Worst Offenders of Employer Snooping

Who knew? Information Security professionals often access information they .

And it turns out that IT security executives were the worst offenders of this snooping behavior, compared to the rest of their team, according to the Dimensional Research survey commissioned by One Identity.

Executives are more likely to engage in unethical behavior than lower level employees. Shocking.

More on Harvey Weinstein

We will be hearing about Harvey Weinstein for a good long tiome, I suspect. The latest is that he fired a director and recast the lead in movie because the director's choice "wasn't 'fuckable'".

“I was furious after being kicked off my film and I told them all about what happened, I told them about the harassment claims and I said here is your quote: ‘I don’t cast films according to Harvey Weinstein’s erection,’ and they just laughed,” Caton-Jones said.

And, of course, the press knew, and other people knew, and no one did anything. We shouldn't kid ourselves that the attention on Harvey Weinstein is fixing the root of the problem. Weinstein deserves everything he gets, but if you think Weinstein is unique, or that Hollywood is unique, think again. Harassment is pervasive. When women speak, we need to believe them.

More on Insecure IoT Devices

Many IoT devices use Bluetooth Low Energy to connect. Sex toys are no exception, including the occasional butt plug.

This is the final result. I paired to the BLE butt plug device without authentication or PIN from my laptop and sent the vibrate command.

I hope that we can look past the butt plug (figuratively) to see how many standard IoT implementations are hopelessly insecure.

No One Reads Terms and Conditions

From 2016, but still relevant.

What we did is we went to the extreme, and we included this - a firstborn clause suggesting that if you agreed to these policies that as a form of payment, you'd be giving up a first-born child. And 98 percent of the participants that took the study didn't even notice this particular clause.

I know parenting is hard, people, but seriously -- pay attention.

OpEd by a Student on Navigating White Educators

The author is a black student who has been taught by predominantly white teachers.

(s)tudents of color make up 85 percent of the population... Our teaching staff is proportionally opposite: more than 85 percent white. That racial disparity between students and staff is a problem. There are subliminal and subconscious micro aggressions, uncomfortable questions about black hair, attempts to invalidate students' experiences of racism and constant assumptions about their backgrounds.

We need to listen to students, even if it makes us uncomfortable -- or especially when it makes us feel uncomfortable.

Privacy and Tracking on State Department of Education Web Sites

Doug Levin has started what looks to be like a great series on State Departments of Education and how they respect (or don't) the privacy of people who visit them.

(t)he web is not—nor will ever be—static. New technologies, tools, and services routinely offer up innovative new capabilities and personalized experiences. And, with every new digital experience that may amaze and delight website visitors, potential new threats can be introduced. While not frequently on the cutting edge of technology, school websites and information technology systems are not immune to these larger trends

This work will be coming out over the next few days/weeks - I look forward to seeing where it leads.

Google Serves Fake News Ads on Fact Checking Sites

You can't make this stuff up. Google AdWords was used to spread misinformation on sites dedicated to debunking information. As usual, Google provided no information about how their system was exploited, or how much money they made from ads placed by these fraudulent sites.

Google declined to explain the specifics of how the fake news ads appeared on the fact-checking sites.

As I and others have written about, Google is complicit in this, and Google and other adtech vendors profit from misinformation.

Filter Bubbles and Privacy, and the Myth of the Privacy Setting

6 min read

When discussing information literacy, we often ignore the role of pervasive online tracking. In this post, we will lay out the connections between accessing accurate information, tracking, and privacy. We will use Twitter an as explicit example. However, while Twitter provides a convenient example, the general principles we lay out here are applicable across the web.

Major online platforms "personalize" the content we see on them. Everything from Amazon's shopping recommendations to Facebook's News Feed to our timelines on Twitter are controlled by algorithms. This "personalization" uses information that these companies have collected about us to present us with an experience that is designed to have us behave in a way that aligns with the company's interests. And we need to be clear on this: personalization is often sold as "showing people more relevant information" but that definition is incomplete. Personalization isn't done for the people using a product; it's done to further the needs of the company offering the product. To the extent that personalization shows people "more relevant information," this information furthers the goals of the company first, and the needs of users second.

Personalization requires that companies collect, store, and analyze information about us. Personalization also requires that we are compared against other people. This process begins with data collection about us -- what we read, what we click on, what we hover over, what we share, what we "like", sites we visit, our location, who we connect with, who we converse with, what we buy, what we search for, the devices we use, etc. This information is collected in many ways, but some of the more visible methods companies use to get this information is via cookies that are set by ad networks, or social share icons. Of course, every social network (Facebook, Instagram, Twitter, Pinterest, Musical.ly, etc) collects this information from you directly when you spend time on their sites.

The web, flipping us the bird

When you see social sharing icons, know that when a site flips you the bird,  your browsing information is being widely shared with these companies and other ad brokers.

This core information collected by sites can be combined with information from other sources. Many companies explicitly claim this right in their terms of service. For example, Voxer's terms claim this right using this language:

Information We May Receive From Third Parties. We may collect information about you from other Product users, such as when a friend provides friend details or contact information, or indicates a relationship with you. If you authorize the activity, Facebook may share with us certain approved data, which may include your profile information, your image and your list of friends, their profile information and their images.

By combining information from other sources, companies can have information about us that includes our educational background, employment history, where we live, voting records, any criminal justice information from parking tickets to arrests to felonies, in addition to our browsing histories. With these datasets, companies can sort us into multiple demographics, which they can then use to compare us against other people pulled from other demographics.

In very general terms, this is how targeted advertising, content recommendation, shopping recommendation, and other forms of personalization all work. Collect a data set, mine it for patterns and the probablity that these patterns are significant and meaningful. Computers make math cheap, so this process can be repeated and refined as needed.

However, while the algorithms can churn nearly indefinitely, they need data and interaction to continue to have relevance. In this way, algorithms can be compared to the annoying office mate with pointless gossip and an incessant need to publicly overshare: they derive value from their audience.

And we are the audience.

Twitter's "Personalization and Data" settings provice a great example of how this works. As we state earlier, while Twitter provides this example, they are not unique. The settings shown in the screenshot below highlight some of the data that is collected, and how this information is used. The screenshot also highlights how, on social media, there is no such thing as a privacy setting. What they give us is a visibility setting -- while we have minimal control over what we might see, nothing is private from the company that offers the service.

Twitter's personalization settings

From looking at this page, we can see that Twitter can collect a broad range of information that has nothing to do with the core functionality of Twitter, and everything to do with creating profiles about us. For example, why would Twitter need to know the other apps on our devices to allow us to share 140 character text snippets?

Twitter is also clear that regardless of what we see here, they will personalize information for us. If we use Twitter, we only have the option to play by their rules (to the extent that they enforce them, of course):

Twitter always uses some information, like where you signed up and your current location, to help show you more relevant content.

What this explanation leaves out, of course, is for whom the content is most relevant: the person reading it, or Twitter. Remember: their platform, their business, their needs.

But when we look at the options on this page, we also need to realize that the data they collect in the name of personalization is where our filter bubbles begin. A best-case definition of "relevant content" is "information they think we are most interested in." However, a key goal of many corporate social sites is to make it more difficult to leave. In design, dark patterns are used to get people to act against their best interest. Creating feeds of "relevant content" -- or more accurately, suppressing information according to the dictates of an algorithm -- can be understood as a dark information pattern. "Relevant content" might be what is most likely to keep us on a site, but it probably won't have much overlap with information that challenges our bias, breaks our assumptions, or broadens our world.

The fact that our personal information is used to narrow the information we encounter only adds insult to injury.

We can counter this, but it takes work. Some easier steps include:

  • Using ad blockers and javascript blockers (uBlock Origin and Privacy Badger are highly recommended as ad blockers. For javascript blockers, try Scriptsafe for Chrome and NoScript for Firefox ).
  • Clear your browser cookies regularly.
  • When searching or doing other research, use Tor and/or a VPN.

These steps will help minimize the amount of data that companies can collect and use, but they don't eliminate the problem. The root of the problem lies in information assymetry: companies know more about us than we know about them, and this gap increases over time. However, privacy and information literacy are directly related issues. The more we safeguard our personal information, the more freedom we have from filter bubbles.

 

Bearistotle

4 min read

In January 2017, Mattel and Microsoft announced the launch of Aristotle, a digital assistant explicitly focused on very young children. The device was marketed by Mattel, and used Microsoft's AI technology. The device was literally intended to work with children from the first weeks of their lives. 

Crying, for example, can trigger Aristotle to play a lullaby or a recorded message from the parent. Conversely, a child’s crying can also trigger nothing at all, to let the kid settle down on his own. Parents will be able to configure these behaviors via the app.

To state the obvious, the developmental risks to a newly born child from having a recorded message in lieu of parental attention are not clear, but I don't think we are at a place where we want to "disrupt" parenting.

Concerns about Aristotle mounted after the initial announcement. Many of these concerns are privacy-related, but many had nothing to do with privacy and focused on the blatant irresponsibility and lack of humanity involved in outsourcing care for a child to a plastic gadget that collected data and shuffled it off to remote storage. As recently as six days ago, Mattel talked about the product as if it was going to be released

The following quotation cites Alex Clark, a Mattel spokesperson, in an article from September 29th.

Aristotle wasn’t designed to store or record audio or video, Clark said. No third parties will have access to any personally-identifiable information, and any data shared is entirely anonymous and fully encrypted, he said.

A few key points jump out from this fantastic piece of doublespeak.

  • First, as of six days ago, the company was defending Aristotle. This suggests that they were still considering releasing this device.
  • Second, the definition of "store" needs to be clarified. Are they saying that the device has no local storage, and it just transmits everything it contacts? This statement is empty. An statement with actual use would be to define what this device transmits, what it stores, and who can access it. But, of course, he is just a spokesperson. Truth costs extra.
  • Third, the last sentence makes two astounding claims: third parties can't access personally identifiable information, and any data shared is "entirely anonymous and fully encrypted." To start, it's refreshing to hear explicit confirmation that Mattel was planning on sharing data with third parties. However, their claims about not sharing personal information are a red herring. Without clarity on how they are anonymizing information, what the prohibitions are on attempts to re-identify the data set, why they are sharing data, and with whom they are sharing data, they aren't offering anything reassuring here. Finally, claiming that data are "fully encrypted" is meaningless: encrypted in transit? At rest? Is encryption in place between storage devices inside their network? While strong encryption is a necessary starting point, encryption isn't a blanket. There are multiple layers to using encryption to protect information, and a robust security program focuses on human and technical steps. Encryption is a piece of this, but only a piece.

Yesterday, Mattel announced that they were cancelling Aristotle. This is the right decision, but we shouldn't confuse this with good news. It was only two years ago that Mattel brought Spy Barbie -- complete with multiple security issues -- into the world.

People of all ages are all currently exposed via devices that have sub-par privacy and security practices, and privacy policies that do not respect the people buying and using the products. Everything from Amazon's Echo and Alexa products, to Google Home and Family products, to Siri, to Cortana, to Google's voice search on phones, to "Smart" TVs, to connected toys, to online baby monitors -- all of these devices have potential security issues, and opaque privacy terms. In most cases, people using these products have no idea about what information is collected, when it is collected, how long it is stored, who can access it, and/or how it can be used over time. When adults use these devices around kids, we send the clear message that this invisible and constant surveillance should not be questioned because it provides a convenience.

The mistake Mattel made this time was introducing a utilitarian object. If they had wrapped Aristotle in a toy, they'd be home free.

My prediction: in 2018, Bearistotle will be the must have toy of the season -- the friendliest, most helpful bear any child will ever need. It will retail for the bargain price of 499.99, and if you enable geotagging it will create a digital portfolio of childhood highlights to use in preschool appications.