For real-time advertising bidding, in which audiences are being served ads that were purchased milliseconds after users arrive at a Web page, ad services "match cookies," so that both sides know who a user is. While that information may not be stored by both companies, i.e. it's not added to a user's persistent file, it means that the walls between online data selves are falling away quickly. Everyone can know who you are, even if they call you by a different number.
Furthermore, many companies are just out there collecting data to sell to other companies. Anyone can combine multiple databases together into a fully fleshed out digital portrait. As a Wall Street Journal investigation put it, data companies are "transforming the Internet into a place where people are becoming anonymous in name only." Joe Turow, who recently published a book on online privacy, had even stronger words:
If a company can follow your behavior in the digital environment -- an environment that potentially includes your mobile phone and television set -- its claim that you are "anonymous" is meaningless. That is particularly true when firms intermittently add off-line information such as shopping patterns and the value of your house to their online data and then simply strip the name and address to make it "anonymous." It matters little if your name is John Smith, Yesh Mispar, or 3211466. The persistence of information about you will lead firms to act based on what they know, share, and care about you, whether you know it is happening or not.
Militating against this collapse of privacy is a protection embedded in the very nature of the online advertising system. No person could ever actually look over the world's Web tracks. It would be too expensive and even if you had all the human laborers in the world, they couldn't do the math fast enough to constantly recalculate Web surfers' value to advertisers. So, machines are the ones that do all of the work.
When new technologies come up against our expectations of privacy, I think it's helpful to make a real-world analogy. But we just do not have an adequate understanding of anonymity in a world where machines can parse all of our behavior without human oversight. Most obviously, with the machine, you have more privacy than if a person were watching your clickstreams, picking up collateral knowledge. A human could easily apply analytical reasoning skills to figure out who you were. And any human could use this data for unauthorized purposes. With our data-driven advertising world, we are relying on machines' current dumbness and inability to "know too much."
This is a double-edged sword. The current levels of machine intelligence insulate us from privacy catastrophe, so we let data be collected about us. But we know that this data is not going away and yet machine intelligence is growing rapidly. The results of this process are ineluctable. Left to their own devices, ad tracking firms will eventually be able to connect your various data selves. And then they will break down the name wall, if they are allowed to.
Your visit to this story probably generated data for 13 companies through our website. The great downside to this beautiful, free Web that we have is that you have to sell your digital self in order to access it. If you'd like to stop data collection, take a look at Do Not Track Plus. It goes beyond Collusion and browser based controls in blocking data collection outright.
But I am ultimately unclear what I think about using these tools. Rhetorically, they imply that there will be technological solutions to these data collection problems. Undoubtedly, tech elites will use them. The problem is the vast majority of Internet users will never know what's churning beneath their browsers. And the advertising lobby is explicitly opposed to setting browser defaults for higher levels of "Do Not Track" privacy. There will be nothing to protect them from unwittingly giving away vast amounts of data about who they are.
On the other hand, these are the tools that allow websites to eke out a tiny bit more money than they otherwise would. I am all too aware of how difficult it is for media businesses to survive in this new environment. Sure, we could all throw up paywalls and try to make a lot more money from a lot fewer readers. But that would destroy what makes the web the unique resource in human history that it is. I want to keep the Internet healthy, which really does mean keeping money flowing from advertising.
I wish there were more obvious villains in this story. The saving grace may end up being that as companies go to more obtrusive and higher production value ads, targeting may become ineffective. Avi Goldfarb of Rotman School of Management and Catherine Tucker of MIT's Sloan School found last year that the big, obtrusive ads that marketers love do not work better with targeting, but worse.
"Ads that match both website content and are obtrusive do worse at increasing purchase intent than ads that do only one or the other," they wrote in a 2011 Marketing Science journal paper. "This failure appears to be related to privacy concerns: The negative effect of combining targeting with obtrusiveness is strongest for people who refuse to give their income and for categories where privacy matters most."
Perhaps there are natural limits to what data targeting can do for advertisers and when we look back in 10 years at why data collection practices changed, it will not be because of regulation or self-regulation or a user uprising. No, it will be because the best ads could not be targeted. It will be because the whole idea did not work and the best minds of the next generation will turn their attention to something else.