Big data has become all the rage. The Internet and other technologies of data collection and surveillance have produced massive amounts of information. Big business and the scientific community now want to put that information in the hands of government to use to solve all sorts of policy problems. As the TechAmerica Foundation highlights in their new report, Demystifying Big Data, computational technologies now offer governments the capability to understand and manage social and economic processes on scales unimaginable even a few years ago.
I’ll be clear up front: I am not a fan of big data. Much of the criticism of big data is lodged in the problem of privacy. My concerns are different, however. The real problem is the scope of government power. We often talk about big government in terms of the size of government budgets and personnel, but those are just proxies for a more important problem, the exercise of its influence in our lives. Big data will lead government to have far greater influence in our lives, arguably in far more insidious and less visible ways. It is, ultimately, a tool for social control and engineering, which is what makes it appealing to both technocrats in the center left and the law and order crowd in the center right. Community organizers and libertarians, alike, watch out. The state is about to become enormous.
Privacy protection can’t fix this problem. Privacy is about what the government—or a company—knows about you as an individual. If someone were able to access that data and use it for nefarious purposes, like denying insurance or rejecting a job application, they could potentially do significant damage to you and to your family. That’s bad. And it’s why TechAmerica’s report highlights that big data can be fashioned in ways that protect against privacy intrusions. Big data, they suggest, is really about aggregate data, anyways. Nothing should be able to link identifying data to you. If identifying data must remain, access should severely restricted to that information to prevent abuse.
But privacy protection falls short of addressing several major problems with big data. For example, there are some especially crucial applications of big data that not only require identifying information but that require its constant and pervasive use. Two illustrations. Political parties in the United States have amassed massive databases about individuals designed to allow them to target individual voters with tailored messaging in order to mobilize votes for their candidates and suppress votes for other candidates. Likewise, retail stores are amassing massive databases about individuals in order to be able to tailor marketing and advertising efforts to your individual preferences and tastes. In both cases, identifying data is absolutely essential in order both to construct the profile in the first place (the database must link a large set of purchasing and other decisions to one another and to you) and to make the link between the data profile and you as a target of individually tailored political or economic persuasion. In these cases, protecting the privacy of data (i.e., restricting access to data) is not the primary concern; rather, protecting against inappropriate political and economic manipulation is.
The second and subtler problem is that de-identified, aggregated data is in reality worse from the standpoint of government intrusion into our lives. Here’s why. Data about you as an individual can be used to manipulate, threaten, or sanction you, as an individual. But aggregate data about large groups can be used to manipulate entire populations. The US Constitution, which protects individuals from the arbitrary and capricious exercise of state power, says nothing about social manipulation on grand scales.
Economists and behavioral psychologists, for example, including senior officials in the Obama Administration, have become enamored by the theory of nudge as an instrument of public policy. Nudge hypothesizes that if you structure a decision choice in a way that takes advantage of people’s emotional or psychological inclinations, you can still give them a free choice while generating better public policy outcomes (read that as societies that comply better with how the government believes they should behave). But let’s look at one of their most common examples. If you structure employee choices to contribute to a 401k so that the default is to contribute, rather than to not contribute, many more people will actually contribute to their voluntary retirement accounts. So, technically, those people all had a choice. They could have opted out. But if the result of government rules is that 50% more people contribute to 401k plans, then the government has, as a matter of fact and very real societal outcome, manipulated the US public into behaving the way the government decided was best.
This is precisely the point of most big data applications for the business of government. Big data wants, for example, to measure health care outcomes comparatively in order to make probabilistic models of which treatments would best work for you, as an individual, and then to usurp the judgment of you and your doctor in favor of their model by refusing to pay for treatment unless the models approve. Similarly, big data wants to shape where and how people drive, how children are educated, how poor mothers take care of themselves during pregnancy, etc. The applications, as TechAmerica suggests, are pervasive across all levels and domains of government concern.
The fact that all of this is designed to structure society more deeply and thoroughly in ways determined by government experts isn’t discussed at all in the TechAmerica report. How could it be? It’s the point. In the spirit of Taylorism and Fordism, big data threatens to use the promise of efficiency and optimization to turn us all into gears in the giant machines that are today’s complex technological societies. It’s not hard to see why. 21st Century societies face significant problems that will require enormous creativity and collective innovation to solve. Big data offers the illusion of rational solutions to these problems through socio-technological optimization, but it is just an illustion. To paraphrase James Scott’s Seeing Like a State, the hubris of those who have believed they can use big data to know and govern society has given rise to some of humanity’s worst technological failures.
The problems with hyper-rationalist social engineering are legion. Numbers can capture patterns of behavior, but they cannot capture the realities of meaning, identity, value, spirit, drive, ambition, love, and hatred that underpin and shape behavior. Communities are collectives, not aggregates of individuals. Local vibrancy overwhelms simplified models. Data—even big data—is inevitably biased toward metrics that are easily and inexpensively measured, or toward metrics that someone has decided to pay to collect. Data is value-laden. But the biggest problem of all is that big data is an instrument of big organizations, whether big government or big companies. It is an instrument of power that threatens not only the freedom of individuals but also the freedom of societies.
For more on big data and its impact, read Campaigns Mine Personal To Get Out Vote from the New York Times.
Clark Miller is the associate director at the Consortium for Science, Policy and Outcomes at Arizona State University and is an associate professor of science policy and political science.