Wednesday, February 27, 2013

Reinventing Society in the Wake of Big Data

Recently I seem to have become MIT's Big Data guy, with people like Tim O'Reilly and "Forbes" calling me one of the seven most powerful data scientists in the world. I'm not sure what all of that means, but I have a distinctive view about Big Data, so maybe it is something that people want to hear.

I believe that the power of Big Data is that it is information about people's behavior instead of information about their beliefs. It's about the behavior of customers, employees, and prospects for your new business. It's not about the things you post on Facebook, and it's not about your searches on Google, which is what most people think about, and it's not data from internal company processes and RFIDs. This sort of Big Data comes from things like location data off of your cell phone or credit card, it's the little data breadcrumbs that you leave behind you as you move around in the world.

What those breadcrumbs tell is the story of your life. It tells what you've chosen to do. That's very different than what you put on Facebook. What you put on Facebook is what you would like to tell people, edited according to the standards of the day. Who you actually are is determined by where you spend time, and which things you buy. Big data is increasingly about real behavior, and by analyzing this sort of data, scientists can tell an enormous amount about you. They can tell whether you are the sort of person who will pay back loans. They can tell you if you're likely to get diabetes.

They can do this because the sort of person you are is largely determined by your social context, so if I can see some of your behaviors, I can infer the rest, just by comparing you to the people in your crowd. You can tell all sorts of things about a person, even though it's not explicitly in the data, because people are so enmeshed in the surrounding social fabric that it determines the sorts of things that they think are normal, and what behaviors they will learn from each other.

As a consequence analysis of Big Data is increasingly about finding connections, connections with the people around you, and connections between people's behavior and outcomes. You can see this in all sorts of places. For instance, one type of Big Data and connection analysis concerns financial data. Not just the flash crash or the Great Recession, but also all the other sorts of bubbles that occur. What these are is these are systems of people, communications, and decisions that go badly awry. Big Data shows us the connections that cause these events. Big data gives us the possibility of understanding how these systems of people and machines work, and whether they're stable.

The notion that it is connections between people that is really important is key, because researchers have mostly been trying to understand things like financial bubbles using what is called Complexity Science or Web Science. But these older ways of thinking about Big Data leaves the humans out of the equation. What actually matters is how the people are connected together by the machines and how, as a whole, they create a financial market, a government, a company, and other social structures.

Because it is so important to understand these connections Asu Ozdaglar and I have recently created the MIT Center for Connection Science and Engineering, which spans all of the different MIT departments and schools. It's one of the very first MIT-wide Centers, because people from all sorts of specialties are coming to understand that it is the connections between people that is actually the core problem in making transportation systems work well, in making energy grids work efficiently, and in making financial systems stable. Markets are not just about rules or algorithms; they're about people and algorithms together.

Understanding these human-machine systems is what's going to make our future social systems stable and safe. We are getting beyond complexity, data science and web science, because we are including people as a key part of these systems. That's the promise of Big Data, to really understand the systems that make our technological society. As you begin to understand them, then you can build systems that are better. The promise is for financial systems that don't melt down, governments that don't get mired in inaction, health systems that actually work, and so on, and so forth.

The barriers to better societal systems are not about the size or speed of data. They're not about most of the things that people are focusing on when they talk about Big Data. Instead, the challenge is to figure out how to analyze the connections in this deluge of data and come to a new way of building systems based on understanding these connections.

by Sandy Pentland, Edge |  Read more:
Photo: uncredited