Posted a deck at the wonderful Haiku Deck. A talk I gave at the recent Arup get-together on Big Data in NYC, although I stuck pretty close to social data and the intersection of social trends in work: the near-term emergence of Watson-grade AI and data analysis in combination with the fall of postmodern shallow organizational culture and the rise of the postnormal deep business culture. Check it out.
It raises more questions than it answers.
“Drawing workshop - B.A.C. of 14th district of Paris” by Julien Prévieux. who asked cops in Paris to draw Voronoi diagrams based on maps showing recent crimes… as a way to make them understand each step of crime prediction algorithms:
A month and a half ago, I set up a drawing workshop with four police officers from the 14th arrondissement in Paris: Benjamin Ferran, Gerald Fidalgo, Blaise Thomas and Mickael Malvaud. The goal of the workshop was to learn to draw “Vorfonoi diagrams” [sic: Voronoi] manually from maps identify- ing recent crimes. These diagrams, used in the U.S. but not in France yet, are among the mapping analysis tools used to visualize crimes in real time to deploy patrols accordingly. Usually they are made by computer, but I offered the French policemen a chance to draw them by hand, taking the time to execute one by one the different steps of the algorithm. The exercise is slow and laborious and requires a precision that is difficult to obtain. With this technique of traditional drawing, the optimization tool is stripped of its primary function by invariably producing the results too late. But what you lose in efficiency is certainly gained in other areas: intensive drawing practice at weekends and holidays, in-depth exploration of the technical division of surfaces into polygons, discussions about Police processing systems(?) and introduction of new management methods, and production of a series of very successful abstract drawings.
The framed example above is “theft from cars” (november 2004, Paris 75017. Drawing executed by Gérald Fidalgo unique piece, serie of 7 + 7 draft.)
The tessellation of a Voronoi map is actually straightforward to conceive but difficult to execute:
This technique is faster if you successively approximate: manually draw guestimated regions around each spot, and after a pass, return to each, making the blog non-overlapping. Software is good at this; people, so-so.
In the case of crime deterrance, the outcome is to rethink police cruising patterns, to concentrate on Voronoi regions of greatest density of crimes, but to cover all areas in a way that matches crime density and shortest path algorithms.
The big data slant: imagine superimposing other non-obviously correlated data sets with a map like this. Social networks of known criminals. Street light maps and luminosity indices. Walking patterns, and locations of public transport. Analyses of how long cars are parked in each region.
People can’t crunch all these variables to determine which are correlated and to what degree, but big data analysis can. And soon, it will cost nothing to peer into the fog of this snarl of data and winnow out the facts. It will be like turning on a flashlight on a dark street.
This is not the same as prediction, but it is so close that there is no difference. You’ll know that a cloudy Sunday night when there a football game at the high school means that in a district a mile away thefts from cars will spike. You could stop that by redirecting two more police cruisers to that neighborhood.
And the deep data machines could be cranking that for all sorts of crimes — not just thefts from cars — and optimize to distribute cruisers and foot patrols based on the goals of minimizing violent crimes first, felonies second, and so on. So cops won’t be hassling kids for smoking on school grounds when they should be patrolling the next district to counter armed robberies and burglaries.
Kate Crawford on “Why Big Data is Not Truth”
Brian is one of my closest friends, and I owe him huge for letting me live on his boat for a few months a few years back. The boat was based in the marina behind Willie Mays field in San Francisco, and my office was only a block away, at that time.
Since then, Brian has become a highly influential and well-known author and speaker, with works like The End of Business as Usual, Engage! and What’s the Future of Business (WTF). I am honored to have him consider me a friend and influence, as he is to me.
Brian Solis is principal at Altimeter Group, a research firm focused on disruptive technology. A digital analyst, sociologist, and futurist, Solis has studied and influenced the effects of emerging technology on business, marketing, and culture. Solis is also globally recognized as one of the most prominent thought leaders and published authors in new media. His new book, What’s the Future of Business (WTF), explores the landscape of connected consumerism and how business and customer relationships unfold and flourish in four distinct moments of truth.
Stowe Boyd: I’m glad we could get together and chat. I wonder what your take is on big data. There is so much buzz, and over the top metaphors. It’s the new oil, it’s the new internet! I know you have some reservations, so I’ll ask you, how do you see big data playing in the transformations we know are needed in the future of business?'No matter how smart we get with predictive algorithms it doesn’t matter, because without understanding social science, without aligning with a bigger mission or vision with what we are trying to do — something that is going to matter to people — we are just managing businesses the way we always have. We are not moving in any new direction.'
Brian Solis: I’ll start with a caveat. My day job is as a digital analyst. I study tech disruption, and I try and make sense of it for business. And what I hear quite a bit is that big data is big, it is the savior of the future of business, because it finally puts business in alignment with customer activity and customer expectations, so that businesses can move from the rigid format of today to a miraculous state of prediction, or being able to look into the crystal ball to make really great decisions. And I hear a lot these days that business professionals and executives are making decisions based on experience or gut, and with big date they will have the answers they need to do the right thing at the right time…
SB: Sounds like you really think the hype about peering into the future through big data, or based on past management practices, is a bit suspect. What kind of mistake is latent in that thinking?
BS: It’s a technology-first decision. People are reacting to technology. People are assuming that the data coming in is going have a tremendous amount of insight baked in. I think the future business lies at the intersection of data science, digital anthropology, sociology, ethnography, and psychology. Because without someone on the human side to track behavior we can’t necessarily map the journey. We can’t identify new touch points. We can’t really surface new expectations or opportunities without understanding human beings.
SB: Well, I completely concur. As Williams James said, ‘you judge a man’s intelligence by how well he agrees with you.’ By that measure, you are a wise man indued. I’m going to raise a second question, which is almost unrelated, but comes back to another debunking of the Santa Claus, Tooth Fairy perspective of big data that people seem to have. And that is the fundamental limits of certainty in a world that has grown so massively complex that certainty fails. There is a great deal of science that supports the theory that large complex systems can’t be peered into. They are too chaotic, and the mathematics suggests that even if we could understand them, we don’t know the initial state of the system, and so it’s unknowable. So that suggests there is a fundamental limit on what we can project, no matter how much data we collect.'We have access to incredible data, but we put it into the same processes, and use the same methodology and philosophy to determine what we do with these technologies. And then we expect different results.'
BS: I think part of this is the hope or the optimism that there has to be an answer to fix business, and I think people are looking to any kind of technology or business practice that they can. There’s that old saying about history repeating itself…
SB: ‘History doesn’t repeat itself, but it rhymes.’ Mark Twain.
BS: Exactly! Thank you. If you think about it, theres a big disconnect between where technology comes into the organization and where decision making takes place in the organization. We’ve learned that from past mistakes. The people making the decisions about what to do and where to head the company are disconnected from technology. A lot of decision-makers don’t even read their own email. And what ends up happening is then on one hand you have one person reporting to shareholders and stakeholders, making decisions based on spreadsheets and powerpoints. Then on the other, you have people who have real access to data or are in the real world. You almost have an Undercover Boss moment, where the decision makers need to go back in and think about what it’s like to be the employee or that customer. You have a little bit of empathy that comes back into the mix. Big data is that representation. It’s either coming through the marketing department or some silo because it’s not getting anywhere it needs to because, number one, there’s no one on the inside to translate those insights into actionable insights and tie it to tangible benefits to the organization. And the second thing is, no one on the organizational level is listening. That’s the problem. And no matter how smart we get with predictive algorithms it doesn’t matter, because without understanding social science, without aligning with a bigger mission or vision with what we are trying to do — something that is going to matter to people — we are just managing businesses the way we always have. We are not moving in any new direction.
SB: So, if it’s going to turn out that big data is going to something, but it won’t be a panacea, what should the most prescient of CEOs be considering? What should they be spending their time on? How are they going to get that extra productivity, or competitiveness, or intensity that they are reporting they are after. And they don’t think they can get that extra boost with the techniques used before, liking making people working long hours or automating business processes. Those seem to be tapped out.
BS: What’s that Rita Mae Brown quote? The definition of insanity is doing the same thing over and over but expecting different results. And you look at how we are making decisions about this new technology, and how we are looking at the other technologies that are producing these disruptions like social, mobile, and real time. We have access to incredible data, but we put it into the same processes, and use the same methodology and philosophy to determine what we do with these technologies. And then we expect different results. That’s just not going to happen. As an analyst, one of the things I have answer in my research is ‘what are the questions that are going to be answered based on the research?’ The problem I have is that in an era of uncertainty people aren’t asking the right questions. Those question might be like ‘What are the early warning signals I need to pay attention to that determines that my business is off the rails?’ or ‘How do I convince shareholders that we need to do things differently even though we are incredibly profitable right now?’ These are questions that leaders or visionaries ask. Not necessarily incredible managers or CEOs. And the problem is that someone is going to have to become that champion within the organization. And I do think big data is an opportunity, but it takes someone on the outside to say ‘This is so telling! Did you realize fourty-seven thousand people have had the same problem and we haven’t done anything about it?’ By the way the title of my new book is What Is The Future of Business, which stands for WTF for a reason.
SB: You sneak.
BS: I wanted to show I can talk the talk and walk the walk, so I spent a good couple of years as an aspiring digital anthropologist and ethnographer, and studied — using big data — the decision making cycles using different kind of customers and different types of industries, and I mapped it. And I showed how you can distill your business into four moments of truth. What you can do in those four moments of truth to grow your business to increase awareness for a much more distracted and connected customer than ever before. You take the data, you take the social science, you take just a bit of empathy, and you can do a lot of amazing things. And that to me is where we have to start thinking. And by the way, I’m doing what I’m telling everyone else to do. You have to fight the fight. Revolutionaries are revolutionaries for a reason. You have to speak the language of the C suite, if you will, and you have to show how this matters to them and how this is going to help them.'The human psyche can tolerate a great deal of prospective misery, but it cannot bear the thought that the future is beyond all power of anticipation'. - Robert Heilbroner
SB: You’ve made a great example of the sorts of things that I think are necessary, the things that underlie this series, Socialogy — the theory and practice of social business based on scientific principles — looking to outside disciplines, outside of the conventional skill sets of business people, looking to fields like cognitive science and network analysis. A great example is Paul Kedrosky’s Ladder Index. Paul learned that the state of California was publishing data on what sorts of debris was being recovered on the state’s roads, stuff that was falling off of trucks and cars. He tracked this data for a few months and one of the things he discovered was that ladders had an interesting trajectory, first rising and then starting to fall. He then started to plot that against housing prices, and it turned out to be a perfect predictor of housing prices. Because, after all, as housing starts fall, the number of carpenters driving to job sites with ladders on their pickups starts to fall, and so too will the number of ladders falling off those trucks. So it’s an example of social data, the behavior of people aggregated neatly. But it wasn’t very big, actually, only a few hundred or so ladders per month, and only a few months of data. And he applied the insight about who are the owners of these ladders falling onto the highway? Why construction workers, mostly. And he correlated that with housing data, and discovered the Ladder Index of Housing. Obviously, the state of California was not providing that analysis along with the raw data in the spreadsheet.
BS: I love that example, and you bring it home to everyone. When we look on the horizon about what big data can do and what businesses need, you realize that just as much as the data you need people like that, to make the inference, and to check it out. And that kind of sense is not common.
SB: Kedrosky is a world-class economist.
BS: Businesses are going to have to pay attention, because as soon as two or three of your competitors figure it out, they are going to have a tremendous competitive advantage over you. Why react to that? That’s when digital Darwinism starts to happen: when society and technology change faster than your ability to adapt. Get it in place now.
SB: Brian, next time we have to do this face to face, with champagne, so I can toast you to say thanks.
BS: Thanks for having me.
So the critical factor isn’t just amassing more data about people, but deepening our understanding about people, people connected in complex social systems, like markets, cities, and companies. I recall the quote by Robert Heilbroner,
The human psyche can tolerate a great deal of prospective misery, but it cannot bear the thought that the future is beyond all power of anticipation.
People want to believe we can see around the corner to the future, in order to make positive changes. But we can’t let our desires cloud our thinking, and give in to the dream of big data solving all of the big questions. I agree with Brian: we need a dollop of empathy to make sense of the human world, not just number crunching.
This post was written as part of the IBM for Midsize Business program, which provides midsize businesses with the tools, expertise and solutions they need to become engines of a smarter planet. I’ve been compensated to contribute to this program, but the opinions expressed in this post are my own and don’t necessarily represent IBM’s positions, strategies or opinions.
Geoffrey West takes a look at the unknowability of the innards of complex systems, and wonders if big data — and a new big theory to corral big data — could act like a flashlight, revealing the inner workings of financial markets, urban communities, and ecosystems under stress.
Geoffrey West, Big Data Needs a Big Theory to Go with It
Complexity comes into play when there are many parts that can interact in many different ways so that the whole takes on a life of its own: it adapts and evolves in response to changing conditions. It can be prone to sudden and seemingly unpredictable changes—a market crash is the classic example. One or more trends can reinforce other trends in a “positive feedback loop” until things swiftly spiral out of control and cross a tipping point, beyond which behavior changes radically.
What makes a “complex system” so vexing is that its collective characteristics cannot easily be predicted from underlying components: the whole is greater than, and often significantly different from, the sum of its parts. A city is much more than its buildings and people. Our bodies are more than the totality of our cells. This quality, called emergent behavior, is characteristic of economies, financial markets, urban communities, companies, organisms, the Internet, galaxies and the health care system.
The digital revolution is driving much of the increasing complexity and pace of life we are now seeing, but this technology also presents an opportunity. The ubiquity of cell phones and electronic transactions, the increasing use of personal medical probes, and the concept of the electronically wired “smart city” are already providing us with enormous amounts of data. With new computational tools and techniques to digest vast, interrelated databases, researchers and practitioners in science, technology, business and government have begun to bring large-scale simulations and models to bear on questions formerly out of reach of quantitative analysis, such as how cooperation emerges in society, what conditions promote innovation, and how conflicts spread and grow.
The trouble is, we don’t have a unified, conceptual framework for addressing questions of complexity. We don’t know what kind of data we need, nor how much, or what critical questions we should be asking. “Big data” without a “big theory” to go with it loses much of its potency and usefulness, potentially generating new unintended consequences.
When the industrial age focused society’s attention on energy in its many manifestations—steam, chemical, mechanical, and so on—the universal laws of thermodynamics came as a response. We now need to ask if our age can produce universal laws of complexity that integrate energy with information. What are the underlying principles that transcend the extraordinary diversity and historical contingency and interconnectivity of financial markets, populations, ecosystems, war and conflict, pandemics and cancer? An overarching predictive, mathematical framework for complex systems would, in principle, incorporate the dynamics and organization of any complex system in a quantitative, computable framework.
'The emergent is everywhere and nowhere.'
Complexity isn’t a clock. You can’t open one up and see it’s innards. There are no gears and cogs. If there was a way to ‘look inside’, all you’d find would be more complex systems. And those complex systems aren’t connected in purely physical way, made up of computable inputs and outputs: they are united by emergent behaviors: the system manifests it character my acting in ways that are inherently unpredictable, and incalculable. These behaviors arise from the interactions between the components, but reside in none of them. The emergent is everywhere and nowhere.
We have no math for this.
West might as well be saying ‘We need to be able to see into the future . It would be helpful.’ But that doesn’t mean we have a way to do it, or that it is doable at all.
The tempo of modern life has sped up to the point that the future feels closer, and since it’s only a heartbeat away it seems reasonable to imagine being able to glance around that corner and know what is about to transpire. But that’s just a feeling.
'The more we have wired everything into everything else, the less we can know about what will happen tomorrow.'
The future is actually farther away than ever, because we have constructed a world that is the most multi-facted astrolobe, the most incestuous interconnection of global economic interdependencies, the deepest ingraining of contingent political scenarios, and the widest pending cascade of possible ecological side-effects. The more we have wired everything into everything else, the less we can know about what will happen tomorrow.
In essence, West hopes we can create a math that can pile up all the big data and crunch it, in a Borgesian infinity. A machinery as complex as the world it hopes to fathom, allowing us — or at least it— to know everything about everything.
I suspect we will have to settle for something less.
We could start by intentionally decoupling complexity that poses threats. Derivative trading, and credit default swaps are a good example. Efforts by banks and brokerages to diffuse risks, and sharing them with other finance companies leads to increased risk, systemically. When there is a big downturn the risks are amplified, and the cascade leads to huge ‘unintended’ results. The solution to this is not predicting when and how it will happen, but stopping the increased complexity inherent in derivatives and credit default swaps. The only cure for increased complexity is decoupling components of the larger system.
Not too long ago, I expressed some disbelief about the techno-utopianism that seems to surround discussions of big data.
Stowe Boyd, Re: The Future Impact Of Big Data
Unconstrained and dynamic complex systems — like our society, the economic system of Europe, or the Earth’s weather — are fundamentally unknowable: their progression from one state to another cannot be predicted consistently, even if you have a relatively good understanding of both the starting state and the present state, because the behavior of the system as a whole is an emergent property of the interconnections between the parts. And the parts are themselves made up of interconnected parts, and so on.
Yes, weather forecasting and other scientific domains have been benefited by better models and more data, and more data and bigger analysis approaches will increase the level of consistency for weather, but only to a certain extent. There are rounding errors that grow from the imprecision of measures and oversimplifications in our models, so that even something as potentially opaque as the weather — where no one is intentionally hiding data, or degrading it — cannot be predicted completely. In everyday life, this is why the weather forecast for the next few hours is several orders of magnitude better than the forecast for 10 days ahead. Big data — as currently conceived — may allow us to improve weather prediction for the next 10 days dramatically, but the inverse square law of predictability means that predictions about the weather 10 months ahead are unlikely to dramatically improve.
So, consider it this way: Big data is unlikely to increase the certainty about what is going to happen in anything but the nearest of near futures — in weather, politics, and buying behavior — because uncertainty and volatility grow along with the interconnectedness of human activities and institutions across the world. Big data is itself a factor in the increased interconnectedness of the world: as companies, governments, and individuals take advantage of insights gleaned from big data, we are making the world more tightly interconnected, and as a result (perhaps unintuitively) less predictable.
I saw a post today from Nassim Taleb, that echoes my points, from a more mathematical basis:
Nassim Taleb, Beware the Big Errors of ‘Big Data’
We’re more fooled by noise than ever before, and it’s because of a nasty phenomenon called “big data.” With big data, researchers have brought cherry-picking to an industrial level.
Modernity provides too many variables, but too little data per variable. So the spurious relationships grow much, much faster than real information.
In other words: Big data may mean more information, but it also means more false information.
Just like bankers who own a free option — where they make the profits and transfer losses to others – researchers have the ability to pick whatever statistics confirm their beliefs (or show good results) … and then ditch the rest.
Big-data researchers have the option to stop doing their research once they have the right result. In options language: The researcher gets the “upside” and truth gets the “downside.” It makes him antifragile, that is, capable of benefiting from complexity and uncertainty — and at the expense of others.
But beyond that, big data means anyone can find fake statistical relationships, since the spurious rises to the surface. This is because in large data sets, large deviations are vastly more attributable to variance (or noise) than to information (or signal). It’s a property of sampling: In real life there is no cherry-picking, but on the researcher’s computer, there is. Large deviations are likely to be bogus.
As data sets grow — especially data about complex, adaptive systems, like world economics, or human interactions — the proportion of noise grows as a function of the complexity.
Big data is not going to be some Hubble telescope peering into the heart of the universe. The world, alas, is not becoming more knowable.
I wrote a guest post at #WETHEDATA, ‘Data As Commons’, where I suggest we need to think about collective solutions to personal data, not individualist approaches. Take a look.
Lara Lee, cited by Stephanie Clifford in Social Media Are Giving a Voice to Taste Buds via NY Times.com
In a piece about the fad flavors for corn chips and cosmetics colors is buried a bit of deep insight by Lara Lee, chief innovation and operating officer at the design consultancy Continuum, which helped design the Swiffer and the One Laptop per Child project.
Our knowledge is constrained by the fabric of the post-normal. The notion that there is a deterministic future ahead of us, rolling out like a yellow brick road, is an illusion. Next year emerges out of an opaque sea of trillions of semi-independent decisions made in the present by billions of individuals and groups, cascading into each other and impacting each other in literally unknowable ways. When systems become as complex as the modern world there are no tools that can see more than a very short distance into the future.
Yes, taste makers can concoct a spicy chip that sells well this season in southern California, or what beer will be popular in NYC for Labor Day, but we can’t predict, for example, the invention of alternatives to antibiotics in a world where bugs are growing antibiotic-resistant. There are limits to our knowledge:
Big data is unlikely to increase the certainty about what is going to happen in anything but the nearest of near futures — in weather, politics, and buying behavior — because uncertainty and volatility grow along with the interconnectedness of human activities and institutions across the world. Big data is itself a factor in the increased interconnectedness of the world: as companies, governments, and individuals take advantage of insights gleaned from big data, we are making the world more tightly interconnected, and as a result (perhaps unintuitively) less predictable.
I am once again a Pew Expert, featured in a big data survey:
Imagining the Internet, Elon University, The 2012 Survey: What is the potential future influence of Big Data by 2020?
A number of respondents articulated a view that could be summarized as: Humans seem to think they know more than they actually know. Still, despite all of our flaws, this new way of looking at the big picture could help.
One version of this kind of summary thought was written by Stowe Boyd […]
Overall, the growth of the ‘Internet of Things’ and ‘Big Data’ will feed the development of new capabilities in sensing, understanding, and manipulating the world. However, the underlying analytic machinery (like Bruce Sterling’s Engines of Meaning) will still require human cognition and curation to connect dots and see the big picture.
And there will be dark episodes, too, since the brightest light casts the darkest shadow. There are opportunities for terrible applications, like the growth of the surveillance society, where the authorities watch everything and analyze our actions, behavior, and movements looking for patterns of illegality, something like a real-time Minority Report.
On the other side, access to more large data can also be a blessing, so social advocacy groups may be able to amass information at a low- or zero-cost that would be unaffordable today. For example, consider the bottom-up creation of an alternative food system, outside the control of multinational agribusiness, and connecting local and regional food producers and consumers. Such a system, what I and others call Food Tech, might come together based on open data about people’s consumption, farmers’ production plans, and regional, cooperative logistics tools. So it will be a mixed bag, like most human technological advances.
Others had smart things to say, like Jerry Michalski, Jeff Jarvis, David Weinberger, danah boyd, and Janna Anderson. Go read the whole thing.
There was a recent hoo-ha at a scientific conference in France, when Bernardo Huberman was furious when researchers from Google and a contributing university presenting results of social data analysis declined to share the data.
John Markoff, Big Data Troves Stay Forbidden to Social Scientists via NYTimes.com
The issue came to a boil last month at a scientific conference in Lyon, France, when three scientists from Google and the University of Cambridge declined to release data they had compiled for a paper on the popularity of YouTube videos in different countries.
The chairman of the conference panel — Bernardo A. Huberman, a physicist who directs the social computing group at HP Labs here — responded angrily. In the future, he said, the conference should not accept papers from authors who did not make their data public. He was greeted by applause from the audience.
In February, Dr. Huberman had published a letter in the journal Nature warning that privately held data was threatening the very basis of scientific research. “If another set of data does not validate results obtained with private data,” he asked, “how do we know if it is because they are not universal or the authors made a mistake?”
He added that corporate control of data could give preferential access to an elite group of scientists at the largest corporations. “If this trend continues,” he wrote, “we’ll see a small group of scientists with access to private data repositories enjoy an unfair amount of attention in the community at the expense of equally talented researchers whose only flaw is the lack of right ‘connections’ to private data.”
Facebook and Microsoft declined to comment on the issue. Hal Varian, Google’s chief economist, said he sympathized with the idea of open data but added that the privacy issues were significant.
“This is one of the reasons the general pattern at Google is to try to release data to everyone or no one,” he said. “I have been working to get companies to release more data about their industries. The idea is that you can provide proprietary data aggregated in a way that poses no threats to privacy.”
The debate will only intensify as large companies with deep pockets do more research about their users. “In the Internet era,” said Andreas Weigend, a physicist and former chief scientist at Amazon, “research has moved out of the universities to the Googles, Amazons and Facebooks of the world.”
And of course, big data is worth big money — leaving aside the privacy concerns — and controlling access to that data is central to the aspirations of companies like Google, Facebook, and others.
Research is just the first place where the latent data inequality of the post normal world will come to light. We will each of us — as individuals — be divided from the inherent value of information about our activities and the inferences that can be made about them. As a society, we will find corporations that do not have our interests at heart working to exploit the potential value of our aggregated data exhaust. We are an exploitable resource — like the oceans of fish or the oil beneath the ground — and these companies plan to harvest all the value without our involvement.
We will find that we don’t own the information about ourselves anymore than we own our DNA. (Yes, others can patent your genes: see The Tissue-Industrial Complex.)