Elsewhere

Small Talk Is Big Again
Posted a deck at the wonderful Haiku Deck. A talk I gave at the recent Arup get-together on Big Data in NYC, although I stuck pretty close to social data and the intersection of social trends in work: the near-term emergence of Watson-grade AI and data analysis in combination with the fall of postmodern shallow organizational culture and the rise of the postnormal deep business culture. Check it out. 
It raises more questions than it answers.

Small Talk Is Big Again

Posted a deck at the wonderful Haiku Deck. A talk I gave at the recent Arup get-together on Big Data in NYC, although I stuck pretty close to social data and the intersection of social trends in work: the near-term emergence of Watson-grade AI and data analysis in combination with the fall of postmodern shallow organizational culture and the rise of the postnormal deep business culture. Check it out. 

It raises more questions than it answers.

betaknowledge:

“Drawing workshop - B.A.C. of 14th district of Paris” by Julien Prévieux. who asked cops in Paris to draw Voronoi diagrams based on maps showing recent crimes… as a way to make them understand each step of crime prediction algorithms:

A month and a half ago, I set up a drawing workshop with four police officers from the 14th arrondissement in Paris: Benjamin Ferran, Gerald Fidalgo, Blaise Thomas and Mickael Malvaud. The goal of the workshop was to learn to draw “Vorfonoi diagrams” [sic: Voronoi] manually from maps identify- ing recent crimes. These diagrams, used in the U.S. but not in France yet, are among the mapping analysis tools used to visualize crimes in real time to deploy patrols accordingly. Usually they are made by computer, but I offered the French policemen a chance to draw them by hand, taking the time to execute one by one the different steps of the algorithm. The exercise is slow and laborious and requires a precision that is difficult to obtain. With this technique of traditional drawing, the optimization tool is stripped of its primary function by invariably producing the results too late. But what you lose in efficiency is certainly gained in other areas: intensive drawing practice at weekends and holidays, in-depth exploration of the technical division of surfaces into polygons, discussions about Police processing systems(?) and introduction of new management methods, and production of a series of very successful abstract drawings.

The framed example above is “theft from cars” (november 2004, Paris 75017. Drawing executed by Gérald Fidalgo unique piece, serie of 7 + 7 draft.)

The tessellation of a Voronoi map is actually straightforward to conceive but difficult to execute:

  1. Put the set of geographic spots (the thefts from cars, in this case) on the geographic map.
  2. For each crime spot, create a polygon (or blob) where all the points within the blob are closer to that crime spot than any other crime spot.

This technique is faster if you successively approximate: manually draw guestimated regions around each spot, and after a pass, return to each, making the blog non-overlapping. Software is good at this; people, so-so.

In the case of crime deterrance, the outcome is to rethink police cruising patterns, to concentrate on Voronoi regions of greatest density of crimes, but to cover all areas in a way that matches crime density and shortest path algorithms.

The big data slant: imagine superimposing other non-obviously correlated data sets with a map like this. Social networks of known criminals. Street light maps and luminosity indices. Walking patterns, and locations of public transport. Analyses of how long cars are parked in each region. 

People can’t crunch all these variables to determine which are correlated and to what degree, but big data analysis can. And soon, it will cost nothing to peer into the fog of this snarl of data and winnow out the facts. It will be like turning on a flashlight on a dark street.

This is not the same as prediction, but it is so close that there is no difference. You’ll know that a cloudy Sunday night when there a football game at the high school means that in a district a mile away thefts from cars will spike. You could stop that by redirecting two more police cruisers to that neighborhood.

And the deep data machines could be cranking that for all sorts of crimes — not just thefts from cars — and optimize to distribute cruisers and foot patrols based on the goals of minimizing violent crimes first, felonies second, and so on. So cops won’t be hassling kids for smoking on school grounds when they should be patrolling the next district to counter armed robberies and burglaries.

Socialogy: Interview with Brian Solis

Geoffrey West’s Fever Dream

Geoffrey West takes a look at the unknowability of the innards of complex systems, and wonders if big data — and a new big theory to corral big data — could act like a flashlight, revealing the inner workings of financial markets, urban communities, and ecosystems under stress.

Geoffrey West, Big Data Needs a Big Theory to Go with It

Complexity comes into play when there are many parts that can interact in many different ways so that the whole takes on a life of its own: it adapts and evolves in response to changing conditions. It can be prone to sudden and seemingly unpredictable changes—a market crash is the classic example. One or more trends can reinforce other trends in a “positive feedback loop” until things swiftly spiral out of control and cross a tipping point, beyond which behavior changes radically.

What makes a “complex system” so vexing is that its collective characteristics cannot easily be predicted from underlying components: the whole is greater than, and often significantly different from, the sum of its parts. A city is much more than its buildings and people. Our bodies are more than the totality of our cells. This quality, called emergent behavior, is characteristic of economies, financial markets, urban communities, companies, organisms, the Internet, galaxies and the health care system.

The digital revolution is driving much of the increasing complexity and pace of life we are now seeing, but this technology also presents an opportunity. The ubiquity of cell phones and electronic transactions, the increasing use of personal medical probes, and the concept of the electronically wired “smart city” are already providing us with enormous amounts of data. With new computational tools and techniques to digest vast, interrelated databases, researchers and practitioners in science, technology, business and government have begun to bring large-scale simulations and models to bear on questions formerly out of reach of quantitative analysis, such as how cooperation emerges in society, what conditions promote innovation, and how conflicts spread and grow.

The trouble is, we don’t have a unified, conceptual framework for addressing questions of complexity. We don’t know what kind of data we need, nor how much, or what critical questions we should be asking. “Big data” without a “big theory” to go with it loses much of its potency and usefulness, potentially generating new unintended consequences.

When the industrial age focused society’s attention on energy in its many manifestations—steam, chemical, mechanical, and so on—the universal laws of thermodynamics came as a response. We now need to ask if our age can produce universal laws of complexity that integrate energy with information. What are the underlying principles that transcend the extraordinary diversity and historical contingency and interconnectivity of financial markets, populations, ecosystems, war and conflict, pandemics and cancer? An overarching predictive, mathematical framework for complex systems would, in principle, incorporate the dynamics and organization of any complex system in a quantitative, computable framework.

'The emergent is everywhere and nowhere.'

Complexity isn’t a clock. You can’t open one up and see it’s innards. There are no gears and cogs. If there was a way to ‘look inside’, all you’d find would be more complex systems. And those complex systems aren’t connected in purely physical way, made up of computable inputs and outputs: they are united by emergent behaviors: the system manifests it character my acting in ways that are inherently unpredictable, and incalculable. These behaviors arise from the interactions between the components, but reside in none of them. The emergent is everywhere and nowhere.

We have no math for this.

West might as well be saying ‘We need to be able to see into the future . It would be helpful.’ But that doesn’t mean we have a way to do it, or that it is doable at all.

The tempo of modern life has sped up to the point that the future feels closer, and since it’s only a heartbeat away it seems reasonable to imagine being able to glance around that corner and know what is about to transpire. But that’s just a feeling.

'The more we have wired everything into everything else, the less we can know about what will happen tomorrow.'

The future is actually farther away than ever, because we have constructed a world that is the most multi-facted astrolobe, the most incestuous interconnection of global economic interdependencies, the deepest ingraining of contingent political scenarios, and the widest pending cascade of possible ecological side-effects. The more we have wired everything into everything else, the less we can know about what will happen tomorrow.

In essence, West hopes we can create a math that can pile up all the big data and crunch it, in a Borgesian infinity. A machinery as complex as the world it hopes to fathom, allowing us — or at least it— to know everything about everything.

I suspect we will have to settle for something less.

We could start by intentionally decoupling complexity that poses threats. Derivative trading, and credit default swaps are a good example. Efforts by banks and brokerages to diffuse risks, and sharing them with other finance companies leads to increased risk, systemically. When there is a big downturn the risks are amplified, and the cascade leads to huge ‘unintended’ results. The solution to this is not predicting when and how it will happen, but stopping the increased complexity inherent in derivatives and credit default swaps. The only cure for increased complexity is decoupling components of the larger system. 

Big Data Is Not Going To Lead To Big Understanding

Not too long ago, I expressed some disbelief about the techno-utopianism that seems to surround discussions of big data.

Stowe Boyd, Re: The Future Impact Of Big Data

Unconstrained and dynamic complex systems — like our society, the economic system of Europe, or the Earth’s weather — are fundamentally unknowable: their progression from one state to another cannot be predicted consistently, even if you have a relatively good understanding of both the starting state and the present state, because the behavior of the system as a whole is an emergent property of the interconnections between the parts. And the parts are themselves made up of interconnected parts, and so on.

Yes, weather forecasting and other scientific domains have been benefited by better models and more data, and more data and bigger analysis approaches will increase the level of consistency for weather, but only to a certain extent. There are rounding errors that grow from the imprecision of measures and oversimplifications in our models, so that even something as potentially opaque as the weather — where no one is intentionally hiding data, or degrading it — cannot be predicted completely. In everyday life, this is why the weather forecast for the next few hours is several orders of magnitude better than the forecast for 10 days ahead. Big data — as currently conceived — may allow us to improve weather prediction for the next 10 days dramatically, but the inverse square law of predictability means that predictions about the weather 10 months ahead are unlikely to dramatically improve.

So, consider it this way: Big data is unlikely to increase the certainty about what is going to happen in anything but the nearest of near futures — in weather, politics, and buying behavior — because uncertainty and volatility grow along with the interconnectedness of human activities and institutions across the world. Big data is itself a factor in the increased interconnectedness of the world: as companies, governments, and individuals take advantage of insights gleaned from big data, we are making the world more tightly interconnected, and as a result (perhaps unintuitively) less predictable.

I saw a post today from Nassim Taleb, that echoes my points, from a more mathematical basis:

Nassim Taleb, Beware the Big Errors of ‘Big Data’

We’re more fooled by noise than ever before, and it’s because of a nasty phenomenon called “big data.” With big data, researchers have brought cherry-picking to an industrial level.

Modernity provides too many variables, but too little data per variable. So the spurious relationships grow much, much faster than real information.

In other words: Big data may mean more information, but it also means more false information.

Just like bankers who own a free option — where they make the profits and transfer losses to others – researchers have the ability to pick whatever statistics confirm their beliefs (or show good results) … and then ditch the rest.

Big-data researchers have the option to stop doing their research once they have the right result. In options language: The researcher gets the “upside” and truth gets the “downside.” It makes him antifragile, that is, capable of benefiting from complexity and uncertainty — and at the expense of others.

But beyond that, big data means anyone can find fake statistical relationships, since the spurious rises to the surface. This is because in large data sets, large deviations are vastly more attributable to variance (or noise) than to information (or signal). It’s a property of sampling: In real life there is no cherry-picking, but on the researcher’s computer, there is. Large deviations are likely to be bogus.

As data sets grow — especially data about complex, adaptive systems, like world economics, or human interactions — the proportion of noise grows as a function of the complexity. 

Big data is not going to be some Hubble telescope peering into the heart of the universe. The world, alas, is not becoming more knowable.

Data can’t tell you where the world is headed.

Lara Lee, cited by Stephanie Clifford in Social Media Are Giving a Voice to Taste Buds via NY Times.com

In a piece about the fad flavors for corn chips and cosmetics colors is buried a bit of deep insight by Lara Lee, chief innovation and operating officer at the design consultancy Continuum, which helped design the Swiffer and the One Laptop per Child project.

Our knowledge is constrained by the fabric of the post-normal. The notion that there is a deterministic future ahead of us, rolling out like a yellow brick road, is an illusion. Next year emerges out of an opaque sea of trillions of semi-independent decisions made in the present by billions of individuals and groups, cascading into each other and impacting each other in literally unknowable ways. When systems become as complex as the modern world there are no tools that can see more than a very short distance into the future.

Yes, taste makers can concoct a spicy chip that sells well this season in southern California, or what beer will be popular in NYC for Labor Day, but we can’t predict, for example, the invention of alternatives to antibiotics in a world where bugs are growing antibiotic-resistant. There are limits to our knowledge:

Stowe Boyd,  Re: The Future Impact Of Big Data In 2020 and The Limits Of Our Knowledge

Big data is unlikely to increase the certainty about what is going to happen in anything but the nearest of near futures — in weather, politics, and buying behavior — because uncertainty and volatility grow along with the interconnectedness of human activities and institutions across the world. Big data is itself a factor in the increased interconnectedness of the world: as companies, governments, and individuals take advantage of insights gleaned from big data, we are making the world more tightly interconnected, and as a result (perhaps unintuitively) less predictable.

The Future Impact Of Big Data In 2020?

I am once again a Pew Expert, featured in a big data survey:

Imagining the Internet, Elon University, The 2012 Survey: What is the potential future influence of Big Data by 2020?

A number of respondents articulated a view that could be summarized as: Humans seem to think they know more than they actually know. Still, despite all of our flaws, this new way of looking at the big picture could help.

One version of this kind of summary thought was written by Stowe Boyd […]

Overall, the growth of the ‘Internet of Things’ and ‘Big Data’ will feed the development of new capabilities in sensing, understanding, and manipulating the world. However, the underlying analytic machinery (like Bruce Sterling’s Engines of Meaning) will still require human cognition and curation to connect dots and see the big picture.

And there will be dark episodes, too, since the brightest light casts the darkest shadow. There are opportunities for terrible applications, like the growth of the surveillance society, where the authorities watch everything and analyze our actions, behavior, and movements looking for patterns of illegality, something like a real-time Minority Report.

On the other side, access to more large data can also be a blessing, so social advocacy groups may be able to amass information at a low- or zero-cost that would be unaffordable today. For example, consider the bottom-up creation of an alternative food system, outside the control of multinational agribusiness, and connecting local and regional food producers and consumers. Such a system, what I and others call Food Tech, might come together based on open data about people’s consumption, farmers’ production plans, and regional, cooperative logistics tools. So it will be a mixed bag, like most human technological advances.

Others had smart things to say, like Jerry Michalski, Jeff Jarvis, David Weinberger, danah boyd, and Janna Anderson. Go read the whole thing.

Big Data and Data Inequality: Research Is Just The Beginning

There was a recent hoo-ha at a scientific conference in France, when Bernardo Huberman was furious when researchers from Google and a contributing university presenting results of social data analysis declined to share the data.

John Markoff, Big Data Troves Stay Forbidden to Social Scientists via  NYTimes.com

The issue came to a boil last month at a scientific conference in Lyon, France, when three scientists from Google and the University of Cambridge declined to release data they had compiled for a paper on the popularity of YouTube videos in different countries.

The chairman of the conference panel — Bernardo A. Huberman, a physicist who directs the social computing group at HP Labs here — responded angrily. In the future, he said, the conference should not accept papers from authors who did not make their data public. He was greeted by applause from the audience.

In February, Dr. Huberman had published a letter in the journal Nature warning that privately held data was threatening the very basis of scientific research. “If another set of data does not validate results obtained with private data,” he asked, “how do we know if it is because they are not universal or the authors made a mistake?”

He added that corporate control of data could give preferential access to an elite group of scientists at the largest corporations. “If this trend continues,” he wrote, “we’ll see a small group of scientists with access to private data repositories enjoy an unfair amount of attention in the community at the expense of equally talented researchers whose only flaw is the lack of right ‘connections’ to private data.”

Facebook and Microsoft declined to comment on the issue. Hal Varian, Google’s chief economist, said he sympathized with the idea of open data but added that the privacy issues were significant.

“This is one of the reasons the general pattern at Google is to try to release data to everyone or no one,” he said. “I have been working to get companies to release more data about their industries. The idea is that you can provide proprietary data aggregated in a way that poses no threats to privacy.”

The debate will only intensify as large companies with deep pockets do more research about their users. “In the Internet era,” said Andreas Weigend, a physicist and former chief scientist at Amazon, “research has moved out of the universities to the Googles, Amazons and Facebooks of the world.”

And of course, big data is worth big money — leaving aside the privacy concerns — and controlling access to that data is central to the aspirations of companies like Google, Facebook, and others.

Research is just the first place where the latent data inequality of the post normal world will come to light. We will each of us — as individuals — be divided from the inherent value of information about our activities and the inferences that can be made about them. As a society, we will find corporations that do not have our interests at heart working to exploit the potential value of our aggregated data exhaust. We are an exploitable resource — like the oceans of fish or the oil beneath the ground — and these companies plan to harvest all the value without our involvement.

We will find that we don’t own the information about ourselves anymore than we own our DNA. (Yes, others can patent your genes: see The Tissue-Industrial Complex.)

Related Posts Plugin for WordPress, Blogger...