Install Theme

Posts tagged with ‘big data’

The advent of Big Data has resurrected the fantasy of a social physics, promising a new data-driven technique for ratifying social facts with sheer algorithmic processing power.

Nathan Jurgenson, View From Nowhere

Shoshanna Zuboff contemplating Big Data

brucesterling:

*Hmmm, there are some rather familiar sentiments here.

http://www.faz.net/aktuell/feuilleton/debatten/the-digital-debate/shoshan-zuboff-on-big-data-as-surveillance-capitalism-13152525-p2.html

(…)

"IV. ”BIG DATA” IS BIG BUSINESS

"Let’s see if we can use these ideas to understand some things about „big data.” The analysis of massive data sets began as a way to reduce uncertainty by discovering the probabilities of future patterns in the behavior of people and systems. Now the focus has quietly shifted to the commercial  monetization of knowledge about current behavior as well as influencing and shaping emerging behavior for future revenue streams. The opportunity is to analyze, predict, and shape, while profiting from each point in the value chain.

"There are many sources from which these new flows are generated: sensors, sur-veillance cameras, phones, satellites, street view, corporate and government databases (from banks, credit card, credit rating, and telecom companies) are just a few.

"The most significant component is what some call “data exhaust.” This is user-generated data harvested from the haphazard ephemera of everyday life, especially the tiniest details of our online engagements— captured, datafied ( translated into machine-readable code), abstracted, aggregated, packaged, sold, and analyzed. This includes eve-rything from Facebook likes and Google searches to tweets, emails, texts, photos, songs, and videos, location and movement, purchases, every click, misspelled word, every page view, and more.

"The largest and most successful „big data“ company is Google, because it is the most visited website and therefore has the largest data exhaust. AdWords, Google’s algo-rithmic method for targeting online advertising, gets its edge from access to the most data exhaust.  Google gives away products like “search” in order to increase the amount of data exhaust it has available to harvest for its customers— its advertisers and other data buyers.  To quote a popular 2013 book on „big data“, “every action a user performs is considered a signal to be analyzed and fed back into the system.”  Facebook,Linked In, Yahoo, Twitter, and thousands of companies and apps  do something similar. On the strength of these capabilities, Google’s ad revenues were $21 billion in 2008 and climbed to over $50 billion in 2013. By February 2014, Google’s $400 billion dollar market value had edged out Exxon for the #2 spot in market capitalization.

"V. “BIG DATA” IS BIG CONTRABAND

"What can an understanding of declarations reveal about “big data?” I begin by suggesting that „big data“ is a big euphemism. As Orwell  once observed, euphemisms are used in politics, war, and business “to make lies sound truthful and murder respectable”. Euphemisms like “enhanced interrogation methods” or “ethnic cleansing” distract us from the ugly truth behind the words.

"The ugly truth here is that much of „big data“ is plucked from our lives without our knowledge or informed consent. It is the fruit of a rich array of surveillance practices designed to be invisible and undetectable as we make our way across the virtual and real worlds.  The pace of these developments is accelerating: drones, Google Glass, wearable technologies, the Internet of Everything  (which is perhaps the biggest euphemism of all).

"These surveillance practices represent profound harms—material, psychological, social, and political— that we are only beginning to understand and codify, largely because of the secret nature of these operations and how long it’s taken for us to understand them. As the recent outcry over the British National Health Service’s plan to sell patient data to insurance companies underscored, one person’s „big data“ is another person’s stolen goods.  The neutral technocratic euphemism, „big data“, can  more accurately be labeled “big contraband” or “big pirate booty.”  My interest here is less in  the details of these surveillance operations than in how they have been allowed to stand and what can be done about it.

"VI.  THE INTERNET COMPANIES DECLARE THE FUTURE

"The answer to how these practices have been allowed to stand is straightforward: Declaration.  We never said they could take these things from us. They simply declared them to be theirs for the taking—- by taking them. All sorts of institutional facts were established with the words and deeds of this declaration.

"Users were constituted as an unpaid workforce, whether slaves or volunteers is something for reasonable people to debate.  Our output was asserted as “exhaust” — waste without value—that it might be expropriated without resistance.  A wasteland is easily claimed and colonized. Who would protest the transformation of rubbish into value?  Because the new data assets were produced through surveillance, they constitute a new asset class that I call “surveillance assets.”  Surveillance assets, as we’ve seen, attract significant capital and investment that I suggest we call “surveillance capital.”  The declaration thus established a radically disembedded and extractive variant of information capitalism that can I label  “surveillance capitalism.”

"This new market form entails wholly new moral and social complexities along with new risks. For example, if the declarations that established surveillance capitalism are challenged, we might discover that „big data“ are larded with illicit surveillance assets who’s ownership is subject to legal contest and liability.  In an  alternative social and legal regime, surveillance assets could  become toxic assets strewn through the world’s data flows in much the same way that bad mortgage debt was baked into financial instruments that abruptly lost value when their status function was challenged by new facts.

"What’s key to understand here is that this logic of “accumulation by surveillance” is a wholly new breed.  In the past, populations were the source of employees and consumers. Under surveillance capitalism, populations are not to be employed and served.  Instead, they are to be harvested for behavioral data…."

Small Talk Is Big Again
Posted a deck at the wonderful Haiku Deck. A talk I gave at the recent Arup get-together on Big Data in NYC, although I stuck pretty close to social data and the intersection of social trends in work: the near-term emergence of Watson-grade AI and data analysis in combination with the fall of postmodern shallow organizational culture and the rise of the postnormal deep business culture. Check it out. 
It raises more questions than it answers.

Small Talk Is Big Again

Posted a deck at the wonderful Haiku Deck. A talk I gave at the recent Arup get-together on Big Data in NYC, although I stuck pretty close to social data and the intersection of social trends in work: the near-term emergence of Watson-grade AI and data analysis in combination with the fall of postmodern shallow organizational culture and the rise of the postnormal deep business culture. Check it out. 

It raises more questions than it answers.

betaknowledge:

“Drawing workshop - B.A.C. of 14th district of Paris” by Julien Prévieux. who asked cops in Paris to draw Voronoi diagrams based on maps showing recent crimes… as a way to make them understand each step of crime prediction algorithms:

A month and a half ago, I set up a drawing workshop with four police officers from the 14th arrondissement in Paris: Benjamin Ferran, Gerald Fidalgo, Blaise Thomas and Mickael Malvaud. The goal of the workshop was to learn to draw “Vorfonoi diagrams” [sic: Voronoi] manually from maps identify- ing recent crimes. These diagrams, used in the U.S. but not in France yet, are among the mapping analysis tools used to visualize crimes in real time to deploy patrols accordingly. Usually they are made by computer, but I offered the French policemen a chance to draw them by hand, taking the time to execute one by one the different steps of the algorithm. The exercise is slow and laborious and requires a precision that is difficult to obtain. With this technique of traditional drawing, the optimization tool is stripped of its primary function by invariably producing the results too late. But what you lose in efficiency is certainly gained in other areas: intensive drawing practice at weekends and holidays, in-depth exploration of the technical division of surfaces into polygons, discussions about Police processing systems(?) and introduction of new management methods, and production of a series of very successful abstract drawings.

The framed example above is “theft from cars” (november 2004, Paris 75017. Drawing executed by Gérald Fidalgo unique piece, serie of 7 + 7 draft.)

The tessellation of a Voronoi map is actually straightforward to conceive but difficult to execute:

  1. Put the set of geographic spots (the thefts from cars, in this case) on the geographic map.
  2. For each crime spot, create a polygon (or blob) where all the points within the blob are closer to that crime spot than any other crime spot.

This technique is faster if you successively approximate: manually draw guestimated regions around each spot, and after a pass, return to each, making the blog non-overlapping. Software is good at this; people, so-so.

In the case of crime deterrance, the outcome is to rethink police cruising patterns, to concentrate on Voronoi regions of greatest density of crimes, but to cover all areas in a way that matches crime density and shortest path algorithms.

The big data slant: imagine superimposing other non-obviously correlated data sets with a map like this. Social networks of known criminals. Street light maps and luminosity indices. Walking patterns, and locations of public transport. Analyses of how long cars are parked in each region. 

People can’t crunch all these variables to determine which are correlated and to what degree, but big data analysis can. And soon, it will cost nothing to peer into the fog of this snarl of data and winnow out the facts. It will be like turning on a flashlight on a dark street.

This is not the same as prediction, but it is so close that there is no difference. You’ll know that a cloudy Sunday night when there a football game at the high school means that in a district a mile away thefts from cars will spike. You could stop that by redirecting two more police cruisers to that neighborhood.

And the deep data machines could be cranking that for all sorts of crimes — not just thefts from cars — and optimize to distribute cruisers and foot patrols based on the goals of minimizing violent crimes first, felonies second, and so on. So cops won’t be hassling kids for smoking on school grounds when they should be patrolling the next district to counter armed robberies and burglaries.

(Source: betaknowledge)

Data is something we create, but it’s also something we imagine.

Socialogy: Interview with Brian Solis

Geoffrey West’s Fever Dream

Geoffrey West takes a look at the unknowability of the innards of complex systems, and wonders if big data — and a new big theory to corral big data — could act like a flashlight, revealing the inner workings of financial markets, urban communities, and ecosystems under stress.

Geoffrey West, Big Data Needs a Big Theory to Go with It

Complexity comes into play when there are many parts that can interact in many different ways so that the whole takes on a life of its own: it adapts and evolves in response to changing conditions. It can be prone to sudden and seemingly unpredictable changes—a market crash is the classic example. One or more trends can reinforce other trends in a “positive feedback loop” until things swiftly spiral out of control and cross a tipping point, beyond which behavior changes radically.

What makes a “complex system” so vexing is that its collective characteristics cannot easily be predicted from underlying components: the whole is greater than, and often significantly different from, the sum of its parts. A city is much more than its buildings and people. Our bodies are more than the totality of our cells. This quality, called emergent behavior, is characteristic of economies, financial markets, urban communities, companies, organisms, the Internet, galaxies and the health care system.

The digital revolution is driving much of the increasing complexity and pace of life we are now seeing, but this technology also presents an opportunity. The ubiquity of cell phones and electronic transactions, the increasing use of personal medical probes, and the concept of the electronically wired “smart city” are already providing us with enormous amounts of data. With new computational tools and techniques to digest vast, interrelated databases, researchers and practitioners in science, technology, business and government have begun to bring large-scale simulations and models to bear on questions formerly out of reach of quantitative analysis, such as how cooperation emerges in society, what conditions promote innovation, and how conflicts spread and grow.

The trouble is, we don’t have a unified, conceptual framework for addressing questions of complexity. We don’t know what kind of data we need, nor how much, or what critical questions we should be asking. “Big data” without a “big theory” to go with it loses much of its potency and usefulness, potentially generating new unintended consequences.

When the industrial age focused society’s attention on energy in its many manifestations—steam, chemical, mechanical, and so on—the universal laws of thermodynamics came as a response. We now need to ask if our age can produce universal laws of complexity that integrate energy with information. What are the underlying principles that transcend the extraordinary diversity and historical contingency and interconnectivity of financial markets, populations, ecosystems, war and conflict, pandemics and cancer? An overarching predictive, mathematical framework for complex systems would, in principle, incorporate the dynamics and organization of any complex system in a quantitative, computable framework.

'The emergent is everywhere and nowhere.'

Complexity isn’t a clock. You can’t open one up and see it’s innards. There are no gears and cogs. If there was a way to ‘look inside’, all you’d find would be more complex systems. And those complex systems aren’t connected in purely physical way, made up of computable inputs and outputs: they are united by emergent behaviors: the system manifests it character my acting in ways that are inherently unpredictable, and incalculable. These behaviors arise from the interactions between the components, but reside in none of them. The emergent is everywhere and nowhere.

We have no math for this.

West might as well be saying ‘We need to be able to see into the future . It would be helpful.’ But that doesn’t mean we have a way to do it, or that it is doable at all.

The tempo of modern life has sped up to the point that the future feels closer, and since it’s only a heartbeat away it seems reasonable to imagine being able to glance around that corner and know what is about to transpire. But that’s just a feeling.

'The more we have wired everything into everything else, the less we can know about what will happen tomorrow.'

The future is actually farther away than ever, because we have constructed a world that is the most multi-facted astrolobe, the most incestuous interconnection of global economic interdependencies, the deepest ingraining of contingent political scenarios, and the widest pending cascade of possible ecological side-effects. The more we have wired everything into everything else, the less we can know about what will happen tomorrow.

In essence, West hopes we can create a math that can pile up all the big data and crunch it, in a Borgesian infinity. A machinery as complex as the world it hopes to fathom, allowing us — or at least it— to know everything about everything.

I suspect we will have to settle for something less.

We could start by intentionally decoupling complexity that poses threats. Derivative trading, and credit default swaps are a good example. Efforts by banks and brokerages to diffuse risks, and sharing them with other finance companies leads to increased risk, systemically. When there is a big downturn the risks are amplified, and the cascade leads to huge ‘unintended’ results. The solution to this is not predicting when and how it will happen, but stopping the increased complexity inherent in derivatives and credit default swaps. The only cure for increased complexity is decoupling components of the larger system. 

Big Data Is Not Going To Lead To Big Understanding

Not too long ago, I expressed some disbelief about the techno-utopianism that seems to surround discussions of big data.

Stowe Boyd, Re: The Future Impact Of Big Data

Unconstrained and dynamic complex systems — like our society, the economic system of Europe, or the Earth’s weather — are fundamentally unknowable: their progression from one state to another cannot be predicted consistently, even if you have a relatively good understanding of both the starting state and the present state, because the behavior of the system as a whole is an emergent property of the interconnections between the parts. And the parts are themselves made up of interconnected parts, and so on.

Yes, weather forecasting and other scientific domains have been benefited by better models and more data, and more data and bigger analysis approaches will increase the level of consistency for weather, but only to a certain extent. There are rounding errors that grow from the imprecision of measures and oversimplifications in our models, so that even something as potentially opaque as the weather — where no one is intentionally hiding data, or degrading it — cannot be predicted completely. In everyday life, this is why the weather forecast for the next few hours is several orders of magnitude better than the forecast for 10 days ahead. Big data — as currently conceived — may allow us to improve weather prediction for the next 10 days dramatically, but the inverse square law of predictability means that predictions about the weather 10 months ahead are unlikely to dramatically improve.

So, consider it this way: Big data is unlikely to increase the certainty about what is going to happen in anything but the nearest of near futures — in weather, politics, and buying behavior — because uncertainty and volatility grow along with the interconnectedness of human activities and institutions across the world. Big data is itself a factor in the increased interconnectedness of the world: as companies, governments, and individuals take advantage of insights gleaned from big data, we are making the world more tightly interconnected, and as a result (perhaps unintuitively) less predictable.

I saw a post today from Nassim Taleb, that echoes my points, from a more mathematical basis:

Nassim Taleb, Beware the Big Errors of ‘Big Data’

We’re more fooled by noise than ever before, and it’s because of a nasty phenomenon called “big data.” With big data, researchers have brought cherry-picking to an industrial level.

Modernity provides too many variables, but too little data per variable. So the spurious relationships grow much, much faster than real information.

In other words: Big data may mean more information, but it also means more false information.

Just like bankers who own a free option — where they make the profits and transfer losses to others – researchers have the ability to pick whatever statistics confirm their beliefs (or show good results) … and then ditch the rest.

Big-data researchers have the option to stop doing their research once they have the right result. In options language: The researcher gets the “upside” and truth gets the “downside.” It makes him antifragile, that is, capable of benefiting from complexity and uncertainty — and at the expense of others.

But beyond that, big data means anyone can find fake statistical relationships, since the spurious rises to the surface. This is because in large data sets, large deviations are vastly more attributable to variance (or noise) than to information (or signal). It’s a property of sampling: In real life there is no cherry-picking, but on the researcher’s computer, there is. Large deviations are likely to be bogus.

As data sets grow — especially data about complex, adaptive systems, like world economics, or human interactions — the proportion of noise grows as a function of the complexity. 

Big data is not going to be some Hubble telescope peering into the heart of the universe. The world, alas, is not becoming more knowable.

Data As Commons →

I wrote a guest post at #WETHEDATA, ‘Data As Commons’, where I suggest we need to think about collective solutions to personal data, not individualist approaches. Take a look.

Data can’t tell you where the world is headed.

Lara Lee, cited by Stephanie Clifford in Social Media Are Giving a Voice to Taste Buds via NY Times.com

In a piece about the fad flavors for corn chips and cosmetics colors is buried a bit of deep insight by Lara Lee, chief innovation and operating officer at the design consultancy Continuum, which helped design the Swiffer and the One Laptop per Child project.

Our knowledge is constrained by the fabric of the post-normal. The notion that there is a deterministic future ahead of us, rolling out like a yellow brick road, is an illusion. Next year emerges out of an opaque sea of trillions of semi-independent decisions made in the present by billions of individuals and groups, cascading into each other and impacting each other in literally unknowable ways. When systems become as complex as the modern world there are no tools that can see more than a very short distance into the future.

Yes, taste makers can concoct a spicy chip that sells well this season in southern California, or what beer will be popular in NYC for Labor Day, but we can’t predict, for example, the invention of alternatives to antibiotics in a world where bugs are growing antibiotic-resistant. There are limits to our knowledge:

Stowe Boyd,  Re: The Future Impact Of Big Data In 2020 and The Limits Of Our Knowledge

Big data is unlikely to increase the certainty about what is going to happen in anything but the nearest of near futures — in weather, politics, and buying behavior — because uncertainty and volatility grow along with the interconnectedness of human activities and institutions across the world. Big data is itself a factor in the increased interconnectedness of the world: as companies, governments, and individuals take advantage of insights gleaned from big data, we are making the world more tightly interconnected, and as a result (perhaps unintuitively) less predictable.


Related Posts Plugin for WordPress, Blogger...