Post(s) tagged with "big data"

Big Data Is Not Going To Lead To Big Understanding

Not too long ago, I expressed some disbelief about the techno-utopianism that seems to surround discussions of big data.

Stowe Boyd, Re: The Future Impact Of Big Data

Unconstrained and dynamic complex systems — like our society, the economic system of Europe, or the Earth’s weather — are fundamentally unknowable: their progression from one state to another cannot be predicted consistently, even if you have a relatively good understanding of both the starting state and the present state, because the behavior of the system as a whole is an emergent property of the interconnections between the parts. And the parts are themselves made up of interconnected parts, and so on.

Yes, weather forecasting and other scientific domains have been benefited by better models and more data, and more data and bigger analysis approaches will increase the level of consistency for weather, but only to a certain extent. There are rounding errors that grow from the imprecision of measures and oversimplifications in our models, so that even something as potentially opaque as the weather — where no one is intentionally hiding data, or degrading it — cannot be predicted completely. In everyday life, this is why the weather forecast for the next few hours is several orders of magnitude better than the forecast for 10 days ahead. Big data — as currently conceived — may allow us to improve weather prediction for the next 10 days dramatically, but the inverse square law of predictability means that predictions about the weather 10 months ahead are unlikely to dramatically improve.

So, consider it this way: Big data is unlikely to increase the certainty about what is going to happen in anything but the nearest of near futures — in weather, politics, and buying behavior — because uncertainty and volatility grow along with the interconnectedness of human activities and institutions across the world. Big data is itself a factor in the increased interconnectedness of the world: as companies, governments, and individuals take advantage of insights gleaned from big data, we are making the world more tightly interconnected, and as a result (perhaps unintuitively) less predictable.

I saw a post today from Nassim Taleb, that echoes my points, from a more mathematical basis:

Nassim Taleb, Beware the Big Errors of ‘Big Data’

We’re more fooled by noise than ever before, and it’s because of a nasty phenomenon called “big data.” With big data, researchers have brought cherry-picking to an industrial level.

Modernity provides too many variables, but too little data per variable. So the spurious relationships grow much, much faster than real information.

In other words: Big data may mean more information, but it also means more false information.

Just like bankers who own a free option — where they make the profits and transfer losses to others – researchers have the ability to pick whatever statistics confirm their beliefs (or show good results) … and then ditch the rest.

Big-data researchers have the option to stop doing their research once they have the right result. In options language: The researcher gets the “upside” and truth gets the “downside.” It makes him antifragile, that is, capable of benefiting from complexity and uncertainty — and at the expense of others.

But beyond that, big data means anyone can find fake statistical relationships, since the spurious rises to the surface. This is because in large data sets, large deviations are vastly more attributable to variance (or noise) than to information (or signal). It’s a property of sampling: In real life there is no cherry-picking, but on the researcher’s computer, there is. Large deviations are likely to be bogus.

As data sets grow — especially data about complex, adaptive systems, like world economics, or human interactions — the proportion of noise grows as a function of the complexity. 

Big data is not going to be some Hubble telescope peering into the heart of the universe. The world, alas, is not becoming more knowable.

Wired

Data As Commons ⇢

I wrote a guest post at #WETHEDATA, ‘Data As Commons’, where I suggest we need to think about collective solutions to personal data, not individualist approaches. Take a look.

Data can’t tell you where the world is headed.

Lara Lee, cited by Stephanie Clifford in Social Media Are Giving a Voice to Taste Buds via NY Times.com

In a piece about the fad flavors for corn chips and cosmetics colors is buried a bit of deep insight by Lara Lee, chief innovation and operating officer at the design consultancy Continuum, which helped design the Swiffer and the One Laptop per Child project.

Our knowledge is constrained by the fabric of the post-normal. The notion that there is a deterministic future ahead of us, rolling out like a yellow brick road, is an illusion. Next year emerges out of an opaque sea of trillions of semi-independent decisions made in the present by billions of individuals and groups, cascading into each other and impacting each other in literally unknowable ways. When systems become as complex as the modern world there are no tools that can see more than a very short distance into the future.

Yes, taste makers can concoct a spicy chip that sells well this season in southern California, or what beer will be popular in NYC for Labor Day, but we can’t predict, for example, the invention of alternatives to antibiotics in a world where bugs are growing antibiotic-resistant. There are limits to our knowledge:

Stowe Boyd,  Re: The Future Impact Of Big Data In 2020 and The Limits Of Our Knowledge

Big data is unlikely to increase the certainty about what is going to happen in anything but the nearest of near futures — in weather, politics, and buying behavior — because uncertainty and volatility grow along with the interconnectedness of human activities and institutions across the world. Big data is itself a factor in the increased interconnectedness of the world: as companies, governments, and individuals take advantage of insights gleaned from big data, we are making the world more tightly interconnected, and as a result (perhaps unintuitively) less predictable.

The New York Times

The Future Impact Of Big Data In 2020?

I am once again a Pew Expert, featured in a big data survey:

Imagining the Internet, Elon University, The 2012 Survey: What is the potential future influence of Big Data by 2020?

A number of respondents articulated a view that could be summarized as: Humans seem to think they know more than they actually know. Still, despite all of our flaws, this new way of looking at the big picture could help.

One version of this kind of summary thought was written by Stowe Boyd […]

Overall, the growth of the ‘Internet of Things’ and ‘Big Data’ will feed the development of new capabilities in sensing, understanding, and manipulating the world. However, the underlying analytic machinery (like Bruce Sterling’s Engines of Meaning) will still require human cognition and curation to connect dots and see the big picture.

And there will be dark episodes, too, since the brightest light casts the darkest shadow. There are opportunities for terrible applications, like the growth of the surveillance society, where the authorities watch everything and analyze our actions, behavior, and movements looking for patterns of illegality, something like a real-time Minority Report.

On the other side, access to more large data can also be a blessing, so social advocacy groups may be able to amass information at a low- or zero-cost that would be unaffordable today. For example, consider the bottom-up creation of an alternative food system, outside the control of multinational agribusiness, and connecting local and regional food producers and consumers. Such a system, what I and others call Food Tech, might come together based on open data about people’s consumption, farmers’ production plans, and regional, cooperative logistics tools. So it will be a mixed bag, like most human technological advances.

Others had smart things to say, like Jerry Michalski, Jeff Jarvis, David Weinberger, danah boyd, and Janna Anderson. Go read the whole thing.

Big Data and Data Inequality: Research Is Just The Beginning

There was a recent hoo-ha at a scientific conference in France, when Bernardo Huberman was furious when researchers from Google and a contributing university presenting results of social data analysis declined to share the data.

John Markoff, Big Data Troves Stay Forbidden to Social Scientists via  NYTimes.com

The issue came to a boil last month at a scientific conference in Lyon, France, when three scientists from Google and the University of Cambridge declined to release data they had compiled for a paper on the popularity of YouTube videos in different countries.

The chairman of the conference panel — Bernardo A. Huberman, a physicist who directs the social computing group at HP Labs here — responded angrily. In the future, he said, the conference should not accept papers from authors who did not make their data public. He was greeted by applause from the audience.

In February, Dr. Huberman had published a letter in the journal Nature warning that privately held data was threatening the very basis of scientific research. “If another set of data does not validate results obtained with private data,” he asked, “how do we know if it is because they are not universal or the authors made a mistake?”

He added that corporate control of data could give preferential access to an elite group of scientists at the largest corporations. “If this trend continues,” he wrote, “we’ll see a small group of scientists with access to private data repositories enjoy an unfair amount of attention in the community at the expense of equally talented researchers whose only flaw is the lack of right ‘connections’ to private data.”

Facebook and Microsoft declined to comment on the issue. Hal Varian, Google’s chief economist, said he sympathized with the idea of open data but added that the privacy issues were significant.

“This is one of the reasons the general pattern at Google is to try to release data to everyone or no one,” he said. “I have been working to get companies to release more data about their industries. The idea is that you can provide proprietary data aggregated in a way that poses no threats to privacy.”

The debate will only intensify as large companies with deep pockets do more research about their users. “In the Internet era,” said Andreas Weigend, a physicist and former chief scientist at Amazon, “research has moved out of the universities to the Googles, Amazons and Facebooks of the world.”

And of course, big data is worth big money — leaving aside the privacy concerns — and controlling access to that data is central to the aspirations of companies like Google, Facebook, and others.

Research is just the first place where the latent data inequality of the post normal world will come to light. We will each of us — as individuals — be divided from the inherent value of information about our activities and the inferences that can be made about them. As a society, we will find corporations that do not have our interests at heart working to exploit the potential value of our aggregated data exhaust. We are an exploitable resource — like the oceans of fish or the oil beneath the ground — and these companies plan to harvest all the value without our involvement.

We will find that we don’t own the information about ourselves anymore than we own our DNA. (Yes, others can patent your genes: see The Tissue-Industrial Complex.)

The New York Times

Is Thompson Moving The Deck Chairs Around, Or Pointing Yahoo In A New Direction?

Scott Thompson has reorganized the company around three ‘groups’: consumer, regions, and technology. But his long term plan is totally unclear, despite having taken three months to get set. My sense is that he’s moving the deck chairs around on the Titanic, rather than addressing the gaping hole in the side of the boat.

However, trying to centralize the business on capturing user information exhaust does at least line up with what others — Facebook and Google, for example — are planning, so at least he’s looking in the right direction.

Yahoo C.E.O. Hints at a Strategy - Nicole Perlroth via NYTimes.com

With 700 million visitors, Yahoo still maintains one of the largest audiences on the Web, but has been unable to increase revenue. The company continues to cede advertising market share to competitors, notably Facebook and Google, and has frustrated shareholders with its reliance on cost-cutting rather than new areas for innovation and growth.

Based on the restructuring, it appears Mr. Thompson plans to hedge much of Yahoo’s future on the media and content properties it hopes will tether visitors to its site and lure back advertisers, as well as on the data it has on its users.

Mr. Thompson has yet to elaborate on how Yahoo plans to use that data. Sources inside the company, who declined to be named because they were not authorized to speak, said that it was still unclear how, or even whether, the company could leverage the information to its advantage.

There is certainly room in the marketplace for a large media player to innovate in media based on mining big data from social exhaust. We’ll have to see if Thompson is trying for that, since he’s been fairly silent on strategy, but it looks like a viable option for Yahoo, at least.

The New York Times

Doing a presentation next week in San Francisco, Data Is The New Oil: The Journey From Privacy To Publicy.I will be sharing the podium with Gerd Leonhard, Andreas Weigend, and Jamais Cascio.

I am likely to use some of the slides in the deck above, Big And Small Data.

I’ve heard we are going to have a packed house, so If you want to attend you should sign up right away.

This is the year things get weird.

- Bryce Roberts, Web 2.0 Ends With Data Monopolies

Bryce is referring to the profound changes in the technological infrastructure our culture sits on, as we start to create nearly unimaginable amounts of data — both personalized and anonymized — through the streams of our existence.

The proximate cause to this insight is the new demo video of Google glasses (or goggles), and Bryce’s realization that Google might be in a position to track literally everything we see (at least with the glasses on). Every product you reject and put back on the shelf, every plate of food, every friend (and stranger) you pass in the street, every store you enter, every breath you take, every move you make, they’ll be watching you.

But I still want them. Which is weird.

The guy with the most data wins.

Tim O’Reilly, interviewed by Jon Bruner Tim O’Reilly on the Future of Location: “The Guy with the Most Data Wins” via Forbes

Source: forbes.com

Data is the New Oil: The Journey from Privacy to Publicy — swissnex ⇢

I will be speaking with Gerd Leonhard, Andreas Wiegand, and (hopefully) Jamais Cascio at an event in San Francisco, 10 April 2012, sponsored by Swissex.

The theme is Data is the New Oil: The Journey from Privacy to Publicy. As every web page we visit is logged, and every comment and tweet analyzed for sentiment and intention, more data is being logged weekly than existed on earth a few years ago, prior to the rise of the social web. We will explore the connections between our connected world and the complexities and challenges of a data economy.

If you are interested in attending, please register quickly, since there are only 150 or so seats.

About

Web anthropologist, futurist, author. My focus is the future, and the tectonic forces pushing business, media, and society into an unclear and accelerating future. more.

Working on longer format projects, Sign up for the newsletter.

GigaOM Research analyst and curator.

Also writing beaconstreets.com.

Contact me. or ask me a question.



My Vizify profile.

Socialogy

  • John Hagel | John offers up some great insights, like the fact that passion is lower the larger that businesses get.

  • Euan Semple | A chat with my old pal, and the author of Organizations Don't Tweet, People Do

  • Will McInnes | The author of Culture Shock and managing director of Nixon/McInnes

  • Jennifer Magnolfi | An interview with the woman who said, 'Work is not a place you go, it's a thing you do'.

  • Hot Now

  • What Drives Us? | A draft chapter of my book, discussing motivations, Maslow's hierarchy, and fluidarity.

  • Socialogy: Interview With John Hagel | I Speak with Joh Hagel about the innovation at the edge.

  • Complex organisation arises from webs of interaction among causal factors | So, it turns out that DNA is, in fact, a great metaphor for business culture, but only after you realize that DNA is not a few hundred off-on switches, but instead a universe of unknowable complexities, that we can interact with, and understand at some abstract cartoonish level, but not control, and never fully comprehend.

  • Bitcoin May Be the Global Economy’s Last Safe Haven | Paul Ford

  • Innovators Get Better With Age | Companies make a mistake by relying too much on the innoations of the young, because Nobel laureats don't come into their prime until their 50s.

  • Oldie

  • Infodemics | 2009 | Passing incomplete or inaccurate information about some risk event can make people take actions that increase the damage of the event itself.