The Natural Side of Big Data

Originally published on Island Press’ “Field Notes” blog, August 21, 2012:

“Big Data” is getting big coverage. For example, this recent column in the New York Times that captures the emergence of Big Data as a cultural meme. Usually, people take a primarily a technophilic view of Big Data. The Times article, for example, describes Big Data as, “applying the tools of artificial intelligence, like machine learning, to vast new troves of data beyond that captured in standard databases. The new data sources include Web-browsing data trails, social network communications, sensor data and surveillance data.”

But Big Data is also being shaped by the natural world and affecting how we understand and interact with the natural world. It is shaped by the natural world because biology offers such an enormous data base on so many levels, from the nearly infinite bytes of data emerging from genetic studies that are already taxing our biggest digital depositories (a dilemma well captured in this article by science philosopher Mark Sagoff on data deluge), to the scores of specimens held in natural history museums and photographic archives that provide a world-wide view of what used to live where, to masses of data now emerging from “citizen science” databases, such as the National Phenology Network (a great overview article was recently published in a special issue of Frontiers in Ecology and the Environment on citizen science), to large scale aggregates of biogeochemical interactions, such as coastal “dead zones” that essentially compile the interactions of industrial nitrogen conversion, human agricultural practices, primary productivity, and biological respiration (an interactive map of these zones has been created by WRI).

Big Data is affecting how we understand the world because it erodes what we have been told over the past 50 years or so is the bedrock of scientific understanding: falsifiable hypotheses tested in controlled experiments under a “strong inference” framework. The notion that Science must be falsifiable comes from Karl Popper, and his ideas got a big boost from John Platt’s propaganda like rant for a standardized method for conducting biological science in his well-cited 1964 “Strong Inference” paper (if you loved this paper the first time you read it, as I did, I urge you to read it again with a more critical ear – it’s a little like getting enthralled with Ayn Rand in high school, and then trying to square her ideas with reality as an adult). These philosophies have led to institutionalized rules that, “correlation does not imply causation”, that “patterns cannot reveal mechanisms” and that science that proceeds without falsifying pre-determined hypotheses is just a “fishing expedition”.

Big Data makes these stalwart arguments seem a bit quaint. They are all still valuable, sometimes, but the reflexive nature in which they are used by both scientists and non-scientists alike (see my comments on it here and in our book, Observation and Ecology) needs to be reevaluated. Big Data approaches allow life scientists to find very robust patterns with much larger chaotic cycles, and if it doesn’t allow us to ascribe mechanistic causes with 100% certitude (no approach will), it at times gets us about as close as possible.  At the same time, a caution about Big Data. It will never fully substitute for Little People. That is, individuals who take the time to observe nature (“with the brain in gear” as Geerat Vermeij says in an excellent contribution to Observation and Ecology) and understand the Little pieces of it that make the Big whole of it.


About Rafe

Rafe Sagarin is a marine ecologist and environmental policy analyst at the University of Arizona. In both his science and policy work, Sagarin connects basic observations of nature to issues of broad societal interest, including conservation biology, protecting public trust resources, and making responses to terrorism and other security threats more adaptable. Dr. Sagarin is a recipient of a 2011 Guggenheim Fellowship and has recently published two books, Learning from the Octopus (Basic Books, March 2012) and Observation and Ecology (Island Press, July 2012), which show how nature observation--when extended across large scales and enhanced with both new technologies and greater deference to traditional knowledge sources—is revealing profound new insights about our dynamic social and ecological world. He was a Geological Society of America Congressional Science Fellow in the office of U.S. Representative (and later U.S. Secretary of Labor) Hilda Solis. He has taught ecology and environmental policy at Duke University, California State University Monterey Bay, Stanford University, University of California, Los Angeles and University of Arizona. His research has appeared in Science, Nature, Conservation Biology, Ecological Monographs, Trends in Ecology and Evolution, Foreign Policy, Homeland Security Affairs and other leading journals, magazines, and newspapers. He is the editor, with Terence Taylor of the volume Natural Security: A Darwinian Approach to a Dangerous World (2008, University of California Press).
This entry was posted in Ecology, Environment, Observation and tagged , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s