Modeling, Data, and Open Science for Long Term Human Ecology
Authors
Michael Barton
Arizona State University
Abstract
Human ecology involves dynamic interplay between humans and their environment that varies across space and through time. But the archaeological and paleoecological records of human socioecological systems comprise fragmentary and discontinuous static residues of human action and biophysical constituents of the environment. As a result, archaeologists have long been relegated to inferring imaginative narratives of human ecology from this proxy record. And often even imaginative scenarios have been unable to portray the tightly and complex coupled nature of these socio-natural systems.
In most cases, inferred scenarios of past human ecology were not tested in any systematic, scientific way. In fact for the most part, they could not be tested due to the very different natures of archaeological narratives and the proxy records. Narratives inferred from the archaeological and paleoecological record usually portray daily life of ancient social interactions, resource acquisition, mobility, and environmental modification--none of which are directly observable in the proxy records we study today. At the same time, the coarse temporal resolution and sparse and opportunistic spatial distribution of proxy records made them largely incompatible with the fine temporal and spatial resolution of human ecology narratives. Finally, while proxies for human activities and paleoenvironmental conditions sometimes co-occur (i.e., some buried archaeological sites), very often these different proxies are found in distinct locations, making correlations between the human and non-human components of human ecosystem difficult.
While these limitations of the archaeological and paleoecological records persist, archaeologists have begun to make use of advances in computational technologies over the past several decades to mitigate these issues in important ways. In particular, some archaeologists have begun to generate quantitative, computational models of past human ecosystems that have a number of advantages over traditional narratives. They can represent in greater detail and with more clarity the complex couplings and feedbacks between human decisions, actions, and biophysical processes. Also, they do so explicitly, transparently, and reproducibly, without recourse to fuzzy concepts like "influence" that are common in natural language narratives. Likewise, they do not depend solely on archaeological "expert" knowledge for their results, but can incorporate diverse sources of insight about human environmental relationships into algorithmic rule sets. Finally, and importantly, these kinds of formal models have a greater potential for testability because they represent explicitly described processes and generate concrete, quantitative results.
Some researchers have noted that model-building also can serve for theory-building, for developing robust and generalizable frameworks to account for human-environmental interactions. And certainly mechanistic, algorithmic models can help to 'operationalize' theory and knowledge about socio-ecological dynamics and explore its viability and implications in very useful ways. However, if such models are not tested (ie., validated) against empirical data, they provide limited improvement over traditional narratives for giving us insight into long-term human ecology, in spite of quantitative or computational sophistication. Likewise, the potential for computational models to be more rigorously validated than narratives can only be realized if data for doing so are available in useful formats. Even more so than for testing narratives, validating computational models requires systematically sampled, quantitative, well-structured data sets.
While the use of computational modeling is spreading in archaeology, the availability of data sufficient to validate such models remains rare. Most archaeologists have long since started using digital data management tools, even if just spreadsheets. But many archaeologists do not seem to understand how to structure their data to make it amenable to analysis and use by others, or the importance of metadata to make data useable at all. Also data collection protocols, necessary to evaluate the reliability of data for model validation, are often insufficiently described. And when archaeologists do create digital data sets, they rarely publish them or make them available in other ways. Mostly they publish inferential results of subjective or quantitative analyses, perhaps supported by illustrations or summary tables--in paper or PDF format. But the complete original data are not made available to others in useable ways or at all.
Technological and conceptual advances are beginning to shift archaeology from intuitive interpretation of the archaeological record toward explicit models of the dynamic interplay of humans in their ecological contexts. For this practice to be successful, however, we also need to change how we manage the data we collect from the archaeological record. Archaeologists have acquired enormous amounts of data over the past couple centuries. This should be a rich and valuable database that to meet the needs of model validation.
I offer examples of these important interconnections between computational models and archaeological data, drawing from research on long-term human ecology in the western Mediterranean. I further offer suggestions for ways in which changes in ethics and standards for archaeological practice can improve our ability to understand the interactions between people and their environments, to involve a wider diversity of participants in generating insights about these interactions, and to better meet our scientific responsibility to share what we learn with others.