Open licenses for archaeological data matter: the case of AustArch

Stefano Costa - July 29, 2014 in Discussions

A few days ago Internet Archaeology published a superb data paper featuring AustArch, a database of more than 5000 radiocarbon and other archaeometric dates from Australia. To my surprise, while the data paper is published as open access under a CC-BY license, the dataset itself is available from the Archaeology Data Service under their custom license, that is not open.

I want to take this example as an opportunity to revisit the ADS terms of use. I hope I to show that both in this specific case and in general standard open licenses offer significant advantages. The ADS is a cornerstone of data sharing in worldwide archaaeology both directly as a repository and indirectly as a leading organisation for other, newer data archives (such as tDAR, DANS and MOD). Their role for the entire community is so important that I feel we’re really missing a lot by having antiquate (no pun intended) terms of use in place.

I will start quoting some relevant excerpts from the data paper: Williams, A.N., Ulm, S., Smith, M. and Reid J. (2014). AustArch: A Database of 14C and Non-14C Ages from Archaeological Sites in Australia – Composition, Compilation and Review (Data Paper). Internet Archaeology, (36). http://dx.doi.org/10.11141/ia.36.6

The dataset includes all radiocarbon and non-radiocarbon ages associated with archaeological deposits published in the last 60 years of research (Figure 3). The dataset also includes extensive, but not comprehensive, unpublished/grey literature data, mainly from New South Wales and Queensland. Some unpublished/grey literature from Victoria and Western Australia is also included through personal communication and/or other databases

The data sources for AustArch:

Overall, information has been obtained from 1,067 publications in the development of the dataset, with several hundred more being examined but failing to contain pertinent data. Of these publications, 583 (55%) were journal articles; 51 (5%) were books; 159 (15%) were book chapters; 100 (9%) were unpublished undergraduate or postgraduate theses; 164 (15%) were unpublished consulting/commercial reports; and 10 (1%) came from other sources.

Please note that 24% comes from unpublished literature.

What use has been made of the dataset so far?

Since the development and release of various parts of the dataset, it has proved a well-used resource for a range of research and consulting/commercial works, however its main application has been in the development of time-series or summed probability analyses.

The dataset could be improved by incorporating more data from the commercial sector:

In the short-term, the dataset can be significantly improved by the incorporation of all unpublished data, particularly produced in the commercial/consulting sector. The data are not readily available, often contained in State or local repositories and/or by individual companies. Commercial/consulting work has been extensive in the last decade, most notably in Victoria and Western Australia, and the incorporation of data from these States would provide a significant increase in ages for both arid and temperate regions.

Now take all the above and put it together with the non-commercial license used at the Archaeology Data Service: what is wrong? I see two main issues here: standardisation and ethics.

Creative Commons open licenses (that is, CC-BY, CC-BY-SA and CC0) are well known and internationally recognised as marking content that is readily usable for any purpose. They’re immediately recognisable by users, who know the Creative Commons brand. They’re machine readable, allowing automated tools to retrieve metadata about permissions and restrictions (e.g. share-alike). My hero Colleen Morgan says that archaeologists should use Creative Commons (open) licenses for everything. Using anything else, even for content or data that is available for downloading, is not unlike putting it on display without allowing bystanders any actual interaction. While it may seem “more permissive”, I can’t see how requiring signed [...] Students’ Undertaking Form is in any way better than a boring, boilerplate Creative Commons deed.

In this respect, the ADS terms of use are slightly worse than a (standard) non commercial license: instead of restricting reuse and distribution, they only allow you

1. To use and to make personal copies of any part of the Data Collections only for the purposes of non-commercial research or teaching, as specified in the accompanying application.

which from an ethics point of view is a very slippery slope, that’s basically prohibiting any professional archaeologist from even downloading the data (make personal copies). And don’t forget that teaching is not necessarily non-commercial (even when done by universities). I remember having this discussion years ago with the ADS staff, at least as early as 2010, and I’m sorry we haven’t been moving forward much. What is more embarassing, however, is that data papers as a new form of academic literature were introduced for the purpose of encouraging open data, not as a mere academic exercise in multiplying the number of (open access) journals. Of course there are more “insane” requirements in the following points but that’s not really the point. Instead, can we move on to Creative Commons open licenses, please?

As Colleen Morgan succinctly put it:

What about the professional archaeologists among us? They need media [and data] too.

For the record, and to put my own observations about AustArch in a wider perspective, a short and incomplete list of databases of radiocarbon dates follows. None of the databases below is available as a download at the moment I am writing this.

  • http://www.archeometrie.mom.fr/banadora/
  • http://www.crt.state.la.us/archaeology/radiocarbonDB.aspx
  • http://www.canadianarchaeology.ca/
  • http://www.waikato.ac.nz/cgi-bin/nzcd/search.pl
  • http://pidba.utk.edu/dating.htm
  • http://www.museumwales.ac.uk/en/radiocarbon/database/
  • http://context-database.uni-koeln.de/
  • http://www.gla.ac.uk/centres/nercrcl/results.htm
  • https://sites.google.com/site/matthewboulanger/research/vermont-radiocarbon-database
  • http://ees.kuleuven.be/geography/projects/14c-palaeolithic/
  • http://c14.kikirpa.be/
  • https://www.radiocarbon.org/Info/index.html#databases

Unleashing the potential of collaboration

Stefano Costa - May 6, 2013 in events

Ant Beck is giving a talk on 10th May 2013 at the ArcLand Conference “k2 > u2 – From Known Knowns to Unknown Unknowns: Remotely Detecting the Past” in Dublin.

The presentation is mainly advocating open approaches and inclusive engagement with the public. The slides he and David Stott prepared are already available online.

The AIA and Open Access: An Open Letter

Jessica Ogden - April 30, 2012 in Discussions

The following is a joint letter to the AIA from several members of the Open Archaeology Working Group.

A recent editorial by the President of the Archaeological Institute of America, Elizabeth Bartman, made the claim that “the [AIA], along with our colleagues at the American Anthropological Association and other learned societies, have taken a stand against open access.”  As might be expected, this has led to a degree of consternation among many of its members. After all, access to information is one of the most significant issues of our age and those who aim to restrict it should expect some opposition. Bartman is not objecting to freedom of speech, however, but ‘free as in beer’, in particular a proposed piece of federal legislation that would make archaeological scholarship ‘available to the public, on the Internet, for no charge’. This is not a simple issue and, as practicing archaeologists from the international community, we respect the AIA’s right to express such views. Despite this, it is our opinion that this proclamation has done both the AIA’s membership, as well as other academics and the general public, a grave disservice.

The editorial constitutes the public version of an official response (PDF) to the US Office of Science and Technology Policy’s Request For Information with regard to their policy on open access in general. In this formal submission the AIA makes clear that the current journal subscription and pay-per-article paradigm is entirely satisfactory in its eyes, and that “Access to [archaeological] information currently already exists and no additional federal government intervention is necessary.” In making such a claim it has urged the US Government to keep the dissemination of publicly funded knowledge in the hands of private commercial interests. It is notable that the AAA, whom the AIA quote at length in justifying their position, have since changed their position following an outcry from their own membership, and now  advocate a move to facilitating open access to anthropological research.

All this being said, we choose to see this situation as an opportunity, rather than a disaster for public access to archaeology. In making her case so boldly, Bartman has brought the issues surrounding open access front and center. Indeed, were her editorial itself not open access, it is doubtful that many would have noticed that the AIA had an opinion on the matter at all. It is also a positive step that Bartman has responded constructively to calls for a reassessment of the AIA’s position. We would urge the AIA to go much further, however, using its own open platforms, most notably its blog, to host a range a voices on this issue so that the genuine opinions of its membership can be heard. Of course, it is quite possible that many AIA members do believe that the terms under which their work can be read by others should be dictated by commercial publishers, but we suspect that this is far from being a consensus. Secondly, Bartman’s argument that the costs of dissemination are not covered by research grants, is not an argument against open access. It simply acknowledges the fact that there is a cost involved. The real question, and one on which we feel there is a worthwhile debate to be had, is how that cost should be borne.

We therefore respectfully urge the AIA to:

A. Retract their objection to open access, just as the AAA have done. Of course, they may wish to abstain from advocating for open access, but actively opposing those who wish to see public research freely disseminated among the American (and indeed global) public seems unworthy for an institution whose mission is to promote ”archaeological inquiry and public understanding of the material record of the human past worldwide.”

B. Use their community services, including their blog and mailing list, as a forum to engage in a debate that has become one of the most important public questions of our time.

Furthermore, we encourage the AIA’s members to speak out on behalf of open access for the following five reasons:

  1. Archaeology, unlike the physical sciences, typically destroys the very subject of its study in the process of investigation. Chemistry experiments can be repeated but excavations cannot. Archaeological interventions should only be sanctioned on the basis that the record – which includes the interpretations of those present, as well as the ‘raw data’ – is made available to those from whom it has been irretrievably taken: the public. Selling that information for profit is little better than selling archaeological finds for profit.

  2. The cost of publishing in closed-access journals is not only high, but it actively hinders the spread of knowledge within academia and the wider society. Almost all academics will have experienced the inability to access journals to which their university does not subscribe. This is insignificant however, in contrast to the lack of access to the general public. The problem is not simply the accumulative cost – it is the fact that it is impossible to know whether an article is truly helpful until one has read it. A pay-per-view TV show that charged $20 per episode would soon go off air, yet such sums are considered reasonable for an individual journal article. To make matters worse, libraries are usually required to pay high costs for cross-disciplinary ‘packages’ of journals, the majority of which are irrelevant to its readership. The overall cost per article is therefore extraordinarily inefficient from the perspective of the library, and thus those who fund it.

  3. It is often suggested that journal publishers make efforts to reduce costs for students or those residing in developing countries. We contend that it should never be the role of a commercial company to arbitrarily decide the conditions under which publicly funded scholarship can be seen. Such decisions should clearly be made by those who paid for it, i.e. the taxpayer or their representatives.

  4. The massive digitisation programmes of recent years have made articles more discoverable and portable than ever before. But this is just the beginning. Digital methods for analysis and visualisation are evolving just as fast and yet they have had almost no impact on archaeology for the simple reason that the data is not accessible. Instead, researchers are applying twenty-first century methods to nineteenth century journal articles because they are freely available online without restrictions from publishers. Meanwhile, twenty-first century articles are still restricted to methods practised in the first century.

  5. Through its editorial and submission to the US Government, the AIA has made itself a poster child for private publication of public scholarship. We agree that sometimes it is important to take a stand on an issue, but we think it wise to ask which side of history one is likely to end up on. The evidence from politics, the media, social practices and the commercial sector suggests that simply ‘taking a stand against open access’ is inadvisable. We do not underestimate the risks involved in ‘going open’, but swimming with the tide may be more advisable that commanding it to turn back.

These are just five reasons to support open access. There are many more, and there are also some important arguments against open access which we hope others will air. The US has long pioneered the notion that public funds should be spent for the benefit of those that pay them, not academic or commercially vested interests. It is our sincere hope that the AIA will in time also lead from the front when it comes to making archaeological scholarship available in the Information Age.

 

Cordially

Dr Leif Isaksen, Department of Archaeology, University of Southampton

Dr Anthony Beck, School of Computing, University of Leeds

Jessica Ogden, L – P : Archaeology

Stefano Costa, Dipartimento di Archeologia, Università degli Studi di Siena

Colleen Morgan, University of California, Berkeley

Doug Rocks-Macqueen, University of Edinburgh

Dr Andrew Bevan, UCL Institute of Archaeology

Dr Eric Kansa, The Alexandria Archive Institute

Paul Cripps, Archaeological Computing Research Group, University of Southampton

Mediterranean archaeology and the Not-So-Open Sea

Andrew Bevan - May 31, 2011 in Discussions

Just a quick contribution, following up on an email discussion that Stefano Costa and I had about current attitudes towards, and provisions for, open access in Mediterranean archaeology (and reminded by some conversation at a fun conference a couple of weeks ago in London).

I think some interesting discussion could be had about the intended detail of data-sharing amongst those involved in Mediterranean archaeology, or indeed archaeology more generally. For example, there are already some very useful online summaries of Mediterranean excavations and surveys in the form of such things as AIAC and L-P Archaeology’s FastiOnline or Tel Aviv/USC/UCLA and IPAWG’s West Bank and East Jerusalem Archaeological Database both of which I think are the kinds of ambitious, but effective cross-border or cross-party efforts that hopefully will be taken up more widely. A great example of a Mediterranean typological resource is the Southampton Roman Amphora project. Another useful resource on Mediterranean surface surveys is CGMA’s MAGIS site, albeit with restrictions on usage and no immediate way to download raw data or results of searches at present (please correct me if I am wrong).

In any case, this is of course far from an exhaustive list and reflects my own research biases. Such Mediterranean projects have a variety of valid priorities and working constraints which lead to the particular dissemination approaches they have chosen. Not everyone will of course agree with all of them and there is, I’d argue, still some further championing of direct-access-to-simple-text-and-image-files to be done. More to the point, the acknowledged focus of these initiatives is on coarse locations (e.g. site centroids), typologies (e.g. of representative artefacts only), and metadata (general context of research, index level bibliography). Such standardising, integrative efforts are clearly crucial, but it would be a pity if that was the only sense in which Mediterranean or any other regional archaeology sought to share data.

In contrast, full, raw and open datasets (for example of excavation and survey archives at the scale of individual recovery units, original finds databases, etc.) are made public only to a very limited degree, even many years after collection. The best ones I have come across so far are typically those driven by the UK Arts and Humanities Research Council’s requirements for archaeological projects with funding from them to archive with the UK Archaeology Data Service’s (ADS; e.g. search for “Mediterranean” here). The Open Context initiative also looks promising although still in its early stages. Even so, the degree to which these project archives really do provide usable, primary datasets varies a lot to my knowledge, and the ADS still has a click-to-agree-with-our-terms protocol which tends to get a bit in the way (any comment on the formal rationale behind this?).

In any case, whether such raw data provisions are achieved via integrative mega-projects, official repositories, open access journals with dataset DOIs, or other arrangements (one of the latter I and some colleagues are working on as a companion to an imminent ADS deposition is here: and the Çatalhöyük database is certainly worth a look) and different kinds of funder or permit requirements is less important (though on the mega side of things, there is clearly a case for well-funded, discipline-agnostic national or trans-national depositories: e.g. building on the idea of Public Data Corporations) than the fact that there currently is such a huge missed opportunity. Particularly strange that the Mediterranean, despite being one of the most data-dense archaeological records in the world, provides so little raw data! The straightforward combination of one or more simple ascii data files, a short description explaining some archaeological background and an open license all have a beautiful simplicity to them — we should start with that, rather than getting to0 worried about either semantically-rich-and-structured stuff or a full theoretical barrage about ontologies etc.

While I am at it, one final issue with regard to Mediterranean spatial datasets and full data-sharing is, of course, the risk of promoting some kind of spatially-enhanced looting (and for the issue in archaeology more generally, Ant Beck’s blog is great). This gets especially tricky of course when our open data efforts cross modern political borders (there is a good blog and journal special issue on this)? Such risks are probably less relevant for excavation archives and finds databases of known sites (“unencumbered data” as Stefano puts it), but are certainly a theoretical worry for many who deal in site-level spatial data across whole landscapes and in known areas of looting. In any case, my feeling is that a) this is a fear that, while very plausible in theory, is rarely backed up with sufficient proof in terms of much documented looting behaviour that has been enabled by academic digital publications of spatial coordinates (unless I have missed something – perhaps shipwreck ‘salvage’?), b) such an issue is being rapidly taken out of specialist hands (and therefore over the next decade will effectively become less of an issue, whatever our misgivings) by the fact that non-specialists and informal contributors (site visitors with cameras, GPS etc., locals, metal detectorists, project participants, etc.) can now contribute fairly precise locations of cultural heritage finds and sites to Google Earth etc., and already promote spatially-precise exploratory activities through (what are usually ecologically progressive) hobbies such as geo-caching. Access to Mediterranean datasets with intentionally-degraded spatial information and/or limited to those with a state or academic institutional affiliation, are perhaps the two most often mentioned spatial firewalls in sensitive cases (e.g. the different levels of coordinate access in the UK’s Portable Antiquities Scheme – which is worth a look more generally, if you have not seen it before, as it is simply such an important initiative), but neither are as open as we might otherwise wish. Beyond simply saying that we agree that open-spatial-data-enabling-looters is an important issue and that it is part of an ongoing discussion, how programmatic do we need to be?

Anyway, thoughts and corrections on the above are very welcome!

 

The purpose of the working group

Stefano Costa - February 6, 2010 in Working Group

The purpose of the Working Group on Open Data in Archaeology is to:
  1. Act as a central point of reference and support for people interested in open archaeological data
  2. Identify relevant projects and practices. Promote best practices as well as legal and technical standards for making data open (such as the Open Knowledge Definition).
  3. Act as a hub for the development and maintenance of low cost, community driven projects related to open data and archaeology.