Good practice in data management is one of the core areas of research integrity, or the responsible conduct of research. It enables verification of research outcomes, supports future research and enables sharing and reuse of research data.

The Collaborative Research Centre (CRC; ‘Sonderforschungsbereich’ or SFB) is designed to capture the complex nature of chronology, regional structure, climatic, environmental and socio-cultural contexts of major intercontinental and transcontinental events of dispersal of Modern Man from Africa to Western Eurasia, and particularly to Europe (Cited from introductory text on: www.sfb806.de).

The CRC806-Database is the central data infrastructure of the Collaborative Research Centre 806 (CRC806). This platform sets out to implement two main aspects: The first aspect is to provide a secure and sustainable long-term data archive and publication platform for research results (primary data) produced by CRC806 researchers and projects. The second goal is to provide an integrated data basis to facilitate the research within the CRC806.

According to the German Research Foundation (DFG) proposals for safeguarding good scientific practise (DFG 1998), the CRC806 endeavours to provide a sustainable long-term data archive for the project data. The project is planned for an overall period of twelve years, and is subdivided into three four-year project terms which are underlying an evaluation at the end of each term. According to the demand of the DFG, to provide access to research results at least 10 years after the end of the project, the CRC has to make sure to be able to provide access to the data for up to 22 years. This long-term availability and the according maintenance of the CRC806-Database infrastructure, after the end of the research project itself, is secured by an agreement between the CRC806 and the Regional Computing Centre of the University of Cologne (RRZK).

During the design, development and implementation of the CRC806-Database, the complex requirements for sound data management in the context of a large interdisciplinary research project were considered theoretically, as well as practically. The presented infrastructure design is mainly based on the requirements for research data management in CRC's, that is mainly the secure storage of primary research data for at least ten years, as well as on the further recommendations, that are about support and improvement of research and facilitation of Web-based collaboration, for information infrastructure by the DFG (Willmes 2016).

The CRC806-Database semantic e-Science infrastructure consists of three main components:    

i.) the CRC806-RDM component that implements the research data management, including a data catalog and a publication database,
ii.) the CRC806-SDI component that provides a Spatial Data Infrastructure (SDI) for Web-based management of spatial data, and additionally,
iii.) the CRC806-KB component that implements a collaborative virtual research environment and knowledgebase.

From a technical perspective, the infrastructure is based on the application of existing Open Source Software (OSS) solutions, that were customized to adapt to the specific requirements were necessary. The main OSS products that were applied for the development of the CRC806-Database are; Typo3, CKAN, GeoNode and Semantic MediaWiki. As integrative technical and theoretical basis of the infrastructure, the concept of Semantic e-Science was implemented. The term e-Science refers to a scientific paradigm that describes computationally intensive science carried out in networked environments. The prefix "Semantic" extends this concept with the application of Semantic Web technologies. A further applied conceptual basis for the development of CRC806-Database, is known under the name "Open Science", that includes the concepts of "Open Access", "Open Data" and "Open Methodology" (Willmes 2016).

The CRC806-Database infrastructure is developed by a Team of the CRC806 Z2 Project and is situated at the Institute of Geography of the University of Cologne. The platform is implemented using mostly free and open source software.


DFG 1998: Proposals for Safeguarding Good Scientific Practice.

Willmes, C. (2016): CRC806-Database: A semantic e-Science infrastructure for an interdisciplinary research centre. PhD Thesis, University of Cologne