17.05.2013 - 07:28

Overview of the technical components of the CRC806 data management


This short write-up is meant to identify the main technical components behind the CRC806 database and deliver informational insight into the combination of said components.

The metadata of the CRC806 database ist stored within and retrieved from a CKAN instance through it´s powerful Action API. We use CKAN in version 1.8 and it’s Action API, which offers the main functionalities of CKAN in a Remote Procedure Call (RPC) style. The underlying information model for the persisted metadata can be subdivided into core metadata and unlimited additional metadata. The core metadata schema that partly overlaps with the Dublin Core Metadata Element Set can be viewed here.

In addition to describing data with said core metadata we offer the user to validate his data against ISO19115 for datasets and datset series and ISO19119 for services. The ISO elements are clearly more voluminous in comparison to the core metadata. The elements currently supported by the database can be viewed in this UML diagramm. The ISO compliant metadata is stored within a pycsw instance. The implementation is independent of pycsw. The system works with every fully transactional CSW Server Implementation. We chose pycsw because it is certified as OGC compliant and is an official OGC Reference Implementation.

The communication with CKAN and pycsw is done within a TYPO3 PHP extension which follows a Model View Controller pattern. The view component is carried out by a self developed frontend that heavily depends on Javascript and the popular AngularJS Framework by Google. CKAN can be regarded as the Model component of the extension whereas the controller is the part that glues view and model together.

The overall architecture can be viewed in this diagramm.

The aforementioned UML Model is translated into an extensible ISO Domain Model of the SFB806 Extension. The single objects of the Domain Model are combined via object composition. Thus, various object compositions are possible, e.g. a compound object for a dataset or a service. Every object that is part of the ISO Domain Model inherits from an abstract base class which provides the funtionality to translate itself into an array or an ISO compliant XML representation. Further, the abstract base class provides the functionality for inheriting compound objects to construct itself from an ISO XML representation or array representation. The workflow is as follows:

A user requests to add additional metadata for his dataset or service. The SFB806 extension instantiates a compound object of the ISO Domain Model that fits the ISO description for datasets or services. The compound object translates itself to an array which is the basis for the delivered form the user has to complete in the frontend. After the user has submitted the form, the compound object is rebuild from the array and translates itself to an ISO compliant XML representation which is then send to the Manager Interface of the pycsw server.  The update process of a dataset can be viewed in this Sequence diagramm.

The current ISO Domain Model can be easily extended in terms of application profiles as demanded in the OGC Catalogue Service Implementation Specification 2.0.2.

All used figures are taken from:

Kürner, D., 2013. Implementation des Metadatenmanagement von Geodaten der SFB806-Datenbank. Diplomarbeit, Geographisches Institut der Universität zu Köln, Köln.

Post by: Daniel Kürner