Skip to content
Snippets Groups Projects
CAT4KIT

CAT4KIT

A cross-institutional data catalog framework for the FAIRification of environmental research data

Get Started with Cat4KIT: https://cat4kit.atmohub.kit.edu

Explore Our Documentation: https://cat4kit.readthedocs.io

  • C

    A User Management Interface(UMI) for harvesting and ingesting by DS2STAC into the Cat4KIT-infrastructure.

    Please have a look at https://ca4kit-umi.readthedocs.io for the documentation.

  • C

    This GitLab subgroup is dedicated to managing environment configurations and Nginx setup files for the Cat4KIT project. To maintain confidentiality, the production vault is stored in a private repository. Conversely, the development vault is kept in a public repository to allow for broader access and collaboration.

  • D

    A Python package for harvesting and ingesting (meta)data into STAC-based catalog infrastructures

    Please have a look at https://ds2stac.readthedocs.io for the documentation.

  • B

    The purpose of this repository is to gather the concerns and problems identified on the Feedback page of the Cat4KIT-UMI platform.

  • C

    This repository is the official GitLab profile for the Cat4KIT group, dedicated to presenting a live roadmap that outlines ongoing and future developments.

  • C

    This repository provides a Docker Compose setup for automatically deploying and running the entire Cat4KIT service on any server.

  • C

    The comprehensive documentation for the Cat4KIT project is compiled using Sphinx, ensuring structured and accessible content. For detailed insights into the project, the documentation is readily available online at https://cat4kit.readthedocs.io or https://cat4kit.pages.hzdr.de/cat4kit-documentation/. These resources provide extensive information and are the go-to references for understanding the nuances of Cat4KIT.

  • C

    This repository within the Cat4KIT project framework is tasked with the installation and ongoing upkeep of software packages. It actively monitors and verifies the operational status of pipelines, manages version control, and disseminates updated roadmap charts for pipeline publication.

  • P

    This repository serves as an archival system for Django REST Framework (DRF)-based production (scheduler) backend, storing object lists in YAML format and regularly backing up harvested metadata. It provides a reliable off-server backup solution, enhancing data resilience by enabling efficient metadata restoration in emergencies.

  • S

    This repository is designed to systematically archive SQL-based backups of two distinct databases: the pgSTAC production database and the Django REST Framework (DRF) PostgreSQL database. Each backup is compressed into a tar.gz format and managed using Git Large File Storage (LFS) to efficiently handle large files. The archival process is automated, with the server configured to periodically push the latest backup data to the repository, ensuring that the repository remains up-to-date with the most recent snapshots of the databases. This setup not only optimizes storage through compression but also leverages Git LFS for scalable management of binary data, providing a robust solution for maintaining critical database backups.

Cat4KIT: A cross-institutional data catalog framework for the FAIRification of environmental research data


CI Docs Code style: black Imports: isort

The objective of the Cat4KIT project is to create a cross-institutional catalog and research data management system that will enhance the Findability, Accessibility, Interoperability, and Reusability (FAIR) principles of environmental research data. The framework comprises four distinct modules, each assigned with certain tasks.

(1) Facilitating the availability of data on storage systems via clearly defined and standardized interfaces.

(2) The process of harvesting and creating (meta)data into uniform and standardized representations.

(3) Facilitating the public accessibility to (meta)data by the utilization of well defined and standardized catalog services and interfaces, so ensuring consistency and uniformity.

(4) Allowing users to efficiently search, apply filters, and navigate through data obtained from decentralized research data infrastructures.

Every module is created and executed in an inter-institutional cooperation consisting of scientists, software developers, and possible end-users. This methodology guarantees the versatility of our framework, allowing it to be applied to many types of research data, including multi-dimensional climate model outputs and high-frequency in-situ measurements.

Cat4KIT pipeline roadmap

%%{init: { 'theme':'forest'}}%%
gantt 
dateFormat YYYY-MM-DD 
title Cat4KIT development roadmap
section Cat4KIT dev
TDS2STAC V1(100%):done,tds2stac1,2022-10-01,2022-11-01
INTAKE2STAC V1(100%):done,intake2stac,2022-11-01,2022-11-15
STA2STAC V1(100%):done,sta2stac,2022-11-16,2022-11-30
DS2STAC-UI V1(100%):done,ds2stac-ui1,2022-12-01,2022-12-10
TDS2STAC V2(100.0%):done, 1964, 2023-06-12, 2023-07-05
Cat4KIT-UI V1(100.0%):done, 1965, 2023-07-06, 2023-08-20
Thumbnails(100.0%):done, 1973, 2023-09-20, 2023-11-30
DS2STAC V2(100.0%):done, 1970, 2023-08-21, 2023-11-30
Pipeline(93.3%):active, 2107, 2023-07-01, 2024-01-01
API Authentification(0.0%):active, 1975, 2023-12-10, 2024-01-10
Cat4KIT-UI V2(72.0%):active, 1972, 2023-12-07, 2024-01-15
Int. and Ext. harvester(100.0%):done, 1974, 2024-01-15, 2024-01-20
section Cat4KIT-docker
+pgSTAC(100%):done,pgstacdocekr,2022-12-11,2022-12-25
+STAC-FastAPI(100%):done,stacapdocker,2022-12-11,2022-12-25
+STAC-Browser(100%):done,stacbrowserdocker,2022-12-11,2022-12-25
+DS2STAC(100%):done,ds2stacdocker,2022-12-11,2022-12-25
Dockerizing(100.0%):done, 1985, 2023-07-05, 2024-01-01
section Research
Testify datasets(100.0%):done, 1979, 2023-08-21, 2023-11-15
Comparison research(0.0%):active, 1978, 2024-01-15, 2024-01-25
section Docs and Publications
DS2STAC Docs(100%):done,tds2stacdoc,2022-10-01,2022-11-30
Datahub(100%):done,datahub,2023-01-19,2023-01-20
E-Sceince-Tage23(100%):done,esciencetage23,2023-03-01,2023-03-03
EGU(100%):done,egu,2023-04-23,2023-04-28
DSS8(100%):done,dss8,2023-06-08,2023-06-09
DS2STAC doc(100.0%):done, 1981, 2023-11-01, 2023-11-30
Cat4KIT doc(14.3%):active, 1982, 2024-01-17, 2024-02-01
Cat4KIT article(0.0%):active, 1983, 2024-01-17, 2024-02-17

Cat4KIT schematic diagram (Will be updated)

Cat4KIT