Welcome to archive-drs’s documentation!
This software aims at adding data that follows directory structures according to the Data Reference Syntax (DRS) to various data object stores or achiving systems. Currently the code supports the following archiving systems:
The intended usage is either via a command line interface (cli) or the
drs_archive
python package that lets you archive
and retrieve data. Along with the data additional meta data is stored.
This metadata consists of a hash that represents the data files that are
archived and the modification dates of the data. The idea is, that data
that has already been archived and that hasn’t changed since it was archived
will not be archived again. This allows for automated archival, like in cron
jobs.
Furthermore, a string representation of the dataset is stored in the metadata if the archiving system doesn’t allow for direct representation of meta data. This allows for fast metadata inspection without downloading the data.
Use Case
The code expects the data that is supposed to be archived to be organised in a file name structure following the DRS conventions <https://pcmdi.llnl.gov/mips/cmip5/docs/cmip5_data_reference_syntax.pdf>. Archive operations are directory based not single file based. Meaning that all files in a given directory will be archived.
Installation
The code can be installed using pip
:
python3 -m pip install archive-drs --extra-index-url gitlab.dkrz.de/api/v4/projects/139393/packages/pypi/simple
See also
- StongLink HSM:
User guide to the DKRZ HSM system
- OptenStack Swift:
User guide to the DRKZ swift cloud store
- python-swiftclient:
User guide of the python swift client library.