Tools and Documentation

Here you can find IT solutions for digital preservation tasks such as tools, components, models and more, along with documentation, user guidelines, demos, codes.

Choose and filter with the categories on your left, so you can retrieve the respective Preserveware material. Should you have a specific keyword in mind you can always search for it from the search box directly.

Showing 10 out of 24 results

An open source and collaborative framework for extracting the data you need from websites. In a fast, simple, yet extensible way.

Scrapy developers
Open Source
AVCC Cataloging Toolkit

AVCC is an open source web application developed to enable collaborative, efficient item-level cataloging of audiovisual collections.

Open Source

Tesseract is an open source Optical Character Recognition (OCR) Engine. It can be used directly as a software, or using an API to extract typed, handwritten or printed text from images. It supports a wide variety of languages.

Open Source

Invenio is a free software suite enabling you to run your own digital library or document repository on the web.


A complete, cross-platform solution to record, convert and stream audio and video. ffmpeg is a very fast video and audio converter that can also grab from a live audio/video source. It can also convert between arbitrary sample rates and resize video on the fly with a high quality polyphase filter.

Open Source
Web Scraper Plus+

Web Scraper Plus+ is a complete web extraction and automation suite. It has a simple wizard-driven interface for common tasks, but has more advanced functionality than all of our competitors combined.

LRM Ontology

The Linked Resource Model (LRM) aims at describing digital resources and their dependencies for preservation purposes.

Jean-Yves Vion-Dury, Nikolaos Lagos, (Xerox Research Centre Europe)
Open Source

ePADD was developed by Stanford University to support the appraisal, ingest, processing, discovery, and delivery processes of email archives.


JHOVE is a file format identification, validation and characterisation tool. It is implemented as a Java application and is usable on any Unix, Windows, or OS X platform with appropriate Java installation.


DROID is designed to identify the precise format of all stored digital objects, and to link that identification to a central registry of technical information about that format and its dependencies.