Hawarp is a set of tools for processing web archive data by means of the Hadoop framework. The different tools are available as command line interface applications, each with it’s own purpose, documentation, and usage modalities.

Entity Registry Model Repository (ERMR)

The ERMR tool is an open-source middleware component designed to implement a long-term data preservation environment to manage large collections of scientific data, replicated across different research projects, which may form the basis of international collaborations.