Digital Library Consulting Logo Making digital libraries easy

Archive for August, 2009

[Article] What’s METS/ALTO and should you care?

Monday, August 17th, 2009

As technologists we always get excited by digital library standards like METS/ALTO. What is it and does it matter to library professionals?

Over the past ten years Digital Library Consulting has built lots of digital collections, and those collections have been built from a wide range of digital objects. For example, we’ve built collections from PDF files, Microsoft Word documents, digital images, text files, HTML files, from digital images with associated text/HTML/PDF files, and from objects using various standards like TEI. In addition, many of the collections we’ve built have had existing metadata in Excel spreadsheets, XML and binary MARC formats, Microsoft Word files, and a whole range of different database formats.

It’s obviously much more complex and time-consuming to build a digital library from objects in formats we’ve not used before than it is to do so from some kind of “standard” format that we already have software to support. In many cases this can’t be helped – the digital files already exist, and the digital library software used to present them simply must be adapted to suit.

In the case of “new” digitization projects, where physical documents are being turned into new digital objects, any number of different digital formats can be selected. We regularly work with those planning new digitization projects to help them select the most appropriate format, and for textual documents we most often recommend METS/ALTO.

The Metadata Encoding and Transmission Standard (METS) has been around for some time, and is a standard with which many library professionals will be familiar. It’s a standard for encoding descriptive, administrative, and structural metadata regarding objects within a digital library, using XML. The METS standard is maintained by the Library of Congress, and is developed as an initiative of the Digital Library Federation.

While METS is great at describing the structure of a digital object, it’s missing the ability to describe the content and layout of each piece of the digital object. For that we need an extension to METS called ALTO (Analyzed Layout and Text Object). This combination of METS and ALTO was originally developed by the METAe project, and was later adopted by the Library of Congress for their large-scale National Digital Newspaper Project (NDNP). Since then METS/ALTO has been used for many large national newspaper digitization projects, as well as a number of projects digitizing books and journals.

METS/ALTO provides extremely rich digital objects, which allows for extremely rich digital library interfaces to be built. For example, a typical METS/ALTO object encodes not only the complete logical and physical structure of a document (i.e. chapters, sections, articles, pages, etc., and their associated metadata), but also the full-text content of each section of the document and even the physical coordinates of every word in the document! The impact of this on the user’s search experience can be quite significant. Additionally, it doesn’t typically cost any more to digitize materials to METS/ALTO than to formats like HTML, which contain much less information.

Digital Library Consulting has completed several projects using the METS/ALTO standard. The National Library of New Zealand’s Papers Past project for example, which contains approximately 1.2 million newspaper pages, or around 7 million individually searchable and viewable newspaper articles. We’ve also completed a project based on the standard for Cornell University Library, and are working on a major project with the National Library of Singapore.

If you would like to discuss the implications of using METS/ALTO in your digital collection projects please contact us at contact@dlconsulting.com.


DIGITAL LIBRARY INNOVATORS ANNOUNCE ALLIANCE

Thursday, August 13th, 2009

New Zealand-based Digital Library Consulting joins forces with leading German company CCS.

August 2009. Hamilton, New Zealand. – The world’s biggest libraries and other institutions will find it easier to open their collections to the world thanks to an agreement announced today between New Zealand’s Digital Library Consulting and Content Conversion Specialists (CCS) of Germany.

“CCS is a world leader in the field of digitizing library and other collections so this is a very exciting development for Digital Library Consulting,” said the company’s founder and managing director Stefan Boddie.

“Having worked with us on a number of international projects, with organizations like the National Library Board of Singapore and New Zealand’s National Library, CCS have seen real value in our Veridian software product which provides the online interface for users to search and view items in a digital collection.”

Digital Library Consulting’s Veridian software has been used for collections including those at the National Libraries of Singapore, Luxembourg, and New Zealand, and libraries at Princeton and Cornell Universities.

Veridian provides the interface used to search and view the National Library of New Zealand’s popular online newspaper archive Papers Past http://paperspast.natlib.govt.nz.

Mr Boddie said the agreement would see CCS using Digital Library Consulting’s Veridian product in major digitization projects in Europe, Asia and America.

“Given CCS have worked with other companies in this field, their decision to use Veridian is evidence we have a product that is very competitive on a world stage,” said Mr Boddie.

There is growing worldwide demand for digitization services as libraries and other institutions seek to preserve valuable collections digitally, said Mr Boddie. “Of course these collections are of little value unless people can access, search and view them easily – which is what software does, enabling these collections to be opened to the world.”

Richard Helle, Managing Director at CCS, said his company were looking forward to collaborating with Digital Library Consulting. “Our clients digitize their stock with precision and great commitment. Veridian helps them to make this effort visible and the collections usable.”

“We are delighted that this technology – proven successful in large scale projects – is now available to a much larger market.”

About Digital Library Consulting

Based in Hamilton, New Zealand, Digital Library Consulting are experts in the field of building digital collections. Their Veridian software product enables the collection, creation and distribution of digital libraries. They have worked with a number of leading tertiary institutions and libraries throughout the US, New Zealand, Africa and the Pacific.

Established in 2002, Digital Library Consulting has eight staff and is privately held.

www.dlconsulting.com

About CCS

CCS is a pioneer and business leader in making information available through digitization. Founded in 1976, CCS connects the digitization elements–capture, conversion, presentation and storage into a smooth, automated, quality-secured and economic production process. CCS (Content Conversion Specialists) is a privately owned company headquartered in Hamburg, Germany.

www.content-conversion.com

For information contact:

Stefan Boddie, Managing Director, Digital Library Consulting

Telephone +64 7 857 0830

E-mail stefan@dlconsulting.com


Powered by Wordpress