DLConsulting recently received a grant from the Foundation for Research Science & Technology to further improve Greenstone’s performance when scaled up to very large collections. Of particular interest are scalability issues caused by large collections of uncorrected OCRed text (e.g. digitized newspaper collections). As part of this research we are testing and benchmarking the performance of a number of different search engine and metadata database options, as well as improving Greenstone’s ability to distribute a collection across multiple servers. The research grant runs until April 2008.
September 18th, 2007 at 3:17 pm
[...] of over one million pages, and we’re continuing this work with funding from a government R&D Grant. We’re currently working on another newspaper digitization project which will eventually [...]