Daily Archives: October 20, 2011

Peeling back the layer of OCR

newspaper collection

When the National Library of New Zealand first put its collection of digitized newspapers online, it did so as images alone, without full text searchability. The site was popular, but not perfect. Wanting to improve the site, NLNZ further processed its collection to include OCR processing, which attempts to extract the letters, and words, and sentences from the images so […]

Posted in Crowdsourcing | Tagged , , | Comments closed