When the National Library of New Zealand first put its collection of digitized newspapers online, it did so as images alone, without full text searchability. The site was popular, but not perfect. Wanting to improve the site, NLNZ further processed its collection to include OCR processing, which attempts to extract the letters, and words, and sentences from the images so [...]
Peeling back the layer of OCR
When the National Library of New Zealand first put its collection of digitized newspapers online, it did so as images alone, without full text searchability. The site was popular, but not perfect. Wanting to improve the site, NLNZ further processed its collection to include OCR processing, which attempts to extract the letters, and words, and sentences from the images so [...]