Digitization Workflows: Scanning, OCR, and Audio Transcription

Converting documents, text, images, and sound files to digital and/or machine-readable formats is a prerequisite for many digital humanities projects. Digitization is the process of capturing analog materials as digital images. Optical Character Recognition (OCR) programs “read” these images and convert them to text documents which can be easily searched, copied, edited, or used for…