Digitization Workflows: Scanning, OCR, and Audio Transcription

Converting documents, text, images, and sound files to digital and/or machine-readable formats is a prerequisite for many digital humanities projects. Digitization is the process of capturing analog materials as digital images. Optical Character Recognition (OCR) programs “read” these images and convert them to text documents which can be easily searched, copied, edited, or used for computational text analysis methods. Transcription is the process of translating audio or video files into a text format. Explore more tools these tasks in the ‘Capture’ category on the DiRT Directory.

http://digitalhumanities.berkeley.edu/resources/digitization-workflows-scanning-ocr-and-audio-transcription

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.