Distributed proofreaders

Thursday 13 May 2004This is close to 21 years old. Be careful.

In one of the comments to my entry about Read Print from Tuesday, “Blues” suggested trying out PG Distributed Proofreaders, and I did. It’s a fascinating web artefact. They’ve solved the problem of how to accomplish the labor-intensive job of proofing and correcting the OCR scans of books.

The site is a web application for handing out units of work, and getting back results. They have over 11,700 people signed up to proof pages, and they are proofing 6200 pages a day. You sign up on the site, then log in to proof pages. You are presented with a scan of a page and the text as produced by the OCR software. Your job is simply to compare the two, and make corrections. Mostly it seems to come down to re-joining hyphenated words (why can’t OCR software do that itself?). All they ask it that you proof one page a day.

It’s a cool way to provide a little bit of labor for a noble cause: the dissemination of public domain information electronically.

Comments

[gravatar]
Read Print also does the same thing... they have a bunch of volunteers who proof and format all the works. I have asked them earlier about this... they said they are currently working on a system whereby visitors would also be able to contribute.

Add a comment:

Ignore this:
Leave this empty:
Name is required. Either email or web are required. Email won't be displayed and I won't spam you. Your web site won't be indexed by search engines.
Don't put anything here:
Leave this empty:
Comment text is Markdown.