How to Convert Scanned Documents to Word - The Happy Android

If you need to digitize a book in text format, you may have several questions. Can it be done? How about the quality, is it good? Not only can it be done, but there are also several ways to convert a scanned document to Word. Let's see:

  • Scanning the document in PDF format and editing it later with Adobe Acrobat XI Pro to save it in Word format. The Pro version of Acrobat is paid, but you can purchase a free 30-day trial license.
  • From the OnlineOCR.net website. This web application allows you to convert PDF, JPG, TIFF and GIF documents to Word, Excel and text. In the free version we can convert up to 15 pages per hour and documents of no more than one page.
  • Scanning the document in (OCR) format and saving it as text. Then we can open Word and edit or save it in .doc format.
  • Using some optical character recognition program:
    •  VueScan (available for Windows, Mac OS X, and Linux)
    • Kooka(for Linux)
    • Office Lens (for Android and ios)
    • CamScanner (for Android and ios)

The most efficient way of all we could say is through Adobe Acrobat Pro, but only if the scan is very, very clean and of high quality. Optical character recognition applications have come a long way, but they still show some gaps with things like bold or italic, and the transcription of some words depending on the font type of the original document can be wrong.

Try to scan and pass this document to Word to see what happens

From the scanner itself

Some scanners include the Optical Recognition (OCR) feature within their own scanning program. To scan a document to text you just have to set the digitization format and look at one that refers to OCR or similar (it depends on the brand / model of the scanner).

Go from PDF to Word with Adobe Acrobat XI Pro

Once Adobe Acrobat XI Pro (HERE you have a 30-day free trial) go to “Tools -> Text recognition -> In this file”.

In the window of "Recognize text"Click on"Edit”And choose the text language, output style and resolution.

To finish, go back to “Tools -> Content editing -> Edit text and images”And modify the text if you need to correct a word. To finish, click on "File -> Save as”And save it in Word format.

OnlineOCR

OnlineOCR is a web application to transfer images or pdfs to Word and it is very easy to use. I'll explain how it works: Enter //www.onlineocr.net/ and click on "Select file”. Select the scanned document and then choose the language and the output format from the 2 drop-down menus that you have in the center of the screen.

To finish click on "Convert”. Just below it will appear a plain text preview that you can edit if you need to correct any words. Finally, click on "Download Output File”And you can download the file in Word format. Here is an example of a PDF converted to Word with OnlineOCR:

  • Original PDF:

  • Converted document:

If this web application does not satisfy you, you can try other similar alternatives such as FreeOCR or Free-Online-OCR.

Optical Character Recognition (OCR) Programs

If you don't want to modify your documents online and you need a desktop application, you can use applications such as VueScan (which is also available for Mac and Linux in addition to the ubiquitous Windows).

Another possibility is to use your Android or iOS device to scan the document and convert it to text directly. There are applications like Office Lens (for Android and ios) or CamScanner (for Android and ios) that carry out the entire process from the same application. It is recommended in these cases to clean the image before converting it to text. If you want to know more about this method take a look at THIS POST.

In my opinion, the optical character recognition technique, although it has improved a lot in recent years, is still light years away from being a perfect technique. Lots of details, lots of words that he "translates" with wrong letters and symbols that litter the text. He still lacks that extra intelligence that makes you see that «t &! $ olog1a»Cannot be a valid translation of any word in a text. I still do not see a reading comprehension, but a simple visual recognition of individual letters that form words without integrating them with the rest of the text. However, I am convinced that the moment when we will make that last great leap is getting closer every day.

You have Telegram installed? Receive the best post of each day on our channel. Or if you prefer, find out everything from our Facebook page.

Recent Posts