Converting scanned documents into "Word" documents?

I have just done a search on Google as to how I would convert a scanned document (of a typescript) into a document that would recognize the characters just like any other Word document. But ofcourse I went and forgot that I am using Ubuntu and not Windows. So is it still possible somehow to do the same on Ubuntu is what I am wondering. I would really appreciate any help.

Thank you.

2 Answers

Tesseract is one option that worked great for me!

I used it as follows:

Install it, if you don't have it with:

 sudo apt-get install tesseract-ocr

Then:

Convert the .JPG scanned file to .tif (this is the format Tesseract
requires). This is done with ImageMagick as follows:
convert foo.JPG foo.tif
Now simply let Tesseract do it's magic:
tesseract foo.tif foo (will save output to foo.txt)

I recently had to convert an old manual with multiple(36) pages to something digital. I whipped up a BASH script to do it.

Code here:

#!/bin/bash
# makeDoc.sh
# Turn a set of scanned JPG pages into a single document file.
# Requires the ImageMagick and Tesseract packages.
# Author: Fred Fury
echo "makeDoc.sh"
echo "Convert a set of scanned JPG pages into a single document file."
echo "Starting up..."
for i in {01..36}
do echo "converting $i.JPG to $i.tif..." bash -c "convert $i.JPG $i.tif" # Convert the file to tesseract usable format bash -c "tesseract $i.tif $i &>-" # Convert the tif to txt
done
echo "Merging files into Output.doc"
bash -c "cat *.txt > Output.doc" # Merge all the generated txt files into a single file
echo "Done."

Also check out this page for some other solutions:What's the best, simplest OCR solution?This is where I found tesseract.

Hope that helps!

I had a similar problem to this a while ago. Try uploading the file to online-convert.com. It will take a while, but the webapp can handle just about any format. Good luck!

Pop Feed Daily

Converting scanned documents into "Word" documents?

2 Answers

Your Answer

Sign up or log in

Post as a guest

You Might Also Like

How To Regen Farm On BTD5 Mobile

Where does the term "dropship" come from? [closed]

Can a single player build a clan dojo?

mac osx 10 xbox 360 controller Counter strike GO