suitable OCR
Initiator des Themas: telefpro
telefpro
telefpro
Local time: 15:24
Portugiesisch > Englisch
+ ...
Sep 18, 2008

I have a document typewritten in 1957 and it is converted into PDF. The printouts cannot be read. Is there any way to read this document better, using some OCR. It is in French.Please advise.

 
Martin Skara, PhD.
Martin Skara, PhD.  Identity Verified
Slowakei
Local time: 10:54
Französisch > Slowakisch
+ ...
only ABBYY Fine REader Sep 18, 2008

is the best solution for conversions PDF-DOC/RTF.

https://abbyy.asknet.com/cgi-bin/dlreg/ml=EN?ID=FRP9DEMOM


Good luck
Martin


 
mediamatrix (X)
mediamatrix (X)
Local time: 06:54
Spanisch > Englisch
+ ...
The world's top OCR system ... Sep 18, 2008

telefpro wrote:

I have a document typewritten in 1957 and it is converted into PDF. The printouts cannot be read. Is there any way to read this document better, using some OCR. It is in French.Please advise.


... is the human eye.

If you can't read the document, then no OCR software will be able to either.

MediaMatrix


 
esperantisto
esperantisto  Identity Verified
Local time: 12:54
Mitglied (2006)
Englisch > Russisch
+ ...
SITE LOCALIZER
Precisely! Sep 18, 2008

mediamatrix wrote:
If you can't read the document, then no OCR software will be able to either.


Absolutely right!

However, theoretically, one might convert a PDF into a set of graphic files such as TIFF and try to play with gamma/color correction. But that's alchemy, not an exact science


 
Jack Doughty
Jack Doughty  Identity Verified
Vereinigtes Königreich
Local time: 09:54
Russisch > Englisch
+ ...
In stillem Gedenken
Zoom it? Sep 18, 2008

With Adobe Acrobat, even if you only have Adobe Acrobat Reader, you can zoom the page out to show the details a lot larger. If this doesn't help, I have no idea what else would.

 
Anna Villegas
Anna Villegas
Mexiko
Local time: 03:54
Englisch > Spanisch
This should do the trick Sep 18, 2008

Though it's in Spanish, you'll certainly be able to understand if you have Office 2003. Click on the link below.

You have your own OCR


 
Viktoria Gimbe
Viktoria Gimbe  Identity Verified
Kanada
Local time: 04:54
Englisch > Französisch
+ ...
I beg to differ Sep 18, 2008

mediamatrix wrote:

If you can't read the document, then no OCR software will be able to either.


I have successfully displayed on screen text that didn't even seem to be there. My OCR software is OmniPage, and it has a built-in function to enhance images before starting the recognition on it (if you know what you are doing, you can also do this with graphic programs like Photoshop).

There are documents that cannot be read by the human eye that can be made readable with the help of software. If I was in telefpro's situation, I would try it.

[Edited at 2008-09-19 15:22]


 
mediamatrix (X)
mediamatrix (X)
Local time: 06:54
Spanisch > Englisch
+ ...
QA? Sep 18, 2008

Viktoria Gimbe wrote:

There are documents that cannot be read by the human eye that can be made readable with the help of software. If I was in telefpro's situation, I would try it.


And how do you propose that telefpro should go about validating the OCR output? Even with texts that are easily human-readable, no OCR software is ever 100% accurate. If telefpro can't read the source text is it reasonable to assume that the output from image-enhanced OCR will, by some miracle, be 100% reliable on this particular occasion?

Telefpro is up against a fundamental law of entropy here.

MediaMatrix


 
Viktoria Gimbe
Viktoria Gimbe  Identity Verified
Kanada
Local time: 04:54
Englisch > Französisch
+ ...
That's the part his/her intelligence is needed for Sep 18, 2008

mediamatrix wrote:

And how do you propose that telefpro should go about validating the OCR output?


Well, s/he can read, right? Once s/he gets the OCR input, s/he can read through it to decide whether the output is strong enough to be processed. Isn't that what we should be doing even with texts that are already in an editable format?

The point of telefpro's question is to make the text readable by the human eye, not to turn it into editable text (although s/he may ultimately be interested in that as well). What other method do you propose? I don't know of any other solution. It's a matter of using technology to enhance what is humanly feasible.

In some cases, like the present one, technology can go much farther than the human brain - although this is usually not the case.


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

suitable OCR






CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

Buy now! »
Trados Business Manager Lite
Create customer quotes and invoices from within Trados Studio

Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.

More info »