Problems with OCR and small text
Thread poster: James Greenfield
James Greenfield
James Greenfield  Identity Verified
United Kingdom
Local time: 11:12
Member (2013)
French to English
+ ...
Nov 29, 2015

Hi,

I am currently translating a dead PDF. I managed to OCR the document and the results were fine apart from the bibliography section at the end which is in very small print. The results for this section make no sense. Using abbyy finereader I tried to increase the resolution but the results were equally as bad. Does anyone have any advice? This is the first time I have had this problem. Perhaps someone with Abby finereader could guide me as to how to properly increase the resoluti
... See more
Hi,

I am currently translating a dead PDF. I managed to OCR the document and the results were fine apart from the bibliography section at the end which is in very small print. The results for this section make no sense. Using abbyy finereader I tried to increase the resolution but the results were equally as bad. Does anyone have any advice? This is the first time I have had this problem. Perhaps someone with Abby finereader could guide me as to how to properly increase the resolution. When I try to do this the image size automatically becomes smaller and it still is unable to recognise the text. Many thanks for any advice.
Collapse


 
Sergei Leshchinsky
Sergei Leshchinsky  Identity Verified
Ukraine
Local time: 13:12
Member (2008)
English to Russian
+ ...
can you Nov 29, 2015

send me the file?

also, if it is raster, then all you have is all you have.


 
James Greenfield
James Greenfield  Identity Verified
United Kingdom
Local time: 11:12
Member (2013)
French to English
+ ...
TOPIC STARTER
email Nov 29, 2015

Sergei Leshchinsky wrote:

send me the file?

also, if it is raster, then all you have is all you have.


Thanks, I've just sent you an email.


 
James Greenfield
James Greenfield  Identity Verified
United Kingdom
Local time: 11:12
Member (2013)
French to English
+ ...
TOPIC STARTER
could anyone help? Nov 29, 2015

I don't suppose anyone has really powerful OCR software that would be prepared to do me a massive favour. I can't manage to OCR the bibliograohy which is in small text and to hand type the 64 entries it is going to take me a long time. Thanks very much.

 
Melissa McMahon
Melissa McMahon  Identity Verified
Australia
Local time: 20:12
French to English
Not sure if post-facto solutions will help Nov 29, 2015

Hi James,

I'm not an expert, but I think if the scan of the original document was not a high enough resolution, then attempts to increase the resolution of the scan won't help, because the "raw material" is inadequate. If I take a blurry photo of something, no amount of fiddling with the sharpness or resolution of the photo will give me a clear photo. I think the only alternative to typing out the text is to get a better scan.

Good luck!
Melissa


 
James Greenfield
James Greenfield  Identity Verified
United Kingdom
Local time: 11:12
Member (2013)
French to English
+ ...
TOPIC STARTER
Thanks Nov 29, 2015

Hi Melissa,

Yes, I think that's right. This section is in English anyway so I have decided not to include it. I thought about including it as it is the bibliography and the French text refers to these English journals, but as you say there is no way of increasing the resolution and hand typing it out would take me an awful long time,

James


 
Anton Konashenok
Anton Konashenok  Identity Verified
Czech Republic
Local time: 12:12
French to English
+ ...
Do you really need to type it? Nov 30, 2015

If the list of references is already in the target language anyway, it makes sense to ask the client if they'd accept it as a pasted image instead of text. If so, you can just copy it using the Snapshot tool of Adobe Reader, then paste it into your target document.

 
esperantisto
esperantisto  Identity Verified
Local time: 13:12
Member (2006)
English to Russian
+ ...
SITE LOCALIZER
Convert to black and white Nov 30, 2015

In my experience, increasing the resolution above 300 dpi has no noticeable effect on recognition results even for small print. However, there is one setting (off by default) that can be usable: Tools → Options → General → More options… → Convert color/gray-scale images to black and white (translating this menu items from Russian UI for FR 8.0, thus, they may be different in your case). Try it with on.

Also, if the sections in question are French only, do select French onl
... See more
In my experience, increasing the resolution above 300 dpi has no noticeable effect on recognition results even for small print. However, there is one setting (off by default) that can be usable: Tools → Options → General → More options… → Convert color/gray-scale images to black and white (translating this menu items from Russian UI for FR 8.0, thus, they may be different in your case). Try it with on.

Also, if the sections in question are French only, do select French only for the language and (re)recognize.
Collapse


 
Tom in London
Tom in London
United Kingdom
Local time: 11:12
Member (2008)
Italian to English
No problem Nov 30, 2015

James Greenfield wrote:

Hi,

I am currently translating a dead PDF. I managed to OCR the document and the results were fine apart from the bibliography section at the end which is in very small print. The results for this section make no sense. Using abbyy finereader I tried to increase the resolution but the results were equally as bad. Does anyone have any advice? This is the first time I have had this problem. Perhaps someone with Abby finereader could guide me as to how to properly increase the resolution. When I try to do this the image size automatically becomes smaller and it still is unable to recognise the text. Many thanks for any advice.


I don't know about you, James, but my Abbby Fine Reader for MacOS outputs to plain text. The resulting file can then be opened in Word and saved as a .doc file. Then you can alter the text any way you want to. I do this all the time.

[Edited at 2015-11-30 07:51 GMT]


 
Rolf Keller
Rolf Keller
Germany
Local time: 12:12
English to German
Enlarge the picture externally Dec 1, 2015

esperantisto wrote:

In my experience, increasing the resolution above 300 dpi has no noticeable effect on recognition results even for small print.


Ack.

Convert color/gray-scale images to black and white


Ack.

Plus plan C:
Enlarge the picture beforehand.

If needs be, go to a copy shop, make an enlarged copy, try different contrast settings etc, then scan/export the result onto an USB stick. The shop staff will help you with this.

Back in your office, OCR the file on the stick.


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Problems with OCR and small text






Trados Business Manager Lite
Create customer quotes and invoices from within Trados Studio

Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.

More info »
TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »