Scanned PDF Files
Initiator des Themas: Trevor Chichester
Trevor Chichester
Trevor Chichester  Identity Verified
Vereinigte Staaten
Local time: 08:13
Mitglied (2012)
Deutsch > Englisch
+ ...
May 17, 2012

Good Afternoon All!

So...I was wondering, what's the percentage of scanned pdf's you guys do a year?

Strangely, more and more of my translations have been from dead pdf's. Right now, I'm working on 13K worth of dead pdfs and to be honest it is QUITE the headache to deal with this file format.

How do you guys combat this? Do you re-write the pdf? Or do you have an OCR converter?

I personally have a great OCR converter but that doesn't mean I
... See more
Good Afternoon All!

So...I was wondering, what's the percentage of scanned pdf's you guys do a year?

Strangely, more and more of my translations have been from dead pdf's. Right now, I'm working on 13K worth of dead pdfs and to be honest it is QUITE the headache to deal with this file format.

How do you guys combat this? Do you re-write the pdf? Or do you have an OCR converter?

I personally have a great OCR converter but that doesn't mean I don't have to wade through the entire file looking for errors before putting it into Trados.

How do you guys deal with these files?



Cheers,

Trev
Collapse


 
Paulo Eduardo -  Pro Knowledge
Paulo Eduardo - Pro Knowledge  Identity Verified
Brasilien
Local time: 09:13
Portugiesisch > Englisch
+ ...
have fun! May 17, 2012

www.freepdfconvert.com/

www.pdfonline.com/

www.freepdfconvert.com/pdf_converter_desktop.asp


 
Giles Watson
Giles Watson  Identity Verified
Italien
Local time: 13:13
Italienisch > Englisch
In stillem Gedenken
Money talks May 17, 2012

Trevor Chichester wrote:

How do you guys deal with these files?



By quoting a hefty (at least 30%) premium for working with them.

In practice, though, I don't do any. The client either comes up with a viable file format or goes elsewhere. I know plenty of translators who are quite happy to deal with scanned images but I'm not one of them.


 
Nikita Kobrin
Nikita Kobrin  Identity Verified
Litauen
Local time: 14:13
Mitglied (2010)
Englisch > Russisch
+ ...
* May 17, 2012

Trevor Chichester wrote:
How do you guys deal with these files?


1) I ask the client to convert the PDF file into editable format (MS Word) and send it to me for translation (I accept only those converted files that are 100% identical to the PDF files from which they were converted).

2) If the client is not able to do 100% identical conversion himself I ask my DTP operator to do the conversion. In order to be able to compensate his work I charge the client extra. It's not cheap: in difficult cases the cost of conversion my be equal to the cost of translation.

Nikita Kobrin

[Edited at 2012-05-17 20:26 GMT]


 
Anton Konashenok
Anton Konashenok  Identity Verified
Tschechische Republik
Local time: 13:13
Französisch > Englisch
+ ...
Just OCR it, but do it properly May 17, 2012

Nikita, your DTP operator seems to be overcharging you by a huge factor. In my own experience, OCRing a scanned text of decent quality (maybe even a good fax) has never taken me more than 10% of the time needed for translation, and I consider it good customer relations to offer it free of charge if a steady client sends me an occasional scanned document.
There is, however, an important point to remember: never run your OCR in fully automatic mode, nor allow it to format the paragraphs for
... See more
Nikita, your DTP operator seems to be overcharging you by a huge factor. In my own experience, OCRing a scanned text of decent quality (maybe even a good fax) has never taken me more than 10% of the time needed for translation, and I consider it good customer relations to offer it free of charge if a steady client sends me an occasional scanned document.
There is, however, an important point to remember: never run your OCR in fully automatic mode, nor allow it to format the paragraphs for you. I'm using FineReader, defining the recognition areas by hand (selecting text or table as appropriate) and saving the results as plain text. For very clear originals, I may decide to save as formatted text instead, but delete all paragraph styles created by FineReader before doing any further work - this way, I only keep character-level formatting (font size and bold/italic/underline). Recreating the necessary paragraph format by hand takes a small fraction of the time needed to straighten out the automatically generated formatting.
Collapse


 
Nadiia and Vatslav Yehurnovy
Nadiia and Vatslav Yehurnovy
Ukraine
Local time: 14:13
Mitglied (2008)
Englisch > Russisch
+ ...
Pricing is often NOT meant to do OCRing May 18, 2012

We also have a friend who sometimes helps with OCRing and deep DTP wizardry, but completely agree with Nikita as for pricing extra per hour. And the originals in Word or other editable and not pre-OCRed formats really start to appear like magic

Well, sometimes miracles do not happen, and so the client pays per hour for re-creating the document versions from a scanned all-tables PDF with several consecutive changes of
... See more
We also have a friend who sometimes helps with OCRing and deep DTP wizardry, but completely agree with Nikita as for pricing extra per hour. And the originals in Word or other editable and not pre-OCRed formats really start to appear like magic

Well, sometimes miracles do not happen, and so the client pays per hour for re-creating the document versions from a scanned all-tables PDF with several consecutive changes of numbers in the cells.

Anton, how about a scanned 15-page document with numerous hardly legible handwritten memos with arrows etc., full of tables and block diagrams?

We just gave a quote for OCRing, drawing and typing, and received back the great Word file with everything intact, just in 3 hours.
Collapse


 
Rolf Keller
Rolf Keller
Deutschland
Local time: 13:13
Englisch > Deutsch
Online services vs. confidentiality May 18, 2012



Usage of such online services might compromise the confidentiality.


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Scanned PDF Files







Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

Buy now! »
Trados Studio 2022 Freelance
The leading translation software used by over 270,000 translators.

Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop and cloud solution, empowering you to work in the most efficient and cost-effective way.

More info »