Hidden/invisible characters in French translation
Thread poster: nana5658
nana5658
nana5658
China
Aug 12, 2014

Dear All,

Recently, one of my colleagues has finished a translation from EN into FR, after the delivery, our client reported that our deliverables contain broken characters.

We have compared the English file with FR file by Beyond Compare, and notice that there are some hidden characters were added in front of some of the translated strings. But they were hidden/invisible in the ttx files.

what my client did was - convert the translated ttx into xliff an
... See more
Dear All,

Recently, one of my colleagues has finished a translation from EN into FR, after the delivery, our client reported that our deliverables contain broken characters.

We have compared the English file with FR file by Beyond Compare, and notice that there are some hidden characters were added in front of some of the translated strings. But they were hidden/invisible in the ttx files.

what my client did was - convert the translated ttx into xliff and then work out txt to compare and validate our translated file.

according to my colleague, he didn’t do any additional file conversion, didn’t add any extra character, he only translated the file as usual, and saved, delivered. strange thing is that, we have worked on the same file for other 2 languages, but only French had this issue.

Has anyone encountered the same issue before? Do we know what the cause is?

Thanks!!
Collapse


 
Dominique Pivard
Dominique Pivard  Identity Verified
Local time: 16:31
Finnish to French
additional information on workflow and tools being used? Aug 12, 2014

It would help if you'd mention: 1) the format of the original document, 2) the tool being used to translate (you mention TTX, which is the format of TagEditor, but TTX files can be translated with many different tools), 3) the tool being used by your client (you mention XLIFF, which could mean SDLXLIFF ie. the format of SDL Trados Studio, but many other tools also use some kind of XLIFF).

 
nana5658
nana5658
China
TOPIC STARTER
Hidden/invisible characters in French translation Aug 12, 2014

Dominique Pivard wrote:

It would help if you'd mention: 1) the format of the original document, 2) the tool being used to translate (you mention TTX, which is the format of TagEditor, but TTX files can be translated with many different tools), 3) the tool being used by your client (you mention XLIFF, which could mean SDLXLIFF ie. the format of SDL Trados Studio, but many other tools also use some kind of XLIFF).

Thanks for the reply.

1) The source file we received is xml file.
2) TagEditor
3) our client validates our deliverables using their online system, which is more or like the way Beyond Compare works. We are required to deliver the translation in txt format first, so that our client could validate the format, characters etc, and this issue is detected during the txt review stage, while comparing the English txt with the French txt, as there were extra characters in front of some of the French translation.

if we copy the translation into doc files, we could see that there are extra characters. The question is-where are these characters coming from? as our team didn’t add any manually, and another thing is, they are not visible is Trados TagEditor.. They can be detected by xBench check though, but we ignored those alerts, as they were hidden/invisible, and treated them as false alarms.

thanks


[Edited at 2014-08-13 02:04 GMT]


 
Dominique Pivard
Dominique Pivard  Identity Verified
Local time: 16:31
Finnish to French
Try a TagEditor alternative Aug 14, 2014

I guess your using TagEditor, a very old (no longer supported by SDL AFAIK) and clunky (it has the most user-unfriendly interface of all tools I've ever encountered) tool, is the reason why you're not getting replies to your questions. There are many modern, supported tools that can handle XLIFF files pretty well. My suggestion would be to use one of these instead of TagEditor. At least retranslate your already translated document (shouldn't take long with the TM) with one of these tools and see... See more
I guess your using TagEditor, a very old (no longer supported by SDL AFAIK) and clunky (it has the most user-unfriendly interface of all tools I've ever encountered) tool, is the reason why you're not getting replies to your questions. There are many modern, supported tools that can handle XLIFF files pretty well. My suggestion would be to use one of these instead of TagEditor. At least retranslate your already translated document (shouldn't take long with the TM) with one of these tools and see if you're getting the extra characters that are causing trouble on your client's side.Collapse


 
nana5658
nana5658
China
TOPIC STARTER
It is seemed to be the byte order mark (BOM) Aug 21, 2014

We figured out the Invisible/hidden characters should be the byte order mark (BOM)

It is a Unicode character used to signal the byte order of a text file or stream. It is encoded at U+FEFF.
We can also call it as “ZERO WIDTH NO-BREAK SPACE”.

It usually exists as the first character of Unicode file, we cannot input it via Keyboard.

But the trick is... we still don't know how this was added during the translation.


 
nana5658
nana5658
China
TOPIC STARTER
It is seemed to be the byte order mark (BOM) Aug 21, 2014

Dominique Pivard wrote:

I guess your using TagEditor, a very old (no longer supported by SDL AFAIK) and clunky (it has the most user-unfriendly interface of all tools I've ever encountered) tool, is the reason why you're not getting replies to your questions. There are many modern, supported tools that can handle XLIFF files pretty well. My suggestion would be to use one of these instead of TagEditor. At least retranslate your already translated document (shouldn't take long with the TM) with one of these tools and see if you're getting the extra characters that are causing trouble on your client's side.


Thanks for the suggestion.
We figured out the Invisible/hidden characters should be the byte order mark (BOM)

It is a Unicode character used to signal the byte order of a text file or stream. It is encoded at U+FEFF.
We can also call it as “ZERO WIDTH NO-BREAK SPACE”.

It usually exists as the first character of Unicode file, we cannot input it via Keyboard.

But the trick is... we still don't know how this was added during the translation.


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Hidden/invisible characters in French translation







TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »
Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

Buy now! »