Comprehensive list of translation memory (TM) file formats Thread poster: ..... (X)
| ..... (X) Local time: 22:05
Hi all, I was recently reading through this thread and there were mentions of "millions" and "hundreds" of formats for translation memory files. While those might have been exaggerations, I definitely became curious to learn more about some of the more obscure formats I don't know (or possibly even main stream formats that I am not aware of). So I decided to start this thre... See more Hi all, I was recently reading through this thread and there were mentions of "millions" and "hundreds" of formats for translation memory files. While those might have been exaggerations, I definitely became curious to learn more about some of the more obscure formats I don't know (or possibly even main stream formats that I am not aware of). So I decided to start this thread and see if we can come up with a comprehensive list from the community (please keep it to translation memory files only, I'll start a separate thread for terminology/termbase file formats). I'll start it off. If we get a lot of good responses from this thread I'll make a resource page with all of the formats. Please try to match the following format: Name: File extension: Type: (open or proprietary) Link(s): NB: for purposes of this thread I define 'open' to mean that the specifications are published on the Internet and 'proprietary' if the specifications are not published. -------------- Name: Translation Memory eXchange File extension: .tmx Type: open Link(s): Wikipedia | TMX 1.4b Specification Name: XLIFF (XML Localisation Interchange File Format) File extension: .xlf (.xliff also found in wild but not compliant with spec) Type: open Link(s): Wikipedia | XLIFF Version 1.2 Specification | XLIFF Version 2.0 Name: SDLXLIFF File extension: .sdlxliff Type: proprietary Link(s): SDL Product Help Description Name: SDLTM File extension: .sdltm Type: proprietary Link(s): Wikipedia Name: Wordfast Translation Memory File extension: .txt Type: proprietary Note: Wordfast claims in their documentation that the format is open. It is a tab-limited text file, but I have yet to find a specification with any more detail than the link below. If someone can point me to a true specification for the format I will change this to open Link(s): Wordfast Support Specifications Name: WordFast TXML File extension: .txml Type: proprietary Link(s): OmegaT Documentation Description Name: Trados TTX File extension: .ttx Type: proprietary Link(s): What's a TTX file? (ProZ forums) Name: Trados TMW File extension: .tmw (also includes .mdf, .mtf, .mwf, .iix as neural network files) Type: proprietary Link(s): Wikipedia EDIT: Update XLIFF file extension
[Edited at 2015-10-07 10:57 GMT] ▲ Collapse | | | Bilingual files, project files, translation memories, termbases... | Oct 7, 2015 |
...it seems you want to include them all. Not that I have any problems with it, but you will indeed end up with "hundreds" of file formats. DejaVu uses Access databases for project files, translation memories (segments), and termbases. Very consistent (though it also uses a project-specific .txt file for terms). I still think that's the way to go, although MS Access may not be the best choice. On the other hand, how many viable choices did they have, late last century? ... See more ...it seems you want to include them all. Not that I have any problems with it, but you will indeed end up with "hundreds" of file formats. DejaVu uses Access databases for project files, translation memories (segments), and termbases. Very consistent (though it also uses a project-specific .txt file for terms). I still think that's the way to go, although MS Access may not be the best choice. On the other hand, how many viable choices did they have, late last century? Cheers, Hans ▲ Collapse | | | Jorge Payan Colombia Local time: 08:05 Member (2002) German to Spanish + ... Not all of them are translation memory file formats ... | Oct 7, 2015 |
Kevin Dias wrote: ... (please keep it to translation memory files only, I'll start a separate thread for terminology/termbase file formats)... As far as I know TTX , XLIFF, SDLXLIFF, and TXML are formats intended for file interchange and not for translation memory (TM). Maybe you would like to open a different thread for file interchange formats ... | | | ..... (X) Local time: 22:05 TOPIC STARTER I intend to include those here | Oct 7, 2015 |
For all intents and purposes what I am referring to when I say "translation memory file formats" are bilingual file formats that contain both the source document text and the translation text (e.g. translation memory data). In other words - practically speaking what are file formats used by CAT tools to store and pass translation memory data (aligned source and target text data). Whether intended or not, I think TTX , XLIFF, SDLXLIFF, and TXML all fit this category. | |
|
|
Stepan Konev Russian Federation Local time: 16:05 English to Russian For the beginning... | Oct 7, 2015 |
1. Across & Across personal edition (freeware) 2. Alchemy Catalyst 3. Anaphraseus (open source - based on OpenOffice macro set, so you require OpenOffice) 4. AnyMem 5. Cafetran 6. CatsCradle (for web pages) 7. Deja Vu 8. Ecco 9. Fluency Translation Suite 10. Fortis Translation Suite 11. GlobalSight 12. Glossy 13. Google Translator Kit (freeware) 14. gtranslator 15. Heartsome Translation Studio 16. IBM Trans... See more 1. Across & Across personal edition (freeware) 2. Alchemy Catalyst 3. Anaphraseus (open source - based on OpenOffice macro set, so you require OpenOffice) 4. AnyMem 5. Cafetran 6. CatsCradle (for web pages) 7. Deja Vu 8. Ecco 9. Fluency Translation Suite 10. Fortis Translation Suite 11. GlobalSight 12. Glossy 13. Google Translator Kit (freeware) 14. gtranslator 15. Heartsome Translation Studio 16. IBM Translation Manager 17. Idiom 18. Logoport 19. Lokalize 20. MateCat 21. memoQ (free & pro versions) 22. Memsource 23. MetaTexis 24. MultiTrans 25. Oddjobs 26. OmegaT (freeware) 27. SDL Trados Passolo 28. SDL Trados Studio 29. SDLX 30. Similis (Freeware) 31. Smartcat 32. Snowball 33. Swordfish Translation Editor 34. Trados Workbench 35. Transit 36. WebBudget 37. Wordbee translator 38. Wordfast (free and paid versions) 39. Wordfisher (freeware) 40. Xliff editor 41. XTM ▲ Collapse | | | ..... (X) Local time: 22:05 TOPIC STARTER File formats - not CAT tools | Oct 7, 2015 |
Hi Stepan, Thanks for your reply. It seems like that is more a list of CAT tools though - not file formats. Yes, some CAT tools will have their own proprietary format(s), but I don't think all of them in that list do. For example - does MateCat have a unique proprietary file format for translation memories? | | | Samuel Murray Netherlands Local time: 14:05 Member (2006) English to Afrikaans + ...
Kevin Dias wrote: Type: proprietary Note: Wordfast claims in their documentation that the format is open. It is a tab-limited text file, but I have yet to find a specification with any more detail than the link below. If someone can point me to a true specification for the format I will change this to open The guys from Virtaal also ran into this problem when they created a WF2PO filter -- what's written in the "specifications" is not the whole story. Perhaps you can get a hold of their filter (it's Python) to see what adjustments they made). You may have to look at old repositories, or use e-mail. Name: WordFast TXML Don't forget Wordfast TXLF. Oh, and then there's the various dialects of PO, and the variations of LNG/INI type files (key=value files). Gettext PO does have an official specification, but various programs that work with PO/POT files deviate from it. == Kevin Dias wrote: For all intents and purposes what I am referring to when I say "translation memory file formats" are bilingual file formats that contain both the source document text and the translation text (e.g. translation memory data). In that case, the dialects of "uncleaned RTF" should also be included, right? I know of the Trados 2007 dialect, the Wordfast dialect and the Anaphraseus dialect (they all share nearly identical marking delimiters). Then some CAT tools have similar-looking uncleaned formats that aren't really dialects of the original uncleaned RTF format, e.g. Metatexis, I think (it uses different marking delimiters). | | | Wrong extension for XLIFF | Oct 7, 2015 |
Kevin Dias wrote: Name: XLIFF (XML Localisation Interchange File Format) File extension: .xliff or .xlif The only extension you can use with XLIFF files is ".xlf" Anything else, including ".xliff", makes the file not compliant with the XLIFF standard. Regards, Rodolfo | |
|
|
..... (X) Local time: 22:05 TOPIC STARTER
Hi Rodolfo, Thanks for the info. ".xlif" was a typo on my part, I'll fix that. ".xliff" does exist out in the wild though. I see you are one of the editors of the XLIFF specification. Any reason that naming guideline has been removed from the XLIFF 2.0 Specification? In the 1.2 Specification it is clear: D.4. XLIFF File Extension XLIFF documents use the .xlf extension. No other extension is recommended by the specification.
however I can't find that in the 2.0 Specification. | | | ..... (X) Local time: 22:05 TOPIC STARTER
Thanks! Great stuff! Samuel Murray wrote: In that case, the dialects of "uncleaned RTF" should also be included, right? Yes, I think so. They seem to fit in the "bilingual file that holds aligned source and target data" category. | | | DorothyX (X) France Local time: 14:05
.xlz which is a bilingual file in xliff format special for Idiom and Translation Workspace Xliff editor. I don't know if it is directly compatible with .xlf or can be opened by Trados Studio but in each case it can easily be converted into other formats. As for Stepan's list, Logoport, does not exist anymore for years now but the software can still used in some cases by those who own an access to Translation Workspace (for Word files only). As for the Wordfast TMs, they ... See more .xlz which is a bilingual file in xliff format special for Idiom and Translation Workspace Xliff editor. I don't know if it is directly compatible with .xlf or can be opened by Trados Studio but in each case it can easily be converted into other formats. As for Stepan's list, Logoport, does not exist anymore for years now but the software can still used in some cases by those who own an access to Translation Workspace (for Word files only). As for the Wordfast TMs, they are real tab delimited .txt files which can be read by each office application especially Excel and Word. .po files are not always compatible between several applications. There are clearly several flavours. I managed to open several POEdit files only with Pootling.
[Edited at 2015-10-07 11:48 GMT] ▲ Collapse | | | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » Comprehensive list of translation memory (TM) file formats Trados Business Manager Lite | Create customer quotes and invoices from within Trados Studio
Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.
More info » |
| Wordfast Pro | Translation Memory Software for Any Platform
Exclusive discount for ProZ.com users!
Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value
Buy now! » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |