Pages in topic: < [1 2 3] > | Best route for 25K-word alignment project Thread poster: Jacques DP
|
LF Aligner (http://sourceforge.net/projects/aligner/), a free open-source tool, is much better than SDL Trados Studio's built-in aligner, although not as good as AlignFactory Light. Description LF Aligner helps translators create translation memories from texts and their translations. It relies on Hunalign for automatic sentence pairing. Input: txt, doc, docx, rtf, pdf... See more LF Aligner (http://sourceforge.net/projects/aligner/), a free open-source tool, is much better than SDL Trados Studio's built-in aligner, although not as good as AlignFactory Light. Description LF Aligner helps translators create translation memories from texts and their translations. It relies on Hunalign for automatic sentence pairing. Input: txt, doc, docx, rtf, pdf, html. Output: tab delimited txt, TMX and xls. With web features. Features - autoalign txt, doc, docx, rtf, html, pdf and other formats - output: tmx, tabbed txt and xls - supports windows, mac and linux - graphical user interface (on Windows) - integrated graphical interface for alignment review/editing - capable of aligning texts in up to 100 languages simultaneously - full UTF-8 workflow - uses hunalign for accurate autoalignment - built-in dictionary data further improves autoalignment in 800+ language combinations - download and align webpages - download and align EU legislation automatically - suitable for large-scale automated corpus building with unattended batch mode - basic support for some oriental languages, enhanced support for most European languages - built-in customizable sentence segmenter borrowed from the europarl corpus project - the grab bag contains various TM, termbase and data conversion and filtering tools ▲ Collapse | | | Roy Oestensen Denmark Local time: 22:52 Member (2010) English to Norwegian (Bokmal) + ...
As CAD or USD 420.00 is no small amount, I wonder how much alignment one needs to do for this to be worth it, and how much time one saves compared to other alignment tools? I suspect most translators are like me and only do alignment jobs only occasionally. Even for a 25k job I would be very hesitant to make such a hughe investment. Sounds like if LF_aligner does a fairly decent job, being free, it would be a much better choice for most translator. | | |
Roy Oestensen wrote: As CAD or USD 420.00 is no small amount, I wonder how much alignment one needs to do for this to be worth it, and how much time one saves compared to other alignment tools? I suspect most translators are like me and only do alignment jobs only occasionally. Even for a 25k job I would be very hesitant to make such a hughe investment. Sounds like if LF_aligner does a fairly decent job, being free, it would be a much better choice for most translator. By now, I have imported approx. 850,000 TUs from various documents (in English and Russian, in my case) publicly available in the UN Official Document System at http://documents.un.org/default.asp In my most recent UN-related project only, this alignment effort earned me about US$650 before I even opened the first of three files in the SDL Trados Studio's "Editor" window. This should explain the underlying logic behind my previous statement. ------- added comment N.B. As compared to LF Aligner, AlignFactory is much easier to use, generates better-aligned output files and offers valuable additional functionality. However, for the colleagues with occasional alignment needs, LF Aligner is definitely a very good option.
[Edited at 2015-05-03 08:26 GMT] | | | Jacques DP Switzerland Local time: 22:52 English to French TOPIC STARTER Done (with WinAlign) | May 3, 2015 |
I was a bit surprised to read that Andras, who had advised against WinAlign and said I should probably use Studio's tool instead, has never used Studio's tool. For the record I tried to use LF Aligner as well, but after reading its 8K-word ReadMe (which is necessary for this kind of software), it failed on me saying it couldn't find some Perl package. Andras provided three possible remedies, but since I had not yet used it I thought I would choose another route. So, I d... See more I was a bit surprised to read that Andras, who had advised against WinAlign and said I should probably use Studio's tool instead, has never used Studio's tool. For the record I tried to use LF Aligner as well, but after reading its 8K-word ReadMe (which is necessary for this kind of software), it failed on me saying it couldn't find some Perl package. Andras provided three possible remedies, but since I had not yet used it I thought I would choose another route. So, I did it with WinAlign, first converting all files to DOC format (since this was a known format when WinAlign was made). It worked great for the project I had, despite a few minor bugs, and I would recommend it for similar projects. ▲ Collapse | |
|
|
Susan Welsh United States Local time: 16:52 Russian to English + ... Is WinAlign still available? | May 3, 2015 |
I can't find a place to download it on the SDL site. | | | @Susan: Try YouAlign free online alignment tool | May 3, 2015 |
http://www.youalign.com/AlignDocs.aspx Supported formats: Microsoft Word, Excel and PowerPoint, Adobe PDF, HTML, XML, Corel WordPerfect, RTF, Lotus WordPro and plain text. Limitations of the free version: Maximum file size for each file is limited to 1MB. For larger files, use AlignFactory. Also, you are limited to 5 alignment jobs per day. TIP: Merge sev... See more http://www.youalign.com/AlignDocs.aspx Supported formats: Microsoft Word, Excel and PowerPoint, Adobe PDF, HTML, XML, Corel WordPerfect, RTF, Lotus WordPro and plain text. Limitations of the free version: Maximum file size for each file is limited to 1MB. For larger files, use AlignFactory. Also, you are limited to 5 alignment jobs per day. TIP: Merge several documents in each of the two languages into one before proceeding with alignment. ▲ Collapse | | |
Jacques DP wrote: For the record I tried to use LF Aligner as well, but after reading its 8K-word ReadMe (which is necessary for this kind of software), it failed on me saying it couldn't find some Perl package. Rather than reading the 8K-word readme, you could have watched this video. On a normal Windows setup, LF Aligner should work out of the box if you unzip its content to a proper destination (eg. your Documents, or any other folder under c:\users) As a "smart" aligner, LF Aligner is so much better than "dumb" aligners like WinAlign. | | |
Susan Welsh wrote: I can't find a place to download it on the SDL site. WinAlign was part of SDL Trados 2007 and older. I think it was also included with older versions of Studio, until Studio started to have its own aligner. It was never available as a free, separate download on the SDL site AFAIK, only as part of the aforementioned packages. Anyway, you're not missing much, since there are freely available aligners that are way better. | |
|
|
Roy Oestensen wrote: Even for a 25k job I would be very hesitant to make such a hughe investment. I seem to remember a trial version fully functional for 15 days is available, so next time you have a 25k alignment need, apply for the trial version and you'll be able to test it without making the upfront investment. AlignFactory is good, but so is LF Aligner. Both would qualify as "smart" aligners, ie. they do not rely solely on segmentation and they have a clever algorithm that allows them to fall back on their feet if/when the alignment "derails". The user interface of LF aligner is a bit "spartan", but once you get over it, the aligner is a real powerhouse. | | | Trans Suite 2000 Align | May 4, 2015 |
Roy Oestensen wrote: Trans Suite 2000 Align, which probably would be Trans Suite 2014 or something today. In the past Trans Suite had got an excellent review, but if it's still available I don't know. Fat chance: Cypresoft (the company behind Trans Suite) went bust in 2004, so any copy you would find would be the original 2000 version, and you would also need an activation key. The aligner was kind of OK back then, when the main alternative was WinAlign, but there are far better solutions nowadays, including free ones. | | | Susan Welsh United States Local time: 16:52 Russian to English + ...
I tried this "out of the box," without bothering to read any of the instructions, and it worked well, certainly much better than Studio 2014's aligner. But it did require quite a lot of manual intervention, such as when the source text put two sentences into one segment and the target text put only one, or the ubiquitous problem of people's initials, which Russian academic texts are full of (I mean a segment after each period ["full stop"] in the initials--L. S. Vygotsky). Perhaps there's a way ... See more I tried this "out of the box," without bothering to read any of the instructions, and it worked well, certainly much better than Studio 2014's aligner. But it did require quite a lot of manual intervention, such as when the source text put two sentences into one segment and the target text put only one, or the ubiquitous problem of people's initials, which Russian academic texts are full of (I mean a segment after each period ["full stop"] in the initials--L. S. Vygotsky). Perhaps there's a way to change the segmentation rules. Probably I will have to read the instructions(!) In my test document, however, I had already manually put non-breaking spaces after each initial--L.[non-breaking space]S.[non-breaking space]Vygotsky--but it still didn't come out well. But for a large document, that is a pain. ▲ Collapse | | | Jacques DP Switzerland Local time: 22:52 English to French TOPIC STARTER Yes, you will have to read the instructions | May 4, 2015 |
From comments below Dominique's you-just-need-to-click-with-the-mouse-on-the-buttons video: From Jennyxxx777: Great tool .. just wondering how I can get it to segment when it reaches a colon (:) ?? I need this for my CAT to function properly .....Are there a list of segment breakers I can change ? Thankx in advance From Safe Tex: Hello The answer to your question is no doubt in the 'sentence splitter' folder of AF aligner and then start with t... See more From comments below Dominique's you-just-need-to-click-with-the-mouse-on-the-buttons video: From Jennyxxx777: Great tool .. just wondering how I can get it to segment when it reaches a colon (:) ?? I need this for my CAT to function properly .....Are there a list of segment breakers I can change ? Thankx in advance From Safe Tex: Hello The answer to your question is no doubt in the 'sentence splitter' folder of AF aligner and then start with the 'read me' file. But this is perhaps where the software needs to be improved. You have to understand how to add your needs and express them using regular expressions if I'm not mistaken (a configuration file) Good luck
[Edited at 2015-05-04 12:03 GMT] ▲ Collapse | |
|
|
Susan Welsh wrote: I tried this "out of the box," without bothering to read any of the instructions, and it worked well, certainly much better than Studio 2014's aligner. But it did require quite a lot of manual intervention, such as when the source text put two sentences into one segment and the target text put only one, or the ubiquitous problem of people's initials, which Russian academic texts are full of (I mean a segment after each period ["full stop"] in the initials--L. S. Vygotsky). Perhaps there's a way to change the segmentation rules. Probably I will have to read the instructions(!) In my test document, however, I had already manually put non-breaking spaces after each initial--L.[non-breaking space]S.[non-breaking space]Vygotsky--but it still didn't come out well. But for a large document, that is a pain. There are certainly ways to improve segmentation and it looks like it needs to be improved. It should handle names like this correctly. Probably the easiest solution is to just edit the nonbreaking prefix file, in this case aligner\scripts\sentence_splitter\nonbreaking_prefixes\nonbreaking_prefix.ru You just need to add each uppercase letter to the file, each in a new row. Interestingly, the file starts with "TBD: Russian uppercase alphabet [А-Я]" So it looks like somebody was going to do this but didn't. I have no idea why, it should take two minutes. So try that, and if it works, please send the edited file to me so that I can update it for the next release. General note: obviously, one doesn't need to read the whole readme before using the program. The sensible thing to do is to open the readme, read the list of contents, read the short intro, try to use the program and go back to the readme if there is a problem. If the list of contents doesn't tell you where to look for what you need, try Ctrl-F. In any case, LF Aligner is probably not the right tool for those who can't figure out on their own the procedure to follow regarding readmes. | | | Perl error vs. actual use | May 4, 2015 |
Jacques DP wrote: Yes, you will have to read the instructions You originally complained you weren't able to even start LF Aligner, because you got some obscure Perl error. I said LF Aligner should run right out-of-the-box if uncompressed to a proper destination. You don't have to read the instructions for that, just watch the video (at 1:05). Now if you want to use advanced features (eg. for fine-tuning the segmentation rules), then you may have to read the instructions. But you can get a good alignment by just going with the flow. I personally don't waste my valuable time trying to reach 100% alignments: if I can choose between a 97% correct alignment in two minutes (which is typically the raw output of LF Aligner) and a 100% correct one in four hours, I'll always go for the former. | | | Susan Welsh United States Local time: 16:52 Russian to English + ... @Farkas - segmentation on LF Aligner | May 5, 2015 |
I read the instructions (:-) and fixed the Russian capital letter problem very nicely (I'll send you the file). But I just spend an hour manually editing the alignment of a 30K word document, so this is not good. The problems: 1. Many segments do not break after the full stop, including 2 sentences in a segment. However this is not consistent in the source and target cells. 2. Solving the problem by "splitting" the segments that contain two sentences only works if there is an e... See more I read the instructions (:-) and fixed the Russian capital letter problem very nicely (I'll send you the file). But I just spend an hour manually editing the alignment of a 30K word document, so this is not good. The problems: 1. Many segments do not break after the full stop, including 2 sentences in a segment. However this is not consistent in the source and target cells. 2. Solving the problem by "splitting" the segments that contain two sentences only works if there is an empty cell below the active cell. Otherwise, the part that is split joins the text in the segment below, and you have to keep splitting ad infinitum, or until you reach an empty cell. 3. The only solution I found was, rather than splitting the cell that had too many sentences, to "merge" the cell that had only one sentence with the one below it. This worked, but was labor intensive. 4. After there was a problem of the above nature in the file, everything after that was screwed up -- i.e., the program did not correct itself and align the rest of the document properly. Instead, everything was one segment "off." 5. The "realign everything after the active segment" button in "Edit" didn't do anything. Susan ▲ Collapse | | | Pages in topic: < [1 2 3] > | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » Best route for 25K-word alignment project Trados Business Manager Lite | Create customer quotes and invoices from within Trados Studio
Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.
More info » |
| Anycount & Translation Office 3000 | Translation Office 3000
Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.
More info » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |