Differences between versions of Lucene tokenizers
Thread poster: CafeTran Training (X)
CafeTran Training (X)
CafeTran Training (X)
Netherlands
Local time: 09:42
Jul 10, 2016

When I create a new project in omegaT 3.6, it'll use version 3.0 of the Lucene tokenizer for the source language German and version 3.6 for the target language Dutch.

Since I see that I can also manually select version 3.6 of the Lucene tokenizer for the source language German, I'd like to learn what the differences are between the versions 3.0 and 3.6 of the Lucene tokenizer for German.


 
Didier Briel
Didier Briel  Identity Verified
France
Local time: 09:42
English to French
+ ...
3.0 provides better stemming Jul 10, 2016

CafeTran Training wrote:
When I create a new project in omegaT 3.6, it'll use version 3.0 of the Lucene tokenizer for the source language German and version 3.6 for the target language Dutch.

Since I see that I can also manually select version 3.6 of the Lucene tokenizer for the source language German, I'd like to learn what the differences are between the versions 3.0 and 3.6 of the Lucene tokenizer for German.

According to translators translating from German, 3.0 uses a better stemming algorithm compared with 3.1 and latter.

You can read a thread on what started the need to configure the behaviour here:
https://groups.yahoo.com/neo/groups/OmegaT/conversations/topics/28375

In OmegaT 4.0, selecting the behaviour won't be necessary. All the tokenizers perform correctly, except German for which we found a way of replicating tokenizer 3.0 behaviour.

Didier


 


There is no moderator assigned specifically to this forum.
To report site rules violations or get help, please contact site staff »


Differences between versions of Lucene tokenizers






Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
Trados Business Manager Lite
Create customer quotes and invoices from within Trados Studio

Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.

More info »