How to extract acronyms from source text?
Thread poster: Erik Freitag
Erik Freitag
Erik Freitag  Identity Verified
Germany
Local time: 13:13
Member (2006)
Dutch to German
+ ...
Feb 22, 2018

Dear colleagues,

This may not be the best forum for my question, but here goes:

I'm looking for a convenient way to extract all acronyms/abbreviations from the source text, by which I basically (as a working definition) mean words that are not found in standard monolingual dictionaries and are written in capitals.

Ideally, I'd like to have them exported as a list, possibly with the whole sentence they appear in for context.

If anyone know a way
... See more
Dear colleagues,

This may not be the best forum for my question, but here goes:

I'm looking for a convenient way to extract all acronyms/abbreviations from the source text, by which I basically (as a working definition) mean words that are not found in standard monolingual dictionaries and are written in capitals.

Ideally, I'd like to have them exported as a list, possibly with the whole sentence they appear in for context.

If anyone know a way to achieve this with SDL Trados Studio 2017, TermExtract, or third party software, I'd be grateful for a hint.

Many thanks in advance,
kind regards,
Erik
Collapse


 
Adam Łobatiuk
Adam Łobatiuk  Identity Verified
Poland
Local time: 13:13
Member (2009)
English to Polish
+ ...
With Word Feb 22, 2018

For a rough list of acronyms with capital letters, you can copy and paste the text in MS Word, search with wildcards for <[A-Z]{2;}> (see note below) and replace with just bold formatting, and then search (without wildcards) for non-bold formatting and replace with ^p. That should leave you with just 2-letter or longer words in ALL CAPS with line breaks.

In the regular expression, you may need to use {2,} instead of {2;} depending on your system settings.

[Edited at 201
... See more
For a rough list of acronyms with capital letters, you can copy and paste the text in MS Word, search with wildcards for <[A-Z]{2;}> (see note below) and replace with just bold formatting, and then search (without wildcards) for non-bold formatting and replace with ^p. That should leave you with just 2-letter or longer words in ALL CAPS with line breaks.

In the regular expression, you may need to use {2,} instead of {2;} depending on your system settings.

[Edited at 2018-02-22 20:00 GMT]
Collapse


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

How to extract acronyms from source text?







Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »