Estonian to Slovenian IATE terminology package
The pricing of this glossary applies if you are an individual translator (or rather, if the file sets you order are used just by you and nobody else). If you are a translation agency or service provider and want to give more translators access to the glossary, please contact Henk Sanderson (the seller) to discuss a fair price scheme.
Bilingual files extracted from the European Union’s IATE Termbase and formatted for various CAT-tools.
||Estonian to Slovenian
The pricing of this glossary applies if you are an individual translator (or rather, if the file sets you order are used just by you and nobody else). If however you represent a translation agency or service provider, please contact Henk Sanderson (the seller) to discuss a fair price scheme.
Packaged for your productivity
Why purchase this package when the data is freely available for download on the IATE site? Your time is valuable, don't waste it spending endless hours trying to get what you need from the raw IATE download. For a small price this package offers you many benefits as a translator:
- Files ready for import into the main CAT programs like SDL Trados Studio, DVX, CafeTran, memoQ, WordFast Classic/Pro and others
- No need to deal with the daunting and time consuming task of handling and extracting language pairs from the full IATE database
New in December 2016 version
- On the basis of termEntry ids alone (containing translations of just one term), there are about 25,000 changes (deletions and additions) between the previous version used (dating from January, 2015) and the current version, based on IATE_export_27072016.tbx.
- For each entry, the Domain the term is belonging to, is given in the Target Language (used to be in English only), when the Domain (or Subdomain) is output at all. This may be especially interesting for users of the package who do not understand English.
- Added instructions for import into Wordfast.
New in May 2017 version
- IATE has again carried out a large cleaning operation of terms. This version is based on IATE_export_16032017.tbx, that has about 50,000 less terms than the previous version I used. As IATE is certainly also adding new terms, it means that there are probably about 70,000 obsolete / unreliable terms deleted. Therefore I have created a new extraction.
- The naming of Domains/Subdomains complies with the new system adopted by IATE.
New in March 2018 version
The pruning and renewing process in IATE continues. Where the total number of terms has decreased by about 13,000 since March 2017, additions and deletions of complete termentries are as follows:
23,645 termentries deleted, 79,826 termentries added.
Note: these numbers give no info about modified / added / removed terms within an already existing termentry.
Data that has been meticulously cleaned and formatted for your benefit
The author of these files has spent a considerable amount of effort to clean up issues and problem areas found in the raw IATE download including:
- Handling of synonyms – sometimes synonyms are strung together within one text record, separated by a semicolon, sometimes they get separate text records
- Context notes are sometimes inserted in the text record between square brackets
- The termbase lists subjects as numerical codes, that can only be resolved after consulting the code definitions on the IATE website; however the codes in the file do not correspond with the codes on the website, and the website does not define all codes found in the .tbx file
- The termbase contains text entries varying from just one word or expression up to complete sentences, inclusive remarks and explanations; these longer text entries have no purpose in a termbase
- The IATE file contains numerous non-UTF-8 characters
- Many of the first users complained about the occurrence of a lot of 1-, 2- and even 3-letter words; others did not want ACRONYMS and abbreviations in their termbase, or at least have the possibility to create separate termbases for these terms
- Since many entries in the IATE file have multiple (sub)domains assigned, merging separately extracted files from multiple domains into one database can lead to a substantial overlap caused by duplicate entries
This package includes the following:
(all .csv, .tmx, .xml, .tbx are encoded in UTF-8, except for WordFast *TB.txt files which are coded in UTF-16LE)
- *1ch*.*, *2ch*.*, *3ch*.*, *abb*.* files, containing 1-character, 2-character, 3-character and abbreviation/acronym terms only; note that terms added to the *abb* file may also be found in one of the other files. The reason for adding terms to these separate files is the possibility to add them into separate termbases in order to avoid overloading the main termbases with uninteresting short words.
- For DVX2/3 and other not specifically targeted CAT tools
- *TB.csv files for importing short strings (less than 4 words) into a Termbase
- *TM.csv files for importing long strings (4 or more words) into a translation memory
- *.dvtdt file containing the termbase definition for creating a DVX2 or DVX3 Termbase
- For WordFast
- *TB.txt files for importing short strings into a Termbase (coded in UTF-16LE)
- *TM.tmx files for importing long strings into a translation memory
- For SDL Trados
- .xdt file containing the termbase definition for the language pair; works with SDL Multiterm Desktop, versions 2011 and upwards
- *TB*.xml files for importing short strings directly into a Studio Termbase, versions 2011 and upwards
- *TM.tmx file for importing long strings directly into a Studio Translation Memory, versions 2011 and upwards
- For CafeTran
- In CafeTran .csv files, synonyms within one Concept-ID of the IATE termbase will be output into one entry, separated by ‘;’.
- The file IATE_CafeTranTM*.csv contains strings with 4 or more words.
- The other CafeTran .csv-files contain the shorter strings.
- For memoQ
- memoQ Custom field import scheme for the Translation memory: memoQ*TM-scheme.xml
- memoQ*TM.tmx for filling the Translation memory with long strings
- memoQ termbases can be filled with the IATE_[lang1]_[lang2]_TB.csv files
- For packages where the IATE termbase contains at least 100,000 terms in each of the languages, a subset of files for the following domains:
- Agriculture, Agri-foodstuffs and Environment
- Employment And Working Conditions, Business And Competition, International Relations, Politics, European Union and International Organisations
- Production, Energy and Industry
- Law, Finance and Science
- Social Questions and Education and Communications
- Economics, Trade, Geography and Transport
- A word document with detailed instructions on how to import the files into various CAT tools
If you would like to purchase this file using PayPal or using a SEPA bank transfer, please visit SanTrans and purchase the file from there.
The following specifies the right of SanTrans to use the IATE DATA: Note on copyright with respect to the data included in the files (from http://iate.europa.eu/tbxPageDownload.do):
Conditions for use
You are allowed to reproduce the data (provided on this page) for your personal needs, to distribute it for non-commercial and commercial purposes, and to make and distribute derivative works, provided the source is acknowledged as follows: Download IATE, European Union, .
Conditions for end use by the buyer of the bilingual file sets
The buyer of the bilingual file sets created by SanTans by extraction from IATE DATA is entitled to use them for his own translation work. Redistribution (reselling, publicizing or any other form of multiplication) is expressly forbidden, unless agreed upon between buyer and SanTrans.