Hungarian to English IATE terminology package
The pricing of this glossary applies if you are an individual translator (or rather, if the file sets you order are used just by you and nobody else). If you are a translation agency or service provider and want to give more translators access to the glossary, please contact Henk Sanderson (the seller) to discuss a fair price scheme.
Bilingual files extracted from the European Union’s IATE Termbase and formatted for various CAT-tools.
The pricing of this glossary applies if you are an individual translator (or rather, if the file sets you order are used just by you and nobody else). If however you represent a translation agency or service provider, please contact Henk Sanderson (the seller) to discuss a fair price scheme.
Packaged for your productivity
Why purchase this package when the data is freely available for download on the IATE site? Your time is valuable, don't waste it spending endless hours trying to get what you need from the raw IATE download. For a small price this package offers you many benefits as a translator:
- Files ready for import into Termbases and Translation memories of the main CAT programs like SDL Trados Studio, DVX, CafeTran, memoQ, WordFast Classic/Pro and others
- No need to deal with the daunting and time consuming task of handling and extracting language pairs from the full IATE database
New in December 2016 version
- On the basis of termEntry ids alone (containing translations of just one term), there are about 25,000 changes (deletions and additions) between the previous version used (dating from January, 2015) and the current version, based on IATE_export_27072016.tbx.
- For each entry, the Domain the term is belonging to, is given in the Target Language (used to be in English only), when the Domain (or Subdomain) is output at all. This may be especially interesting for users of the package who do not understand English.
- Added instructions for import into Wordfast.
New in May 2017 version
- IATE has again carried out a large cleaning operation of terms. This version is based on IATE_export_16032017.tbx, that has about 50,000 less terms than the previous version I used. As IATE is certainly also adding new terms, it means that there are probably about 70,000 obsolete / unreliable terms deleted. Therefore I have created a new extraction.
- The naming of Domains/Subdomains complies with the new system adopted by IATE.
New in March 2018 version
The pruning and renewing process in IATE continues. Where the total number of terms has decreased by about 13,000 since March 2017, additions and deletions of complete termentries are as follows:
23,645 termentries deleted, 79,826 termentries added.
Note: these numbers give no info about modified / added / removed terms within an already existing termentry.
New in March 2019 version:
- Latin and Multilingual terms, introduced earlier in IATE, are now added to both source- and target terms of the relevant entries. This is especially important for those working in botany, chemistry, law, trade, food…
- While there are numerous new entries added by the IATE linguists, the total number of entries has again decreased, sometimes up to 20%, as a result of the cleaning up of the database by IATE, like removing duplicates, removing entries with low reliability, etc.
New in June 2020 version:
- As of May 22th, 2020, IATE has again opened up the possibility of downloading their Termbase. The current release of my Language Pairs is based upon the status of that date.
- 2 years ago, there were only 121 language pairs (from 552) that had more than 100,000 entries. Now, all language pairs have at least 100,000 entries, 173 even more than 200,000.
- While there are again numerous new entries added by the IATE linguists, the total number of entries keeps decreasing (about 50,000 compared to the previous release of Feb. 2019) as a result of the continued cleaning up of the database by IATE, like removing duplicates, removing entries with low reliability, etc., thereby increasing the usefulness of the information from IATE
Data that has been meticulously cleaned and formatted for your benefit
The author of these files has spent a considerable amount of effort to clean up issues and problem areas found in the raw IATE download including:
- Handling of synonyms – sometimes synonyms are strung together within one text record, separated by a semicolon or pipe symbol (|), sometimes they get separate text records
- Context notes are sometimes inserted in the text record between square brackets, making recognition of the terms impossible
- The termbase as downloaded from IATE lists the Domain the subjects belong to in English only. These Domain names are now shown in the Target language of the Language pair. This may be especially interesting for users of the package who do not understand English.
- The IATE termbase contains text entries varying from just one word or expression up to complete sentences, inclusive remarks and explanations; these longer text entries have no purpose in a termbase for a CAT tool, but are available for import into a Translation Memory.
- Many of the first users complained about the occurrence of a lot of 1-, 2- and even 3-letter words; others did not want ACRONYMS and abbreviations in their termbase, or at least have the possibility to create separate termbases for these terms.
- Latin and Multilingual terms, introduced earlier in IATE, have been added to both source- and target terms of the relevant entries. This is especially important for those working in botany, chemistry, law, trade, food…
This package includes the following:
(all .csv, .tmx, .xml, .tbx are encoded in UTF-8, except for WordFast *TB.txt files which are coded in UTF-16LE)
- *1ch*.*, *2ch*.*, *3ch*.*, *abb*.* files, containing 1-character, 2-character, 3-character and abbreviation/acronym terms only; note that terms added to the *abb* file may also be found in one of the other files. The reason for adding terms to these separate files is the possibility to add them into separate termbases in order to avoid overloading the main termbases with uninteresting short words.
- For DVX2/3 and other not specifically targeted CAT tools
- *TB.csv files for importing short strings (less than 4 words) into a Termbase
- *TM.csv files for importing long strings (4 or more words) into a translation memory
- *.dvtdt file containing the termbase definition for creating a DVX2 or DVX3 Termbase
- For WordFast
- *TB.txt files for importing short strings into a Termbase (coded in UTF-16LE)
- *TM.tmx files for importing long strings into a translation memory
- For SDL Trados
- .xdt file containing the termbase definition for the language pair; works with SDL Multiterm Desktop, versions 2011 and upwards
- *TB*.xml files for importing short strings directly into a Studio Termbase, versions 2011 and upwards
- *TM.tmx file for importing long strings directly into a Studio Translation Memory, versions 2011 and upwards
- For CafeTran
- In CafeTran .csv files, synonyms within one Concept-ID of the IATE termbase will be output into one entry, separated by ‘;’.
- The file IATE_CafeTranTM*.csv contains strings with 4 or more words.
- The other CafeTran .csv-files contain the shorter strings.
- For memoQ
- memoQ Custom field import scheme for the Translation memory: memoQ*TM-scheme.xml
- memoQ*TM.tmx for filling the Translation memory with long strings
- memoQ termbases can be filled with the IATE_[lang1]_[lang2]_TB.csv files
- For packages where the IATE termbase contains at least 100,000 terms in each of the languages, a subset of files for the following domains:
- Agriculture, Agri-foodstuffs and Environment
- Employment And Working Conditions, Business And Competition, International Relations, Politics, European Union and International Organisations
- Production, Energy and Industry
- Law, Finance and Science
- Social Questions and Education and Communications
- Economics, Trade, Geography and Transport
- A word document with detailed instructions on how to import the files into various CAT tools
If you would like to purchase this file using PayPal or using a SEPA bank transfer, please visit SanTrans and fill in the Contact form, including your wishes in the Message field.
The following specifies the right of SanTrans to use the IATE DATA: Note on copyright with respect to the data included in the files (from http://iate.europa.eu/tbxPageDownload.do):
Conditions for use
You are allowed to reproduce the data (provided on this page) for your personal needs, to distribute it for non-commercial and commercial purposes, and to make and distribute derivative works, provided the source is acknowledged as follows: Download IATE, European Union, .
Conditions for end use by the buyer of the bilingual file sets
The buyer of the bilingual file sets created by SanTans by extraction from IATE DATA is entitled to use them for his own translation work. Redistribution (reselling, publicizing or any other form of multiplication) is expressly forbidden, unless agreed upon between buyer and SanTrans.