English to Hungarian IATE terminology package
The pricing of this glossary applies if you are an individual translator (or rather, if the file sets you order are used just by you and nobody else). If you are a translation agency or service provider and want to give more translators access to the glossary, please contact Henk Sanderson (the seller) to discuss a fair price scheme.
Bilingual files extracted from the European Union’s IATE Termbase and formatted for various CAT-tools.
The pricing of this glossary applies if you are an individual translator (or rather, if the file sets you order are used just by you and nobody else). If however you represent a translation agency or service provider, please contact Henk Sanderson (the seller) to discuss a fair price scheme.
Packaged for your productivity
Why purchase this package when the data is freely available for download on the IATE site? Your time is valuable, don't waste it spending endless hours trying to get what you need from the raw IATE download. For a small price this package offers you many benefits as a translator:
- Files ready for import into Termbases and Translation memories of the main CAT programs like Trados Studio, DVX, CafeTran, memoQ, WordFast Classic/Pro and others
- No need to deal with the daunting and time consuming task of handling and extracting language pairs from the full IATE database
New in March 2019 version:
- Latin and Multilingual terms, introduced earlier in IATE, are now added to both source- and target terms of the relevant entries. This is especially important for those working in botany, chemistry, law, trade, food…
New in June 2020 version:
- As of May 22th, 2020, IATE has again opened up the possibility of downloading their Termbase. The current release of my Language Pairs is based upon the status of that date.
- 2 years ago, there were only 121 language pairs (from 552) that had more than 100,000 entries. Now, all language pairs have at least 100,000 entries, 173 even more than 200,000.
- While there are again numerous new entries added by the IATE linguists, the total number of entries keeps decreasing (about 50,000 compared to the previous release of Feb. 2019) as a result of the continued cleaning up of the database by IATE, like removing duplicates, removing entries with low reliability, etc., thereby increasing the usefulness of the information from IATE
New in May 2021 version:
- The update process at IATE is accelerating: since my previous release in September 2020, more than 5% changes (additions, deletions and modifications were carried out, and overall quality of terms is increasing, according to IATE). My current release is based upon an IATE extraction of May 1st, 2021.
- Apart from an update of the terms themselves, a few bugs in the extraction have been corrected (Domain names were missing in sets with English as target language, Domain names for sets with Hungarian as target language were sometimes in Croat)
Data that has been meticulously cleaned and formatted for your benefit
The author of these files has spent a considerable amount of effort to clean up issues and problem areas found in the raw IATE download including:
- Handling of synonyms – sometimes synonyms are strung together within one text record, separated by a semicolon or pipe symbol (|), sometimes they get separate text records
- Context notes are sometimes inserted in the text record between square brackets, making recognition of the terms impossible
- The termbase as downloaded from IATE lists the Domain the subjects belong to in English only. These Domain names are now shown in the Target language of the Language pair. This may be especially interesting for users of the package who do not understand English.
- The IATE termbase contains text entries varying from just one word or expression up to complete sentences, inclusive remarks and explanations; these longer text entries have no purpose in a termbase for a CAT tool, but are available for import into a Translation Memory.
- Many of the first users complained about the occurrence of a lot of 1-, 2- and even 3-letter words; others did not want ACRONYMS and abbreviations in their termbase, or at least have the possibility to create separate termbases for these terms.
- Latin and Multilingual terms, introduced earlier in IATE, have been added to both source- and target terms of the relevant entries. This is especially important for those working in botany, chemistry, law, trade, food…
This package includes the following:
(all .csv, .tmx, .xml, .tbx are encoded in UTF-8, except for WordFast *TB.txt files which are coded in UTF-16LE)
- *1ch*.*, *2ch*.*, *3ch*.*, *abb*.* files, containing 1-character, 2-character, 3-character and abbreviation/acronym terms only; note that terms added to the *abb* file may also be found in one of the other files. The reason for adding terms to these separate files is the possibility to add them into separate termbases in order to avoid overloading the main termbases with uninteresting short words.
- For DVX2/3 and other not specifically targeted CAT tools
- *TB.csv files for importing short strings (less than 4 words) into a Termbase
- *TM.csv files for importing long strings (4 or more words) into a translation memory
- *.dvtdt file containing the termbase definition for creating a DVX2 or DVX3 Termbase
- For WordFast
- *TB.txt files for importing short strings into a Termbase (coded in UTF-16LE)
- *TM.tmx files for importing long strings into a translation memory
- For Trados
- .xdt file containing the termbase definition for the language pair; works with SDL Multiterm Desktop, versions 2011 and upwards
- *TB*.xml files for importing short strings directly into a Studio Termbase, versions 2011 and upwards
- *TM.tmx file for importing long strings directly into a Studio Translation Memory, versions 2011 and upwards
- For CafeTran
- In CafeTran .csv files, synonyms within one Concept-ID of the IATE termbase will be output into one entry, separated by ‘;’.
- The file IATE_CafeTranTM*.csv contains strings with 4 or more words.
- The other CafeTran .csv-files contain the shorter strings.
- For memoQ
- memoQ Custom field import scheme for the Translation memory: memoQ*TM-scheme.xml
- memoQ*TM.tmx for filling the Translation memory with long strings
- memoQ termbases can be filled with the IATE_[lang1]_[lang2]_TB.csv files
- For packages where the IATE termbase contains at least 100,000 terms in each of the languages, a subset of files for the following domains:
- Agriculture, Agri-foodstuffs and Environment
- Employment And Working Conditions, Business And Competition, International Relations, Politics, European Union and International Organisations
- Production, Energy and Industry
- Law, Finance and Science
- Social Questions and Education and Communications
- Economics, Trade, Geography and Transport
- A word document with detailed instructions on how to import the files into various CAT tools
Don’t have a credit card?
If you would like to purchase this file using PayPal, a SEPA bank transfer, iDeal (Dutch bank accounts), Bancontact (Belgian bank accounts), SOFORT (German bank accounts) please visit SanTrans.net/shop and order from there (or in the case of PayPal, please visit SanTrans.net/contact and fill in the Contact form, including your wishes in the Message field.)
Special wishes like a Custom extraction (selected Domain Groups)?
Please visit SanTrans and fill in the Contact form, including your wishes in the Message field.
The following specifies the right of SanTrans to use the IATE DATA: Note on copyright with respect to the data included in the files (from http://iate.europa.eu/tbxPageDownload.do):
Conditions for use
You are allowed to reproduce the data (provided on this page) for your personal needs, to distribute it for non-commercial and commercial purposes, and to make and distribute derivative works, provided the source is acknowledged as follows: Download IATE, European Union, .
Conditions for end use by the buyer of the bilingual file sets
The buyer of the bilingual file sets created by SanTans by extraction from IATE DATA is entitled to use them for his own translation work. Redistribution (reselling, publicizing or any other form of multiplication) is expressly forbidden, unless agreed upon between buyer and SanTrans.