Put Your Translations To Work...

TM-Town matches new work based on your past translations.

Try it now, Free!

Alignment the TM-Town Way

Alignment the TM-Town Way

TM-Town's translation enablement platform is designed to help translators get the most out of their linguistic assets. TM-Town's system can help you unlock the value of your prior translations in multiple different ways - finding potential new clients, leveraging your Translation Memory (TM) as well helping you extract valuable analytical business data from your linguistic assets.

In future posts I will go into all of these areas in detail; however, today I will explore one innovative feature of TM-Town that will help you quickly and easily get your previous translations into a form where you can benefit from their value. This feature is TM-Town's free alignment tool.

After interviewing countless professional translators one of the most pressing issues discussed was your inability to access previous work. Sometimes your documents would be in PDF form, other times in Word. Many of you said there were tons of Word documents and PDFs scattered throughout your hard drives, but trying to remember which document had that phrase you need for this current translation...well, that can be time consuming to track down. Frustration sets in because you know that these are valuable assets that you could be using to help you in future or present translation jobs.

Translation Memory and Computer Assisted Translation (CAT) tools were created to take care of this problem. These tools perform many functions and one is to allow you to leverage your previous translations so if a similar or identical sentence comes up again, you don’t have to waste your time translating the same thing again. The more your translation memory (TM) grows, the more linguistic assets you have to leverage and the better chance of getting a match. This means that over time your previous work will help you to work more and more efficiently.

This is where alignment tools come in.

Since your files are scattered throughout your hard drive as Word files, PDFs, text files, etc., you are probably too busy and don’t have time to figure out how to get these documents into a TM file format so that you can utilize a CAT tool. Many translators’ previous translations are not in TM file formats (such as TMX or XLIFF).

Alignment software segments a source document and its translation (target document) and then matches the corresponding segments together into translation units. It then creates industry standard TM files such as TMX or XLIFF.

All you do is upload the source document and target document and the alignment tool will automatically create a new TM file for you.

Alignment? I'm still not sure I understand.

Sometimes it helps to see things visually. Let's go through an example to better understand the alignment process.

Imagine that you are a Spanish to English to translator. Your client gave you a document that needs to be translated. This is called the source document. To keep this example simple, let's pretend this is the document you were given to translate:

Source Document

Bienvenido a Miami. Miami es una ciudad estadounidense ubicada en la parte sureste de Florida alrededor del río Miami, entre los Everglades y el océano Atlántico.

You work hard to translate the document into English. The translated document is called the target document.

Target Document

Welcome to Miami. Miami is a US city located in the southeastern part of Florida around the Miami River, between the Everglades and the Atlantic Ocean.

The first step in the alignment process is to segment each document (both the source and target). Segmentation is the process of breaking the text into segments (typically a segment is roughly equivalent to a sentence).

Source Document Segmented

Segment #1: Bienvenido a Miami.
Segment #2: Miami es una ciudad estadounidense ubicada en la parte sureste de Florida alrededor del río Miami, entre los Everglades y el océano Atlántico.

Target Document Segmented

Segment #1: Welcome to Miami.
Segment #2: Miami is a US city located in the southeastern part of Florida around the Miami River, between the Everglades and the Atlantic Ocean.

After each document has been segmented, the next step in the alignment process is to match each segment from the source document to its corresponding segment in the target document. In this example it is easy, segment #1 from the source document matches segment #1 from the target document and segment #2 from the source document matches segment #2 from the target document.

After all of the segments have been successfully matched, the final step is to create a Translation Memory file. A Translation Memory file stores the aligned document in a special format so that when the file is read it it is obvious which segments match together.

Example Aligned Document (TMX document)

<?xml version="1.0"?>
<tmx version="1.4">
<header creationtool="TM-Town"></header>
<body>
<tu>
<tuv xml:lang="ES">
<seg>Bienvenido a Miami.</seg>
</tuv>
<tuv xml:lang="EN">
<seg>Welcome to Miami.</seg>
</tuv>
</tu>
<tu>
<tuv xml:lang="ES">
<seg>Miami es una ciudad estadounidense ubicada en la parte
sureste de Florida alrededor del río Miami, entre los Everglades y el
océano Atlántico.</seg>
</tuv>
<tuv xml:lang="EN">
<seg>Miami is a US city located in the southeastern part of Florida around the Miami River, between the Everglades and the Atlantic Ocean.</seg>
</tuv>
</tu>
</body>
</tmx>

Here’s the really good news!

TM-Town offers an alignment tool that is not only free but is far superior to most other alignment software. There are some open source alignment tools on the web, but none of them are particularly user friendly to say the least. There is no need for an IT certificate to use TM-Town’s alignment tool. TM-Town’s alignment tool is so simple; you upload the source document, upload the target document and TM-Town’s system does the rest - creating a new aligned file for you that you can download in many different formats (.tmx, .xliff, .xls, .csv).

TM-Town Alignment

TM-Town Alignment Benefits

  • Free
  • Easy to use
  • Safe and secure - your uploaded work is private.
  • Your new TM file is available in multiple industry formats (.tmx, .xliff, .csv, .xls).
  • TM-Town’s system will automatically extract terms from your documents, helping you to easily build term bases and glossaries.
  • Your uploaded work can potentially help you get new work through TM-Town's innovative job matching system.

If you have never used an alignment tool, try out TM-Town and see how easy it is. With TM-Town your previous work is just a click away which will help save you time and effort. TM-Town has some other fantastic free features which I will get into in upcoming posts. In the meantime, please send me your feedback or comment below. I enjoy getting to know the community and it helps me to improve TM-Town.

kevin dias at tm-town

About the Author

Kevin Dias
TM-Town Developer
More about me

TM-Town is the next-generation platform for freelance translators.

Join today and let your work start working for you.

Join now, it's free!

Comments (4)

michaelbeijer Michael J.W. Beijer
United Kingdom
Posted over 1 year ago.

Hi Kevin,

I was wondering what alignment engine you are using in the background? For example, are you using Hunalign, or some other open source engine?

diasks2 Kevin Dias
Japan
Posted over 1 year ago.

Hi Michael,

Thanks for your question. The current alignment engine is based on the Gale-Church algorithm, so somewhat similar to Hunalign.

Actually I have developed my own alignment method which I talk about at the end of this presentation. It is based on 3 main heuristics:
1. Machine translate A -> B and B -> A
2. Relative sentence length
3. The segment's order/position in the document

In my tests it is much more accurate. The reason I don't use it for TM-Town is that it would require me sending data to a 3rd party (such as Microsoft Translate) to get the machine translation results which I can't/won't do (as any documents translators uploaded to TM-Town are strictly private). I plan to open source it at some point, just haven't gotten around to it, too many other things to focus on with TM-Town.

Regardless of the alignment method though, I have found that the #1 reason for misalignment is poor segmentation. Therefore I have spent a lot of time on TM-Town's segmentation engine (which is open source).

If you are interested in alignment (or segmentation) I would check out TM-Town's Natural Language Processing page, it has links to research papers in those areas.

hans Hans van den Broek
Indonesia
Posted over 1 year ago.

Excellent! I tried two well-known other alignment tools and one CAT tool, and they failed where TM-T produced a decent result (had to delete only 4 segments, or 2%).

diasks2 Kevin Dias
Japan
Posted over 1 year ago.

Thanks for the comment Hans. Glad to hear you had success with the alignment tool.

If you would like to leave a comment please sign in to your TM-Town account. If you are not a TM-Town member you can easily register for a free account.