Home | Blog | What is TM-Town? | Directory Search | Nakōdo Expert Finder | Terminology Marketplace | Register | Log In

Search Your Translation Memory

It is now possible to search your personal translation memory files and terminology files on TM-Town. This powerful new feature will search any documents you have uploaded to TM-Town and break out the results for both relevant segments as well as relevant terms.

TM-Town's new search feature can be accessed directly from the Your Work page or from the new Search Your Translation Memory page.

Advanced Search

Just typing in a search query will return the most relevant results in most cases; however, if you need a more powerful search you can take advantage of the advanced features built into TM-Town's search.

Exact match search

For an exact match search, enclose your search query in dollar signs.

$computer$

Exact match search is case-sensitive. This search is useful when you are looking for an exact term in one of your terminology files.

Regular expression search

Regular expression search allows you to search your translation memory and terminology files using a regular expression, a powerful type of "pattern matching".

To search using a regular expression, enclose your regular expression search query in forward slashes. Note that the regular expression must match the entire string you are searching for.

/.+computer[^s].+/
Anchoring

Most regular expression engines allow you to match any part of a string. If you want the regular expression pattern to start at the beginning of the string or finish at the end of the string, then you have to anchor it specifically, using ^ to indicate the beginning or $ to indicate the end.

TM-Town's searches are always anchored. The pattern provided must match the entire string. For string "abcde":

ab.* # match
abcd # no match
Allowed characters

Any Unicode characters may be used in the pattern, but certain characters are reserved and must be escaped with a backslash. The standard reserved characters are:

. ? + * | { } [ ] ( ) " \

Match any character

The period "." can be used to represent any character. For string "abcde":

ab... # match
a.c.e # match
One-or-more

The plus sign "+" can be used to repeat the preceding shortest pattern one or more times. For string "aaabbb":

a+b+ # match
aa+bb+ # match
a+.+ # match
aa+bbb+ # match
Zero-or-more

The asterisk "*" can be used to match the preceding shortest pattern zero-or-more times. For string "aaabbb":

a*b* # match
a*b*c* # match
.*bbb.* # match
aaa*bbb* # match
Zero-or-one

The question mark "?" makes the preceding shortest pattern optional. It matches zero or one times. For string "aaabbb":

aaa?bbb? # match
aaaa?bbbb? # match
.....?.? # match
aa?bb? # no match
Min-to-max

Curly brackets "{}" can be used to specify a minimum and (optionally) a maximum number of times the preceding shortest pattern can repeat. The allowed forms are:

{5} # repeat exactly 5 times
{2,5} # repeat at least twice and at most 5 times
{2,} # repeat at least twice

For string "aaabbb":

a{3}b{3} # match
a{2,4}b{2,4} # match
a{2,}b{2,} # match
.{3}.{3} # match
a{4}b{4} # no match
a{4,6}b{4,6} # no match
a{4,}b{4,} # no match
Grouping

Parentheses "()" can be used to form sub-patterns. The quantity operators listed above operate on the shortest previous pattern, which can be a group. For string "ababab":

(ab)+ # match
ab(ab)+ # match
(..)+ # match
(...)+ # no match
(ab)* # match
abab(ab)? # match
ab(ab)? # no match
(ab){3} # match
(ab){1,2} # no match
Alternation

The pipe symbol "|" acts as an OR operator. The match will succeed if the pattern on either the left-hand side OR the right-hand side matches. The alternation applies to the longest pattern, not the shortest. For string "aabb":

aabb|bbaa # match
aacc|bb # no match
aa(cc|bb) # match
a+|b+ # no match
a+b+|b+a+ # match
a+(b|c)+ # match
Character classes

Ranges of potential characters may be represented as character classes by enclosing them in square brackets "[]". A leading ^ negates the character class. The allowed forms are:

[abc] # 'a' or 'b' or 'c'
[a-c] # 'a' or 'b' or 'c'
[-abc] # '-' or 'a' or 'b' or 'c'
[abc\-] # '-' or 'a' or 'b' or 'c'
[^abc] # any character except 'a' or 'b' or 'c'
[^a-c] # any character except 'a' or 'b' or 'c'
[^-abc] # any character except '-' or 'a' or 'b' or 'c'
[^abc\-] # any character except '-' or 'a' or 'b' or 'c'

Note that the dash "-" indicates a range of characters, unless it is the first character or if it is escaped with a backslash.

For string "abcd":

ab[cd]+ # match
[a-d]+ # match
[^a-d]+ # no match

Regular expression examples taken from the ElasticSearch documentation.

Getting the most out of your TMs

Remember, any files you upload to TM-Town are automatically private and secure. No one will be able to view or search your files except you.

TM-Town's new search feature is a small step to help you leverage your translation memory assets. In the future TM-Town is planning tighter integration with your favorite CAT tool(s) so that you can easily and securely access your private TMs from within those tools.

Get started on TM-Town today and start getting the most out of your previous translations.

Kevin Dias
TM-Town Developer

Comments (0)

If you would like to leave a comment please sign in to your TM-Town account. If you are not a TM-Town member you can easily register for a free account.