Translation memory is a type of software application that allows human translators to reuse existing translations. At its core, translation memory, or TM for short, is simply a database of source sentences and their corresponding translations. When a translator starts on a new sentence, the TM is searched to determine if that sentence has been previously translated. If a match is found, it is presented to the translator so that it can be “dropped-in” to the document, thus preventing re-translation of the sentence from scratch. Even if an exact match cannot be found, similar sentences may be found in the TM and used by the translator. These so-called fuzzy matches give the translator a starting point from which to edit, again eliminating the need to translate from scratch.
Translation memory works at the sentence level. When using TM, the source document is broken down into its component sentences or segments. The term segment is used because in some cases, a chunk of text may not be a complete sentence, for example, in the case of headings. The segment is the smallest unit of text that can be reused when working with TM. Smaller units of text, such as individual words, are not used, because they may occur in different contexts and thus require different translations, and word-for-word translation generally does not produce usable results.
As the translator works, each segment for translation is compared to what is stored in the TM, and matches are presented to the translator automatically. A segment in the TM that is identical to the segment for translation is considered a 100% match. At some point in the past, this exact segment was encountered and a translation was provided and stored in the TM. In theory, it can be used exactly as is. If there is no exact match, but there are segments in the TM that are similar to the one being translated, then these are presented as fuzzy matches. Each is ranked by a percentage ranging from 0% to 99%, where the higher percent matches are closer in content to the sentence being translated. A 99% match might differ only in a single letter or punctuation mark, where a 75% match might have several different words. Generally, matches below the 70% mark are not useful.
When a document contains several identical segments that are not currently in the TM, these segments are known as repetitions. Most translation memory tools can identify potential repetitions before translation begins. The advantage of repetitions is that after the first occurrence is translated, the rest will become 100% matches. As the translator works, each newly translated sentence is added to the TM. Thus, that new sentence can become a 100% match or even a fuzzy match for other sentences in the document. Repetitions are those segments that have the potential to become 100% matches.
Before translation begins, the file will be analyzed against the TM. This produces statistics about the file that provide the total word count, and the number of words that make up repetitions, 100% matches, and fuzzy matches in the file. This is often referred to as a TM breakdown, as the total word count of the file is broken down in the various TM match categories. For example, a common breakdown used in the industry is:
Each category may have its own price or discount. 100% matches and repetitions are generally the least expensive, as they require little additional effort be the translator to use, while the lower percentage matches require more. Finally, anything below 75% is usually charged at the full per-word rate.
The benefits of using translation memory are reduced translation costs, quicker turnaround times, and increased translation consistency. Since repetitions, 100% matches, and potentially fuzzy matches have a discounted price, the cost of translating a document using a TM with numerous matches is lower. Furthermore, since there is less work required of the translator, using TM means the translator can finish that work sooner. Finally, since the translation memory is automatically suggesting matches to the translator while they work, the translator is more likely to use terminology and phrases consistent with previous translations, which increase quality.