Localization System Overview

Localization of a project involves a few key concepts:

OverviewDiagram.png

Text

Text is the basic unit of localization. Text is defined by a namespace, a key, a source string, and a display string. Together, the namespace and key form a unique identity by which a text can be referred by. The namespace allows text to be translated differently in the case of homographs (identical strings with different meanings) by offering a different identity. The key provides specific context regarding the text. The source string is the string in its native form, without having been translated. The display string is the string which will be shown, typically a translated form of the source string.

For example, a dialog box may appear in English or Spanish. The dialog box may have a message, an "Ok" button, and a "Cancel" button. All three pieces of text may use the namespace "MyProject". The message text may use a key of "MyMessage", the "Ok" text may use a key of "DialogBox.AffirmativeButtonLabel", and the "Cancel" text may use a key of "DialogBox.NegatoryButtonLabel". Based on the namespace and key, each piece of text can be uniquely identified and translated.

Targets

Targets are named, self-contained modules of localization data. Targets have their text gathered from a specified set of sources, stored in a manifest file, translated in culture-specific archive files, compiled into culture-specific localization resource files, which are then loaded by the system for display.

A project can have a single target for simplicity or multiple targets in order to break up the project's localization data into separable sections. The Unreal Editor has a separate target from the rest of the Unreal Engine so that the editor can be localized yet the editor's localization data can be withheld from distribution with games. Typically, a game will have one target for all of the base game's localization data and additional targets for expansions.

Cultures

Cultures, also known as locales, define details such as language, script, and region. Cultures are identified by formatted strings as a required language code (ISO-639 standard), optional script code (ISO-15924 standard), and optional region code (ISO-3166 standard), each delimited by dashes or underscores.

Examples of culture codes include "en" (English language), "es-MX" (Spanish language, Mexican region), "zh-Hans-CN" (Chinese language, Simplified script, Chinese region).

Manifests

Manifests store gathered text as source strings, mapped by namespace and key, in human-readable JSON format. Manifests are generated by a commandlet using gathered text from special text gathering commandlets. Manifests are truncated and created from scratch each time and should not be manually updated.

Archives

Archives store source strings and their translations, mapped by namespace, in human-readable JSON format. Archives are generated by a commandlet that stubs out all entries from a specified manifest. Because entries in archives do not have keys, all entries from a manifest sharing the same source within a namespace are collapsed into a single archive entry; if text only differs by key, it is assumed they are superficially identical and will use the same translation. Archives are updated if they already exist, not truncated. Archives should be provided, as-is or converted into other formats, to translators for processing and returned with translations in place of empty stub entries.

Localization Resources (LocRes)

LocRes store translated text in a binary format to be loaded by the system. LocRes are generated by a commandlet that compiles a specified manifest and some number of specified archives.

The system loads LocRes files based on the project settings and the current culture. Localized text from the LocRes of the current culture is used in addition to localized text from the LocRes of all of the current culture's parent cultures. This allows general translations to be made for a language in tandem with having more specific translations made for a region. As a basic example, for a single target containing the text "Color" and supporting the cultures "en" (English) and "en-UK" (English for the United Kingdom), the "en" LocRes may have "Color" localized as "Color" while the "en-UK" LocRes may have it localized as "Colour". If the user switches to "en-CA" (English for Canada), but the LocRes for "en-CA" lacks a localization for "Color", the LocRes for "en" will provide "Color".