When uploading a structured or unstructured document, an ADMIN
or KHUB_ADMIN
user must configure the content language in one of the following ways:
- In the content authoring software.
- In a control file.
All language codes in Fluid Topics include an ISO 639-1 language designator in lowercase and an ISO 3166-1 region designator in uppercase, for example: en-US
, fr-CA
, zh-TW
, etc. The International Organization for Standardization (ISO) maintains a list of official designators.
Configuring a content language via control file is not possible for the Author-it or Paligo connector because they do not support the Fluid Topics control file.
Example
It is possible to define the language of an unstructured document in the corresponding control file by using the xml:lang
attribute.
xml:lang
is an optional attribute of the ft:resource
parameter. It allows users to define the language of the resource, as shown in the following example:
<ft:resources>
<ft:resource xml:lang="$LANG_CODE_OF_THE_RESOURCE">
<ft:file>...</ft:file>
...
</ft:resources>
In the above example, xml:lang
expects the ISO 639-1 code in lowercase and an ISO 3166-1 code in uppercase, separated by a hyphen.
If ft:resource@xml:lang
is not set, Fluid Topics tries to automatically detect the language of the unstructured document. If this is not possible, the language is set to en-US
.
It is possible to configure this attribute for the ft:resource
parameter. In this case, the same language attribute is assigned to all unstructured documents in the archive.