originId of unstructured documents - Fluid Topics - Latest

Unstructured Documents Connector Reference Guide

Reference Guides

The <originId> element of a control file is a document's identifier.

Fluid Topics uses the <originId> value of documents to determine if a document is a new document, or an existing one to update. This means that the value of <originId> must remain consistent over time.

If users do not define the <originId> value of documents, Fluid Topics generates it based on their path and filename in the archive. This means that all document filenames must be unique, and package structures must remain consistent over time.

Do not modify the <originId> value of an existing document, and do not modify its location in archives.

With unstructured documents, the <originId> value configured in the control file, and the <originId> metadata in Fluid Topics are not an exact match. For example, an <originId> configured as myid in the control file becomes documents/myid.document in Fluid Topics.