When an administrator enables crawl for a portal, Fluid Topics adds the following API path to the root of the portal's URL:
/sitemap.xml
The following example shows a portal's URL with the API path added:
my_fluid_topics_portal.net/sitemap.xml
Fluid Topics adds the following additional paths for different categories of content as follows:
- /home.xml for the portal's homepage.
- /pages.xml
- /unstructured/<page>.xml.
- /structured/<page>.xml.
- Fluid Topics paginates structured and unstructured documents. Each page contains 10,000 documents.
- If crawl is disabled, Fluid Topics does not generate a sitemap and replies with an HTTP 404 error if a bot or user tries to access one.
Even if an administrator restricts access to authenticated users, crawlers are still able to access the tenant's sitemap.xml file with an HTML view of public content. Only content protected by content access rules is absent from the sitemap.xml file.
If a pretty URL exists for a given document, Fluid Topics asks bots to give it preference over the canonical one when adding a URL to the sitemap.
Customize the sitemap.xml file
It is possible to replace the sitemap.xml file with a custom file. This custom file must be named sitemap.xml and located in the public tenant configuration directory.
When a custom file is deployed, the Fluid Topics server no longer generates or updates the default version of the file.
Antidot cannot be held responsible for any issue caused by the use of a custom sitemap.xml file.