Get an unstructured document's content - Fluid Topics - Latest

Integrate the Fluid Topics API

Category
Technical Notes
Audience
public
Version
Latest

It is possible to retrieve the content of topics in a map using the Get the content of a document web service.

/api/khub/documents/$DOC_ID/content

Where:

  • /documents lists all the unstructured documents stored on the Fluid Topics server.
  • /$DOC_ID allows to get metadata information for a specific unstructured document. The document ID is given with /documents.
  • /content retrieves the content of the unstructured document.

Clustering is not available for this web service.

The following lines show an example of a Python implementation of the Get the content of a document web service:

import requests

FT_SERVER_URL = 'https:// <host>/<serviceId>/<status>/'

DOCUMENTS_ENDPOINT = '/api/khub/documents'

HEADERS = {'FT-Authorization': 'Basic ...'}    

def crawl_documents():
   URL = FT_SERVER_URL + DOCUMENTS_ENDPOINT
...

def crawl_document(document_preview):
  URL = FT_SERVER_URL + document_preview['documentApiEndpoint']
...

def crawl_document_content(document_content_preview):
  URL = FT_SERVER_URL + document_content_preview['contentApiEndpoint']
...

For example, calling /api/khub/documents/n4cQEkKM8SM1f3zRPe_Bfg/content returns the image Standard Time Zones of the World.jpeg as follows:

![Retrieved image](./../images/time-zones-world.png)