Retrieve Document Content Buffer

async AsyncCogniteClient.documents.retrieve_content_buffer( buffer: BinaryIO, id: int | None = None, external_id: str | None = None, instance_id: NodeId | None = None, ) → None

Retrieve document content into buffer.

Returns extracted textual information for the given document.

The document pipeline extracts up to 1MiB of textual information from each processed document. The search and list endpoints truncate the textual content of each document, in order to reduce the size of the returned payload. If you want the whole text for a document, you can use this endpoint.

Parameters:

buffer (BinaryIO) – The document content is streamed directly into the buffer. This is useful for retrieving large documents.
id (int | None) – The server-generated ID for the document you want to retrieve the content of.
external_id (str | None) – External ID of the document.
instance_id (NodeId | None) – Instance ID of the document.

Examples

Retrieve the content of a document with id 123 into local file “my_text.txt”:

>>> from cognite.client import CogniteClient
>>> from pathlib import Path
>>> client = CogniteClient()
>>> # async_client = AsyncCogniteClient()  # another option
>>> with Path("my_file.txt").open("wb") as buffer:
...     client.documents.retrieve_content_buffer(buffer, id=123)

Retrieve the content of a document with external_id “my_document” into local file “my_text.txt”:

>>> with Path("my_file.txt").open("wb") as buffer:
...     client.documents.retrieve_content_buffer(buffer, external_id="my_document")