Retrieve pandas dataframe

async AsyncCogniteClient.raw.rows.retrieve_dataframe(
db_name: str,
table_name: str,
min_last_updated_time: int | None = None,
max_last_updated_time: int | None = None,
columns: list[str] | None = None,
limit: int | None = 25,
partitions: int | None = None,
last_updated_time_in_index: bool = False,
infer_dtypes: bool = True,
) pd.DataFrame

Retrieve rows in a table as a pandas dataframe.

Rowkeys are used as the index.

Parameters:
  • db_name (str) – Name of the database.

  • table_name (str) – Name of the table.

  • min_last_updated_time (int | None) – Rows must have been last updated after this time. Milliseconds since epoch.

  • max_last_updated_time (int | None) – Rows must have been last updated before this time. Milliseconds since epoch.

  • columns (list[str] | None) – List of column keys. Set to None to retrieving all, use empty list, [], to retrieve only row keys.

  • limit (int | None) – The number of rows to retrieve. Defaults to 25. Set to -1, float(“inf”) or None to return all items.

  • partitions (int | None) – Retrieve rows in parallel using this number of workers. Can be used together with a (large) finite limit. When partitions is not passed, it defaults to 1, i.e. no concurrency for a finite limit and global_config.concurrency_settings.raw.read for an unlimited query (will be capped at this value). To prevent unexpected problems and maximize read throughput, check out concurrency limits in the API documentation.

  • last_updated_time_in_index (bool) – Use a MultiIndex with row keys and last_updated_time as index.

  • infer_dtypes (bool) – If True, pandas will try to infer dtypes of the columns. Defaults to True.

Returns:

The requested rows in a pandas dataframe.

Return type:

pd.DataFrame

Examples

Get dataframe:

>>> from cognite.client import CogniteClient, AsyncCogniteClient
>>> client = CogniteClient()
>>> # async_client = AsyncCogniteClient()  # another option
>>> df = client.raw.rows.retrieve_dataframe("db1", "t1", limit=5)