Retrieve pandas dataframe

async AsyncCogniteClient.raw.rows.retrieve_dataframe( db_name: str, table_name: str, min_last_updated_time: int | None = None, max_last_updated_time: int | None = None, columns: list[str] | None = None, limit: int | None = 25, partitions: int | None = None, last_updated_time_in_index: bool = False, infer_dtypes: bool = True, ) → pd.DataFrame

Retrieve rows in a table as a pandas dataframe.

Rowkeys are used as the index.

Parameters:

db_name (str) – Name of the database.
table_name (str) – Name of the table.
min_last_updated_time (int | None) – Rows must have been last updated after this time. Milliseconds since epoch.
max_last_updated_time (int | None) – Rows must have been last updated before this time. Milliseconds since epoch.
columns (list[str] | None) – List of column keys. Set to None to retrieving all, use empty list, [], to retrieve only row keys.
limit (int | None) – The number of rows to retrieve. Defaults to 25. Set to -1, float(“inf”) or None to return all items.
partitions (int | None) – Retrieve rows in parallel using this number of workers. Can be used together with a (large) finite limit. When partitions is not passed, it defaults to 1, i.e. no concurrency for a finite limit and global_config.concurrency_settings.raw.read for an unlimited query (will be capped at this value). To prevent unexpected problems and maximize read throughput, check out concurrency limits in the API documentation.
last_updated_time_in_index (bool) – Use a MultiIndex with row keys and last_updated_time as index.
infer_dtypes (bool) – If True, pandas will try to infer dtypes of the columns. Defaults to True.

Returns:

The requested rows in a pandas dataframe.

Return type:

pd.DataFrame

Examples

Get dataframe:

>>> from cognite.client import CogniteClient, AsyncCogniteClient
>>> client = CogniteClient()
>>> # async_client = AsyncCogniteClient()  # another option
>>> df = client.raw.rows.retrieve_dataframe("db1", "t1", limit=5)