Retrieve pandas dataframe
- async AsyncCogniteClient.raw.rows.retrieve_dataframe(
- db_name: str,
- table_name: str,
- min_last_updated_time: int | None = None,
- max_last_updated_time: int | None = None,
- columns: list[str] | None = None,
- limit: int | None = 25,
- partitions: int | None = None,
- last_updated_time_in_index: bool = False,
- infer_dtypes: bool = True,
Retrieve rows in a table as a pandas dataframe.
Rowkeys are used as the index.
- Parameters:
db_name (str) – Name of the database.
table_name (str) – Name of the table.
min_last_updated_time (int | None) – Rows must have been last updated after this time. Milliseconds since epoch.
max_last_updated_time (int | None) – Rows must have been last updated before this time. Milliseconds since epoch.
columns (list[str] | None) – List of column keys. Set to None to retrieving all, use empty list, [], to retrieve only row keys.
limit (int | None) – The number of rows to retrieve. Defaults to 25. Set to -1, float(“inf”) or None to return all items.
partitions (int | None) – Retrieve rows in parallel using this number of workers. Can be used together with a (large) finite limit. When partitions is not passed, it defaults to 1, i.e. no concurrency for a finite limit and
global_config.concurrency_settings.raw.readfor an unlimited query (will be capped at this value). To prevent unexpected problems and maximize read throughput, check out concurrency limits in the API documentation.last_updated_time_in_index (bool) – Use a MultiIndex with row keys and last_updated_time as index.
infer_dtypes (bool) – If True, pandas will try to infer dtypes of the columns. Defaults to True.
- Returns:
The requested rows in a pandas dataframe.
- Return type:
pd.DataFrame
Examples
Get dataframe:
>>> from cognite.client import CogniteClient, AsyncCogniteClient >>> client = CogniteClient() >>> # async_client = AsyncCogniteClient() # another option >>> df = client.raw.rows.retrieve_dataframe("db1", "t1", limit=5)