Data Ingestion

Raw

Databases

List databases

RawDatabasesAPI.list(limit: int | None = 25) DatabaseList

List databases

Parameters

limit (int | None) – Maximum number of databases to return. Defaults to 25. Set to -1, float(“inf”) or None to return all items.

Returns

List of requested databases.

Return type

DatabaseList

Examples

List the first 5 databases:

>>> from cognite.client import CogniteClient
>>> client = CogniteClient()
>>> db_list = client.raw.databases.list(limit=5)

Iterate over databases:

>>> for db in client.raw.databases:
...     db # do something with the db

Iterate over chunks of databases to reduce memory load:

>>> for db_list in client.raw.databases(chunk_size=2500):
...     db_list # do something with the dbs

Create new databases

RawDatabasesAPI.create(name: str) Database
RawDatabasesAPI.create(name: list[str]) DatabaseList

Create one or more databases.

Parameters

name (str | list[str]) – A db name or list of db names to create.

Returns

Database or list of databases that has been created.

Return type

Database | DatabaseList

Examples

Create a new database:

>>> from cognite.client import CogniteClient
>>> client = CogniteClient()
>>> res = client.raw.databases.create("db1")

Delete databases

RawDatabasesAPI.delete(name: Union[str, SequenceNotStr[str]], recursive: bool = False) None

Delete one or more databases.

Parameters
  • name (str | SequenceNotStr[str]) – A db name or list of db names to delete.

  • recursive (bool) – Recursively delete all tables in the database(s).

Examples

Delete a list of databases:

>>> from cognite.client import CogniteClient
>>> client = CogniteClient()
>>> client.raw.databases.delete(["db1", "db2"])

Tables

List tables in a database

RawTablesAPI.list(db_name: str, limit: int | None = 25) TableList

List tables

Parameters
  • db_name (str) – The database to list tables from.

  • limit (int | None) – Maximum number of tables to return. Defaults to 25. Set to -1, float(“inf”) or None to return all items.

Returns

List of requested tables.

Return type

raw.TableList

Examples

List the first 5 tables:

>>> from cognite.client import CogniteClient
>>> client = CogniteClient()
>>> table_list = client.raw.tables.list("db1", limit=5)

Iterate over tables:

>>> for table in client.raw.tables(db_name="db1"):
...     table # do something with the table

Iterate over chunks of tables to reduce memory load:

>>> for table_list in client.raw.tables(db_name="db1", chunk_size=2500):
...     table_list # do something with the tables

Create new tables in a database

RawTablesAPI.create(db_name: str, name: str) Table
RawTablesAPI.create(db_name: str, name: list[str]) TableList

Create one or more tables.

Parameters
  • db_name (str) – Database to create the tables in.

  • name (str | list[str]) – A table name or list of table names to create.

Returns

raw.Table or list of tables that has been created.

Return type

raw.Table | raw.TableList

Examples

Create a new table in a database:

>>> from cognite.client import CogniteClient
>>> client = CogniteClient()
>>> res = client.raw.tables.create("db1", "table1")

Delete tables from a database

RawTablesAPI.delete(db_name: str, name: Union[str, SequenceNotStr[str]]) None

Delete one or more tables.

Parameters
  • db_name (str) – Database to delete tables from.

  • name (str | SequenceNotStr[str]) – A table name or list of table names to delete.

Examples

Delete a list of tables:

>>> from cognite.client import CogniteClient
>>> client = CogniteClient()
>>> res = client.raw.tables.delete("db1", ["table1", "table2"])

Rows

Get a row from a table

RawRowsAPI.retrieve(db_name: str, table_name: str, key: str) cognite.client.data_classes.raw.Row | None

Retrieve a single row by key.

Parameters
  • db_name (str) – Name of the database.

  • table_name (str) – Name of the table.

  • key (str) – The key of the row to retrieve.

Returns

The requested row.

Return type

Row | None

Examples

Retrieve a row with key ‘k1’ from table ‘t1’ in database ‘db1’:

>>> from cognite.client import CogniteClient
>>> client = CogniteClient()
>>> row = client.raw.rows.retrieve("db1", "t1", "k1")

You may access the data directly on the row (like a dict), or use ‘.get’ when keys can be missing:

>>> val1 = row["col1"]
>>> val2 = row.get("col2")

List rows in a table

RawRowsAPI.list(db_name: str, table_name: str, min_last_updated_time: Optional[int] = None, max_last_updated_time: Optional[int] = None, columns: Optional[list[str]] = None, limit: int | None = 25, partitions: Optional[int] = None) RowList

List rows in a table.

Parameters
  • db_name (str) – Name of the database.

  • table_name (str) – Name of the table.

  • min_last_updated_time (int | None) – Rows must have been last updated after this time (exclusive). ms since epoch.

  • max_last_updated_time (int | None) – Rows must have been last updated before this time (inclusive). ms since epoch.

  • columns (list[str] | None) – List of column keys. Set to None for retrieving all, use [] to retrieve only row keys.

  • limit (int | None) – The number of rows to retrieve. Can be used with partitions. Defaults to 25. Set to -1, float(“inf”) or None to return all items.

  • partitions (int | None) – Retrieve rows in parallel using this number of workers. Can be used together with a (large) finite limit. When partitions is not passed, it defaults to 1, i.e. no concurrency for a finite limit and global_config.max_workers for an unlimited query (will be capped at this value). To prevent unexpected problems and maximize read throughput, check out concurrency limits in the API documentation.

Returns

The requested rows.

Return type

RowList

Examples

List a few rows:

>>> from cognite.client import CogniteClient
>>> client = CogniteClient()
>>> row_list = client.raw.rows.list("db1", "tbl1", limit=5)

Read an entire table efficiently by using concurrency (default behavior when limit=None):

>>> row_list = client.raw.rows.list("db1", "tbl1", limit=None)

Iterate through all rows one-by-one to reduce memory load (no concurrency used):

>>> for row in client.raw.rows("db1", "t1", columns=["col1","col2"]):
...     val1 = row["col1"]  # You may access the data directly
...     val2 = row.get("col2")  # ...or use '.get' when keys can be missing

Iterate through all rows, one chunk at a time, to reduce memory load (no concurrency used):

>>> for row_list in client.raw.rows("db1", "t1", chunk_size=2500):
...     row_list  # Do something with the rows

Iterate through a massive table to reduce memory load while using concurrency for high throughput. Note: partitions must be specified for concurrency to be used (this is different from list() to keep backward compatibility). Supplying a finite limit does not affect concurrency settings (except for very small values).

>>> rows_iterator = client.raw.rows(
...     db_name="db1", table_name="t1", partitions=5, chunk_size=5000, limit=1_000_000
... )
>>> for row_list in rows_iterator:
...     row_list  # Do something with the rows

Insert rows into a table

RawRowsAPI.insert(db_name: str, table_name: str, row: collections.abc.Sequence[cognite.client.data_classes.raw.Row] | collections.abc.Sequence[cognite.client.data_classes.raw.RowWrite] | cognite.client.data_classes.raw.Row | cognite.client.data_classes.raw.RowWrite | dict, ensure_parent: bool = False) None

Insert one or more rows into a table.

Parameters
  • db_name (str) – Name of the database.

  • table_name (str) – Name of the table.

  • row (Sequence[Row] | Sequence[RowWrite] | Row | RowWrite | dict) – The row(s) to insert

  • ensure_parent (bool) – Create database/table if they don’t already exist.

Examples

Insert new rows into a table:

>>> from cognite.client import CogniteClient
>>> from cognite.client.data_classes import RowWrite
>>> client = CogniteClient()
>>> rows = [RowWrite(key="r1", columns={"col1": "val1", "col2": "val1"}),
...         RowWrite(key="r2", columns={"col1": "val2", "col2": "val2"})]
>>> client.raw.rows.insert("db1", "table1", rows)

You may also insert a dictionary directly:

>>> rows = {
...     "key-1": {"col1": 1, "col2": 2},
...     "key-2": {"col1": 3, "col2": 4, "col3": "high five"},
... }
>>> client.raw.rows.insert("db1", "table1", rows)

Delete rows from a table

RawRowsAPI.delete(db_name: str, table_name: str, key: Union[str, SequenceNotStr[str]]) None

Delete rows from a table.

Parameters
  • db_name (str) – Name of the database.

  • table_name (str) – Name of the table.

  • key (str | SequenceNotStr[str]) – The key(s) of the row(s) to delete.

Examples

Delete rows from table:

>>> from cognite.client import CogniteClient
>>> client = CogniteClient()
>>> keys_to_delete = ["k1", "k2", "k3"]
>>> client.raw.rows.delete("db1", "table1", keys_to_delete)

Retrieve pandas dataframe

RawRowsAPI.retrieve_dataframe(db_name: str, table_name: str, min_last_updated_time: int | None = None, max_last_updated_time: int | None = None, columns: list[str] | None = None, limit: int | None = 25, partitions: int | None = None, last_updated_time_in_index: bool = False) pd.DataFrame

Retrieve rows in a table as a pandas dataframe.

Rowkeys are used as the index.

Parameters
  • db_name (str) – Name of the database.

  • table_name (str) – Name of the table.

  • min_last_updated_time (int | None) – Rows must have been last updated after this time. ms since epoch.

  • max_last_updated_time (int | None) – Rows must have been last updated before this time. ms since epoch.

  • columns (list[str] | None) – List of column keys. Set to None for retrieving all, use [] to retrieve only row keys.

  • limit (int | None) – The number of rows to retrieve. Defaults to 25. Set to -1, float(“inf”) or None to return all items.

  • partitions (int | None) –

    Retrieve rows in parallel using this number of workers. Can be used together with a (large) finite limit. When partitions is not passed, it defaults to 1, i.e. no concurrency for a finite limit and global_config.max_workers for an unlimited query (will be capped at this value). To prevent unexpected problems and maximize read throughput, check out concurrency limits in the API documentation.

  • last_updated_time_in_index (bool) – Use a MultiIndex with row keys and last_updated_time as index.

Returns

The requested rows in a pandas dataframe.

Return type

pd.DataFrame

Examples

Get dataframe:

>>> from cognite.client import CogniteClient
>>> client = CogniteClient()
>>> df = client.raw.rows.retrieve_dataframe("db1", "t1", limit=5)

Insert pandas dataframe

RawRowsAPI.insert_dataframe(db_name: str, table_name: str, dataframe: pd.DataFrame, ensure_parent: bool = False, dropna: bool = True) None

Insert pandas dataframe into a table

Uses index for row keys.

Parameters
  • db_name (str) – Name of the database.

  • table_name (str) – Name of the table.

  • dataframe (pd.DataFrame) – The dataframe to insert. Index will be used as row keys.

  • ensure_parent (bool) – Create database/table if they don’t already exist.

  • dropna (bool) – Remove NaNs (but keep None’s in dtype=object columns) before inserting. Done individually per column. Default: True

Examples

Insert new rows into a table:

>>> import pandas as pd
>>> from cognite.client import CogniteClient
>>>
>>> client = CogniteClient()
>>> df = pd.DataFrame(
...     {"col-a": [1, 3, None], "col-b": [2, -1, 9]},
...     index=["r1", "r2", "r3"])
>>> res = client.raw.rows.insert_dataframe(
...     "db1", "table1", df, dropna=True)

RAW Data classes

class cognite.client.data_classes.raw.Database(name: str | None = None, created_time: int | None = None, cognite_client: CogniteClient | None = None)

Bases: DatabaseCore

A NoSQL database to store customer data.

Parameters
  • name (str | None) – Unique name of a database.

  • created_time (int | None) – Time the database was created.

  • cognite_client (CogniteClient | None) – The client to associate with this object.

as_write() DatabaseWrite

Returns this Database as a DatabaseWrite

tables(limit: Optional[int] = None) TableList

Get the tables in this database.

Parameters

limit (int | None) – The number of tables to return.

Returns

List of tables in this database.

Return type

TableList

class cognite.client.data_classes.raw.DatabaseCore(name: Optional[str] = None)

Bases: WriteableCogniteResource[DatabaseWrite], ABC

A NoSQL database to store customer data.

Parameters

name (str | None) – Unique name of a database.

class cognite.client.data_classes.raw.DatabaseList(resources: Iterable[Any], cognite_client: CogniteClient | None = None)

Bases: WriteableCogniteResourceList[DatabaseWrite, Database], NameTransformerMixin

as_write() DatabaseWriteList

Returns this DatabaseList as a DatabaseWriteList

class cognite.client.data_classes.raw.DatabaseWrite(name: str)

Bases: DatabaseCore

A NoSQL database to store customer data.

Parameters

name (str) – Unique name of a database.

as_write() DatabaseWrite

Returns this DatabaseWrite instance.

class cognite.client.data_classes.raw.DatabaseWriteList(resources: Iterable[Any], cognite_client: CogniteClient | None = None)

Bases: CogniteResourceList[DatabaseWrite], NameTransformerMixin

class cognite.client.data_classes.raw.Row(key: str | None = None, columns: dict[str, Any] | None = None, last_updated_time: int | None = None, cognite_client: CogniteClient | None = None)

Bases: RowCore

This represents a row in a NO-SQL table. This is the reading version of the Row class, which is used when retrieving a row.

Parameters
  • key (str | None) – Unique row key

  • columns (dict[str, Any] | None) – Row data stored as a JSON object.

  • last_updated_time (int | None) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.

  • cognite_client (CogniteClient | None) – The client to associate with this object.

as_write() RowWrite

Returns this Row as a RowWrite

class cognite.client.data_classes.raw.RowCore(key: Optional[str] = None, columns: Optional[dict[str, Any]] = None)

Bases: WriteableCogniteResource[RowWrite], ABC

No description.

Parameters
  • key (str | None) – Unique row key

  • columns (dict[str, Any] | None) – Row data stored as a JSON object.

to_pandas() pandas.DataFrame

Convert the instance into a pandas DataFrame.

Returns

The pandas DataFrame representing this instance.

Return type

pandas.DataFrame

class cognite.client.data_classes.raw.RowList(resources: Iterable[Any], cognite_client: CogniteClient | None = None)

Bases: RowListCore[Row]

as_write() RowWriteList

Returns this RowList as a RowWriteList

class cognite.client.data_classes.raw.RowListCore(resources: Iterable[Any], cognite_client: CogniteClient | None = None)

Bases: WriteableCogniteResourceList[RowWrite, T_Row], ABC

to_pandas() pandas.DataFrame

Convert the instance into a pandas DataFrame.

Returns

The pandas DataFrame representing this instance.

Return type

pandas.DataFrame

class cognite.client.data_classes.raw.RowWrite(key: str, columns: dict[str, Any])

Bases: RowCore

This represents a row in a NO-SQL table. This is the writing version of the Row class, which is used when creating a row.

Parameters
  • key (str) – Unique row key

  • columns (dict[str, Any]) – Row data stored as a JSON object.

as_write() RowWrite

Returns this RowWrite instance.

class cognite.client.data_classes.raw.RowWriteList(resources: Iterable[Any], cognite_client: CogniteClient | None = None)

Bases: RowListCore[RowWrite]

class cognite.client.data_classes.raw.Table(name: str | None = None, created_time: int | None = None, cognite_client: CogniteClient | None = None)

Bases: TableCore

A NoSQL database table to store customer data. This is the reading version of the Table class, which is used when retrieving a table.

Parameters
  • name (str | None) – Unique name of the table

  • created_time (int | None) – Time the table was created.

  • cognite_client (CogniteClient | None) – The client to associate with this object.

as_write() TableWrite

Returns this Table as a TableWrite

rows(key: str, limit: int | None = None) cognite.client.data_classes.raw.Row | None
rows(key: None = None, limit: int | None = None) RowList

Get the rows in this table.

Parameters
  • key (str | None) – Specify a key to return only that row.

  • limit (int | None) – The number of rows to return.

Returns

List of tables in this database.

Return type

Row | RowList | None

class cognite.client.data_classes.raw.TableCore(name: Optional[str] = None)

Bases: WriteableCogniteResource[TableWrite]

A NoSQL database table to store customer data

Parameters

name (str | None) – Unique name of the table

class cognite.client.data_classes.raw.TableList(resources: Iterable[Any], cognite_client: CogniteClient | None = None)

Bases: WriteableCogniteResourceList[TableWrite, Table], NameTransformerMixin

as_write() TableWriteList

Returns this TableList as a TableWriteList

class cognite.client.data_classes.raw.TableWrite(name: str)

Bases: TableCore

A NoSQL database table to store customer data This is the writing version of the Table class, which is used when creating a table.

Parameters

name (str) – Unique name of the table

as_write() TableWrite

Returns this TableWrite instance.

class cognite.client.data_classes.raw.TableWriteList(resources: Iterable[Any], cognite_client: CogniteClient | None = None)

Bases: CogniteResourceList[TableWrite], NameTransformerMixin

Extraction pipelines

List extraction pipelines

ExtractionPipelinesAPI.list(limit: int | None = 25) ExtractionPipelineList

List extraction pipelines

Parameters

limit (int | None) – Maximum number of ExtractionPipelines to return. Defaults to 25. Set to -1, float(“inf”) or None to return all items.

Returns

List of requested ExtractionPipelines

Return type

ExtractionPipelineList

Examples

List ExtractionPipelines:

>>> from cognite.client import CogniteClient
>>> client = CogniteClient()
>>> ep_list = client.extraction_pipelines.list(limit=5)

Create extraction pipeline

ExtractionPipelinesAPI.create(extraction_pipeline: cognite.client.data_classes.extractionpipelines.ExtractionPipeline | cognite.client.data_classes.extractionpipelines.ExtractionPipelineWrite) ExtractionPipeline
ExtractionPipelinesAPI.create(extraction_pipeline: collections.abc.Sequence[cognite.client.data_classes.extractionpipelines.ExtractionPipeline] | collections.abc.Sequence[cognite.client.data_classes.extractionpipelines.ExtractionPipelineWrite]) ExtractionPipelineList

Create one or more extraction pipelines.

You can create an arbitrary number of extraction pipelines, and the SDK will split the request into multiple requests if necessary.

Parameters

extraction_pipeline (ExtractionPipeline | ExtractionPipelineWrite | Sequence[ExtractionPipeline] | Sequence[ExtractionPipelineWrite]) – Extraction pipeline or list of extraction pipelines to create.

Returns

Created extraction pipeline(s)

Return type

ExtractionPipeline | ExtractionPipelineList

Examples

Create new extraction pipeline:

>>> from cognite.client import CogniteClient
>>> from cognite.client.data_classes import ExtractionPipelineWrite
>>> client = CogniteClient()
>>> extpipes = [ExtractionPipelineWrite(name="extPipe1",...), ExtractionPipelineWrite(name="extPipe2",...)]
>>> res = client.extraction_pipelines.create(extpipes)

Retrieve an extraction pipeline by ID

ExtractionPipelinesAPI.retrieve(id: Optional[int] = None, external_id: Optional[str] = None) cognite.client.data_classes.extractionpipelines.ExtractionPipeline | None

Retrieve a single extraction pipeline by id.

Parameters
  • id (int | None) – ID

  • external_id (str | None) – External ID

Returns

Requested extraction pipeline or None if it does not exist.

Return type

ExtractionPipeline | None

Examples

Get extraction pipeline by id:

>>> from cognite.client import CogniteClient
>>> client = CogniteClient()
>>> res = client.extraction_pipelines.retrieve(id=1)

Get extraction pipeline by external id:

>>> res = client.extraction_pipelines.retrieve(external_id="1")

Retrieve multiple extraction pipelines by ID

ExtractionPipelinesAPI.retrieve_multiple(ids: Optional[Sequence[int]] = None, external_ids: Optional[SequenceNotStr[str]] = None, ignore_unknown_ids: bool = False) ExtractionPipelineList

Retrieve multiple extraction pipelines by ids and external ids.

Parameters
  • ids (Sequence[int] | None) – IDs

  • external_ids (SequenceNotStr[str] | None) – External IDs

  • ignore_unknown_ids (bool) – Ignore IDs and external IDs that are not found rather than throw an exception.

Returns

The requested ExtractionPipelines.

Return type

ExtractionPipelineList

Examples

Get ExtractionPipelines by id:

>>> from cognite.client import CogniteClient
>>> client = CogniteClient()
>>> res = client.extraction_pipelines.retrieve_multiple(ids=[1, 2, 3])

Get assets by external id:

>>> res = client.extraction_pipelines.retrieve_multiple(external_ids=["abc", "def"], ignore_unknown_ids=True)

Update extraction pipelines

ExtractionPipelinesAPI.update(item: cognite.client.data_classes.extractionpipelines.ExtractionPipeline | cognite.client.data_classes.extractionpipelines.ExtractionPipelineWrite | cognite.client.data_classes.extractionpipelines.ExtractionPipelineUpdate) ExtractionPipeline
ExtractionPipelinesAPI.update(item: Sequence[cognite.client.data_classes.extractionpipelines.ExtractionPipeline | cognite.client.data_classes.extractionpipelines.ExtractionPipelineWrite | cognite.client.data_classes.extractionpipelines.ExtractionPipelineUpdate]) ExtractionPipelineList

Update one or more extraction pipelines

Parameters
Returns

Updated extraction pipeline(s)

Return type

ExtractionPipeline | ExtractionPipelineList

Examples

Update an extraction pipeline that you have fetched. This will perform a full update of the extraction pipeline:

>>> from cognite.client import CogniteClient
>>> from cognite.client.data_classes import ExtractionPipelineUpdate
>>> client = CogniteClient()
>>> update = ExtractionPipelineUpdate(id=1)
>>> update.description.set("Another new extpipe")
>>> res = client.extraction_pipelines.update(update)

Delete extraction pipelines

ExtractionPipelinesAPI.delete(id: Optional[Union[int, Sequence[int]]] = None, external_id: Optional[Union[str, SequenceNotStr[str]]] = None) None

Delete one or more extraction pipelines

Parameters
  • id (int | Sequence[int] | None) – Id or list of ids

  • external_id (str | SequenceNotStr[str] | None) – External ID or list of external ids

Examples

Delete extraction pipelines by id or external id:

>>> from cognite.client import CogniteClient
>>> client = CogniteClient()
>>> client.extraction_pipelines.delete(id=[1,2,3], external_id="3")

Extraction pipeline runs

List runs for an extraction pipeline

ExtractionPipelineRunsAPI.list(external_id: str, statuses: Optional[Union[Literal['success', 'failure', 'seen'], Sequence[Literal['success', 'failure', 'seen']], SequenceNotStr[str]]] = None, message_substring: Optional[str] = None, created_time: Optional[Union[dict[str, Any], TimestampRange, str]] = None, limit: int | None = 25) ExtractionPipelineRunList

List runs for an extraction pipeline with given external_id

Parameters
  • external_id (str) – Extraction pipeline external Id.

  • statuses (RunStatus | Sequence[RunStatus] | SequenceNotStr[str] | None) – One or more among “success” / “failure” / “seen”.

  • message_substring (str | None) – Failure message part.

  • created_time (dict[str, Any] | TimestampRange | str | None) – Range between two timestamps. Possible keys are min and max, with values given as timestamps in ms. If a string is passed, it is assumed to be the minimum value.

  • limit (int | None) – Maximum number of ExtractionPipelines to return. Defaults to 25. Set to -1, float(“inf”) or None to return all items.

Returns

List of requested extraction pipeline runs

Return type

ExtractionPipelineRunList

Tip

The created_time parameter can also be passed as a string, to support the most typical usage pattern of fetching the most recent runs, meaning it is implicitly assumed to be the minimum created time. The format is “N[timeunit]-ago”, where timeunit is w,d,h,m (week, day, hour, minute), e.g. “12d-ago”.

Examples

List extraction pipeline runs:

>>> from cognite.client import CogniteClient
>>> client = CogniteClient()
>>> runsList = client.extraction_pipelines.runs.list(external_id="test ext id", limit=5)

Filter extraction pipeline runs on a given status:

>>> runs_list = client.extraction_pipelines.runs.list(external_id="test ext id", statuses=["seen"], limit=5)

Get all failed pipeline runs in the last 24 hours for pipeline ‘extId’:

>>> from cognite.client.data_classes import ExtractionPipelineRun
>>> res = client.extraction_pipelines.runs.list(external_id="extId", statuses="failure", created_time="24h-ago")

Report new runs

ExtractionPipelineRunsAPI.create(run: cognite.client.data_classes.extractionpipelines.ExtractionPipelineRun | cognite.client.data_classes.extractionpipelines.ExtractionPipelineRunWrite) ExtractionPipelineRun
ExtractionPipelineRunsAPI.create(run: collections.abc.Sequence[cognite.client.data_classes.extractionpipelines.ExtractionPipelineRun] | collections.abc.Sequence[cognite.client.data_classes.extractionpipelines.ExtractionPipelineRunWrite]) ExtractionPipelineRunList

Create one or more extraction pipeline runs.

You can create an arbitrary number of extraction pipeline runs, and the SDK will split the request into multiple requests.

Parameters

run (ExtractionPipelineRun | ExtractionPipelineRunWrite | Sequence[ExtractionPipelineRun] | Sequence[ExtractionPipelineRunWrite]) – ExtractionPipelineRun| ExtractionPipelineRunWrite | Sequence[ExtractionPipelineRun] | Sequence[ExtractionPipelineRunWrite]): Extraction pipeline or list of extraction pipeline runs to create.

Returns

Created extraction pipeline run(s)

Return type

ExtractionPipelineRun | ExtractionPipelineRunList

Examples

Report a new extraction pipeline run:

>>> from cognite.client import CogniteClient
>>> from cognite.client.data_classes import ExtractionPipelineRunWrite
>>> client = CogniteClient()
>>> res = client.extraction_pipelines.runs.create(
...     ExtractionPipelineRunWrite(status="success", extpipe_external_id="extId"))

Extraction pipeline configs

Get the latest or a specific config revision

ExtractionPipelineConfigsAPI.retrieve(external_id: str, revision: Optional[int] = None, active_at_time: Optional[int] = None) ExtractionPipelineConfig

Retrieve a specific configuration revision, or the latest by default <https://developer.cognite.com/api#tag/Extraction-Pipelines-Config/operation/getExtPipeConfigRevision>

By default the latest configuration revision is retrieved, or you can specify a timestamp or a revision number.

Parameters
  • external_id (str) – External id of the extraction pipeline to retrieve config from.

  • revision (int | None) – Optionally specify a revision number to retrieve.

  • active_at_time (int | None) – Optionally specify a timestamp the configuration revision should be active.

Returns

Retrieved extraction pipeline configuration revision

Return type

ExtractionPipelineConfig

Examples

Retrieve latest config revision:

>>> from cognite.client import CogniteClient
>>> client = CogniteClient()
>>> res = client.extraction_pipelines.config.retrieve("extId")

List configuration revisions

ExtractionPipelineConfigsAPI.list(external_id: str) ExtractionPipelineConfigRevisionList

Retrieve all configuration revisions from an extraction pipeline <https://developer.cognite.com/api#tag/Extraction-Pipelines-Config/operation/listExtPipeConfigRevisions>

Parameters

external_id (str) – External id of the extraction pipeline to retrieve config from.

Returns

Retrieved extraction pipeline configuration revisions

Return type

ExtractionPipelineConfigRevisionList

Examples

Retrieve a list of config revisions:

>>> from cognite.client import CogniteClient
>>> client = CogniteClient()
>>> res = client.extraction_pipelines.config.list("extId")

Create a config revision

ExtractionPipelineConfigsAPI.create(config: cognite.client.data_classes.extractionpipelines.ExtractionPipelineConfig | cognite.client.data_classes.extractionpipelines.ExtractionPipelineConfigWrite) ExtractionPipelineConfig

Create a new configuration revision <https://developer.cognite.com/api#tag/Extraction-Pipelines-Config/operation/createExtPipeConfig>

Parameters

config (ExtractionPipelineConfig | ExtractionPipelineConfigWrite) – Configuration revision to create.

Returns

Created extraction pipeline configuration revision

Return type

ExtractionPipelineConfig

Examples

Create a config revision:

>>> from cognite.client import CogniteClient
>>> from cognite.client.data_classes import ExtractionPipelineConfigWrite
>>> client = CogniteClient()
>>> res = client.extraction_pipelines.config.create(ExtractionPipelineConfigWrite(external_id="extId", config="my config contents"))

Revert to an earlier config revision

ExtractionPipelineConfigsAPI.revert(external_id: str, revision: int) ExtractionPipelineConfig

Revert to a previous configuration revision <https://developer.cognite.com/api#tag/Extraction-Pipelines-Config/operation/revertExtPipeConfigRevision>

Parameters
  • external_id (str) – External id of the extraction pipeline to revert revision for.

  • revision (int) – Revision to revert to.

Returns

New latest extraction pipeline configuration revision.

Return type

ExtractionPipelineConfig

Examples

Revert a config revision:

>>> from cognite.client import CogniteClient
>>> client = CogniteClient()
>>> res = client.extraction_pipelines.config.revert("extId", 5)

Extractor Config Data classes

class cognite.client.data_classes.extractionpipelines.ExtractionPipeline(id: int | None = None, external_id: str | None = None, name: str | None = None, description: str | None = None, data_set_id: int | None = None, raw_tables: list[dict[str, str]] | None = None, last_success: int | None = None, last_failure: int | None = None, last_message: str | None = None, last_seen: int | None = None, schedule: str | None = None, contacts: list[ExtractionPipelineContact] | None = None, metadata: dict[str, str] | None = None, source: str | None = None, documentation: str | None = None, notification_config: ExtractionPipelineNotificationConfiguration | None = None, created_time: int | None = None, last_updated_time: int | None = None, created_by: str | None = None, cognite_client: CogniteClient | None = None)

Bases: ExtractionPipelineCore

An extraction pipeline is a representation of a process writing data to CDF, such as an extractor or an ETL tool. This is the reading version of the ExtractionPipeline class, which is used when retrieving extraction pipelines.

Parameters
  • id (int | None) – A server-generated ID for the object.

  • external_id (str | None) – The external ID provided by the client. Must be unique for the resource type.

  • name (str | None) – The name of the extraction pipeline.

  • description (str | None) – The description of the extraction pipeline.

  • data_set_id (int | None) – The id of the dataset this extraction pipeline related with.

  • raw_tables (list[dict[str, str]] | None) – list of raw tables in list format: [{“dbName”: “value”, “tableName” : “value”}].

  • last_success (int | None) – Milliseconds value of last success status.

  • last_failure (int | None) – Milliseconds value of last failure status.

  • last_message (str | None) – Message of last failure.

  • last_seen (int | None) – Milliseconds value of last seen status.

  • schedule (str | None) – One of None/On trigger/Continuous/cron regex.

  • contacts (list[ExtractionPipelineContact] | None) – list of contacts

  • metadata (dict[str, str] | None) – Custom, application specific metadata. String key -> String value. Limits: Maximum length of key is 128 bytes, value 10240 bytes, up to 256 key-value pairs, of total size at most 10240.

  • source (str | None) – Source text value for extraction pipeline.

  • documentation (str | None) – Documentation text value for extraction pipeline.

  • notification_config (ExtractionPipelineNotificationConfiguration | None) – Notification configuration for the extraction pipeline.

  • created_time (int | None) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.

  • last_updated_time (int | None) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.

  • created_by (str | None) – Extraction pipeline creator, usually an email.

  • cognite_client (CogniteClient | None) – The client to associate with this object.

as_write() ExtractionPipelineWrite

Returns this ExtractionPipeline as a ExtractionPipelineWrite

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineConfig(external_id: str | None = None, config: str | None = None, revision: int | None = None, description: str | None = None, created_time: int | None = None, cognite_client: CogniteClient | None = None)

Bases: ExtractionPipelineConfigCore

An extraction pipeline config

Parameters
  • external_id (str | None) – The external ID of the associated extraction pipeline.

  • config (str | None) – Contents of this configuration revision.

  • revision (int | None) – The revision number of this config as a positive integer.

  • description (str | None) – Short description of this configuration revision.

  • created_time (int | None) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.

  • cognite_client (CogniteClient | None) – The client to associate with this object.

as_write() ExtractionPipelineConfigWrite

Returns this ExtractionPipelineConfig as a ExtractionPipelineConfigWrite

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineConfigCore(external_id: Optional[str] = None, config: Optional[str] = None, description: Optional[str] = None)

Bases: WriteableCogniteResource[ExtractionPipelineConfigWrite], ABC

An extraction pipeline config

Parameters
  • external_id (str | None) – The external ID of the associated extraction pipeline.

  • config (str | None) – Contents of this configuration revision.

  • description (str | None) – Short description of this configuration revision.

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineConfigList(resources: Iterable[Any], cognite_client: CogniteClient | None = None)

Bases: WriteableCogniteResourceList[ExtractionPipelineConfigWrite, ExtractionPipelineConfig], ExternalIDTransformerMixin

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineConfigRevision(external_id: str | None = None, revision: int | None = None, description: str | None = None, created_time: int | None = None, cognite_client: CogniteClient | None = None)

Bases: CogniteResource

An extraction pipeline config revision

Parameters
  • external_id (str | None) – The external ID of the associated extraction pipeline.

  • revision (int | None) – The revision number of this config as a positive integer.

  • description (str | None) – Short description of this configuration revision.

  • created_time (int | None) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.

  • cognite_client (CogniteClient | None) – The client to associate with this object.

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineConfigRevisionList(resources: Iterable[Any], cognite_client: CogniteClient | None = None)

Bases: CogniteResourceList[ExtractionPipelineConfigRevision], ExternalIDTransformerMixin

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineConfigWrite(external_id: str, config: Optional[str] = None, description: Optional[str] = None)

Bases: ExtractionPipelineConfigCore

An extraction pipeline config

Parameters
  • external_id (str) – The external ID of the associated extraction pipeline.

  • config (str | None) – Contents of this configuration revision.

  • description (str | None) – Short description of this configuration revision.

as_write() ExtractionPipelineConfigWrite

Returns this ExtractionPipelineConfigWrite instance.

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineConfigWriteList(resources: Iterable[Any], cognite_client: CogniteClient | None = None)

Bases: CogniteResourceList[ExtractionPipelineConfigWrite], ExternalIDTransformerMixin

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineContact(name: Optional[str] = None, email: Optional[str] = None, role: Optional[str] = None, send_notification: Optional[bool] = None)

Bases: CogniteObject

A contact for an extraction pipeline

Parameters
  • name (str | None) – Name of contact

  • email (str | None) – Email address of contact

  • role (str | None) – Role of contact, such as Owner, Maintainer, etc.

  • send_notification (bool | None) – Whether to send notifications to this contact or not

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineCore(external_id: Optional[str] = None, name: Optional[str] = None, description: Optional[str] = None, data_set_id: Optional[int] = None, raw_tables: Optional[list[dict[str, str]]] = None, schedule: Optional[str] = None, contacts: Optional[list[cognite.client.data_classes.extractionpipelines.ExtractionPipelineContact]] = None, metadata: Optional[dict[str, str]] = None, source: Optional[str] = None, documentation: Optional[str] = None, notification_config: Optional[ExtractionPipelineNotificationConfiguration] = None, created_by: Optional[str] = None)

Bases: WriteableCogniteResource[ExtractionPipelineWrite], ABC

An extraction pipeline is a representation of a process writing data to CDF, such as an extractor or an ETL tool.

Parameters
  • external_id (str | None) – The external ID provided by the client. Must be unique for the resource type.

  • name (str | None) – The name of the extraction pipeline.

  • description (str | None) – The description of the extraction pipeline.

  • data_set_id (int | None) – The id of the dataset this extraction pipeline related with.

  • raw_tables (list[dict[str, str]] | None) – list of raw tables in list format: [{“dbName”: “value”, “tableName” : “value”}].

  • schedule (str | None) – One of None/On trigger/Continuous/cron regex.

  • contacts (list[ExtractionPipelineContact] | None) – list of contacts

  • metadata (dict[str, str] | None) – Custom, application specific metadata. String key -> String value. Limits: Maximum length of key is 128 bytes, value 10240 bytes, up to 256 key-value pairs, of total size at most 10240.

  • source (str | None) – Source text value for extraction pipeline.

  • documentation (str | None) – Documentation text value for extraction pipeline.

  • notification_config (ExtractionPipelineNotificationConfiguration | None) – Notification configuration for the extraction pipeline.

  • created_by (str | None) – Extraction pipeline creator, usually an email.

dump(camel_case: bool = True) dict[str, Any]

Dump the instance into a json serializable Python data type.

Parameters

camel_case (bool) – Use camelCase for attribute names. Defaults to True.

Returns

A dictionary representation of the instance.

Return type

dict[str, Any]

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineList(resources: Iterable[Any], cognite_client: CogniteClient | None = None)

Bases: WriteableCogniteResourceList[ExtractionPipelineWrite, ExtractionPipeline], IdTransformerMixin

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineNotificationConfiguration(allowed_not_seen_range_in_minutes: Optional[int] = None)

Bases: CogniteObject

Extraction pipeline notification configuration

Parameters

allowed_not_seen_range_in_minutes (int | None) – Time in minutes to pass without any Run. Null if extraction pipeline is not checked.

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineRun(extpipe_external_id: str | None = None, status: str | None = None, message: str | None = None, created_time: int | None = None, cognite_client: CogniteClient | None = None, id: int | None = None)

Bases: ExtractionPipelineRunCore

A representation of an extraction pipeline run.

Parameters
  • extpipe_external_id (str | None) – The external ID of the extraction pipeline.

  • status (str | None) – success/failure/seen.

  • message (str | None) – Optional status message.

  • created_time (int | None) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.

  • cognite_client (CogniteClient | None) – The client to associate with this object.

  • id (int | None) – A server-generated ID for the object.

as_write() ExtractionPipelineRunWrite

Returns this ExtractionPipelineRun as a ExtractionPipelineRunWrite

dump(camel_case: bool = True) dict[str, Any]

Dump the instance into a json serializable Python data type.

Parameters

camel_case (bool) – Use camelCase for attribute names. Defaults to True.

Returns

A dictionary representation of the instance.

Return type

dict[str, Any]

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineRunCore(status: Optional[str] = None, message: Optional[str] = None, created_time: Optional[int] = None)

Bases: WriteableCogniteResource[ExtractionPipelineRunWrite], ABC

A representation of an extraction pipeline run.

Parameters
  • status (str | None) – success/failure/seen.

  • message (str | None) – Optional status message.

  • created_time (int | None) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineRunFilter(external_id: Optional[str] = None, statuses: Optional[SequenceNotStr[str]] = None, message: Optional[StringFilter] = None, created_time: Optional[Union[dict[str, Any], TimestampRange]] = None)

Bases: CogniteFilter

Filter runs with exact matching

Parameters
  • external_id (str | None) – The external ID of related ExtractionPipeline provided by the client. Must be unique for the resource type.

  • statuses (SequenceNotStr[str] | None) – success/failure/seen.

  • message (StringFilter | None) – message filter.

  • created_time (dict[str, Any] | TimestampRange | None) – Range between two timestamps.

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineRunList(resources: Iterable[Any], cognite_client: CogniteClient | None = None)

Bases: WriteableCogniteResourceList[ExtractionPipelineRunWrite, ExtractionPipelineRun], IdTransformerMixin

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineRunWrite(extpipe_external_id: str, status: Literal['success', 'failure', 'seen'], message: Optional[str] = None, created_time: Optional[int] = None)

Bases: ExtractionPipelineRunCore

A representation of an extraction pipeline run. This is the writing version of the ExtractionPipelineRun class, which is used when creating extraction pipeline runs.

Parameters
  • extpipe_external_id (str) – The external ID of the extraction pipeline.

  • status (Literal['success', 'failure', 'seen']) – success/failure/seen.

  • message (str | None) – Optional status message.

  • created_time (int | None) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.

as_write() ExtractionPipelineRunWrite

Returns this ExtractionPipelineRunWrite instance.

dump(camel_case: bool = True) dict[str, Any]

Dump the instance into a json serializable Python data type.

Parameters

camel_case (bool) – Use camelCase for attribute names. Defaults to True.

Returns

A dictionary representation of the instance.

Return type

dict[str, Any]

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineRunWriteList(resources: Iterable[Any], cognite_client: CogniteClient | None = None)

Bases: CogniteResourceList[ExtractionPipelineRunWrite]

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineUpdate(id: Optional[int] = None, external_id: Optional[str] = None)

Bases: CogniteUpdate

Changes applied to an extraction pipeline

Parameters
  • id (int) – A server-generated ID for the object.

  • external_id (str) – The external ID provided by the client. Must be unique for the resource type.

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineWrite(external_id: str, name: str, data_set_id: int, description: Optional[str] = None, raw_tables: Optional[list[dict[str, str]]] = None, schedule: Optional[str] = None, contacts: Optional[list[cognite.client.data_classes.extractionpipelines.ExtractionPipelineContact]] = None, metadata: Optional[dict[str, str]] = None, source: Optional[str] = None, documentation: Optional[str] = None, notification_config: Optional[ExtractionPipelineNotificationConfiguration] = None, created_by: Optional[str] = None)

Bases: ExtractionPipelineCore

An extraction pipeline is a representation of a process writing data to CDF, such as an extractor or an ETL tool. This is the writing version of the ExtractionPipeline class, which is used when creating extraction pipelines.

Parameters
  • external_id (str) – The external ID provided by the client. Must be unique for the resource type.

  • name (str) – The name of the extraction pipeline.

  • data_set_id (int) – The id of the dataset this extraction pipeline related with.

  • description (str | None) – The description of the extraction pipeline.

  • raw_tables (list[dict[str, str]] | None) – list of raw tables in list format: [{“dbName”: “value”, “tableName” : “value”}].

  • schedule (str | None) – One of None/On trigger/Continuous/cron regex.

  • contacts (list[ExtractionPipelineContact] | None) – list of contacts

  • metadata (dict[str, str] | None) – Custom, application specific metadata. String key -> String value. Limits: Maximum length of key is 128 bytes, value 10240 bytes, up to 256 key-value pairs, of total size at most 10240.

  • source (str | None) – Source text value for extraction pipeline.

  • documentation (str | None) – Documentation text value for extraction pipeline.

  • notification_config (ExtractionPipelineNotificationConfiguration | None) – Notification configuration for the extraction pipeline.

  • created_by (str | None) – Extraction pipeline creator, usually an email.

as_write() ExtractionPipelineWrite

Returns this ExtractionPipelineWrite instance.

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineWriteList(resources: Iterable[Any], cognite_client: CogniteClient | None = None)

Bases: CogniteResourceList[ExtractionPipelineWrite], ExternalIDTransformerMixin

class cognite.client.data_classes.extractionpipelines.StringFilter(substring: Optional[str] = None)

Bases: CogniteFilter

Filter runs on substrings of the message

Parameters

substring (str | None) – Part of message