Data Ingestion

Raw

Databases

List databases

async RawDatabasesAPI.list( limit: int | None = 25, ) → DatabaseList

Parameters:: limit (int | None) – Maximum number of databases to return. Defaults to 25. Set to -1, float(“inf”) or None to return all items.
Returns:: List of requested databases.
Return type:: DatabaseList

Examples

List the first 5 databases:

>>> from cognite.client import CogniteClient, AsyncCogniteClient
>>> client = CogniteClient()
>>> # async_client = AsyncCogniteClient()  # another option
>>> db_list = client.raw.databases.list(limit=5)

Iterate over databases, one-by-one:

>>> for db in client.raw.databases():
...     db  # do something with the db

Iterate over chunks of databases to reduce memory load:

>>> for db_list in client.raw.databases(chunk_size=2500):
...     db_list # do something with the dbs

Create new databases

async RawDatabasesAPI.create( name: str | list[str], ) → Database | DatabaseList

Create one or more databases.

Parameters:: name (str | list[str]) – A db name or list of db names to create.
Returns:: Database or list of databases that has been created.
Return type:: Database | DatabaseList

Examples

Create a new database:

>>> from cognite.client import CogniteClient, AsyncCogniteClient
>>> client = CogniteClient()
>>> # async_client = AsyncCogniteClient()  # another option
>>> res = client.raw.databases.create("db1")

Delete databases

async RawDatabasesAPI.delete( name: str | SequenceNotStr[str], recursive: bool = False, ) → None

Delete one or more databases.

Parameters:

name (str | SequenceNotStr[str]) – A db name or list of db names to delete.
recursive (bool) – Recursively delete all tables in the database(s).

Examples

Delete a list of databases:

>>> from cognite.client import CogniteClient, AsyncCogniteClient
>>> client = CogniteClient()
>>> # async_client = AsyncCogniteClient()  # another option
>>> client.raw.databases.delete(["db1", "db2"])

Tables

List tables in a database

async RawTablesAPI.list( db_name: str, limit: int | None = 25, ) → TableList

List tables

Parameters:

db_name (str) – The database to list tables from.
limit (int | None) – Maximum number of tables to return. Defaults to 25. Set to -1, float(“inf”) or None to return all items.

Returns:

List of requested tables.

Return type:

raw.TableList

Examples

List the first 5 tables:

>>> from cognite.client import CogniteClient, AsyncCogniteClient
>>> client = CogniteClient()
>>> # async_client = AsyncCogniteClient()  # another option
>>> table_list = client.raw.tables.list("db1", limit=5)

Iterate over tables, one-by-one:

>>> for table in client.raw.tables(db_name="db1"):
...     table  # do something with the table

Iterate over chunks of tables to reduce memory load:

>>> for table_list in client.raw.tables(db_name="db1", chunk_size=25):
...     table_list # do something with the tables

Create new tables in a database

async RawTablesAPI.create( db_name: str, name: str | list[str], ) → Table | TableList

Create one or more tables.

Parameters:

db_name (str) – Database to create the tables in.
name (str | list[str]) – A table name or list of table names to create.

Returns:

raw.Table or list of tables that has been created.

Return type:

raw.Table | raw.TableList

Examples

Create a new table in a database:

>>> from cognite.client import CogniteClient, AsyncCogniteClient
>>> client = CogniteClient()
>>> # async_client = AsyncCogniteClient()  # another option
>>> res = client.raw.tables.create("db1", "table1")

Delete tables from a database

async RawTablesAPI.delete( db_name: str, name: str | SequenceNotStr[str], ) → None

Delete one or more tables.

Parameters:

db_name (str) – Database to delete tables from.
name (str | SequenceNotStr[str]) – A table name or list of table names to delete.

Examples

Delete a list of tables:

>>> from cognite.client import CogniteClient, AsyncCogniteClient
>>> client = CogniteClient()
>>> # async_client = AsyncCogniteClient()  # another option
>>> res = client.raw.tables.delete("db1", ["table1", "table2"])

Rows

Get a row from a table

async RawRowsAPI.retrieve( db_name: str, table_name: str, key: str, ) → Row | None

Retrieve a single row by key.

Parameters:

db_name (str) – Name of the database.
table_name (str) – Name of the table.
key (str) – The key of the row to retrieve.

Returns:

The requested row.

Return type:

Row | None

Examples

Retrieve a row with key ‘k1’ from table ‘t1’ in database ‘db1’:

>>> from cognite.client import CogniteClient, AsyncCogniteClient
>>> client = CogniteClient()
>>> # async_client = AsyncCogniteClient()  # another option
>>> row = client.raw.rows.retrieve("db1", "t1", "k1")

You may access the data directly on the row (like a dict), or use ‘.get’ when keys can be missing:

>>> val1 = row["col1"]
>>> val2 = row.get("col2")

List rows in a table

List rows in a table.

Parameters:

db_name (str) – Name of the database.
table_name (str) – Name of the table.
min_last_updated_time (int | None) – Rows must have been last updated after this time (exclusive). Milliseconds since epoch.
max_last_updated_time (int | None) – Rows must have been last updated before this time (inclusive). Milliseconds since epoch.
columns (list[str] | None) – List of column keys. Set to None to retrieving all, use empty list, [], to retrieve only row keys.
limit (int | None) – The number of rows to retrieve. Can be used with partitions. Defaults to 25. Set to -1, float(“inf”) or None to return all items.
partitions (int | None) – Retrieve rows in parallel using this number of workers. Can be used together with a (large) finite limit. When partitions is not passed, it defaults to 1, i.e. no concurrency for a finite limit and global_config.concurrency_settings.raw.read for an unlimited query (will be capped at this value). To prevent unexpected problems and maximize read throughput, check out concurrency limits in the API documentation.

Returns:

The requested rows.

Return type:

RowList

Examples

List a few rows:

>>> from cognite.client import CogniteClient, AsyncCogniteClient
>>> client = CogniteClient()
>>> # async_client = AsyncCogniteClient()  # another option
>>> row_list = client.raw.rows.list("db1", "tbl1", limit=5)

Read an entire table efficiently by using concurrency (default behavior when limit=None):

>>> row_list = client.raw.rows.list("db1", "tbl1", limit=None)

Iterate through all rows one-by-one to reduce memory load (no concurrency used):

>>> for row in client.raw.rows("db1", "t1", columns=["col1","col2"]):
...     val1 = row["col1"]  # You may access the data directly
...     val2 = row.get("col2")  # ...or use '.get' when keys can be missing

Iterate through all rows, one chunk at a time, to reduce memory load (no concurrency used):

>>> for row_list in client.raw.rows("db1", "t1", chunk_size=2500):
...     row_list  # Do something with the rows

Iterate through a massive table to reduce memory load while using concurrency for high throughput. Note: partitions must be specified for concurrency to be used (this is different from list() to keep backward compatibility). Supplying a finite limit does not affect concurrency settings (except for very small values).

>>> rows_iterator = client.raw.rows(
...     db_name="db1", table_name="t1", partitions=5, chunk_size=5000, limit=1_000_000
... )
>>> for row_list in rows_iterator:
...     row_list  # Do something with the rows

Insert rows into a table

async RawRowsAPI.insert( db_name: str, table_name: str, row: Sequence[Row] | Sequence[RowWrite] | Row | RowWrite | dict, ensure_parent: bool = False, ) → None

Insert one or more rows into a table.

Parameters:

db_name (str) – Name of the database.
table_name (str) – Name of the table.
row (Sequence[Row] | Sequence[RowWrite] | Row | RowWrite | dict) – The row(s) to insert
ensure_parent (bool) – Create database/table if they don’t already exist.

Examples

Insert new rows into a table:

>>> from cognite.client import CogniteClient
>>> from cognite.client.data_classes import RowWrite
>>> client = CogniteClient()
>>> # async_client = AsyncCogniteClient()  # another option
>>> rows = [RowWrite(key="r1", columns={"col1": "val1", "col2": "val1"}),
...         RowWrite(key="r2", columns={"col1": "val2", "col2": "val2"})]
>>> client.raw.rows.insert("db1", "table1", rows)

You may also insert a dictionary directly:

>>> rows = {
...     "key-1": {"col1": 1, "col2": 2},
...     "key-2": {"col1": 3, "col2": 4, "col3": "high five"},
... }
>>> client.raw.rows.insert("db1", "table1", rows)

Delete rows from a table

async RawRowsAPI.delete( db_name: str, table_name: str, key: str | SequenceNotStr[str], ) → None

Delete rows from a table.

Parameters:

db_name (str) – Name of the database.
table_name (str) – Name of the table.
key (str | SequenceNotStr[str]) – The key(s) of the row(s) to delete.

Examples

Delete rows from table:

>>> from cognite.client import CogniteClient, AsyncCogniteClient
>>> client = CogniteClient()
>>> # async_client = AsyncCogniteClient()  # another option
>>> keys_to_delete = ["k1", "k2", "k3"]
>>> client.raw.rows.delete("db1", "table1", keys_to_delete)

Retrieve pandas dataframe

async RawRowsAPI.retrieve_dataframe( db_name: str, table_name: str, min_last_updated_time: int | None = None, max_last_updated_time: int | None = None, columns: list[str] | None = None, limit: int | None = 25, partitions: int | None = None, last_updated_time_in_index: bool = False, infer_dtypes: bool = True, ) → pd.DataFrame

Retrieve rows in a table as a pandas dataframe.

Rowkeys are used as the index.

Parameters:

db_name (str) – Name of the database.
table_name (str) – Name of the table.
min_last_updated_time (int | None) – Rows must have been last updated after this time. Milliseconds since epoch.
max_last_updated_time (int | None) – Rows must have been last updated before this time. Milliseconds since epoch.
columns (list[str] | None) – List of column keys. Set to None to retrieving all, use empty list, [], to retrieve only row keys.
limit (int | None) – The number of rows to retrieve. Defaults to 25. Set to -1, float(“inf”) or None to return all items.
partitions (int | None) –
Retrieve rows in parallel using this number of workers. Can be used together with a (large) finite limit. When partitions is not passed, it defaults to 1, i.e. no concurrency for a finite limit and global_config.concurrency_settings.raw.read for an unlimited query (will be capped at this value). To prevent unexpected problems and maximize read throughput, check out concurrency limits in the API documentation.
last_updated_time_in_index (bool) – Use a MultiIndex with row keys and last_updated_time as index.
infer_dtypes (bool) – If True, pandas will try to infer dtypes of the columns. Defaults to True.

Returns:

The requested rows in a pandas dataframe.

Return type:

pd.DataFrame

Examples

Get dataframe:

>>> from cognite.client import CogniteClient, AsyncCogniteClient
>>> client = CogniteClient()
>>> # async_client = AsyncCogniteClient()  # another option
>>> df = client.raw.rows.retrieve_dataframe("db1", "t1", limit=5)

Insert pandas dataframe

async RawRowsAPI.insert_dataframe( db_name: str, table_name: str, dataframe: pd.DataFrame, ensure_parent: bool = False, dropna: bool = True, ) → None

Insert pandas dataframe into a table

Uses index for row keys.

Parameters:

db_name (str) – Name of the database.
table_name (str) – Name of the table.
dataframe (pd.DataFrame) – The dataframe to insert. Index will be used as row keys.
ensure_parent (bool) – Create database/table if they don’t already exist.
dropna (bool) – Remove NaNs (but keep None’s in dtype=object columns) before inserting. Done individually per column. Default: True

Examples

Insert new rows into a table:

>>> import pandas as pd
>>> from cognite.client import CogniteClient
>>>
>>> client = CogniteClient()
>>> # async_client = AsyncCogniteClient()  # another option
>>> df = pd.DataFrame(
...     {"col-a": [1, 3, None], "col-b": [2, -1, 9]},
...     index=["r1", "r2", "r3"])
>>> res = client.raw.rows.insert_dataframe(
...     "db1", "table1", df, dropna=True)

RAW Data classes

class cognite.client.data_classes.raw.Database(name: str, created_time: int | None)

Bases: WriteableCogniteResourceWithClientRef[DatabaseWrite]

A NoSQL database to store customer data.

Parameters:

name (str) – Unique name of a database.
created_time (int | None) – Time the database was created.

as_write() → DatabaseWrite: Returns this Database as a DatabaseWrite

tables( limit: int | None = None, ) → TableList

Get the tables in this database.

Parameters:: limit (int | None) – The number of tables to return.
Returns:: List of tables in this database.
Return type:: TableList

async tables_async( limit: int | None = None, ) → TableList

Get the tables in this database.

Parameters:: limit (int | None) – The number of tables to return.
Returns:: List of tables in this database.
Return type:: TableList

class cognite.client.data_classes.raw.DatabaseList( resources: Sequence[T_CogniteResource], )

Bases: WriteableCogniteResourceList[DatabaseWrite, Database], NameTransformerMixin

as_write() → DatabaseWriteList: Returns this DatabaseList as a DatabaseWriteList

class cognite.client.data_classes.raw.DatabaseWrite(name: str)

Bases: WriteableCogniteResource[DatabaseWrite]

A NoSQL database to store customer data.

Parameters:: name (str) – Unique name of a database.

as_write() → DatabaseWrite: Returns this DatabaseWrite instance.

class cognite.client.data_classes.raw.DatabaseWriteList( resources: Sequence[T_CogniteResource], ): Bases: CogniteResourceList[DatabaseWrite], NameTransformerMixin

class cognite.client.data_classes.raw.Row(key: str, columns: dict[str, Any], last_updated_time: int)

Bases: RowCore

This represents a row in a NO-SQL table. This is the read version of the Row class, which is used when retrieving a row.

Parameters:

key (str) – Unique row key
columns (dict[str, Any]) – Row data stored as a JSON object.
last_updated_time (int) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.

as_write() → RowWrite: Returns this Row as a RowWrite

class cognite.client.data_classes.raw.RowCore(key: str, columns: dict[str, Any])

Bases: WriteableCogniteResource[RowWrite], ABC

No description.

Parameters:

key (str) – Unique row key
columns (dict[str, Any]) – Row data stored as a JSON object.

to_pandas() → pandas.DataFrame

Convert the instance into a pandas DataFrame.

Returns:: The pandas DataFrame representing this instance.
Return type:: pandas.DataFrame

class cognite.client.data_classes.raw.RowList( resources: Sequence[T_CogniteResource], )

Bases: RowListCore[Row]

as_write() → RowWriteList: Returns this RowList as a RowWriteList

class cognite.client.data_classes.raw.RowListCore( resources: Sequence[T_CogniteResource], )

Bases: WriteableCogniteResourceList[RowWrite, T_Row], ABC

to_pandas() → pandas.DataFrame

Convert the instance into a pandas DataFrame.

Returns:: The pandas DataFrame representing this instance.
Return type:: pandas.DataFrame

class cognite.client.data_classes.raw.RowWrite(key: str, columns: dict[str, Any])

Bases: RowCore

This represents a row in a NO-SQL table. This is the write version of the Row class, which is used when creating a row.

Parameters:

key (str) – Unique row key
columns (dict[str, Any]) – Row data stored as a JSON object.

as_write() → RowWrite: Returns this RowWrite instance.

class cognite.client.data_classes.raw.RowWriteList( resources: Sequence[T_CogniteResource], ): Bases: RowListCore[RowWrite]

class cognite.client.data_classes.raw.Table(name: str, created_time: int | None)

Bases: WriteableCogniteResourceWithClientRef[TableWrite]

A NoSQL database table to store customer data. This is the read version of the Table class, which is used when retrieving a table.

Parameters:

name (str) – Unique name of the table
created_time (int | None) – Time the table was created.

as_write() → TableWrite: Returns this Table as a TableWrite

rows( key: str | None = None, limit: int | None = None, ) → Row | RowList | None

Get the rows in this table.

Parameters:

key (str | None) – Specify a key to return only that row.
limit (int | None) – The number of rows to return.

Returns:

List of tables in this database.

Return type:

Row | RowList | None

async rows_async( key: str | None = None, limit: int | None = None, ) → Row | RowList | None

Get the rows in this table.

Parameters:

key (str | None) – Specify a key to return only that row.
limit (int | None) – The number of rows to return.

Returns:

List of tables in this database.

Return type:

Row | RowList | None

class cognite.client.data_classes.raw.TableList( resources: Sequence[T_CogniteResource], )

Bases: WriteableCogniteResourceList[TableWrite, Table], NameTransformerMixin

as_write() → TableWriteList: Returns this TableList as a TableWriteList

class cognite.client.data_classes.raw.TableWrite(name: str)

Bases: WriteableCogniteResource[TableWrite]

A NoSQL database table to store customer data This is the write version of the Table class, which is used when creating a table.

Parameters:: name (str) – Unique name of the table

as_write() → TableWrite: Returns this TableWrite instance.

class cognite.client.data_classes.raw.TableWriteList( resources: Sequence[T_CogniteResource], ): Bases: CogniteResourceList[TableWrite], NameTransformerMixin

Extraction pipelines

List extraction pipelines

async ExtractionPipelinesAPI.list( limit: int | None = 25, ) → ExtractionPipelineList

List extraction pipelines

Parameters:: limit (int | None) – Maximum number of ExtractionPipelines to return. Defaults to 25. Set to -1, float(“inf”) or None to return all items.
Returns:: List of requested ExtractionPipelines
Return type:: ExtractionPipelineList

Examples

List ExtractionPipelines:

>>> from cognite.client import CogniteClient, AsyncCogniteClient
>>> client = CogniteClient()
>>> # async_client = AsyncCogniteClient()  # another option
>>> ep_list = client.extraction_pipelines.list(limit=5)

Create extraction pipeline

async ExtractionPipelinesAPI.create( extraction_pipeline: ExtractionPipeline | ExtractionPipelineWrite | Sequence[ExtractionPipeline] | Sequence[ExtractionPipelineWrite], ) → ExtractionPipeline | ExtractionPipelineList

Create one or more extraction pipelines.

You can create an arbitrary number of extraction pipelines, and the SDK will split the request into multiple requests if necessary.

Parameters:: extraction_pipeline (ExtractionPipeline | ExtractionPipelineWrite | Sequence[ExtractionPipeline] | Sequence[ExtractionPipelineWrite]) – Extraction pipeline or list of extraction pipelines to create.
Returns:: Created extraction pipeline(s)
Return type:: ExtractionPipeline | ExtractionPipelineList

Examples

Create new extraction pipeline:

>>> from cognite.client import CogniteClient
>>> from cognite.client.data_classes import ExtractionPipelineWrite
>>> client = CogniteClient()
>>> # async_client = AsyncCogniteClient()  # another option
>>> extpipes = [ExtractionPipelineWrite(name="extPipe1",...), ExtractionPipelineWrite(name="extPipe2",...)]
>>> res = client.extraction_pipelines.create(extpipes)

Retrieve an extraction pipeline by ID

async ExtractionPipelinesAPI.retrieve( id: int | None = None, external_id: str | None = None, ) → ExtractionPipeline | None

Retrieve a single extraction pipeline by id.

Parameters:

id (int | None) – ID
external_id (str | None) – External ID

Returns:

Requested extraction pipeline or None if it does not exist.

Return type:

ExtractionPipeline | None

Examples

Get extraction pipeline by id:

>>> from cognite.client import CogniteClient, AsyncCogniteClient
>>> client = CogniteClient()
>>> # async_client = AsyncCogniteClient()  # another option
>>> res = client.extraction_pipelines.retrieve(id=1)

Get extraction pipeline by external id:

>>> res = client.extraction_pipelines.retrieve(external_id="1")

Retrieve multiple extraction pipelines by ID

async ExtractionPipelinesAPI.retrieve_multiple( ids: Sequence[int] | None = None, external_ids: SequenceNotStr[str] | None = None, ignore_unknown_ids: bool = False, ) → ExtractionPipelineList

Retrieve multiple extraction pipelines by ids and external ids.

Parameters:

ids (Sequence[int] | None) – IDs
external_ids (SequenceNotStr[str] | None) – External IDs
ignore_unknown_ids (bool) – Ignore IDs and external IDs that are not found rather than throw an exception.

Returns:

The requested ExtractionPipelines.

Return type:

ExtractionPipelineList

Examples

Get ExtractionPipelines by id:

>>> from cognite.client import CogniteClient, AsyncCogniteClient
>>> client = CogniteClient()
>>> # async_client = AsyncCogniteClient()  # another option
>>> res = client.extraction_pipelines.retrieve_multiple(ids=[1, 2, 3])

Get assets by external id:

>>> res = client.extraction_pipelines.retrieve_multiple(external_ids=["abc", "def"], ignore_unknown_ids=True)

Update extraction pipelines

async ExtractionPipelinesAPI.update( item: ExtractionPipeline | ExtractionPipelineWrite | ExtractionPipelineUpdate | Sequence[ExtractionPipeline | ExtractionPipelineWrite | ExtractionPipelineUpdate], mode: Literal['replace_ignore_null', 'patch', 'replace'] = 'replace_ignore_null', ) → ExtractionPipeline | ExtractionPipelineList

Update one or more extraction pipelines

Parameters:

item (ExtractionPipeline | ExtractionPipelineWrite | ExtractionPipelineUpdate | Sequence[ExtractionPipeline | ExtractionPipelineWrite | ExtractionPipelineUpdate]) – Extraction pipeline(s) to update
mode (Literal['replace_ignore_null', 'patch', 'replace']) – How to update data when a non-update object is given (ExtractionPipeline or -Write). If you use ‘replace_ignore_null’, only the fields you have set will be used to replace existing (default). Using ‘replace’ will additionally clear all the fields that are not specified by you. Last option, ‘patch’, will update only the fields you have set and for container-like fields such as metadata or labels, add the values to the existing. For more details, see Update and Upsert Mode Parameter.

Returns:

Updated extraction pipeline(s)

Return type:

ExtractionPipeline | ExtractionPipelineList

Examples

Update an extraction pipeline that you have fetched. This will perform a full update of the extraction pipeline:

>>> from cognite.client import CogniteClient
>>> from cognite.client.data_classes import ExtractionPipelineUpdate
>>> client = CogniteClient()
>>> # async_client = AsyncCogniteClient()  # another option
>>> update = ExtractionPipelineUpdate(id=1)
>>> update.description.set("Another new extpipe")
>>> res = client.extraction_pipelines.update(update)

Delete extraction pipelines

async ExtractionPipelinesAPI.delete( id: int | Sequence[int] | None = None, external_id: str | SequenceNotStr[str] | None = None, ) → None

Delete one or more extraction pipelines

Parameters:

id (int | Sequence[int] | None) – Id or list of ids
external_id (str | SequenceNotStr[str] | None) – External ID or list of external ids

Examples

Delete extraction pipelines by id or external id:

>>> from cognite.client import CogniteClient, AsyncCogniteClient
>>> client = CogniteClient()
>>> # async_client = AsyncCogniteClient()  # another option
>>> client.extraction_pipelines.delete(id=[1,2,3], external_id="3")

Extraction pipeline runs

List runs for an extraction pipeline

List runs for an extraction pipeline with given external_id

Parameters:

external_id (str) – Extraction pipeline external Id.
statuses (RunStatus | Sequence[RunStatus] | SequenceNotStr[str] | None) – One or more among “success” / “failure” / “seen”.
message_substring (str | None) – Failure message part.
created_time (dict[str, Any] | TimestampRange | str | None) – Range between two timestamps. Possible keys are min and max, with values given as timestamps in ms. If a string is passed, it is assumed to be the minimum value.
limit (int | None) – Maximum number of ExtractionPipelines to return. Defaults to 25. Set to -1, float(“inf”) or None to return all items.

Returns:

List of requested extraction pipeline runs

Return type:

ExtractionPipelineRunList

Tip

The created_time parameter can also be passed as a string, to support the most typical usage pattern of fetching the most recent runs, meaning it is implicitly assumed to be the minimum created time. The format is “N[timeunit]-ago”, where timeunit is w,d,h,m (week, day, hour, minute), e.g. “12d-ago”.

Examples

List extraction pipeline runs:

>>> from cognite.client import CogniteClient, AsyncCogniteClient
>>> client = CogniteClient()
>>> # async_client = AsyncCogniteClient()  # another option
>>> runsList = client.extraction_pipelines.runs.list(external_id="test ext id", limit=5)

Filter extraction pipeline runs on a given status:

>>> runs_list = client.extraction_pipelines.runs.list(external_id="test ext id", statuses=["seen"], limit=5)

Get all failed pipeline runs in the last 24 hours for pipeline ‘extId’:

>>> from cognite.client.data_classes import ExtractionPipelineRun
>>> res = client.extraction_pipelines.runs.list(external_id="extId", statuses="failure", created_time="24h-ago")

Report new runs

async ExtractionPipelineRunsAPI.create( run: ExtractionPipelineRun | ExtractionPipelineRunWrite | Sequence[ExtractionPipelineRun] | Sequence[ExtractionPipelineRunWrite], ) → ExtractionPipelineRun | ExtractionPipelineRunList

Create one or more extraction pipeline runs.

You can create an arbitrary number of extraction pipeline runs, and the SDK will split the request into multiple requests.

Parameters:: run (ExtractionPipelineRun | ExtractionPipelineRunWrite | Sequence[ExtractionPipelineRun] | Sequence[ExtractionPipelineRunWrite]) – ExtractionPipelineRun| ExtractionPipelineRunWrite | Sequence[ExtractionPipelineRun] | Sequence[ExtractionPipelineRunWrite]): Extraction pipeline or list of extraction pipeline runs to create.
Returns:: Created extraction pipeline run(s)
Return type:: ExtractionPipelineRun | ExtractionPipelineRunList

Examples

Report a new extraction pipeline run:

>>> from cognite.client import CogniteClient
>>> from cognite.client.data_classes import ExtractionPipelineRunWrite
>>> client = CogniteClient()
>>> res = client.extraction_pipelines.runs.create(
...     ExtractionPipelineRunWrite(status="success", extpipe_external_id="extId"))

Extraction pipeline configs

Get the latest or a specific config revision

async ExtractionPipelineConfigsAPI.retrieve( external_id: str, revision: int | None = None, active_at_time: int | None = None, ) → ExtractionPipelineConfig

Retrieve a specific configuration revision, or the latest by default <https://developer.cognite.com/api#tag/Extraction-Pipelines-Config/operation/getExtPipeConfigRevision>

By default the latest configuration revision is retrieved, or you can specify a timestamp or a revision number.

Parameters:

external_id (str) – External id of the extraction pipeline to retrieve config from.
revision (int | None) – Optionally specify a revision number to retrieve.
active_at_time (int | None) – Optionally specify a timestamp the configuration revision should be active.

Returns:

Retrieved extraction pipeline configuration revision

Return type:

ExtractionPipelineConfig

Examples

Retrieve latest config revision:

>>> from cognite.client import CogniteClient, AsyncCogniteClient
>>> client = CogniteClient()
>>> # async_client = AsyncCogniteClient()  # another option
>>> res = client.extraction_pipelines.config.retrieve("extId")

List configuration revisions

async ExtractionPipelineConfigsAPI.list( external_id: str, ) → ExtractionPipelineConfigRevisionList

Retrieve all configuration revisions from an extraction pipeline <https://developer.cognite.com/api#tag/Extraction-Pipelines-Config/operation/listExtPipeConfigRevisions>

Parameters:: external_id (str) – External id of the extraction pipeline to retrieve config from.
Returns:: Retrieved extraction pipeline configuration revisions
Return type:: ExtractionPipelineConfigRevisionList

Examples

Retrieve a list of config revisions:

>>> from cognite.client import CogniteClient, AsyncCogniteClient
>>> client = CogniteClient()
>>> # async_client = AsyncCogniteClient()  # another option
>>> res = client.extraction_pipelines.config.list("extId")

Create a config revision

async ExtractionPipelineConfigsAPI.create( config: ExtractionPipelineConfig | ExtractionPipelineConfigWrite, ) → ExtractionPipelineConfig

Create a new configuration revision <https://developer.cognite.com/api#tag/Extraction-Pipelines-Config/operation/createExtPipeConfig>

Parameters:: config (ExtractionPipelineConfig | ExtractionPipelineConfigWrite) – Configuration revision to create.
Returns:: Created extraction pipeline configuration revision
Return type:: ExtractionPipelineConfig

Examples

Create a config revision:

>>> from cognite.client import CogniteClient
>>> from cognite.client.data_classes import ExtractionPipelineConfigWrite
>>> client = CogniteClient()
>>> res = client.extraction_pipelines.config.create(ExtractionPipelineConfigWrite(external_id="extId", config="my config contents"))

Revert to an earlier config revision

async ExtractionPipelineConfigsAPI.revert( external_id: str, revision: int, ) → ExtractionPipelineConfig

Revert to a previous configuration revision <https://developer.cognite.com/api#tag/Extraction-Pipelines-Config/operation/revertExtPipeConfigRevision>

Parameters:

external_id (str) – External id of the extraction pipeline to revert revision for.
revision (int) – Revision to revert to.

Returns:

New latest extraction pipeline configuration revision.

Return type:

ExtractionPipelineConfig

Examples

Revert a config revision:

>>> from cognite.client import CogniteClient, AsyncCogniteClient
>>> client = CogniteClient()
>>> # async_client = AsyncCogniteClient()  # another option
>>> res = client.extraction_pipelines.config.revert("extId", 5)

Extractor Config Data classes

class cognite.client.data_classes.extractionpipelines.ExtractionPipeline( id: int, external_id: str, name: str, description: str | None, data_set_id: int, raw_tables: list[dict[str, str]] | None, last_success: int | None, last_failure: int | None, last_message: str | None, last_seen: int | None, schedule: str | None, contacts: list[ExtractionPipelineContact] | None, metadata: dict[str, str] | None, source: str | None, documentation: str | None, notification_config: ExtractionPipelineNotificationConfiguration | None, created_time: int, last_updated_time: int, created_by: str | None, )

Bases: ExtractionPipelineCore

An extraction pipeline is a representation of a process writing data to CDF, such as an extractor or an ETL tool. This is the read version of the ExtractionPipeline class, which is used when retrieving extraction pipelines.

Parameters:

id (int) – A server-generated ID for the object.
external_id (str) – The external ID provided by the client. Must be unique for the resource type.
name (str) – The name of the extraction pipeline.
description (str | None) – The description of the extraction pipeline.
data_set_id (int) – The id of the dataset this extraction pipeline related with.
raw_tables (list[dict[str, str]] | None) – list of raw tables in list format: [{“dbName”: “value”, “tableName” : “value”}].
last_success (int | None) – Milliseconds value of last success status.
last_failure (int | None) – Milliseconds value of last failure status.
last_message (str | None) – Message of last failure.
last_seen (int | None) – Milliseconds value of last seen status.
schedule (str | None) – One of None/On trigger/Continuous/cron regex.
contacts (list[ExtractionPipelineContact] | None) – list of contacts
metadata (dict[str, str] | None) – Custom, application specific metadata. String key -> String value. Limits: Maximum length of key is 128 bytes, value 10240 bytes, up to 256 key-value pairs, of total size at most 10240.
source (str | None) – Source text value for extraction pipeline.
documentation (str | None) – Documentation text value for extraction pipeline.
notification_config (ExtractionPipelineNotificationConfiguration | None) – Notification configuration for the extraction pipeline.
created_time (int) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.
last_updated_time (int) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.
created_by (str | None) – Extraction pipeline creator, usually an email.

as_write() → ExtractionPipelineWrite: Returns this ExtractionPipeline as a ExtractionPipelineWrite

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineConfig( external_id: str, config: str | None, revision: int, description: str | None, created_time: int, )

Bases: ExtractionPipelineConfigCore

An extraction pipeline config

Parameters:

external_id (str) – The external ID of the associated extraction pipeline.
config (str | None) – Contents of this configuration revision.
revision (int) – The revision number of this config as a positive integer.
description (str | None) – Short description of this configuration revision.
created_time (int) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.

as_write() → ExtractionPipelineConfigWrite: Returns this ExtractionPipelineConfig as a ExtractionPipelineConfigWrite

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineConfigCore( external_id: str | None = None, config: str | None = None, description: str | None = None, )

Bases: WriteableCogniteResource[ExtractionPipelineConfigWrite], ABC

An extraction pipeline config

Parameters:

external_id (str | None) – The external ID of the associated extraction pipeline.
config (str | None) – Contents of this configuration revision.
description (str | None) – Short description of this configuration revision.

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineConfigList( resources: Sequence[T_CogniteResource], ): Bases: WriteableCogniteResourceList[ExtractionPipelineConfigWrite, ExtractionPipelineConfig], ExternalIDTransformerMixin

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineConfigRevision( external_id: str, revision: int, description: str | None, created_time: int, )

Bases: CogniteResource

An extraction pipeline config revision

Parameters:

external_id (str) – The external ID of the associated extraction pipeline.
revision (int) – The revision number of this config as a positive integer.
description (str | None) – Short description of this configuration revision.
created_time (int) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineConfigRevisionList( resources: Sequence[T_CogniteResource], ): Bases: CogniteResourceList[ExtractionPipelineConfigRevision], ExternalIDTransformerMixin

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineConfigWrite( external_id: str, config: str | None = None, description: str | None = None, )

Bases: ExtractionPipelineConfigCore

An extraction pipeline config

Parameters:

external_id (str) – The external ID of the associated extraction pipeline.
config (str | None) – Contents of this configuration revision.
description (str | None) – Short description of this configuration revision.

as_write() → ExtractionPipelineConfigWrite: Returns this ExtractionPipelineConfigWrite instance.

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineConfigWriteList( resources: Sequence[T_CogniteResource], ): Bases: CogniteResourceList[ExtractionPipelineConfigWrite], ExternalIDTransformerMixin

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineContact( name: str | None = None, email: str | None = None, role: str | None = None, send_notification: bool | None = None, )

Bases: CogniteResource

A contact for an extraction pipeline

Parameters:

name (str | None) – Name of contact
email (str | None) – Email address of contact
role (str | None) – Role of contact, such as Owner, Maintainer, etc.
send_notification (bool | None) – Whether to send notifications to this contact or not

Bases: WriteableCogniteResource[ExtractionPipelineWrite], ABC

An extraction pipeline is a representation of a process writing data to CDF, such as an extractor or an ETL tool.

Parameters:

external_id (str) – The external ID provided by the client. Must be unique for the resource type.
name (str | None) – The name of the extraction pipeline.
description (str | None) – The description of the extraction pipeline.
data_set_id (int | None) – The id of the dataset this extraction pipeline related with.
raw_tables (list[dict[str, str]] | None) – list of raw tables in list format: [{“dbName”: “value”, “tableName” : “value”}].
schedule (str | None) – One of None/On trigger/Continuous/cron regex.
contacts (list[ExtractionPipelineContact] | None) – list of contacts
metadata (dict[str, str] | None) – Custom, application specific metadata. String key -> String value. Limits: Maximum length of key is 128 bytes, value 10240 bytes, up to 256 key-value pairs, of total size at most 10240.
source (str | None) – Source text value for extraction pipeline.
documentation (str | None) – Documentation text value for extraction pipeline.
notification_config (ExtractionPipelineNotificationConfiguration | None) – Notification configuration for the extraction pipeline.
created_by (str | None) – Extraction pipeline creator, usually an email.

dump(camel_case: bool = True) → dict[str, Any]

Dump the instance into a json serializable Python data type.

Parameters:: camel_case (bool) – Use camelCase for attribute names. Defaults to True.
Returns:: A dictionary representation of the instance.
Return type:: dict[str, Any]

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineList( resources: Sequence[T_CogniteResource], ): Bases: WriteableCogniteResourceList[ExtractionPipelineWrite, ExtractionPipeline], IdTransformerMixin

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineNotificationConfiguration( allowed_not_seen_range_in_minutes: int | None = None, )

Bases: CogniteResource

Extraction pipeline notification configuration

Parameters:: allowed_not_seen_range_in_minutes (int | None) – Time in minutes to pass without any Run. Null if extraction pipeline is not checked.

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineRun( id: int, extpipe_external_id: str | None, status: str, message: str | None, created_time: int | None, )

Bases: ExtractionPipelineRunCore

A representation of an extraction pipeline run.

Parameters:

id (int) – A server-generated ID for the object.
extpipe_external_id (str | None) – The external ID of the extraction pipeline.
status (str) – success/failure/seen.
message (str | None) – Optional status message.
created_time (int | None) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.

as_write() → ExtractionPipelineRunWrite: Returns this ExtractionPipelineRun as a ExtractionPipelineRunWrite

dump(camel_case: bool = True) → dict[str, Any]

Dump the instance into a json serializable Python data type.

Parameters:: camel_case (bool) – Use camelCase for attribute names. Defaults to True.
Returns:: A dictionary representation of the instance.
Return type:: dict[str, Any]

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineRunCore( status: str, message: str | None, created_time: int | None, )

Bases: WriteableCogniteResource[ExtractionPipelineRunWrite], ABC

A representation of an extraction pipeline run.

Parameters:

status (str) – success/failure/seen.
message (str | None) – Optional status message.
created_time (int | None) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineRunFilter( external_id: str | None = None, statuses: SequenceNotStr[str] | None = None, message: StringFilter | None = None, created_time: dict[str, Any] | TimestampRange | None = None, )

Bases: CogniteFilter

Filter runs with exact matching

Parameters:

external_id (str | None) – The external ID of related ExtractionPipeline provided by the client. Must be unique for the resource type.
statuses (SequenceNotStr[str] | None) – success/failure/seen.
message (StringFilter | None) – message filter.
created_time (dict[str, Any] | TimestampRange | None) – Range between two timestamps.

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineRunList( resources: Sequence[T_CogniteResource], ): Bases: WriteableCogniteResourceList[ExtractionPipelineRunWrite, ExtractionPipelineRun], IdTransformerMixin

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineRunWrite( extpipe_external_id: str, status: Literal['success', 'failure', 'seen'], message: str | None = None, created_time: int | None = None, )

Bases: ExtractionPipelineRunCore

A representation of an extraction pipeline run. This is the write version of the ExtractionPipelineRun class, which is used when creating extraction pipeline runs.

Parameters:

extpipe_external_id (str) – The external ID of the extraction pipeline.
status (Literal['success', 'failure', 'seen']) – success/failure/seen.
message (str | None) – Optional status message.
created_time (int | None) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.

as_write() → ExtractionPipelineRunWrite: Returns this ExtractionPipelineRunWrite instance.

dump( camel_case: bool = True, ) → dict[str, Any]

Dump the instance into a json serializable Python data type.

Parameters:: camel_case (bool) – Use camelCase for attribute names. Defaults to True.
Returns:: A dictionary representation of the instance.
Return type:: dict[str, Any]

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineRunWriteList( resources: Sequence[T_CogniteResource], ): Bases: CogniteResourceList[ExtractionPipelineRunWrite]

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineUpdate(id: int | None = None, external_id: str | None = None)

Bases: CogniteUpdate

Changes applied to an extraction pipeline

Parameters:

id (int) – A server-generated ID for the object.
external_id (str) – The external ID provided by the client. Must be unique for the resource type.

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineWrite( external_id: str, name: str, data_set_id: int, description: str | None = None, raw_tables: list[dict[str, str]] | None = None, schedule: str | None = None, contacts: list[ExtractionPipelineContact] | None = None, metadata: dict[str, str] | None = None, source: str | None = None, documentation: str | None = None, notification_config: ExtractionPipelineNotificationConfiguration | None = None, created_by: str | None = None, )

Bases: ExtractionPipelineCore

An extraction pipeline is a representation of a process writing data to CDF, such as an extractor or an ETL tool. This is the write version of the ExtractionPipeline class, which is used when creating extraction pipelines.

Parameters:

external_id (str) – The external ID provided by the client. Must be unique for the resource type.
name (str) – The name of the extraction pipeline.
data_set_id (int) – The id of the dataset this extraction pipeline related with.
description (str | None) – The description of the extraction pipeline.
raw_tables (list[dict[str, str]] | None) – list of raw tables in list format: [{“dbName”: “value”, “tableName” : “value”}].
schedule (str | None) – One of None/On trigger/Continuous/cron regex.
contacts (list[ExtractionPipelineContact] | None) – list of contacts
metadata (dict[str, str] | None) – Custom, application specific metadata. String key -> String value. Limits: Maximum length of key is 128 bytes, value 10240 bytes, up to 256 key-value pairs, of total size at most 10240.
source (str | None) – Source text value for extraction pipeline.
documentation (str | None) – Documentation text value for extraction pipeline.
notification_config (ExtractionPipelineNotificationConfiguration | None) – Notification configuration for the extraction pipeline.
created_by (str | None) – Extraction pipeline creator, usually an email.

as_write() → ExtractionPipelineWrite: Returns this ExtractionPipelineWrite instance.

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineWriteList( resources: Sequence[T_CogniteResource], ): Bases: CogniteResourceList[ExtractionPipelineWrite], ExternalIDTransformerMixin

class cognite.client.data_classes.extractionpipelines.StringFilter(substring: str | None = None)

Bases: CogniteFilter

Filter runs on substrings of the message

Parameters:: substring (str | None) – Part of message