Data Ingestion

Raw

Databases

List databases

RawDatabasesAPI.list(limit: int | None = 25) → DatabaseList

Parameters: limit (int | None) – Maximum number of databases to return. Defaults to 25. Set to -1, float(“inf”) or None to return all items.
Returns: List of requested databases.
Return type: DatabaseList

Examples

List the first 5 databases:

>>> from cognite.client import CogniteClient
>>> client = CogniteClient()
>>> db_list = client.raw.databases.list(limit=5)

Iterate over databases:

>>> for db in client.raw.databases:
...     db # do something with the db

Iterate over chunks of databases to reduce memory load:

>>> for db_list in client.raw.databases(chunk_size=2500):
...     db_list # do something with the dbs

Create new databases

RawDatabasesAPI.create(name: str) → Database

RawDatabasesAPI.create(name: list[str]) → DatabaseList

Create one or more databases.

Parameters: name (str | list[str]) – A db name or list of db names to create.
Returns: Database or list of databases that has been created.
Return type: Database | DatabaseList

Examples

Create a new database:

>>> from cognite.client import CogniteClient
>>> client = CogniteClient()
>>> res = client.raw.databases.create("db1")

Delete databases

RawDatabasesAPI.delete(name: Union[str, SequenceNotStr[str]], recursive: bool = False) → None

Delete one or more databases.

Parameters

name (str | SequenceNotStr[str]) – A db name or list of db names to delete.
recursive (bool) – Recursively delete all tables in the database(s).

Examples

Delete a list of databases:

>>> from cognite.client import CogniteClient
>>> client = CogniteClient()
>>> client.raw.databases.delete(["db1", "db2"])

Tables

List tables in a database

RawTablesAPI.list(db_name: str, limit: int | None = 25) → TableList

List tables

Parameters

db_name (str) – The database to list tables from.
limit (int | None) – Maximum number of tables to return. Defaults to 25. Set to -1, float(“inf”) or None to return all items.

Returns

List of requested tables.

Return type

raw.TableList

Examples

List the first 5 tables:

>>> from cognite.client import CogniteClient
>>> client = CogniteClient()
>>> table_list = client.raw.tables.list("db1", limit=5)

Iterate over tables:

>>> for table in client.raw.tables(db_name="db1"):
...     table # do something with the table

Iterate over chunks of tables to reduce memory load:

>>> for table_list in client.raw.tables(db_name="db1", chunk_size=2500):
...     table_list # do something with the tables

Create new tables in a database

RawTablesAPI.create(db_name: str, name: str) → Table

RawTablesAPI.create(db_name: str, name: list[str]) → TableList

Create one or more tables.

Parameters

db_name (str) – Database to create the tables in.
name (str | list[str]) – A table name or list of table names to create.

Returns

raw.Table or list of tables that has been created.

Return type

raw.Table | raw.TableList

Examples

Create a new table in a database:

>>> from cognite.client import CogniteClient
>>> client = CogniteClient()
>>> res = client.raw.tables.create("db1", "table1")

Delete tables from a database

RawTablesAPI.delete(db_name: str, name: Union[str, SequenceNotStr[str]]) → None

Delete one or more tables.

Parameters

db_name (str) – Database to delete tables from.
name (str | SequenceNotStr[str]) – A table name or list of table names to delete.

Examples

Delete a list of tables:

>>> from cognite.client import CogniteClient
>>> client = CogniteClient()
>>> res = client.raw.tables.delete("db1", ["table1", "table2"])

Rows

Get a row from a table

RawRowsAPI.retrieve(db_name: str, table_name: str, key: str) → cognite.client.data_classes.raw.Row | None

Retrieve a single row by key.

Parameters

db_name (str) – Name of the database.
table_name (str) – Name of the table.
key (str) – The key of the row to retrieve.

Returns

The requested row.

Return type

Row | None

Examples

Retrieve a row with key ‘k1’ from table ‘t1’ in database ‘db1’:

>>> from cognite.client import CogniteClient
>>> client = CogniteClient()
>>> row = client.raw.rows.retrieve("db1", "t1", "k1")

You may access the data directly on the row (like a dict), or use ‘.get’ when keys can be missing:

>>> val1 = row["col1"]
>>> val2 = row.get("col2")

List rows in a table

RawRowsAPI.list(db_name: str, table_name: str, min_last_updated_time: Optional[int] = None, max_last_updated_time: Optional[int] = None, columns: Optional[list[str]] = None, limit: int | None = 25, partitions: Optional[int] = None) → RowList

List rows in a table.

Parameters

db_name (str) – Name of the database.
table_name (str) – Name of the table.
min_last_updated_time (int | None) – Rows must have been last updated after this time (exclusive). ms since epoch.
max_last_updated_time (int | None) – Rows must have been last updated before this time (inclusive). ms since epoch.
columns (list[str] | None) – List of column keys. Set to None for retrieving all, use [] to retrieve only row keys.
limit (int | None) – The number of rows to retrieve. Can be used with partitions. Defaults to 25. Set to -1, float(“inf”) or None to return all items.
partitions (int | None) – Retrieve rows in parallel using this number of workers. Can be used together with a (large) finite limit. When partitions is not passed, it defaults to 1, i.e. no concurrency for a finite limit and global_config.max_workers for an unlimited query (will be capped at this value). To prevent unexpected problems and maximize read throughput, check out concurrency limits in the API documentation.

Returns

The requested rows.

Return type

RowList

Examples

List a few rows:

>>> from cognite.client import CogniteClient
>>> client = CogniteClient()
>>> row_list = client.raw.rows.list("db1", "tbl1", limit=5)

Read an entire table efficiently by using concurrency (default behavior when limit=None):

>>> row_list = client.raw.rows.list("db1", "tbl1", limit=None)

Iterate through all rows one-by-one to reduce memory load (no concurrency used):

>>> for row in client.raw.rows("db1", "t1", columns=["col1","col2"]):
...     val1 = row["col1"]  # You may access the data directly
...     val2 = row.get("col2")  # ...or use '.get' when keys can be missing

Iterate through all rows, one chunk at a time, to reduce memory load (no concurrency used):

>>> for row_list in client.raw.rows("db1", "t1", chunk_size=2500):
...     row_list  # Do something with the rows

Iterate through a massive table to reduce memory load while using concurrency for high throughput. Note: partitions must be specified for concurrency to be used (this is different from list() to keep backward compatibility). Supplying a finite limit does not affect concurrency settings (except for very small values).

>>> rows_iterator = client.raw.rows(
...     db_name="db1", table_name="t1", partitions=5, chunk_size=5000, limit=1_000_000
... )
>>> for row_list in rows_iterator:
...     row_list  # Do something with the rows

Insert rows into a table

RawRowsAPI.insert(db_name: str, table_name: str, row: collections.abc.Sequence[cognite.client.data_classes.raw.Row] | collections.abc.Sequence[cognite.client.data_classes.raw.RowWrite] | cognite.client.data_classes.raw.Row | cognite.client.data_classes.raw.RowWrite | dict, ensure_parent: bool = False) → None

Insert one or more rows into a table.

Parameters

db_name (str) – Name of the database.
table_name (str) – Name of the table.
row (Sequence[Row] | Sequence[RowWrite] | Row | RowWrite | dict) – The row(s) to insert
ensure_parent (bool) – Create database/table if they don’t already exist.

Examples

Insert new rows into a table:

>>> from cognite.client import CogniteClient
>>> from cognite.client.data_classes import RowWrite
>>> client = CogniteClient()
>>> rows = [RowWrite(key="r1", columns={"col1": "val1", "col2": "val1"}),
...         RowWrite(key="r2", columns={"col1": "val2", "col2": "val2"})]
>>> client.raw.rows.insert("db1", "table1", rows)

You may also insert a dictionary directly:

>>> rows = {
...     "key-1": {"col1": 1, "col2": 2},
...     "key-2": {"col1": 3, "col2": 4, "col3": "high five"},
... }
>>> client.raw.rows.insert("db1", "table1", rows)

Delete rows from a table

RawRowsAPI.delete(db_name: str, table_name: str, key: Union[str, SequenceNotStr[str]]) → None

Delete rows from a table.

Parameters

db_name (str) – Name of the database.
table_name (str) – Name of the table.
key (str | SequenceNotStr[str]) – The key(s) of the row(s) to delete.

Examples

Delete rows from table:

>>> from cognite.client import CogniteClient
>>> client = CogniteClient()
>>> keys_to_delete = ["k1", "k2", "k3"]
>>> client.raw.rows.delete("db1", "table1", keys_to_delete)

Retrieve pandas dataframe

RawRowsAPI.retrieve_dataframe(db_name: str, table_name: str, min_last_updated_time: int | None = None, max_last_updated_time: int | None = None, columns: list[str] | None = None, limit: int | None = 25, partitions: int | None = None, last_updated_time_in_index: bool = False, infer_dtypes: bool = True) → pd.DataFrame

Retrieve rows in a table as a pandas dataframe.

Rowkeys are used as the index.

Parameters

db_name (str) – Name of the database.
table_name (str) – Name of the table.
min_last_updated_time (int | None) – Rows must have been last updated after this time. ms since epoch.
max_last_updated_time (int | None) – Rows must have been last updated before this time. ms since epoch.
columns (list[str] | None) – List of column keys. Set to None for retrieving all, use [] to retrieve only row keys.
limit (int | None) – The number of rows to retrieve. Defaults to 25. Set to -1, float(“inf”) or None to return all items.
partitions (int | None) –
Retrieve rows in parallel using this number of workers. Can be used together with a (large) finite limit. When partitions is not passed, it defaults to 1, i.e. no concurrency for a finite limit and global_config.max_workers for an unlimited query (will be capped at this value). To prevent unexpected problems and maximize read throughput, check out concurrency limits in the API documentation.
last_updated_time_in_index (bool) – Use a MultiIndex with row keys and last_updated_time as index.
infer_dtypes (bool) – If True, pandas will try to infer dtypes of the columns. Defaults to True.

Returns

The requested rows in a pandas dataframe.

Return type

pd.DataFrame

Examples

Get dataframe:

>>> from cognite.client import CogniteClient
>>> client = CogniteClient()
>>> df = client.raw.rows.retrieve_dataframe("db1", "t1", limit=5)

Insert pandas dataframe

RawRowsAPI.insert_dataframe(db_name: str, table_name: str, dataframe: pd.DataFrame, ensure_parent: bool = False, dropna: bool = True) → None

Insert pandas dataframe into a table

Uses index for row keys.

Parameters

db_name (str) – Name of the database.
table_name (str) – Name of the table.
dataframe (pd.DataFrame) – The dataframe to insert. Index will be used as row keys.
ensure_parent (bool) – Create database/table if they don’t already exist.
dropna (bool) – Remove NaNs (but keep None’s in dtype=object columns) before inserting. Done individually per column. Default: True

Examples

Insert new rows into a table:

>>> import pandas as pd
>>> from cognite.client import CogniteClient
>>>
>>> client = CogniteClient()
>>> df = pd.DataFrame(
...     {"col-a": [1, 3, None], "col-b": [2, -1, 9]},
...     index=["r1", "r2", "r3"])
>>> res = client.raw.rows.insert_dataframe(
...     "db1", "table1", df, dropna=True)

RAW Data classes

class cognite.client.data_classes.raw.Database(name: str | None = None, created_time: int | None = None, cognite_client: CogniteClient | None = None)

Bases: DatabaseCore

A NoSQL database to store customer data.

Parameters

name (str | None) – Unique name of a database.
created_time (int | None) – Time the database was created.
cognite_client (CogniteClient | None) – The client to associate with this object.

as_write() → DatabaseWrite: Returns this Database as a DatabaseWrite

tables(limit: Optional[int] = None) → TableList

Get the tables in this database.

Parameters: limit (int | None) – The number of tables to return.
Returns: List of tables in this database.
Return type: TableList

class cognite.client.data_classes.raw.DatabaseCore(name: Optional[str] = None)

Bases: WriteableCogniteResource[DatabaseWrite], ABC

A NoSQL database to store customer data.

Parameters: name (str | None) – Unique name of a database.

class cognite.client.data_classes.raw.DatabaseList(resources: Iterable[Any], cognite_client: CogniteClient | None = None)

Bases: WriteableCogniteResourceList[DatabaseWrite, Database], NameTransformerMixin

as_write() → DatabaseWriteList: Returns this DatabaseList as a DatabaseWriteList

class cognite.client.data_classes.raw.DatabaseWrite(name: str)

Bases: DatabaseCore

A NoSQL database to store customer data.

Parameters: name (str) – Unique name of a database.

as_write() → DatabaseWrite: Returns this DatabaseWrite instance.

class cognite.client.data_classes.raw.DatabaseWriteList(resources: Iterable[Any], cognite_client: CogniteClient | None = None): Bases: CogniteResourceList[DatabaseWrite], NameTransformerMixin

class cognite.client.data_classes.raw.Row(key: str | None = None, columns: dict[str, Any] | None = None, last_updated_time: int | None = None, cognite_client: CogniteClient | None = None)

Bases: RowCore

This represents a row in a NO-SQL table. This is the reading version of the Row class, which is used when retrieving a row.

Parameters

key (str | None) – Unique row key
columns (dict[str, Any] | None) – Row data stored as a JSON object.
last_updated_time (int | None) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.
cognite_client (CogniteClient | None) – The client to associate with this object.

as_write() → RowWrite: Returns this Row as a RowWrite

class cognite.client.data_classes.raw.RowCore(key: Optional[str] = None, columns: Optional[dict[str, Any]] = None)

Bases: WriteableCogniteResource[RowWrite], ABC

No description.

Parameters

key (str | None) – Unique row key
columns (dict[str, Any] | None) – Row data stored as a JSON object.

to_pandas() → pandas.DataFrame

Convert the instance into a pandas DataFrame.

Returns: The pandas DataFrame representing this instance.
Return type: pandas.DataFrame

class cognite.client.data_classes.raw.RowList(resources: Iterable[Any], cognite_client: CogniteClient | None = None)

Bases: RowListCore[Row]

as_write() → RowWriteList: Returns this RowList as a RowWriteList

class cognite.client.data_classes.raw.RowListCore(resources: Iterable[Any], cognite_client: CogniteClient | None = None)

Bases: WriteableCogniteResourceList[RowWrite, T_Row], ABC

to_pandas() → pandas.DataFrame

Convert the instance into a pandas DataFrame.

Returns: The pandas DataFrame representing this instance.
Return type: pandas.DataFrame

class cognite.client.data_classes.raw.RowWrite(key: str, columns: dict[str, Any])

Bases: RowCore

This represents a row in a NO-SQL table. This is the writing version of the Row class, which is used when creating a row.

Parameters

key (str) – Unique row key
columns (dict[str, Any]) – Row data stored as a JSON object.

as_write() → RowWrite: Returns this RowWrite instance.

class cognite.client.data_classes.raw.RowWriteList(resources: Iterable[Any], cognite_client: CogniteClient | None = None): Bases: RowListCore[RowWrite]

class cognite.client.data_classes.raw.Table(name: str | None = None, created_time: int | None = None, cognite_client: CogniteClient | None = None)

Bases: TableCore

A NoSQL database table to store customer data. This is the reading version of the Table class, which is used when retrieving a table.

Parameters

name (str | None) – Unique name of the table
created_time (int | None) – Time the table was created.
cognite_client (CogniteClient | None) – The client to associate with this object.

as_write() → TableWrite: Returns this Table as a TableWrite

rows(key: str, limit: int | None = None) → cognite.client.data_classes.raw.Row | None

rows(key: None = None, limit: int | None = None) → RowList

Get the rows in this table.

Parameters

key (str | None) – Specify a key to return only that row.
limit (int | None) – The number of rows to return.

Returns

List of tables in this database.

Return type

Row | RowList | None

class cognite.client.data_classes.raw.TableCore(name: Optional[str] = None)

Bases: WriteableCogniteResource[TableWrite]

A NoSQL database table to store customer data

Parameters: name (str | None) – Unique name of the table

class cognite.client.data_classes.raw.TableList(resources: Iterable[Any], cognite_client: CogniteClient | None = None)

Bases: WriteableCogniteResourceList[TableWrite, Table], NameTransformerMixin

as_write() → TableWriteList: Returns this TableList as a TableWriteList

class cognite.client.data_classes.raw.TableWrite(name: str)

Bases: TableCore

A NoSQL database table to store customer data This is the writing version of the Table class, which is used when creating a table.

Parameters: name (str) – Unique name of the table

as_write() → TableWrite: Returns this TableWrite instance.

class cognite.client.data_classes.raw.TableWriteList(resources: Iterable[Any], cognite_client: CogniteClient | None = None): Bases: CogniteResourceList[TableWrite], NameTransformerMixin

Extraction pipelines

List extraction pipelines

ExtractionPipelinesAPI.list(limit: int | None = 25) → ExtractionPipelineList

List extraction pipelines

Parameters: limit (int | None) – Maximum number of ExtractionPipelines to return. Defaults to 25. Set to -1, float(“inf”) or None to return all items.
Returns: List of requested ExtractionPipelines
Return type: ExtractionPipelineList

Examples

List ExtractionPipelines:

>>> from cognite.client import CogniteClient
>>> client = CogniteClient()
>>> ep_list = client.extraction_pipelines.list(limit=5)

Create extraction pipeline

ExtractionPipelinesAPI.create(extraction_pipeline: cognite.client.data_classes.extractionpipelines.ExtractionPipeline | cognite.client.data_classes.extractionpipelines.ExtractionPipelineWrite) → ExtractionPipeline

ExtractionPipelinesAPI.create(extraction_pipeline: collections.abc.Sequence[cognite.client.data_classes.extractionpipelines.ExtractionPipeline] | collections.abc.Sequence[cognite.client.data_classes.extractionpipelines.ExtractionPipelineWrite]) → ExtractionPipelineList

Create one or more extraction pipelines.

You can create an arbitrary number of extraction pipelines, and the SDK will split the request into multiple requests if necessary.

Parameters: extraction_pipeline (ExtractionPipeline | ExtractionPipelineWrite | Sequence[ExtractionPipeline] | Sequence[ExtractionPipelineWrite]) – Extraction pipeline or list of extraction pipelines to create.
Returns: Created extraction pipeline(s)
Return type: ExtractionPipeline | ExtractionPipelineList

Examples

Create new extraction pipeline:

>>> from cognite.client import CogniteClient
>>> from cognite.client.data_classes import ExtractionPipelineWrite
>>> client = CogniteClient()
>>> extpipes = [ExtractionPipelineWrite(name="extPipe1",...), ExtractionPipelineWrite(name="extPipe2",...)]
>>> res = client.extraction_pipelines.create(extpipes)

Retrieve an extraction pipeline by ID

ExtractionPipelinesAPI.retrieve(id: Optional[int] = None, external_id: Optional[str] = None) → cognite.client.data_classes.extractionpipelines.ExtractionPipeline | None

Retrieve a single extraction pipeline by id.

Parameters

id (int | None) – ID
external_id (str | None) – External ID

Returns

Requested extraction pipeline or None if it does not exist.

Return type

ExtractionPipeline | None

Examples

Get extraction pipeline by id:

>>> from cognite.client import CogniteClient
>>> client = CogniteClient()
>>> res = client.extraction_pipelines.retrieve(id=1)

Get extraction pipeline by external id:

>>> res = client.extraction_pipelines.retrieve(external_id="1")

Retrieve multiple extraction pipelines by ID

ExtractionPipelinesAPI.retrieve_multiple(ids: Optional[Sequence[int]] = None, external_ids: Optional[SequenceNotStr[str]] = None, ignore_unknown_ids: bool = False) → ExtractionPipelineList

Retrieve multiple extraction pipelines by ids and external ids.

Parameters

ids (Sequence[int] | None) – IDs
external_ids (SequenceNotStr[str] | None) – External IDs
ignore_unknown_ids (bool) – Ignore IDs and external IDs that are not found rather than throw an exception.

Returns

The requested ExtractionPipelines.

Return type

ExtractionPipelineList

Examples

Get ExtractionPipelines by id:

>>> from cognite.client import CogniteClient
>>> client = CogniteClient()
>>> res = client.extraction_pipelines.retrieve_multiple(ids=[1, 2, 3])

Get assets by external id:

>>> res = client.extraction_pipelines.retrieve_multiple(external_ids=["abc", "def"], ignore_unknown_ids=True)

Update extraction pipelines

ExtractionPipelinesAPI.update(item: cognite.client.data_classes.extractionpipelines.ExtractionPipeline | cognite.client.data_classes.extractionpipelines.ExtractionPipelineWrite | cognite.client.data_classes.extractionpipelines.ExtractionPipelineUpdate) → ExtractionPipeline

ExtractionPipelinesAPI.update(item: Sequence[cognite.client.data_classes.extractionpipelines.ExtractionPipeline | cognite.client.data_classes.extractionpipelines.ExtractionPipelineWrite | cognite.client.data_classes.extractionpipelines.ExtractionPipelineUpdate]) → ExtractionPipelineList

Update one or more extraction pipelines

Parameters

item (ExtractionPipeline | ExtractionPipelineWrite | ExtractionPipelineUpdate | Sequence[ExtractionPipeline | ExtractionPipelineWrite | ExtractionPipelineUpdate]) – Extraction pipeline(s) to update
mode (Literal['replace_ignore_null', 'patch', 'replace']) – How to update data when a non-update object is given (ExtractionPipeline or -Write). If you use ‘replace_ignore_null’, only the fields you have set will be used to replace existing (default). Using ‘replace’ will additionally clear all the fields that are not specified by you. Last option, ‘patch’, will update only the fields you have set and for container-like fields such as metadata or labels, add the values to the existing. For more details, see Update and Upsert Mode Parameter.

Returns

Updated extraction pipeline(s)

Return type

ExtractionPipeline | ExtractionPipelineList

Examples

Update an extraction pipeline that you have fetched. This will perform a full update of the extraction pipeline:

>>> from cognite.client import CogniteClient
>>> from cognite.client.data_classes import ExtractionPipelineUpdate
>>> client = CogniteClient()
>>> update = ExtractionPipelineUpdate(id=1)
>>> update.description.set("Another new extpipe")
>>> res = client.extraction_pipelines.update(update)

Delete extraction pipelines

ExtractionPipelinesAPI.delete(id: Optional[Union[int, Sequence[int]]] = None, external_id: Optional[Union[str, SequenceNotStr[str]]] = None) → None

Delete one or more extraction pipelines

Parameters

id (int | Sequence[int] | None) – Id or list of ids
external_id (str | SequenceNotStr[str] | None) – External ID or list of external ids

Examples

Delete extraction pipelines by id or external id:

>>> from cognite.client import CogniteClient
>>> client = CogniteClient()
>>> client.extraction_pipelines.delete(id=[1,2,3], external_id="3")

Extraction pipeline runs

List runs for an extraction pipeline

ExtractionPipelineRunsAPI.list(external_id: str, statuses: Optional[Union[Literal['success', 'failure', 'seen'], Sequence[Literal['success', 'failure', 'seen']], SequenceNotStr[str]]] = None, message_substring: Optional[str] = None, created_time: Optional[Union[dict[str, Any], TimestampRange, str]] = None, limit: int | None = 25) → ExtractionPipelineRunList

List runs for an extraction pipeline with given external_id

Parameters

external_id (str) – Extraction pipeline external Id.
statuses (RunStatus | Sequence[RunStatus] | SequenceNotStr[str] | None) – One or more among “success” / “failure” / “seen”.
message_substring (str | None) – Failure message part.
created_time (dict[str, Any] | TimestampRange | str | None) – Range between two timestamps. Possible keys are min and max, with values given as timestamps in ms. If a string is passed, it is assumed to be the minimum value.
limit (int | None) – Maximum number of ExtractionPipelines to return. Defaults to 25. Set to -1, float(“inf”) or None to return all items.

Returns

List of requested extraction pipeline runs

Return type

ExtractionPipelineRunList

Tip

The created_time parameter can also be passed as a string, to support the most typical usage pattern of fetching the most recent runs, meaning it is implicitly assumed to be the minimum created time. The format is “N[timeunit]-ago”, where timeunit is w,d,h,m (week, day, hour, minute), e.g. “12d-ago”.

Examples

List extraction pipeline runs:

>>> from cognite.client import CogniteClient
>>> client = CogniteClient()
>>> runsList = client.extraction_pipelines.runs.list(external_id="test ext id", limit=5)

Filter extraction pipeline runs on a given status:

>>> runs_list = client.extraction_pipelines.runs.list(external_id="test ext id", statuses=["seen"], limit=5)

Get all failed pipeline runs in the last 24 hours for pipeline ‘extId’:

>>> from cognite.client.data_classes import ExtractionPipelineRun
>>> res = client.extraction_pipelines.runs.list(external_id="extId", statuses="failure", created_time="24h-ago")

Report new runs

ExtractionPipelineRunsAPI.create(run: cognite.client.data_classes.extractionpipelines.ExtractionPipelineRun | cognite.client.data_classes.extractionpipelines.ExtractionPipelineRunWrite) → ExtractionPipelineRun

ExtractionPipelineRunsAPI.create(run: collections.abc.Sequence[cognite.client.data_classes.extractionpipelines.ExtractionPipelineRun] | collections.abc.Sequence[cognite.client.data_classes.extractionpipelines.ExtractionPipelineRunWrite]) → ExtractionPipelineRunList

Create one or more extraction pipeline runs.

You can create an arbitrary number of extraction pipeline runs, and the SDK will split the request into multiple requests.

Parameters: run (ExtractionPipelineRun | ExtractionPipelineRunWrite | Sequence[ExtractionPipelineRun] | Sequence[ExtractionPipelineRunWrite]) – ExtractionPipelineRun| ExtractionPipelineRunWrite | Sequence[ExtractionPipelineRun] | Sequence[ExtractionPipelineRunWrite]): Extraction pipeline or list of extraction pipeline runs to create.
Returns: Created extraction pipeline run(s)
Return type: ExtractionPipelineRun | ExtractionPipelineRunList

Examples

Report a new extraction pipeline run:

>>> from cognite.client import CogniteClient
>>> from cognite.client.data_classes import ExtractionPipelineRunWrite
>>> client = CogniteClient()
>>> res = client.extraction_pipelines.runs.create(
...     ExtractionPipelineRunWrite(status="success", extpipe_external_id="extId"))

Extraction pipeline configs

Get the latest or a specific config revision

ExtractionPipelineConfigsAPI.retrieve(external_id: str, revision: Optional[int] = None, active_at_time: Optional[int] = None) → ExtractionPipelineConfig

Retrieve a specific configuration revision, or the latest by default <https://developer.cognite.com/api#tag/Extraction-Pipelines-Config/operation/getExtPipeConfigRevision>

By default the latest configuration revision is retrieved, or you can specify a timestamp or a revision number.

Parameters

external_id (str) – External id of the extraction pipeline to retrieve config from.
revision (int | None) – Optionally specify a revision number to retrieve.
active_at_time (int | None) – Optionally specify a timestamp the configuration revision should be active.

Returns

Retrieved extraction pipeline configuration revision

Return type

ExtractionPipelineConfig

Examples

Retrieve latest config revision:

>>> from cognite.client import CogniteClient
>>> client = CogniteClient()
>>> res = client.extraction_pipelines.config.retrieve("extId")

List configuration revisions

ExtractionPipelineConfigsAPI.list(external_id: str) → ExtractionPipelineConfigRevisionList

Retrieve all configuration revisions from an extraction pipeline <https://developer.cognite.com/api#tag/Extraction-Pipelines-Config/operation/listExtPipeConfigRevisions>

Parameters: external_id (str) – External id of the extraction pipeline to retrieve config from.
Returns: Retrieved extraction pipeline configuration revisions
Return type: ExtractionPipelineConfigRevisionList

Examples

Retrieve a list of config revisions:

>>> from cognite.client import CogniteClient
>>> client = CogniteClient()
>>> res = client.extraction_pipelines.config.list("extId")

Create a config revision

ExtractionPipelineConfigsAPI.create(config: cognite.client.data_classes.extractionpipelines.ExtractionPipelineConfig | cognite.client.data_classes.extractionpipelines.ExtractionPipelineConfigWrite) → ExtractionPipelineConfig

Create a new configuration revision <https://developer.cognite.com/api#tag/Extraction-Pipelines-Config/operation/createExtPipeConfig>

Parameters: config (ExtractionPipelineConfig | ExtractionPipelineConfigWrite) – Configuration revision to create.
Returns: Created extraction pipeline configuration revision
Return type: ExtractionPipelineConfig

Examples

Create a config revision:

>>> from cognite.client import CogniteClient
>>> from cognite.client.data_classes import ExtractionPipelineConfigWrite
>>> client = CogniteClient()
>>> res = client.extraction_pipelines.config.create(ExtractionPipelineConfigWrite(external_id="extId", config="my config contents"))

Revert to an earlier config revision

ExtractionPipelineConfigsAPI.revert(external_id: str, revision: int) → ExtractionPipelineConfig

Revert to a previous configuration revision <https://developer.cognite.com/api#tag/Extraction-Pipelines-Config/operation/revertExtPipeConfigRevision>

Parameters

external_id (str) – External id of the extraction pipeline to revert revision for.
revision (int) – Revision to revert to.

Returns

New latest extraction pipeline configuration revision.

Return type

ExtractionPipelineConfig

Examples

Revert a config revision:

>>> from cognite.client import CogniteClient
>>> client = CogniteClient()
>>> res = client.extraction_pipelines.config.revert("extId", 5)

Extractor Config Data classes

Bases: ExtractionPipelineCore

An extraction pipeline is a representation of a process writing data to CDF, such as an extractor or an ETL tool. This is the reading version of the ExtractionPipeline class, which is used when retrieving extraction pipelines.

Parameters

id (int | None) – A server-generated ID for the object.
external_id (str | None) – The external ID provided by the client. Must be unique for the resource type.
name (str | None) – The name of the extraction pipeline.
description (str | None) – The description of the extraction pipeline.
data_set_id (int | None) – The id of the dataset this extraction pipeline related with.
raw_tables (list[dict[str, str]] | None) – list of raw tables in list format: [{“dbName”: “value”, “tableName” : “value”}].
last_success (int | None) – Milliseconds value of last success status.
last_failure (int | None) – Milliseconds value of last failure status.
last_message (str | None) – Message of last failure.
last_seen (int | None) – Milliseconds value of last seen status.
schedule (str | None) – One of None/On trigger/Continuous/cron regex.
contacts (list[ExtractionPipelineContact] | None) – list of contacts
metadata (dict[str, str] | None) – Custom, application specific metadata. String key -> String value. Limits: Maximum length of key is 128 bytes, value 10240 bytes, up to 256 key-value pairs, of total size at most 10240.
source (str | None) – Source text value for extraction pipeline.
documentation (str | None) – Documentation text value for extraction pipeline.
notification_config (ExtractionPipelineNotificationConfiguration | None) – Notification configuration for the extraction pipeline.
created_time (int | None) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.
last_updated_time (int | None) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.
created_by (str | None) – Extraction pipeline creator, usually an email.
cognite_client (CogniteClient | None) – The client to associate with this object.

as_write() → ExtractionPipelineWrite: Returns this ExtractionPipeline as a ExtractionPipelineWrite

Bases: ExtractionPipelineConfigCore

An extraction pipeline config

Parameters

external_id (str | None) – The external ID of the associated extraction pipeline.
config (str | None) – Contents of this configuration revision.
revision (int | None) – The revision number of this config as a positive integer.
description (str | None) – Short description of this configuration revision.
created_time (int | None) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.
cognite_client (CogniteClient | None) – The client to associate with this object.

as_write() → ExtractionPipelineConfigWrite: Returns this ExtractionPipelineConfig as a ExtractionPipelineConfigWrite

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineConfigCore(external_id: Optional[str] = None, config: Optional[str] = None, description: Optional[str] = None)

Bases: WriteableCogniteResource[ExtractionPipelineConfigWrite], ABC

An extraction pipeline config

Parameters

external_id (str | None) – The external ID of the associated extraction pipeline.
config (str | None) – Contents of this configuration revision.
description (str | None) – Short description of this configuration revision.

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineConfigList(resources: Iterable[Any], cognite_client: CogniteClient | None = None): Bases: WriteableCogniteResourceList[ExtractionPipelineConfigWrite, ExtractionPipelineConfig], ExternalIDTransformerMixin

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineConfigRevision(external_id: str | None = None, revision: int | None = None, description: str | None = None, created_time: int | None = None, cognite_client: CogniteClient | None = None)

Bases: CogniteResource

An extraction pipeline config revision

Parameters

external_id (str | None) – The external ID of the associated extraction pipeline.
revision (int | None) – The revision number of this config as a positive integer.
description (str | None) – Short description of this configuration revision.
created_time (int | None) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.
cognite_client (CogniteClient | None) – The client to associate with this object.

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineConfigRevisionList(resources: Iterable[Any], cognite_client: CogniteClient | None = None): Bases: CogniteResourceList[ExtractionPipelineConfigRevision], ExternalIDTransformerMixin

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineConfigWrite(external_id: str, config: Optional[str] = None, description: Optional[str] = None)

Bases: ExtractionPipelineConfigCore

An extraction pipeline config

Parameters

external_id (str) – The external ID of the associated extraction pipeline.
config (str | None) – Contents of this configuration revision.
description (str | None) – Short description of this configuration revision.

as_write() → ExtractionPipelineConfigWrite: Returns this ExtractionPipelineConfigWrite instance.

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineConfigWriteList(resources: Iterable[Any], cognite_client: CogniteClient | None = None): Bases: CogniteResourceList[ExtractionPipelineConfigWrite], ExternalIDTransformerMixin

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineContact(name: Optional[str] = None, email: Optional[str] = None, role: Optional[str] = None, send_notification: Optional[bool] = None)

Bases: CogniteObject

A contact for an extraction pipeline

Parameters

name (str | None) – Name of contact
email (str | None) – Email address of contact
role (str | None) – Role of contact, such as Owner, Maintainer, etc.
send_notification (bool | None) – Whether to send notifications to this contact or not

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineCore(external_id: Optional[str] = None, name: Optional[str] = None, description: Optional[str] = None, data_set_id: Optional[int] = None, raw_tables: Optional[list[dict[str, str]]] = None, schedule: Optional[str] = None, contacts: Optional[list[cognite.client.data_classes.extractionpipelines.ExtractionPipelineContact]] = None, metadata: Optional[dict[str, str]] = None, source: Optional[str] = None, documentation: Optional[str] = None, notification_config: Optional[ExtractionPipelineNotificationConfiguration] = None, created_by: Optional[str] = None)

Bases: WriteableCogniteResource[ExtractionPipelineWrite], ABC

An extraction pipeline is a representation of a process writing data to CDF, such as an extractor or an ETL tool.

Parameters

external_id (str | None) – The external ID provided by the client. Must be unique for the resource type.
name (str | None) – The name of the extraction pipeline.
description (str | None) – The description of the extraction pipeline.
data_set_id (int | None) – The id of the dataset this extraction pipeline related with.
raw_tables (list[dict[str, str]] | None) – list of raw tables in list format: [{“dbName”: “value”, “tableName” : “value”}].
schedule (str | None) – One of None/On trigger/Continuous/cron regex.
contacts (list[ExtractionPipelineContact] | None) – list of contacts
metadata (dict[str, str] | None) – Custom, application specific metadata. String key -> String value. Limits: Maximum length of key is 128 bytes, value 10240 bytes, up to 256 key-value pairs, of total size at most 10240.
source (str | None) – Source text value for extraction pipeline.
documentation (str | None) – Documentation text value for extraction pipeline.
notification_config (ExtractionPipelineNotificationConfiguration | None) – Notification configuration for the extraction pipeline.
created_by (str | None) – Extraction pipeline creator, usually an email.

dump(camel_case: bool = True) → dict[str, Any]

Dump the instance into a json serializable Python data type.

Parameters: camel_case (bool) – Use camelCase for attribute names. Defaults to True.
Returns: A dictionary representation of the instance.
Return type: dict[str, Any]

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineList(resources: Iterable[Any], cognite_client: CogniteClient | None = None): Bases: WriteableCogniteResourceList[ExtractionPipelineWrite, ExtractionPipeline], IdTransformerMixin

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineNotificationConfiguration(allowed_not_seen_range_in_minutes: Optional[int] = None)

Bases: CogniteObject

Extraction pipeline notification configuration

Parameters: allowed_not_seen_range_in_minutes (int | None) – Time in minutes to pass without any Run. Null if extraction pipeline is not checked.

Bases: ExtractionPipelineRunCore

A representation of an extraction pipeline run.

Parameters

extpipe_external_id (str | None) – The external ID of the extraction pipeline.
status (str | None) – success/failure/seen.
message (str | None) – Optional status message.
created_time (int | None) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.
cognite_client (CogniteClient | None) – The client to associate with this object.
id (int | None) – A server-generated ID for the object.

as_write() → ExtractionPipelineRunWrite: Returns this ExtractionPipelineRun as a ExtractionPipelineRunWrite

dump(camel_case: bool = True) → dict[str, Any]

Dump the instance into a json serializable Python data type.

Parameters: camel_case (bool) – Use camelCase for attribute names. Defaults to True.
Returns: A dictionary representation of the instance.
Return type: dict[str, Any]

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineRunCore(status: Optional[str] = None, message: Optional[str] = None, created_time: Optional[int] = None)

Bases: WriteableCogniteResource[ExtractionPipelineRunWrite], ABC

A representation of an extraction pipeline run.

Parameters

status (str | None) – success/failure/seen.
message (str | None) – Optional status message.
created_time (int | None) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineRunFilter(external_id: Optional[str] = None, statuses: Optional[SequenceNotStr[str]] = None, message: Optional[StringFilter] = None, created_time: Optional[Union[dict[str, Any], TimestampRange]] = None)

Bases: CogniteFilter

Filter runs with exact matching

Parameters

external_id (str | None) – The external ID of related ExtractionPipeline provided by the client. Must be unique for the resource type.
statuses (SequenceNotStr[str] | None) – success/failure/seen.
message (StringFilter | None) – message filter.
created_time (dict[str, Any] | TimestampRange | None) – Range between two timestamps.

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineRunList(resources: Iterable[Any], cognite_client: CogniteClient | None = None): Bases: WriteableCogniteResourceList[ExtractionPipelineRunWrite, ExtractionPipelineRun], IdTransformerMixin

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineRunWrite(extpipe_external_id: str, status: Literal['success', 'failure', 'seen'], message: Optional[str] = None, created_time: Optional[int] = None)

Bases: ExtractionPipelineRunCore

A representation of an extraction pipeline run. This is the writing version of the ExtractionPipelineRun class, which is used when creating extraction pipeline runs.

Parameters

extpipe_external_id (str) – The external ID of the extraction pipeline.
status (Literal['success', 'failure', 'seen']) – success/failure/seen.
message (str | None) – Optional status message.
created_time (int | None) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.

as_write() → ExtractionPipelineRunWrite: Returns this ExtractionPipelineRunWrite instance.

dump(camel_case: bool = True) → dict[str, Any]

Dump the instance into a json serializable Python data type.

Parameters: camel_case (bool) – Use camelCase for attribute names. Defaults to True.
Returns: A dictionary representation of the instance.
Return type: dict[str, Any]

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineRunWriteList(resources: Iterable[Any], cognite_client: CogniteClient | None = None): Bases: CogniteResourceList[ExtractionPipelineRunWrite]

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineUpdate(id: Optional[int] = None, external_id: Optional[str] = None)

Bases: CogniteUpdate

Changes applied to an extraction pipeline

Parameters

id (int) – A server-generated ID for the object.
external_id (str) – The external ID provided by the client. Must be unique for the resource type.

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineWrite(external_id: str, name: str, data_set_id: int, description: Optional[str] = None, raw_tables: Optional[list[dict[str, str]]] = None, schedule: Optional[str] = None, contacts: Optional[list[cognite.client.data_classes.extractionpipelines.ExtractionPipelineContact]] = None, metadata: Optional[dict[str, str]] = None, source: Optional[str] = None, documentation: Optional[str] = None, notification_config: Optional[ExtractionPipelineNotificationConfiguration] = None, created_by: Optional[str] = None)

Bases: ExtractionPipelineCore

An extraction pipeline is a representation of a process writing data to CDF, such as an extractor or an ETL tool. This is the writing version of the ExtractionPipeline class, which is used when creating extraction pipelines.

Parameters

external_id (str) – The external ID provided by the client. Must be unique for the resource type.
name (str) – The name of the extraction pipeline.
data_set_id (int) – The id of the dataset this extraction pipeline related with.
description (str | None) – The description of the extraction pipeline.
raw_tables (list[dict[str, str]] | None) – list of raw tables in list format: [{“dbName”: “value”, “tableName” : “value”}].
schedule (str | None) – One of None/On trigger/Continuous/cron regex.
contacts (list[ExtractionPipelineContact] | None) – list of contacts
metadata (dict[str, str] | None) – Custom, application specific metadata. String key -> String value. Limits: Maximum length of key is 128 bytes, value 10240 bytes, up to 256 key-value pairs, of total size at most 10240.
source (str | None) – Source text value for extraction pipeline.
documentation (str | None) – Documentation text value for extraction pipeline.
notification_config (ExtractionPipelineNotificationConfiguration | None) – Notification configuration for the extraction pipeline.
created_by (str | None) – Extraction pipeline creator, usually an email.

as_write() → ExtractionPipelineWrite: Returns this ExtractionPipelineWrite instance.

class cognite.client.data_classes.extractionpipelines.ExtractionPipelineWriteList(resources: Iterable[Any], cognite_client: CogniteClient | None = None): Bases: CogniteResourceList[ExtractionPipelineWrite], ExternalIDTransformerMixin

class cognite.client.data_classes.extractionpipelines.StringFilter(substring: Optional[str] = None)

Bases: CogniteFilter

Filter runs on substrings of the message

Parameters: substring (str | None) – Part of message