Data Ingestion
Raw
Databases
List databases
- RawDatabasesAPI.list(limit: int | None = 25) DatabaseList
-
- Parameters
limit (int | None) – Maximum number of databases to return. Defaults to 25. Set to -1, float(“inf”) or None to return all items.
- Returns
List of requested databases.
- Return type
Examples
List the first 5 databases:
>>> from cognite.client import CogniteClient >>> client = CogniteClient() >>> db_list = client.raw.databases.list(limit=5)
Iterate over databases:
>>> for db in client.raw.databases: ... db # do something with the db
Iterate over chunks of databases to reduce memory load:
>>> for db_list in client.raw.databases(chunk_size=2500): ... db_list # do something with the dbs
Create new databases
- RawDatabasesAPI.create(name: str) Database
- RawDatabasesAPI.create(name: list[str]) DatabaseList
-
- Parameters
name (str | list[str]) – A db name or list of db names to create.
- Returns
Database or list of databases that has been created.
- Return type
Examples
Create a new database:
>>> from cognite.client import CogniteClient >>> client = CogniteClient() >>> res = client.raw.databases.create("db1")
Delete databases
- RawDatabasesAPI.delete(name: Union[str, SequenceNotStr[str]], recursive: bool = False) None
-
- Parameters
name (str | SequenceNotStr[str]) – A db name or list of db names to delete.
recursive (bool) – Recursively delete all tables in the database(s).
Examples
Delete a list of databases:
>>> from cognite.client import CogniteClient >>> client = CogniteClient() >>> client.raw.databases.delete(["db1", "db2"])
Tables
List tables in a database
- RawTablesAPI.list(db_name: str, limit: int | None = 25) TableList
-
- Parameters
db_name (str) – The database to list tables from.
limit (int | None) – Maximum number of tables to return. Defaults to 25. Set to -1, float(“inf”) or None to return all items.
- Returns
List of requested tables.
- Return type
Examples
List the first 5 tables:
>>> from cognite.client import CogniteClient >>> client = CogniteClient() >>> table_list = client.raw.tables.list("db1", limit=5)
Iterate over tables:
>>> for table in client.raw.tables(db_name="db1"): ... table # do something with the table
Iterate over chunks of tables to reduce memory load:
>>> for table_list in client.raw.tables(db_name="db1", chunk_size=2500): ... table_list # do something with the tables
Create new tables in a database
- RawTablesAPI.create(db_name: str, name: str) Table
- RawTablesAPI.create(db_name: str, name: list[str]) TableList
-
- Parameters
db_name (str) – Database to create the tables in.
name (str | list[str]) – A table name or list of table names to create.
- Returns
raw.Table or list of tables that has been created.
- Return type
Examples
Create a new table in a database:
>>> from cognite.client import CogniteClient >>> client = CogniteClient() >>> res = client.raw.tables.create("db1", "table1")
Delete tables from a database
- RawTablesAPI.delete(db_name: str, name: Union[str, SequenceNotStr[str]]) None
-
- Parameters
db_name (str) – Database to delete tables from.
name (str | SequenceNotStr[str]) – A table name or list of table names to delete.
Examples
Delete a list of tables:
>>> from cognite.client import CogniteClient >>> client = CogniteClient() >>> res = client.raw.tables.delete("db1", ["table1", "table2"])
Rows
Get a row from a table
- RawRowsAPI.retrieve(db_name: str, table_name: str, key: str) cognite.client.data_classes.raw.Row | None
-
- Parameters
db_name (str) – Name of the database.
table_name (str) – Name of the table.
key (str) – The key of the row to retrieve.
- Returns
The requested row.
- Return type
Row | None
Examples
Retrieve a row with key ‘k1’ from table ‘t1’ in database ‘db1’:
>>> from cognite.client import CogniteClient >>> client = CogniteClient() >>> row = client.raw.rows.retrieve("db1", "t1", "k1")
You may access the data directly on the row (like a dict), or use ‘.get’ when keys can be missing:
>>> val1 = row["col1"] >>> val2 = row.get("col2")
List rows in a table
- RawRowsAPI.list(db_name: str, table_name: str, min_last_updated_time: Optional[int] = None, max_last_updated_time: Optional[int] = None, columns: Optional[list[str]] = None, limit: int | None = 25, partitions: Optional[int] = None) RowList
-
- Parameters
db_name (str) – Name of the database.
table_name (str) – Name of the table.
min_last_updated_time (int | None) – Rows must have been last updated after this time (exclusive). ms since epoch.
max_last_updated_time (int | None) – Rows must have been last updated before this time (inclusive). ms since epoch.
columns (list[str] | None) – List of column keys. Set to None for retrieving all, use [] to retrieve only row keys.
limit (int | None) – The number of rows to retrieve. Can be used with partitions. Defaults to 25. Set to -1, float(“inf”) or None to return all items.
partitions (int | None) – Retrieve rows in parallel using this number of workers. Can be used together with a (large) finite limit. When partitions is not passed, it defaults to 1, i.e. no concurrency for a finite limit and
global_config.max_workers
for an unlimited query (will be capped at this value). To prevent unexpected problems and maximize read throughput, check out concurrency limits in the API documentation.
- Returns
The requested rows.
- Return type
Examples
List a few rows:
>>> from cognite.client import CogniteClient >>> client = CogniteClient() >>> row_list = client.raw.rows.list("db1", "tbl1", limit=5)
Read an entire table efficiently by using concurrency (default behavior when
limit=None
):>>> row_list = client.raw.rows.list("db1", "tbl1", limit=None)
Iterate through all rows one-by-one to reduce memory load (no concurrency used):
>>> for row in client.raw.rows("db1", "t1", columns=["col1","col2"]): ... val1 = row["col1"] # You may access the data directly ... val2 = row.get("col2") # ...or use '.get' when keys can be missing
Iterate through all rows, one chunk at a time, to reduce memory load (no concurrency used):
>>> for row_list in client.raw.rows("db1", "t1", chunk_size=2500): ... row_list # Do something with the rows
Iterate through a massive table to reduce memory load while using concurrency for high throughput. Note:
partitions
must be specified for concurrency to be used (this is different fromlist()
to keep backward compatibility). Supplying a finitelimit
does not affect concurrency settings (except for very small values).>>> rows_iterator = client.raw.rows( ... db_name="db1", table_name="t1", partitions=5, chunk_size=5000, limit=1_000_000 ... ) >>> for row_list in rows_iterator: ... row_list # Do something with the rows
Insert rows into a table
- RawRowsAPI.insert(db_name: str, table_name: str, row: collections.abc.Sequence[cognite.client.data_classes.raw.Row] | collections.abc.Sequence[cognite.client.data_classes.raw.RowWrite] | cognite.client.data_classes.raw.Row | cognite.client.data_classes.raw.RowWrite | dict, ensure_parent: bool = False) None
Insert one or more rows into a table.
- Parameters
Examples
Insert new rows into a table:
>>> from cognite.client import CogniteClient >>> from cognite.client.data_classes import RowWrite >>> client = CogniteClient() >>> rows = [RowWrite(key="r1", columns={"col1": "val1", "col2": "val1"}), ... RowWrite(key="r2", columns={"col1": "val2", "col2": "val2"})] >>> client.raw.rows.insert("db1", "table1", rows)
You may also insert a dictionary directly:
>>> rows = { ... "key-1": {"col1": 1, "col2": 2}, ... "key-2": {"col1": 3, "col2": 4, "col3": "high five"}, ... } >>> client.raw.rows.insert("db1", "table1", rows)
Delete rows from a table
- RawRowsAPI.delete(db_name: str, table_name: str, key: Union[str, SequenceNotStr[str]]) None
-
- Parameters
db_name (str) – Name of the database.
table_name (str) – Name of the table.
key (str | SequenceNotStr[str]) – The key(s) of the row(s) to delete.
Examples
Delete rows from table:
>>> from cognite.client import CogniteClient >>> client = CogniteClient() >>> keys_to_delete = ["k1", "k2", "k3"] >>> client.raw.rows.delete("db1", "table1", keys_to_delete)
Retrieve pandas dataframe
- RawRowsAPI.retrieve_dataframe(db_name: str, table_name: str, min_last_updated_time: int | None = None, max_last_updated_time: int | None = None, columns: list[str] | None = None, limit: int | None = 25, partitions: int | None = None, last_updated_time_in_index: bool = False) pd.DataFrame
Retrieve rows in a table as a pandas dataframe.
Rowkeys are used as the index.
- Parameters
db_name (str) – Name of the database.
table_name (str) – Name of the table.
min_last_updated_time (int | None) – Rows must have been last updated after this time. ms since epoch.
max_last_updated_time (int | None) – Rows must have been last updated before this time. ms since epoch.
columns (list[str] | None) – List of column keys. Set to None for retrieving all, use [] to retrieve only row keys.
limit (int | None) – The number of rows to retrieve. Defaults to 25. Set to -1, float(“inf”) or None to return all items.
partitions (int | None) –
Retrieve rows in parallel using this number of workers. Can be used together with a (large) finite limit. When partitions is not passed, it defaults to 1, i.e. no concurrency for a finite limit and
global_config.max_workers
for an unlimited query (will be capped at this value). To prevent unexpected problems and maximize read throughput, check out concurrency limits in the API documentation.last_updated_time_in_index (bool) – Use a MultiIndex with row keys and last_updated_time as index.
- Returns
The requested rows in a pandas dataframe.
- Return type
pd.DataFrame
Examples
Get dataframe:
>>> from cognite.client import CogniteClient >>> client = CogniteClient() >>> df = client.raw.rows.retrieve_dataframe("db1", "t1", limit=5)
Insert pandas dataframe
- RawRowsAPI.insert_dataframe(db_name: str, table_name: str, dataframe: pd.DataFrame, ensure_parent: bool = False, dropna: bool = True) None
Insert pandas dataframe into a table
Uses index for row keys.
- Parameters
db_name (str) – Name of the database.
table_name (str) – Name of the table.
dataframe (pd.DataFrame) – The dataframe to insert. Index will be used as row keys.
ensure_parent (bool) – Create database/table if they don’t already exist.
dropna (bool) – Remove NaNs (but keep None’s in dtype=object columns) before inserting. Done individually per column. Default: True
Examples
Insert new rows into a table:
>>> import pandas as pd >>> from cognite.client import CogniteClient >>> >>> client = CogniteClient() >>> df = pd.DataFrame( ... {"col-a": [1, 3, None], "col-b": [2, -1, 9]}, ... index=["r1", "r2", "r3"]) >>> res = client.raw.rows.insert_dataframe( ... "db1", "table1", df, dropna=True)
RAW Data classes
- class cognite.client.data_classes.raw.Database(name: str | None = None, created_time: int | None = None, cognite_client: CogniteClient | None = None)
Bases:
DatabaseCore
A NoSQL database to store customer data.
- Parameters
name (str | None) – Unique name of a database.
created_time (int | None) – Time the database was created.
cognite_client (CogniteClient | None) – The client to associate with this object.
- as_write() DatabaseWrite
Returns this Database as a DatabaseWrite
- class cognite.client.data_classes.raw.DatabaseCore(name: Optional[str] = None)
Bases:
WriteableCogniteResource
[DatabaseWrite
],ABC
A NoSQL database to store customer data.
- Parameters
name (str | None) – Unique name of a database.
- class cognite.client.data_classes.raw.DatabaseList(resources: Iterable[Any], cognite_client: CogniteClient | None = None)
Bases:
WriteableCogniteResourceList
[DatabaseWrite
,Database
],NameTransformerMixin
- as_write() DatabaseWriteList
Returns this DatabaseList as a DatabaseWriteList
- class cognite.client.data_classes.raw.DatabaseWrite(name: str)
Bases:
DatabaseCore
A NoSQL database to store customer data.
- Parameters
name (str) – Unique name of a database.
- as_write() DatabaseWrite
Returns this DatabaseWrite instance.
- class cognite.client.data_classes.raw.DatabaseWriteList(resources: Iterable[Any], cognite_client: CogniteClient | None = None)
Bases:
CogniteResourceList
[DatabaseWrite
],NameTransformerMixin
- class cognite.client.data_classes.raw.Row(key: str | None = None, columns: dict[str, Any] | None = None, last_updated_time: int | None = None, cognite_client: CogniteClient | None = None)
Bases:
RowCore
This represents a row in a NO-SQL table. This is the reading version of the Row class, which is used when retrieving a row.
- Parameters
key (str | None) – Unique row key
columns (dict[str, Any] | None) – Row data stored as a JSON object.
last_updated_time (int | None) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.
cognite_client (CogniteClient | None) – The client to associate with this object.
- class cognite.client.data_classes.raw.RowCore(key: Optional[str] = None, columns: Optional[dict[str, Any]] = None)
Bases:
WriteableCogniteResource
[RowWrite
],ABC
No description.
- Parameters
key (str | None) – Unique row key
columns (dict[str, Any] | None) – Row data stored as a JSON object.
- to_pandas() pandas.DataFrame
Convert the instance into a pandas DataFrame.
- Returns
The pandas DataFrame representing this instance.
- Return type
pandas.DataFrame
- class cognite.client.data_classes.raw.RowList(resources: Iterable[Any], cognite_client: CogniteClient | None = None)
Bases:
RowListCore
[Row
]- as_write() RowWriteList
Returns this RowList as a RowWriteList
- class cognite.client.data_classes.raw.RowListCore(resources: Iterable[Any], cognite_client: CogniteClient | None = None)
Bases:
WriteableCogniteResourceList
[RowWrite
,T_Row
],ABC
- to_pandas() pandas.DataFrame
Convert the instance into a pandas DataFrame.
- Returns
The pandas DataFrame representing this instance.
- Return type
pandas.DataFrame
- class cognite.client.data_classes.raw.RowWrite(key: str, columns: dict[str, Any])
Bases:
RowCore
This represents a row in a NO-SQL table. This is the writing version of the Row class, which is used when creating a row.
- Parameters
key (str) – Unique row key
columns (dict[str, Any]) – Row data stored as a JSON object.
- class cognite.client.data_classes.raw.RowWriteList(resources: Iterable[Any], cognite_client: CogniteClient | None = None)
Bases:
RowListCore
[RowWrite
]
- class cognite.client.data_classes.raw.Table(name: str | None = None, created_time: int | None = None, cognite_client: CogniteClient | None = None)
Bases:
TableCore
A NoSQL database table to store customer data. This is the reading version of the Table class, which is used when retrieving a table.
- Parameters
name (str | None) – Unique name of the table
created_time (int | None) – Time the table was created.
cognite_client (CogniteClient | None) – The client to associate with this object.
- as_write() TableWrite
Returns this Table as a TableWrite
- rows(key: str, limit: int | None = None) cognite.client.data_classes.raw.Row | None
- rows(key: None = None, limit: int | None = None) RowList
Get the rows in this table.
- class cognite.client.data_classes.raw.TableCore(name: Optional[str] = None)
Bases:
WriteableCogniteResource
[TableWrite
]A NoSQL database table to store customer data
- Parameters
name (str | None) – Unique name of the table
- class cognite.client.data_classes.raw.TableList(resources: Iterable[Any], cognite_client: CogniteClient | None = None)
Bases:
WriteableCogniteResourceList
[TableWrite
,Table
],NameTransformerMixin
- as_write() TableWriteList
Returns this TableList as a TableWriteList
- class cognite.client.data_classes.raw.TableWrite(name: str)
Bases:
TableCore
A NoSQL database table to store customer data This is the writing version of the Table class, which is used when creating a table.
- Parameters
name (str) – Unique name of the table
- as_write() TableWrite
Returns this TableWrite instance.
- class cognite.client.data_classes.raw.TableWriteList(resources: Iterable[Any], cognite_client: CogniteClient | None = None)
Bases:
CogniteResourceList
[TableWrite
],NameTransformerMixin
Extraction pipelines
List extraction pipelines
- ExtractionPipelinesAPI.list(limit: int | None = 25) ExtractionPipelineList
-
- Parameters
limit (int | None) – Maximum number of ExtractionPipelines to return. Defaults to 25. Set to -1, float(“inf”) or None to return all items.
- Returns
List of requested ExtractionPipelines
- Return type
Examples
List ExtractionPipelines:
>>> from cognite.client import CogniteClient >>> client = CogniteClient() >>> ep_list = client.extraction_pipelines.list(limit=5)
Create extraction pipeline
- ExtractionPipelinesAPI.create(extraction_pipeline: cognite.client.data_classes.extractionpipelines.ExtractionPipeline | cognite.client.data_classes.extractionpipelines.ExtractionPipelineWrite) ExtractionPipeline
- ExtractionPipelinesAPI.create(extraction_pipeline: collections.abc.Sequence[cognite.client.data_classes.extractionpipelines.ExtractionPipeline] | collections.abc.Sequence[cognite.client.data_classes.extractionpipelines.ExtractionPipelineWrite]) ExtractionPipelineList
Create one or more extraction pipelines.
You can create an arbitrary number of extraction pipelines, and the SDK will split the request into multiple requests if necessary.
- Parameters
extraction_pipeline (ExtractionPipeline | ExtractionPipelineWrite | Sequence[ExtractionPipeline] | Sequence[ExtractionPipelineWrite]) – Extraction pipeline or list of extraction pipelines to create.
- Returns
Created extraction pipeline(s)
- Return type
Examples
Create new extraction pipeline:
>>> from cognite.client import CogniteClient >>> from cognite.client.data_classes import ExtractionPipelineWrite >>> client = CogniteClient() >>> extpipes = [ExtractionPipelineWrite(name="extPipe1",...), ExtractionPipelineWrite(name="extPipe2",...)] >>> res = client.extraction_pipelines.create(extpipes)
Retrieve an extraction pipeline by ID
- ExtractionPipelinesAPI.retrieve(id: Optional[int] = None, external_id: Optional[str] = None) cognite.client.data_classes.extractionpipelines.ExtractionPipeline | None
Retrieve a single extraction pipeline by id.
- Parameters
id (int | None) – ID
external_id (str | None) – External ID
- Returns
Requested extraction pipeline or None if it does not exist.
- Return type
ExtractionPipeline | None
Examples
Get extraction pipeline by id:
>>> from cognite.client import CogniteClient >>> client = CogniteClient() >>> res = client.extraction_pipelines.retrieve(id=1)
Get extraction pipeline by external id:
>>> res = client.extraction_pipelines.retrieve(external_id="1")
Retrieve multiple extraction pipelines by ID
- ExtractionPipelinesAPI.retrieve_multiple(ids: Optional[Sequence[int]] = None, external_ids: Optional[SequenceNotStr[str]] = None, ignore_unknown_ids: bool = False) ExtractionPipelineList
Retrieve multiple extraction pipelines by ids and external ids.
- Parameters
ids (Sequence[int] | None) – IDs
external_ids (SequenceNotStr[str] | None) – External IDs
ignore_unknown_ids (bool) – Ignore IDs and external IDs that are not found rather than throw an exception.
- Returns
The requested ExtractionPipelines.
- Return type
Examples
Get ExtractionPipelines by id:
>>> from cognite.client import CogniteClient >>> client = CogniteClient() >>> res = client.extraction_pipelines.retrieve_multiple(ids=[1, 2, 3])
Get assets by external id:
>>> res = client.extraction_pipelines.retrieve_multiple(external_ids=["abc", "def"], ignore_unknown_ids=True)
Update extraction pipelines
- ExtractionPipelinesAPI.update(item: cognite.client.data_classes.extractionpipelines.ExtractionPipeline | cognite.client.data_classes.extractionpipelines.ExtractionPipelineWrite | cognite.client.data_classes.extractionpipelines.ExtractionPipelineUpdate) ExtractionPipeline
- ExtractionPipelinesAPI.update(item: Sequence[cognite.client.data_classes.extractionpipelines.ExtractionPipeline | cognite.client.data_classes.extractionpipelines.ExtractionPipelineWrite | cognite.client.data_classes.extractionpipelines.ExtractionPipelineUpdate]) ExtractionPipelineList
Update one or more extraction pipelines
- Parameters
item (ExtractionPipeline | ExtractionPipelineWrite | ExtractionPipelineUpdate | Sequence[ExtractionPipeline | ExtractionPipelineWrite | ExtractionPipelineUpdate]) – Extraction pipeline(s) to update
mode (Literal['replace_ignore_null', 'patch', 'replace']) – How to update data when a non-update object is given (ExtractionPipeline or -Write). If you use ‘replace_ignore_null’, only the fields you have set will be used to replace existing (default). Using ‘replace’ will additionally clear all the fields that are not specified by you. Last option, ‘patch’, will update only the fields you have set and for container-like fields such as metadata or labels, add the values to the existing. For more details, see Update and Upsert Mode Parameter.
- Returns
Updated extraction pipeline(s)
- Return type
Examples
Update an extraction pipeline that you have fetched. This will perform a full update of the extraction pipeline:
>>> from cognite.client import CogniteClient >>> from cognite.client.data_classes import ExtractionPipelineUpdate >>> client = CogniteClient() >>> update = ExtractionPipelineUpdate(id=1) >>> update.description.set("Another new extpipe") >>> res = client.extraction_pipelines.update(update)
Delete extraction pipelines
- ExtractionPipelinesAPI.delete(id: Optional[Union[int, Sequence[int]]] = None, external_id: Optional[Union[str, SequenceNotStr[str]]] = None) None
Delete one or more extraction pipelines
- Parameters
id (int | Sequence[int] | None) – Id or list of ids
external_id (str | SequenceNotStr[str] | None) – External ID or list of external ids
Examples
Delete extraction pipelines by id or external id:
>>> from cognite.client import CogniteClient >>> client = CogniteClient() >>> client.extraction_pipelines.delete(id=[1,2,3], external_id="3")
Extraction pipeline runs
List runs for an extraction pipeline
- ExtractionPipelineRunsAPI.list(external_id: str, statuses: Optional[Union[Literal['success', 'failure', 'seen'], Sequence[Literal['success', 'failure', 'seen']], SequenceNotStr[str]]] = None, message_substring: Optional[str] = None, created_time: Optional[Union[dict[str, Any], TimestampRange, str]] = None, limit: int | None = 25) ExtractionPipelineRunList
List runs for an extraction pipeline with given external_id
- Parameters
external_id (str) – Extraction pipeline external Id.
statuses (RunStatus | Sequence[RunStatus] | SequenceNotStr[str] | None) – One or more among “success” / “failure” / “seen”.
message_substring (str | None) – Failure message part.
created_time (dict[str, Any] | TimestampRange | str | None) – Range between two timestamps. Possible keys are min and max, with values given as timestamps in ms. If a string is passed, it is assumed to be the minimum value.
limit (int | None) – Maximum number of ExtractionPipelines to return. Defaults to 25. Set to -1, float(“inf”) or None to return all items.
- Returns
List of requested extraction pipeline runs
- Return type
Tip
The
created_time
parameter can also be passed as a string, to support the most typical usage pattern of fetching the most recent runs, meaning it is implicitly assumed to be the minimum created time. The format is “N[timeunit]-ago”, where timeunit is w,d,h,m (week, day, hour, minute), e.g. “12d-ago”.Examples
List extraction pipeline runs:
>>> from cognite.client import CogniteClient >>> client = CogniteClient() >>> runsList = client.extraction_pipelines.runs.list(external_id="test ext id", limit=5)
Filter extraction pipeline runs on a given status:
>>> runs_list = client.extraction_pipelines.runs.list(external_id="test ext id", statuses=["seen"], limit=5)
Get all failed pipeline runs in the last 24 hours for pipeline ‘extId’:
>>> from cognite.client.data_classes import ExtractionPipelineRun >>> res = client.extraction_pipelines.runs.list(external_id="extId", statuses="failure", created_time="24h-ago")
Report new runs
- ExtractionPipelineRunsAPI.create(run: cognite.client.data_classes.extractionpipelines.ExtractionPipelineRun | cognite.client.data_classes.extractionpipelines.ExtractionPipelineRunWrite) ExtractionPipelineRun
- ExtractionPipelineRunsAPI.create(run: collections.abc.Sequence[cognite.client.data_classes.extractionpipelines.ExtractionPipelineRun] | collections.abc.Sequence[cognite.client.data_classes.extractionpipelines.ExtractionPipelineRunWrite]) ExtractionPipelineRunList
Create one or more extraction pipeline runs.
You can create an arbitrary number of extraction pipeline runs, and the SDK will split the request into multiple requests.
- Parameters
run (ExtractionPipelineRun | ExtractionPipelineRunWrite | Sequence[ExtractionPipelineRun] | Sequence[ExtractionPipelineRunWrite]) – ExtractionPipelineRun| ExtractionPipelineRunWrite | Sequence[ExtractionPipelineRun] | Sequence[ExtractionPipelineRunWrite]): Extraction pipeline or list of extraction pipeline runs to create.
- Returns
Created extraction pipeline run(s)
- Return type
Examples
Report a new extraction pipeline run:
>>> from cognite.client import CogniteClient >>> from cognite.client.data_classes import ExtractionPipelineRunWrite >>> client = CogniteClient() >>> res = client.extraction_pipelines.runs.create( ... ExtractionPipelineRunWrite(status="success", extpipe_external_id="extId"))
Extraction pipeline configs
Get the latest or a specific config revision
- ExtractionPipelineConfigsAPI.retrieve(external_id: str, revision: Optional[int] = None, active_at_time: Optional[int] = None) ExtractionPipelineConfig
Retrieve a specific configuration revision, or the latest by default <https://developer.cognite.com/api#tag/Extraction-Pipelines-Config/operation/getExtPipeConfigRevision>
By default the latest configuration revision is retrieved, or you can specify a timestamp or a revision number.
- Parameters
external_id (str) – External id of the extraction pipeline to retrieve config from.
revision (int | None) – Optionally specify a revision number to retrieve.
active_at_time (int | None) – Optionally specify a timestamp the configuration revision should be active.
- Returns
Retrieved extraction pipeline configuration revision
- Return type
Examples
Retrieve latest config revision:
>>> from cognite.client import CogniteClient >>> client = CogniteClient() >>> res = client.extraction_pipelines.config.retrieve("extId")
List configuration revisions
- ExtractionPipelineConfigsAPI.list(external_id: str) ExtractionPipelineConfigRevisionList
Retrieve all configuration revisions from an extraction pipeline <https://developer.cognite.com/api#tag/Extraction-Pipelines-Config/operation/listExtPipeConfigRevisions>
- Parameters
external_id (str) – External id of the extraction pipeline to retrieve config from.
- Returns
Retrieved extraction pipeline configuration revisions
- Return type
Examples
Retrieve a list of config revisions:
>>> from cognite.client import CogniteClient >>> client = CogniteClient() >>> res = client.extraction_pipelines.config.list("extId")
Create a config revision
- ExtractionPipelineConfigsAPI.create(config: cognite.client.data_classes.extractionpipelines.ExtractionPipelineConfig | cognite.client.data_classes.extractionpipelines.ExtractionPipelineConfigWrite) ExtractionPipelineConfig
Create a new configuration revision <https://developer.cognite.com/api#tag/Extraction-Pipelines-Config/operation/createExtPipeConfig>
- Parameters
config (ExtractionPipelineConfig | ExtractionPipelineConfigWrite) – Configuration revision to create.
- Returns
Created extraction pipeline configuration revision
- Return type
Examples
Create a config revision:
>>> from cognite.client import CogniteClient >>> from cognite.client.data_classes import ExtractionPipelineConfigWrite >>> client = CogniteClient() >>> res = client.extraction_pipelines.config.create(ExtractionPipelineConfigWrite(external_id="extId", config="my config contents"))
Revert to an earlier config revision
- ExtractionPipelineConfigsAPI.revert(external_id: str, revision: int) ExtractionPipelineConfig
Revert to a previous configuration revision <https://developer.cognite.com/api#tag/Extraction-Pipelines-Config/operation/revertExtPipeConfigRevision>
- Parameters
external_id (str) – External id of the extraction pipeline to revert revision for.
revision (int) – Revision to revert to.
- Returns
New latest extraction pipeline configuration revision.
- Return type
Examples
Revert a config revision:
>>> from cognite.client import CogniteClient >>> client = CogniteClient() >>> res = client.extraction_pipelines.config.revert("extId", 5)
Extractor Config Data classes
- class cognite.client.data_classes.extractionpipelines.ExtractionPipeline(id: int | None = None, external_id: str | None = None, name: str | None = None, description: str | None = None, data_set_id: int | None = None, raw_tables: list[dict[str, str]] | None = None, last_success: int | None = None, last_failure: int | None = None, last_message: str | None = None, last_seen: int | None = None, schedule: str | None = None, contacts: list[ExtractionPipelineContact] | None = None, metadata: dict[str, str] | None = None, source: str | None = None, documentation: str | None = None, notification_config: ExtractionPipelineNotificationConfiguration | None = None, created_time: int | None = None, last_updated_time: int | None = None, created_by: str | None = None, cognite_client: CogniteClient | None = None)
Bases:
ExtractionPipelineCore
An extraction pipeline is a representation of a process writing data to CDF, such as an extractor or an ETL tool. This is the reading version of the ExtractionPipeline class, which is used when retrieving extraction pipelines.
- Parameters
id (int | None) – A server-generated ID for the object.
external_id (str | None) – The external ID provided by the client. Must be unique for the resource type.
name (str | None) – The name of the extraction pipeline.
description (str | None) – The description of the extraction pipeline.
data_set_id (int | None) – The id of the dataset this extraction pipeline related with.
raw_tables (list[dict[str, str]] | None) – list of raw tables in list format: [{“dbName”: “value”, “tableName” : “value”}].
last_success (int | None) – Milliseconds value of last success status.
last_failure (int | None) – Milliseconds value of last failure status.
last_message (str | None) – Message of last failure.
last_seen (int | None) – Milliseconds value of last seen status.
schedule (str | None) – One of None/On trigger/Continuous/cron regex.
contacts (list[ExtractionPipelineContact] | None) – list of contacts
metadata (dict[str, str] | None) – Custom, application specific metadata. String key -> String value. Limits: Maximum length of key is 128 bytes, value 10240 bytes, up to 256 key-value pairs, of total size at most 10240.
source (str | None) – Source text value for extraction pipeline.
documentation (str | None) – Documentation text value for extraction pipeline.
notification_config (ExtractionPipelineNotificationConfiguration | None) – Notification configuration for the extraction pipeline.
created_time (int | None) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.
last_updated_time (int | None) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.
created_by (str | None) – Extraction pipeline creator, usually an email.
cognite_client (CogniteClient | None) – The client to associate with this object.
- as_write() ExtractionPipelineWrite
Returns this ExtractionPipeline as a ExtractionPipelineWrite
- class cognite.client.data_classes.extractionpipelines.ExtractionPipelineConfig(external_id: str | None = None, config: str | None = None, revision: int | None = None, description: str | None = None, created_time: int | None = None, cognite_client: CogniteClient | None = None)
Bases:
ExtractionPipelineConfigCore
An extraction pipeline config
- Parameters
external_id (str | None) – The external ID of the associated extraction pipeline.
config (str | None) – Contents of this configuration revision.
revision (int | None) – The revision number of this config as a positive integer.
description (str | None) – Short description of this configuration revision.
created_time (int | None) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.
cognite_client (CogniteClient | None) – The client to associate with this object.
- as_write() ExtractionPipelineConfigWrite
Returns this ExtractionPipelineConfig as a ExtractionPipelineConfigWrite
- class cognite.client.data_classes.extractionpipelines.ExtractionPipelineConfigCore(external_id: Optional[str] = None, config: Optional[str] = None, description: Optional[str] = None)
Bases:
WriteableCogniteResource
[ExtractionPipelineConfigWrite
],ABC
An extraction pipeline config
- Parameters
external_id (str | None) – The external ID of the associated extraction pipeline.
config (str | None) – Contents of this configuration revision.
description (str | None) – Short description of this configuration revision.
- class cognite.client.data_classes.extractionpipelines.ExtractionPipelineConfigList(resources: Iterable[Any], cognite_client: CogniteClient | None = None)
Bases:
WriteableCogniteResourceList
[ExtractionPipelineConfigWrite
,ExtractionPipelineConfig
],ExternalIDTransformerMixin
- class cognite.client.data_classes.extractionpipelines.ExtractionPipelineConfigRevision(external_id: str | None = None, revision: int | None = None, description: str | None = None, created_time: int | None = None, cognite_client: CogniteClient | None = None)
Bases:
CogniteResource
An extraction pipeline config revision
- Parameters
external_id (str | None) – The external ID of the associated extraction pipeline.
revision (int | None) – The revision number of this config as a positive integer.
description (str | None) – Short description of this configuration revision.
created_time (int | None) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.
cognite_client (CogniteClient | None) – The client to associate with this object.
- class cognite.client.data_classes.extractionpipelines.ExtractionPipelineConfigRevisionList(resources: Iterable[Any], cognite_client: CogniteClient | None = None)
Bases:
CogniteResourceList
[ExtractionPipelineConfigRevision
],ExternalIDTransformerMixin
- class cognite.client.data_classes.extractionpipelines.ExtractionPipelineConfigWrite(external_id: str, config: Optional[str] = None, description: Optional[str] = None)
Bases:
ExtractionPipelineConfigCore
An extraction pipeline config
- Parameters
external_id (str) – The external ID of the associated extraction pipeline.
config (str | None) – Contents of this configuration revision.
description (str | None) – Short description of this configuration revision.
- as_write() ExtractionPipelineConfigWrite
Returns this ExtractionPipelineConfigWrite instance.
- class cognite.client.data_classes.extractionpipelines.ExtractionPipelineConfigWriteList(resources: Iterable[Any], cognite_client: CogniteClient | None = None)
Bases:
CogniteResourceList
[ExtractionPipelineConfigWrite
],ExternalIDTransformerMixin
- class cognite.client.data_classes.extractionpipelines.ExtractionPipelineContact(name: Optional[str] = None, email: Optional[str] = None, role: Optional[str] = None, send_notification: Optional[bool] = None)
Bases:
CogniteObject
A contact for an extraction pipeline
- Parameters
name (str | None) – Name of contact
email (str | None) – Email address of contact
role (str | None) – Role of contact, such as Owner, Maintainer, etc.
send_notification (bool | None) – Whether to send notifications to this contact or not
- class cognite.client.data_classes.extractionpipelines.ExtractionPipelineCore(external_id: Optional[str] = None, name: Optional[str] = None, description: Optional[str] = None, data_set_id: Optional[int] = None, raw_tables: Optional[list[dict[str, str]]] = None, schedule: Optional[str] = None, contacts: Optional[list[cognite.client.data_classes.extractionpipelines.ExtractionPipelineContact]] = None, metadata: Optional[dict[str, str]] = None, source: Optional[str] = None, documentation: Optional[str] = None, notification_config: Optional[ExtractionPipelineNotificationConfiguration] = None, created_by: Optional[str] = None)
Bases:
WriteableCogniteResource
[ExtractionPipelineWrite
],ABC
An extraction pipeline is a representation of a process writing data to CDF, such as an extractor or an ETL tool.
- Parameters
external_id (str | None) – The external ID provided by the client. Must be unique for the resource type.
name (str | None) – The name of the extraction pipeline.
description (str | None) – The description of the extraction pipeline.
data_set_id (int | None) – The id of the dataset this extraction pipeline related with.
raw_tables (list[dict[str, str]] | None) – list of raw tables in list format: [{“dbName”: “value”, “tableName” : “value”}].
schedule (str | None) – One of None/On trigger/Continuous/cron regex.
contacts (list[ExtractionPipelineContact] | None) – list of contacts
metadata (dict[str, str] | None) – Custom, application specific metadata. String key -> String value. Limits: Maximum length of key is 128 bytes, value 10240 bytes, up to 256 key-value pairs, of total size at most 10240.
source (str | None) – Source text value for extraction pipeline.
documentation (str | None) – Documentation text value for extraction pipeline.
notification_config (ExtractionPipelineNotificationConfiguration | None) – Notification configuration for the extraction pipeline.
created_by (str | None) – Extraction pipeline creator, usually an email.
- dump(camel_case: bool = True) dict[str, Any]
Dump the instance into a json serializable Python data type.
- Parameters
camel_case (bool) – Use camelCase for attribute names. Defaults to True.
- Returns
A dictionary representation of the instance.
- Return type
dict[str, Any]
- class cognite.client.data_classes.extractionpipelines.ExtractionPipelineList(resources: Iterable[Any], cognite_client: CogniteClient | None = None)
Bases:
WriteableCogniteResourceList
[ExtractionPipelineWrite
,ExtractionPipeline
],IdTransformerMixin
- class cognite.client.data_classes.extractionpipelines.ExtractionPipelineNotificationConfiguration(allowed_not_seen_range_in_minutes: Optional[int] = None)
Bases:
CogniteObject
Extraction pipeline notification configuration
- Parameters
allowed_not_seen_range_in_minutes (int | None) – Time in minutes to pass without any Run. Null if extraction pipeline is not checked.
- class cognite.client.data_classes.extractionpipelines.ExtractionPipelineRun(extpipe_external_id: str | None = None, status: str | None = None, message: str | None = None, created_time: int | None = None, cognite_client: CogniteClient | None = None, id: int | None = None)
Bases:
ExtractionPipelineRunCore
A representation of an extraction pipeline run.
- Parameters
extpipe_external_id (str | None) – The external ID of the extraction pipeline.
status (str | None) – success/failure/seen.
message (str | None) – Optional status message.
created_time (int | None) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.
cognite_client (CogniteClient | None) – The client to associate with this object.
id (int | None) – A server-generated ID for the object.
- as_write() ExtractionPipelineRunWrite
Returns this ExtractionPipelineRun as a ExtractionPipelineRunWrite
- dump(camel_case: bool = True) dict[str, Any]
Dump the instance into a json serializable Python data type.
- Parameters
camel_case (bool) – Use camelCase for attribute names. Defaults to True.
- Returns
A dictionary representation of the instance.
- Return type
dict[str, Any]
- class cognite.client.data_classes.extractionpipelines.ExtractionPipelineRunCore(status: Optional[str] = None, message: Optional[str] = None, created_time: Optional[int] = None)
Bases:
WriteableCogniteResource
[ExtractionPipelineRunWrite
],ABC
A representation of an extraction pipeline run.
- Parameters
status (str | None) – success/failure/seen.
message (str | None) – Optional status message.
created_time (int | None) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.
- class cognite.client.data_classes.extractionpipelines.ExtractionPipelineRunFilter(external_id: Optional[str] = None, statuses: Optional[SequenceNotStr[str]] = None, message: Optional[StringFilter] = None, created_time: Optional[Union[dict[str, Any], TimestampRange]] = None)
Bases:
CogniteFilter
Filter runs with exact matching
- Parameters
external_id (str | None) – The external ID of related ExtractionPipeline provided by the client. Must be unique for the resource type.
statuses (SequenceNotStr[str] | None) – success/failure/seen.
message (StringFilter | None) – message filter.
created_time (dict[str, Any] | TimestampRange | None) – Range between two timestamps.
- class cognite.client.data_classes.extractionpipelines.ExtractionPipelineRunList(resources: Iterable[Any], cognite_client: CogniteClient | None = None)
Bases:
WriteableCogniteResourceList
[ExtractionPipelineRunWrite
,ExtractionPipelineRun
],IdTransformerMixin
- class cognite.client.data_classes.extractionpipelines.ExtractionPipelineRunWrite(extpipe_external_id: str, status: Literal['success', 'failure', 'seen'], message: Optional[str] = None, created_time: Optional[int] = None)
Bases:
ExtractionPipelineRunCore
A representation of an extraction pipeline run. This is the writing version of the ExtractionPipelineRun class, which is used when creating extraction pipeline runs.
- Parameters
extpipe_external_id (str) – The external ID of the extraction pipeline.
status (Literal['success', 'failure', 'seen']) – success/failure/seen.
message (str | None) – Optional status message.
created_time (int | None) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.
- as_write() ExtractionPipelineRunWrite
Returns this ExtractionPipelineRunWrite instance.
- dump(camel_case: bool = True) dict[str, Any]
Dump the instance into a json serializable Python data type.
- Parameters
camel_case (bool) – Use camelCase for attribute names. Defaults to True.
- Returns
A dictionary representation of the instance.
- Return type
dict[str, Any]
- class cognite.client.data_classes.extractionpipelines.ExtractionPipelineRunWriteList(resources: Iterable[Any], cognite_client: CogniteClient | None = None)
- class cognite.client.data_classes.extractionpipelines.ExtractionPipelineUpdate(id: Optional[int] = None, external_id: Optional[str] = None)
Bases:
CogniteUpdate
Changes applied to an extraction pipeline
- Parameters
id (int) – A server-generated ID for the object.
external_id (str) – The external ID provided by the client. Must be unique for the resource type.
- class cognite.client.data_classes.extractionpipelines.ExtractionPipelineWrite(external_id: str, name: str, data_set_id: int, description: Optional[str] = None, raw_tables: Optional[list[dict[str, str]]] = None, schedule: Optional[str] = None, contacts: Optional[list[cognite.client.data_classes.extractionpipelines.ExtractionPipelineContact]] = None, metadata: Optional[dict[str, str]] = None, source: Optional[str] = None, documentation: Optional[str] = None, notification_config: Optional[ExtractionPipelineNotificationConfiguration] = None, created_by: Optional[str] = None)
Bases:
ExtractionPipelineCore
An extraction pipeline is a representation of a process writing data to CDF, such as an extractor or an ETL tool. This is the writing version of the ExtractionPipeline class, which is used when creating extraction pipelines.
- Parameters
external_id (str) – The external ID provided by the client. Must be unique for the resource type.
name (str) – The name of the extraction pipeline.
data_set_id (int) – The id of the dataset this extraction pipeline related with.
description (str | None) – The description of the extraction pipeline.
raw_tables (list[dict[str, str]] | None) – list of raw tables in list format: [{“dbName”: “value”, “tableName” : “value”}].
schedule (str | None) – One of None/On trigger/Continuous/cron regex.
contacts (list[ExtractionPipelineContact] | None) – list of contacts
metadata (dict[str, str] | None) – Custom, application specific metadata. String key -> String value. Limits: Maximum length of key is 128 bytes, value 10240 bytes, up to 256 key-value pairs, of total size at most 10240.
source (str | None) – Source text value for extraction pipeline.
documentation (str | None) – Documentation text value for extraction pipeline.
notification_config (ExtractionPipelineNotificationConfiguration | None) – Notification configuration for the extraction pipeline.
created_by (str | None) – Extraction pipeline creator, usually an email.
- as_write() ExtractionPipelineWrite
Returns this ExtractionPipelineWrite instance.
- class cognite.client.data_classes.extractionpipelines.ExtractionPipelineWriteList(resources: Iterable[Any], cognite_client: CogniteClient | None = None)
Bases:
CogniteResourceList
[ExtractionPipelineWrite
],ExternalIDTransformerMixin
- class cognite.client.data_classes.extractionpipelines.StringFilter(substring: Optional[str] = None)
Bases:
CogniteFilter
Filter runs on substrings of the message
- Parameters
substring (str | None) – Part of message