Data Ingestion
Raw
Databases
List databases
- async RawDatabasesAPI.list(
- limit: int | None = 25,
-
- Parameters:
limit (int | None) – Maximum number of databases to return. Defaults to 25. Set to -1, float(“inf”) or None to return all items.
- Returns:
List of requested databases.
- Return type:
Examples
List the first 5 databases:
>>> from cognite.client import CogniteClient, AsyncCogniteClient >>> client = CogniteClient() >>> # async_client = AsyncCogniteClient() # another option >>> db_list = client.raw.databases.list(limit=5)
Iterate over databases, one-by-one:
>>> for db in client.raw.databases(): ... db # do something with the db
Iterate over chunks of databases to reduce memory load:
>>> for db_list in client.raw.databases(chunk_size=2500): ... db_list # do something with the dbs
Create new databases
- async RawDatabasesAPI.create(
- name: str | list[str],
-
- Parameters:
name (str | list[str]) – A db name or list of db names to create.
- Returns:
Database or list of databases that has been created.
- Return type:
Examples
Create a new database:
>>> from cognite.client import CogniteClient, AsyncCogniteClient >>> client = CogniteClient() >>> # async_client = AsyncCogniteClient() # another option >>> res = client.raw.databases.create("db1")
Delete databases
- async RawDatabasesAPI.delete(
- name: str | SequenceNotStr[str],
- recursive: bool = False,
-
- Parameters:
name (str | SequenceNotStr[str]) – A db name or list of db names to delete.
recursive (bool) – Recursively delete all tables in the database(s).
Examples
Delete a list of databases:
>>> from cognite.client import CogniteClient, AsyncCogniteClient >>> client = CogniteClient() >>> # async_client = AsyncCogniteClient() # another option >>> client.raw.databases.delete(["db1", "db2"])
Tables
List tables in a database
- async RawTablesAPI.list(
- db_name: str,
- limit: int | None = 25,
-
- Parameters:
db_name (str) – The database to list tables from.
limit (int | None) – Maximum number of tables to return. Defaults to 25. Set to -1, float(“inf”) or None to return all items.
- Returns:
List of requested tables.
- Return type:
Examples
List the first 5 tables:
>>> from cognite.client import CogniteClient, AsyncCogniteClient >>> client = CogniteClient() >>> # async_client = AsyncCogniteClient() # another option >>> table_list = client.raw.tables.list("db1", limit=5)
Iterate over tables, one-by-one:
>>> for table in client.raw.tables(db_name="db1"): ... table # do something with the table
Iterate over chunks of tables to reduce memory load:
>>> for table_list in client.raw.tables(db_name="db1", chunk_size=25): ... table_list # do something with the tables
Create new tables in a database
- async RawTablesAPI.create(
- db_name: str,
- name: str | list[str],
-
- Parameters:
db_name (str) – Database to create the tables in.
name (str | list[str]) – A table name or list of table names to create.
- Returns:
raw.Table or list of tables that has been created.
- Return type:
Examples
Create a new table in a database:
>>> from cognite.client import CogniteClient, AsyncCogniteClient >>> client = CogniteClient() >>> # async_client = AsyncCogniteClient() # another option >>> res = client.raw.tables.create("db1", "table1")
Delete tables from a database
- async RawTablesAPI.delete(
- db_name: str,
- name: str | SequenceNotStr[str],
-
- Parameters:
db_name (str) – Database to delete tables from.
name (str | SequenceNotStr[str]) – A table name or list of table names to delete.
Examples
Delete a list of tables:
>>> from cognite.client import CogniteClient, AsyncCogniteClient >>> client = CogniteClient() >>> # async_client = AsyncCogniteClient() # another option >>> res = client.raw.tables.delete("db1", ["table1", "table2"])
Rows
Get a row from a table
- async RawRowsAPI.retrieve(
- db_name: str,
- table_name: str,
- key: str,
-
- Parameters:
db_name (str) – Name of the database.
table_name (str) – Name of the table.
key (str) – The key of the row to retrieve.
- Returns:
The requested row.
- Return type:
Row | None
Examples
Retrieve a row with key ‘k1’ from table ‘t1’ in database ‘db1’:
>>> from cognite.client import CogniteClient, AsyncCogniteClient >>> client = CogniteClient() >>> # async_client = AsyncCogniteClient() # another option >>> row = client.raw.rows.retrieve("db1", "t1", "k1")
You may access the data directly on the row (like a dict), or use ‘.get’ when keys can be missing:
>>> val1 = row["col1"] >>> val2 = row.get("col2")
List rows in a table
- async RawRowsAPI.list(
- db_name: str,
- table_name: str,
- min_last_updated_time: int | None = None,
- max_last_updated_time: int | None = None,
- columns: list[str] | None = None,
- limit: int | None = 25,
- partitions: int | None = None,
-
- Parameters:
db_name (str) – Name of the database.
table_name (str) – Name of the table.
min_last_updated_time (int | None) – Rows must have been last updated after this time (exclusive). Milliseconds since epoch.
max_last_updated_time (int | None) – Rows must have been last updated before this time (inclusive). Milliseconds since epoch.
columns (list[str] | None) – List of column keys. Set to None to retrieving all, use empty list, [], to retrieve only row keys.
limit (int | None) – The number of rows to retrieve. Can be used with partitions. Defaults to 25. Set to -1, float(“inf”) or None to return all items.
partitions (int | None) – Retrieve rows in parallel using this number of workers. Can be used together with a (large) finite limit. When partitions is not passed, it defaults to 1, i.e. no concurrency for a finite limit and
global_config.concurrency_settings.raw.readfor an unlimited query (will be capped at this value). To prevent unexpected problems and maximize read throughput, check out concurrency limits in the API documentation.
- Returns:
The requested rows.
- Return type:
Examples
List a few rows:
>>> from cognite.client import CogniteClient, AsyncCogniteClient >>> client = CogniteClient() >>> # async_client = AsyncCogniteClient() # another option >>> row_list = client.raw.rows.list("db1", "tbl1", limit=5)
Read an entire table efficiently by using concurrency (default behavior when
limit=None):>>> row_list = client.raw.rows.list("db1", "tbl1", limit=None)
Iterate through all rows one-by-one to reduce memory load (no concurrency used):
>>> for row in client.raw.rows("db1", "t1", columns=["col1","col2"]): ... val1 = row["col1"] # You may access the data directly ... val2 = row.get("col2") # ...or use '.get' when keys can be missing
Iterate through all rows, one chunk at a time, to reduce memory load (no concurrency used):
>>> for row_list in client.raw.rows("db1", "t1", chunk_size=2500): ... row_list # Do something with the rows
Iterate through a massive table to reduce memory load while using concurrency for high throughput. Note:
partitionsmust be specified for concurrency to be used (this is different fromlist()to keep backward compatibility). Supplying a finitelimitdoes not affect concurrency settings (except for very small values).>>> rows_iterator = client.raw.rows( ... db_name="db1", table_name="t1", partitions=5, chunk_size=5000, limit=1_000_000 ... ) >>> for row_list in rows_iterator: ... row_list # Do something with the rows
Insert rows into a table
- async RawRowsAPI.insert(
- db_name: str,
- table_name: str,
- row: Sequence[Row] | Sequence[RowWrite] | Row | RowWrite | dict,
- ensure_parent: bool = False,
Insert one or more rows into a table.
- Parameters:
Examples
Insert new rows into a table:
>>> from cognite.client import CogniteClient >>> from cognite.client.data_classes import RowWrite >>> client = CogniteClient() >>> # async_client = AsyncCogniteClient() # another option >>> rows = [RowWrite(key="r1", columns={"col1": "val1", "col2": "val1"}), ... RowWrite(key="r2", columns={"col1": "val2", "col2": "val2"})] >>> client.raw.rows.insert("db1", "table1", rows)
You may also insert a dictionary directly:
>>> rows = { ... "key-1": {"col1": 1, "col2": 2}, ... "key-2": {"col1": 3, "col2": 4, "col3": "high five"}, ... } >>> client.raw.rows.insert("db1", "table1", rows)
Delete rows from a table
- async RawRowsAPI.delete(
- db_name: str,
- table_name: str,
- key: str | SequenceNotStr[str],
-
- Parameters:
db_name (str) – Name of the database.
table_name (str) – Name of the table.
key (str | SequenceNotStr[str]) – The key(s) of the row(s) to delete.
Examples
Delete rows from table:
>>> from cognite.client import CogniteClient, AsyncCogniteClient >>> client = CogniteClient() >>> # async_client = AsyncCogniteClient() # another option >>> keys_to_delete = ["k1", "k2", "k3"] >>> client.raw.rows.delete("db1", "table1", keys_to_delete)
Retrieve pandas dataframe
- async RawRowsAPI.retrieve_dataframe(
- db_name: str,
- table_name: str,
- min_last_updated_time: int | None = None,
- max_last_updated_time: int | None = None,
- columns: list[str] | None = None,
- limit: int | None = 25,
- partitions: int | None = None,
- last_updated_time_in_index: bool = False,
- infer_dtypes: bool = True,
Retrieve rows in a table as a pandas dataframe.
Rowkeys are used as the index.
- Parameters:
db_name (str) – Name of the database.
table_name (str) – Name of the table.
min_last_updated_time (int | None) – Rows must have been last updated after this time. Milliseconds since epoch.
max_last_updated_time (int | None) – Rows must have been last updated before this time. Milliseconds since epoch.
columns (list[str] | None) – List of column keys. Set to None to retrieving all, use empty list, [], to retrieve only row keys.
limit (int | None) – The number of rows to retrieve. Defaults to 25. Set to -1, float(“inf”) or None to return all items.
partitions (int | None) –
Retrieve rows in parallel using this number of workers. Can be used together with a (large) finite limit. When partitions is not passed, it defaults to 1, i.e. no concurrency for a finite limit and
global_config.concurrency_settings.raw.readfor an unlimited query (will be capped at this value). To prevent unexpected problems and maximize read throughput, check out concurrency limits in the API documentation.last_updated_time_in_index (bool) – Use a MultiIndex with row keys and last_updated_time as index.
infer_dtypes (bool) – If True, pandas will try to infer dtypes of the columns. Defaults to True.
- Returns:
The requested rows in a pandas dataframe.
- Return type:
pd.DataFrame
Examples
Get dataframe:
>>> from cognite.client import CogniteClient, AsyncCogniteClient >>> client = CogniteClient() >>> # async_client = AsyncCogniteClient() # another option >>> df = client.raw.rows.retrieve_dataframe("db1", "t1", limit=5)
Insert pandas dataframe
- async RawRowsAPI.insert_dataframe(
- db_name: str,
- table_name: str,
- dataframe: pd.DataFrame,
- ensure_parent: bool = False,
- dropna: bool = True,
Insert pandas dataframe into a table
Uses index for row keys.
- Parameters:
db_name (str) – Name of the database.
table_name (str) – Name of the table.
dataframe (pd.DataFrame) – The dataframe to insert. Index will be used as row keys.
ensure_parent (bool) – Create database/table if they don’t already exist.
dropna (bool) – Remove NaNs (but keep None’s in dtype=object columns) before inserting. Done individually per column. Default: True
Examples
Insert new rows into a table:
>>> import pandas as pd >>> from cognite.client import CogniteClient >>> >>> client = CogniteClient() >>> # async_client = AsyncCogniteClient() # another option >>> df = pd.DataFrame( ... {"col-a": [1, 3, None], "col-b": [2, -1, 9]}, ... index=["r1", "r2", "r3"]) >>> res = client.raw.rows.insert_dataframe( ... "db1", "table1", df, dropna=True)
RAW Data classes
- class cognite.client.data_classes.raw.Database(name: str, created_time: int | None)
Bases:
WriteableCogniteResourceWithClientRef[DatabaseWrite]A NoSQL database to store customer data.
- Parameters:
name (str) – Unique name of a database.
created_time (int | None) – Time the database was created.
- as_write() DatabaseWrite
Returns this Database as a DatabaseWrite
- class cognite.client.data_classes.raw.DatabaseList(
- resources: Sequence[T_CogniteResource],
Bases:
WriteableCogniteResourceList[DatabaseWrite,Database],NameTransformerMixin- as_write() DatabaseWriteList
Returns this DatabaseList as a DatabaseWriteList
- class cognite.client.data_classes.raw.DatabaseWrite(name: str)
Bases:
WriteableCogniteResource[DatabaseWrite]A NoSQL database to store customer data.
- Parameters:
name (str) – Unique name of a database.
- as_write() DatabaseWrite
Returns this DatabaseWrite instance.
- class cognite.client.data_classes.raw.DatabaseWriteList(
- resources: Sequence[T_CogniteResource],
Bases:
CogniteResourceList[DatabaseWrite],NameTransformerMixin
- class cognite.client.data_classes.raw.Row(key: str, columns: dict[str, Any], last_updated_time: int)
Bases:
RowCoreThis represents a row in a NO-SQL table. This is the read version of the Row class, which is used when retrieving a row.
- Parameters:
key (str) – Unique row key
columns (dict[str, Any]) – Row data stored as a JSON object.
last_updated_time (int) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.
- class cognite.client.data_classes.raw.RowCore(key: str, columns: dict[str, Any])
Bases:
WriteableCogniteResource[RowWrite],ABCNo description.
- Parameters:
key (str) – Unique row key
columns (dict[str, Any]) – Row data stored as a JSON object.
- to_pandas() pandas.DataFrame
Convert the instance into a pandas DataFrame.
- Returns:
The pandas DataFrame representing this instance.
- Return type:
pandas.DataFrame
- class cognite.client.data_classes.raw.RowList(
- resources: Sequence[T_CogniteResource],
Bases:
RowListCore[Row]- as_write() RowWriteList
Returns this RowList as a RowWriteList
- class cognite.client.data_classes.raw.RowListCore(
- resources: Sequence[T_CogniteResource],
Bases:
WriteableCogniteResourceList[RowWrite,T_Row],ABC- to_pandas() pandas.DataFrame
Convert the instance into a pandas DataFrame.
- Returns:
The pandas DataFrame representing this instance.
- Return type:
pandas.DataFrame
- class cognite.client.data_classes.raw.RowWrite(key: str, columns: dict[str, Any])
Bases:
RowCoreThis represents a row in a NO-SQL table. This is the write version of the Row class, which is used when creating a row.
- Parameters:
key (str) – Unique row key
columns (dict[str, Any]) – Row data stored as a JSON object.
- class cognite.client.data_classes.raw.RowWriteList(
- resources: Sequence[T_CogniteResource],
Bases:
RowListCore[RowWrite]
- class cognite.client.data_classes.raw.Table(name: str, created_time: int | None)
Bases:
WriteableCogniteResourceWithClientRef[TableWrite]A NoSQL database table to store customer data. This is the read version of the Table class, which is used when retrieving a table.
- Parameters:
name (str) – Unique name of the table
created_time (int | None) – Time the table was created.
- as_write() TableWrite
Returns this Table as a TableWrite
- rows(
- key: str | None = None,
- limit: int | None = None,
Get the rows in this table.
- class cognite.client.data_classes.raw.TableList(
- resources: Sequence[T_CogniteResource],
Bases:
WriteableCogniteResourceList[TableWrite,Table],NameTransformerMixin- as_write() TableWriteList
Returns this TableList as a TableWriteList
- class cognite.client.data_classes.raw.TableWrite(name: str)
Bases:
WriteableCogniteResource[TableWrite]A NoSQL database table to store customer data This is the write version of the Table class, which is used when creating a table.
- Parameters:
name (str) – Unique name of the table
- as_write() TableWrite
Returns this TableWrite instance.
- class cognite.client.data_classes.raw.TableWriteList(
- resources: Sequence[T_CogniteResource],
Bases:
CogniteResourceList[TableWrite],NameTransformerMixin
Extraction pipelines
List extraction pipelines
- async ExtractionPipelinesAPI.list(
- limit: int | None = 25,
-
- Parameters:
limit (int | None) – Maximum number of ExtractionPipelines to return. Defaults to 25. Set to -1, float(“inf”) or None to return all items.
- Returns:
List of requested ExtractionPipelines
- Return type:
Examples
List ExtractionPipelines:
>>> from cognite.client import CogniteClient, AsyncCogniteClient >>> client = CogniteClient() >>> # async_client = AsyncCogniteClient() # another option >>> ep_list = client.extraction_pipelines.list(limit=5)
Create extraction pipeline
- async ExtractionPipelinesAPI.create(
- extraction_pipeline: ExtractionPipeline | ExtractionPipelineWrite | Sequence[ExtractionPipeline] | Sequence[ExtractionPipelineWrite],
Create one or more extraction pipelines.
You can create an arbitrary number of extraction pipelines, and the SDK will split the request into multiple requests if necessary.
- Parameters:
extraction_pipeline (ExtractionPipeline | ExtractionPipelineWrite | Sequence[ExtractionPipeline] | Sequence[ExtractionPipelineWrite]) – Extraction pipeline or list of extraction pipelines to create.
- Returns:
Created extraction pipeline(s)
- Return type:
Examples
Create new extraction pipeline:
>>> from cognite.client import CogniteClient >>> from cognite.client.data_classes import ExtractionPipelineWrite >>> client = CogniteClient() >>> # async_client = AsyncCogniteClient() # another option >>> extpipes = [ExtractionPipelineWrite(name="extPipe1",...), ExtractionPipelineWrite(name="extPipe2",...)] >>> res = client.extraction_pipelines.create(extpipes)
Retrieve an extraction pipeline by ID
- async ExtractionPipelinesAPI.retrieve(
- id: int | None = None,
- external_id: str | None = None,
Retrieve a single extraction pipeline by id.
- Parameters:
id (int | None) – ID
external_id (str | None) – External ID
- Returns:
Requested extraction pipeline or None if it does not exist.
- Return type:
ExtractionPipeline | None
Examples
Get extraction pipeline by id:
>>> from cognite.client import CogniteClient, AsyncCogniteClient >>> client = CogniteClient() >>> # async_client = AsyncCogniteClient() # another option >>> res = client.extraction_pipelines.retrieve(id=1)
Get extraction pipeline by external id:
>>> res = client.extraction_pipelines.retrieve(external_id="1")
Retrieve multiple extraction pipelines by ID
- async ExtractionPipelinesAPI.retrieve_multiple(
- ids: Sequence[int] | None = None,
- external_ids: SequenceNotStr[str] | None = None,
- ignore_unknown_ids: bool = False,
Retrieve multiple extraction pipelines by ids and external ids.
- Parameters:
ids (Sequence[int] | None) – IDs
external_ids (SequenceNotStr[str] | None) – External IDs
ignore_unknown_ids (bool) – Ignore IDs and external IDs that are not found rather than throw an exception.
- Returns:
The requested ExtractionPipelines.
- Return type:
Examples
Get ExtractionPipelines by id:
>>> from cognite.client import CogniteClient, AsyncCogniteClient >>> client = CogniteClient() >>> # async_client = AsyncCogniteClient() # another option >>> res = client.extraction_pipelines.retrieve_multiple(ids=[1, 2, 3])
Get assets by external id:
>>> res = client.extraction_pipelines.retrieve_multiple(external_ids=["abc", "def"], ignore_unknown_ids=True)
Update extraction pipelines
- async ExtractionPipelinesAPI.update(
- item: ExtractionPipeline | ExtractionPipelineWrite | ExtractionPipelineUpdate | Sequence[ExtractionPipeline | ExtractionPipelineWrite | ExtractionPipelineUpdate],
- mode: Literal['replace_ignore_null', 'patch', 'replace'] = 'replace_ignore_null',
Update one or more extraction pipelines
- Parameters:
item (ExtractionPipeline | ExtractionPipelineWrite | ExtractionPipelineUpdate | Sequence[ExtractionPipeline | ExtractionPipelineWrite | ExtractionPipelineUpdate]) – Extraction pipeline(s) to update
mode (Literal['replace_ignore_null', 'patch', 'replace']) – How to update data when a non-update object is given (ExtractionPipeline or -Write). If you use ‘replace_ignore_null’, only the fields you have set will be used to replace existing (default). Using ‘replace’ will additionally clear all the fields that are not specified by you. Last option, ‘patch’, will update only the fields you have set and for container-like fields such as metadata or labels, add the values to the existing. For more details, see Update and Upsert Mode Parameter.
- Returns:
Updated extraction pipeline(s)
- Return type:
Examples
Update an extraction pipeline that you have fetched. This will perform a full update of the extraction pipeline:
>>> from cognite.client import CogniteClient >>> from cognite.client.data_classes import ExtractionPipelineUpdate >>> client = CogniteClient() >>> # async_client = AsyncCogniteClient() # another option >>> update = ExtractionPipelineUpdate(id=1) >>> update.description.set("Another new extpipe") >>> res = client.extraction_pipelines.update(update)
Delete extraction pipelines
- async ExtractionPipelinesAPI.delete(
- id: int | Sequence[int] | None = None,
- external_id: str | SequenceNotStr[str] | None = None,
Delete one or more extraction pipelines
- Parameters:
id (int | Sequence[int] | None) – Id or list of ids
external_id (str | SequenceNotStr[str] | None) – External ID or list of external ids
Examples
Delete extraction pipelines by id or external id:
>>> from cognite.client import CogniteClient, AsyncCogniteClient >>> client = CogniteClient() >>> # async_client = AsyncCogniteClient() # another option >>> client.extraction_pipelines.delete(id=[1,2,3], external_id="3")
Extraction pipeline runs
List runs for an extraction pipeline
- async ExtractionPipelineRunsAPI.list(
- external_id: str,
- statuses: Literal['success', 'failure', 'seen'] | Sequence[Literal['success', 'failure', 'seen']] | SequenceNotStr[str] | None = None,
- message_substring: str | None = None,
- created_time: dict[str, Any] | TimestampRange | str | None = None,
- limit: int | None = 25,
List runs for an extraction pipeline with given external_id
- Parameters:
external_id (str) – Extraction pipeline external Id.
statuses (RunStatus | Sequence[RunStatus] | SequenceNotStr[str] | None) – One or more among “success” / “failure” / “seen”.
message_substring (str | None) – Failure message part.
created_time (dict[str, Any] | TimestampRange | str | None) – Range between two timestamps. Possible keys are min and max, with values given as timestamps in ms. If a string is passed, it is assumed to be the minimum value.
limit (int | None) – Maximum number of ExtractionPipelines to return. Defaults to 25. Set to -1, float(“inf”) or None to return all items.
- Returns:
List of requested extraction pipeline runs
- Return type:
Tip
The
created_timeparameter can also be passed as a string, to support the most typical usage pattern of fetching the most recent runs, meaning it is implicitly assumed to be the minimum created time. The format is “N[timeunit]-ago”, where timeunit is w,d,h,m (week, day, hour, minute), e.g. “12d-ago”.Examples
List extraction pipeline runs:
>>> from cognite.client import CogniteClient, AsyncCogniteClient >>> client = CogniteClient() >>> # async_client = AsyncCogniteClient() # another option >>> runsList = client.extraction_pipelines.runs.list(external_id="test ext id", limit=5)
Filter extraction pipeline runs on a given status:
>>> runs_list = client.extraction_pipelines.runs.list(external_id="test ext id", statuses=["seen"], limit=5)
Get all failed pipeline runs in the last 24 hours for pipeline ‘extId’:
>>> from cognite.client.data_classes import ExtractionPipelineRun >>> res = client.extraction_pipelines.runs.list(external_id="extId", statuses="failure", created_time="24h-ago")
Report new runs
- async ExtractionPipelineRunsAPI.create(
- run: ExtractionPipelineRun | ExtractionPipelineRunWrite | Sequence[ExtractionPipelineRun] | Sequence[ExtractionPipelineRunWrite],
Create one or more extraction pipeline runs.
You can create an arbitrary number of extraction pipeline runs, and the SDK will split the request into multiple requests.
- Parameters:
run (ExtractionPipelineRun | ExtractionPipelineRunWrite | Sequence[ExtractionPipelineRun] | Sequence[ExtractionPipelineRunWrite]) – ExtractionPipelineRun| ExtractionPipelineRunWrite | Sequence[ExtractionPipelineRun] | Sequence[ExtractionPipelineRunWrite]): Extraction pipeline or list of extraction pipeline runs to create.
- Returns:
Created extraction pipeline run(s)
- Return type:
Examples
Report a new extraction pipeline run:
>>> from cognite.client import CogniteClient >>> from cognite.client.data_classes import ExtractionPipelineRunWrite >>> client = CogniteClient() >>> res = client.extraction_pipelines.runs.create( ... ExtractionPipelineRunWrite(status="success", extpipe_external_id="extId"))
Extraction pipeline configs
Get the latest or a specific config revision
- async ExtractionPipelineConfigsAPI.retrieve(
- external_id: str,
- revision: int | None = None,
- active_at_time: int | None = None,
Retrieve a specific configuration revision, or the latest by default <https://developer.cognite.com/api#tag/Extraction-Pipelines-Config/operation/getExtPipeConfigRevision>
By default the latest configuration revision is retrieved, or you can specify a timestamp or a revision number.
- Parameters:
external_id (str) – External id of the extraction pipeline to retrieve config from.
revision (int | None) – Optionally specify a revision number to retrieve.
active_at_time (int | None) – Optionally specify a timestamp the configuration revision should be active.
- Returns:
Retrieved extraction pipeline configuration revision
- Return type:
Examples
Retrieve latest config revision:
>>> from cognite.client import CogniteClient, AsyncCogniteClient >>> client = CogniteClient() >>> # async_client = AsyncCogniteClient() # another option >>> res = client.extraction_pipelines.config.retrieve("extId")
List configuration revisions
- async ExtractionPipelineConfigsAPI.list(
- external_id: str,
Retrieve all configuration revisions from an extraction pipeline <https://developer.cognite.com/api#tag/Extraction-Pipelines-Config/operation/listExtPipeConfigRevisions>
- Parameters:
external_id (str) – External id of the extraction pipeline to retrieve config from.
- Returns:
Retrieved extraction pipeline configuration revisions
- Return type:
Examples
Retrieve a list of config revisions:
>>> from cognite.client import CogniteClient, AsyncCogniteClient >>> client = CogniteClient() >>> # async_client = AsyncCogniteClient() # another option >>> res = client.extraction_pipelines.config.list("extId")
Create a config revision
- async ExtractionPipelineConfigsAPI.create( ) ExtractionPipelineConfig
Create a new configuration revision <https://developer.cognite.com/api#tag/Extraction-Pipelines-Config/operation/createExtPipeConfig>
- Parameters:
config (ExtractionPipelineConfig | ExtractionPipelineConfigWrite) – Configuration revision to create.
- Returns:
Created extraction pipeline configuration revision
- Return type:
Examples
Create a config revision:
>>> from cognite.client import CogniteClient >>> from cognite.client.data_classes import ExtractionPipelineConfigWrite >>> client = CogniteClient() >>> res = client.extraction_pipelines.config.create(ExtractionPipelineConfigWrite(external_id="extId", config="my config contents"))
Revert to an earlier config revision
- async ExtractionPipelineConfigsAPI.revert(
- external_id: str,
- revision: int,
Revert to a previous configuration revision <https://developer.cognite.com/api#tag/Extraction-Pipelines-Config/operation/revertExtPipeConfigRevision>
- Parameters:
external_id (str) – External id of the extraction pipeline to revert revision for.
revision (int) – Revision to revert to.
- Returns:
New latest extraction pipeline configuration revision.
- Return type:
Examples
Revert a config revision:
>>> from cognite.client import CogniteClient, AsyncCogniteClient >>> client = CogniteClient() >>> # async_client = AsyncCogniteClient() # another option >>> res = client.extraction_pipelines.config.revert("extId", 5)
Extractor Config Data classes
- class cognite.client.data_classes.extractionpipelines.ExtractionPipeline(
- id: int,
- external_id: str,
- name: str,
- description: str | None,
- data_set_id: int,
- raw_tables: list[dict[str, str]] | None,
- last_success: int | None,
- last_failure: int | None,
- last_message: str | None,
- last_seen: int | None,
- schedule: str | None,
- contacts: list[ExtractionPipelineContact] | None,
- metadata: dict[str, str] | None,
- source: str | None,
- documentation: str | None,
- notification_config: ExtractionPipelineNotificationConfiguration | None,
- created_time: int,
- last_updated_time: int,
- created_by: str | None,
Bases:
ExtractionPipelineCoreAn extraction pipeline is a representation of a process writing data to CDF, such as an extractor or an ETL tool. This is the read version of the ExtractionPipeline class, which is used when retrieving extraction pipelines.
- Parameters:
id (int) – A server-generated ID for the object.
external_id (str) – The external ID provided by the client. Must be unique for the resource type.
name (str) – The name of the extraction pipeline.
description (str | None) – The description of the extraction pipeline.
data_set_id (int) – The id of the dataset this extraction pipeline related with.
raw_tables (list[dict[str, str]] | None) – list of raw tables in list format: [{“dbName”: “value”, “tableName” : “value”}].
last_success (int | None) – Milliseconds value of last success status.
last_failure (int | None) – Milliseconds value of last failure status.
last_message (str | None) – Message of last failure.
last_seen (int | None) – Milliseconds value of last seen status.
schedule (str | None) – One of None/On trigger/Continuous/cron regex.
contacts (list[ExtractionPipelineContact] | None) – list of contacts
metadata (dict[str, str] | None) – Custom, application specific metadata. String key -> String value. Limits: Maximum length of key is 128 bytes, value 10240 bytes, up to 256 key-value pairs, of total size at most 10240.
source (str | None) – Source text value for extraction pipeline.
documentation (str | None) – Documentation text value for extraction pipeline.
notification_config (ExtractionPipelineNotificationConfiguration | None) – Notification configuration for the extraction pipeline.
created_time (int) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.
last_updated_time (int) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.
created_by (str | None) – Extraction pipeline creator, usually an email.
- as_write() ExtractionPipelineWrite
Returns this ExtractionPipeline as a ExtractionPipelineWrite
- class cognite.client.data_classes.extractionpipelines.ExtractionPipelineConfig(
- external_id: str,
- config: str | None,
- revision: int,
- description: str | None,
- created_time: int,
Bases:
ExtractionPipelineConfigCoreAn extraction pipeline config
- Parameters:
external_id (str) – The external ID of the associated extraction pipeline.
config (str | None) – Contents of this configuration revision.
revision (int) – The revision number of this config as a positive integer.
description (str | None) – Short description of this configuration revision.
created_time (int) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.
- as_write() ExtractionPipelineConfigWrite
Returns this ExtractionPipelineConfig as a ExtractionPipelineConfigWrite
- class cognite.client.data_classes.extractionpipelines.ExtractionPipelineConfigCore(
- external_id: str | None = None,
- config: str | None = None,
- description: str | None = None,
Bases:
WriteableCogniteResource[ExtractionPipelineConfigWrite],ABCAn extraction pipeline config
- Parameters:
external_id (str | None) – The external ID of the associated extraction pipeline.
config (str | None) – Contents of this configuration revision.
description (str | None) – Short description of this configuration revision.
- class cognite.client.data_classes.extractionpipelines.ExtractionPipelineConfigList(
- resources: Sequence[T_CogniteResource],
Bases:
WriteableCogniteResourceList[ExtractionPipelineConfigWrite,ExtractionPipelineConfig],ExternalIDTransformerMixin
- class cognite.client.data_classes.extractionpipelines.ExtractionPipelineConfigRevision(
- external_id: str,
- revision: int,
- description: str | None,
- created_time: int,
Bases:
CogniteResourceAn extraction pipeline config revision
- Parameters:
external_id (str) – The external ID of the associated extraction pipeline.
revision (int) – The revision number of this config as a positive integer.
description (str | None) – Short description of this configuration revision.
created_time (int) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.
- class cognite.client.data_classes.extractionpipelines.ExtractionPipelineConfigRevisionList(
- resources: Sequence[T_CogniteResource],
Bases:
CogniteResourceList[ExtractionPipelineConfigRevision],ExternalIDTransformerMixin
- class cognite.client.data_classes.extractionpipelines.ExtractionPipelineConfigWrite(
- external_id: str,
- config: str | None = None,
- description: str | None = None,
Bases:
ExtractionPipelineConfigCoreAn extraction pipeline config
- Parameters:
external_id (str) – The external ID of the associated extraction pipeline.
config (str | None) – Contents of this configuration revision.
description (str | None) – Short description of this configuration revision.
- as_write() ExtractionPipelineConfigWrite
Returns this ExtractionPipelineConfigWrite instance.
- class cognite.client.data_classes.extractionpipelines.ExtractionPipelineConfigWriteList(
- resources: Sequence[T_CogniteResource],
Bases:
CogniteResourceList[ExtractionPipelineConfigWrite],ExternalIDTransformerMixin
- class cognite.client.data_classes.extractionpipelines.ExtractionPipelineContact(
- name: str | None = None,
- email: str | None = None,
- role: str | None = None,
- send_notification: bool | None = None,
Bases:
CogniteResourceA contact for an extraction pipeline
- Parameters:
name (str | None) – Name of contact
email (str | None) – Email address of contact
role (str | None) – Role of contact, such as Owner, Maintainer, etc.
send_notification (bool | None) – Whether to send notifications to this contact or not
- class cognite.client.data_classes.extractionpipelines.ExtractionPipelineCore(
- external_id: str,
- name: str | None,
- description: str | None,
- data_set_id: int | None,
- raw_tables: list[dict[str, str]] | None,
- schedule: str | None,
- contacts: list[ExtractionPipelineContact] | None,
- metadata: dict[str, str] | None,
- source: str | None,
- documentation: str | None,
- notification_config: ExtractionPipelineNotificationConfiguration | None,
- created_by: str | None,
Bases:
WriteableCogniteResource[ExtractionPipelineWrite],ABCAn extraction pipeline is a representation of a process writing data to CDF, such as an extractor or an ETL tool.
- Parameters:
external_id (str) – The external ID provided by the client. Must be unique for the resource type.
name (str | None) – The name of the extraction pipeline.
description (str | None) – The description of the extraction pipeline.
data_set_id (int | None) – The id of the dataset this extraction pipeline related with.
raw_tables (list[dict[str, str]] | None) – list of raw tables in list format: [{“dbName”: “value”, “tableName” : “value”}].
schedule (str | None) – One of None/On trigger/Continuous/cron regex.
contacts (list[ExtractionPipelineContact] | None) – list of contacts
metadata (dict[str, str] | None) – Custom, application specific metadata. String key -> String value. Limits: Maximum length of key is 128 bytes, value 10240 bytes, up to 256 key-value pairs, of total size at most 10240.
source (str | None) – Source text value for extraction pipeline.
documentation (str | None) – Documentation text value for extraction pipeline.
notification_config (ExtractionPipelineNotificationConfiguration | None) – Notification configuration for the extraction pipeline.
created_by (str | None) – Extraction pipeline creator, usually an email.
- dump(camel_case: bool = True) dict[str, Any]
Dump the instance into a json serializable Python data type.
- Parameters:
camel_case (bool) – Use camelCase for attribute names. Defaults to True.
- Returns:
A dictionary representation of the instance.
- Return type:
dict[str, Any]
- class cognite.client.data_classes.extractionpipelines.ExtractionPipelineList(
- resources: Sequence[T_CogniteResource],
Bases:
WriteableCogniteResourceList[ExtractionPipelineWrite,ExtractionPipeline],IdTransformerMixin
- class cognite.client.data_classes.extractionpipelines.ExtractionPipelineNotificationConfiguration(
- allowed_not_seen_range_in_minutes: int | None = None,
Bases:
CogniteResourceExtraction pipeline notification configuration
- Parameters:
allowed_not_seen_range_in_minutes (int | None) – Time in minutes to pass without any Run. Null if extraction pipeline is not checked.
- class cognite.client.data_classes.extractionpipelines.ExtractionPipelineRun(
- id: int,
- extpipe_external_id: str | None,
- status: str,
- message: str | None,
- created_time: int | None,
Bases:
ExtractionPipelineRunCoreA representation of an extraction pipeline run.
- Parameters:
id (int) – A server-generated ID for the object.
extpipe_external_id (str | None) – The external ID of the extraction pipeline.
status (str) – success/failure/seen.
message (str | None) – Optional status message.
created_time (int | None) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.
- as_write() ExtractionPipelineRunWrite
Returns this ExtractionPipelineRun as a ExtractionPipelineRunWrite
- dump(camel_case: bool = True) dict[str, Any]
Dump the instance into a json serializable Python data type.
- Parameters:
camel_case (bool) – Use camelCase for attribute names. Defaults to True.
- Returns:
A dictionary representation of the instance.
- Return type:
dict[str, Any]
- class cognite.client.data_classes.extractionpipelines.ExtractionPipelineRunCore(
- status: str,
- message: str | None,
- created_time: int | None,
Bases:
WriteableCogniteResource[ExtractionPipelineRunWrite],ABCA representation of an extraction pipeline run.
- Parameters:
status (str) – success/failure/seen.
message (str | None) – Optional status message.
created_time (int | None) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.
- class cognite.client.data_classes.extractionpipelines.ExtractionPipelineRunFilter(
- external_id: str | None = None,
- statuses: SequenceNotStr[str] | None = None,
- message: StringFilter | None = None,
- created_time: dict[str, Any] | TimestampRange | None = None,
Bases:
CogniteFilterFilter runs with exact matching
- Parameters:
external_id (str | None) – The external ID of related ExtractionPipeline provided by the client. Must be unique for the resource type.
statuses (SequenceNotStr[str] | None) – success/failure/seen.
message (StringFilter | None) – message filter.
created_time (dict[str, Any] | TimestampRange | None) – Range between two timestamps.
- class cognite.client.data_classes.extractionpipelines.ExtractionPipelineRunList(
- resources: Sequence[T_CogniteResource],
Bases:
WriteableCogniteResourceList[ExtractionPipelineRunWrite,ExtractionPipelineRun],IdTransformerMixin
- class cognite.client.data_classes.extractionpipelines.ExtractionPipelineRunWrite(
- extpipe_external_id: str,
- status: Literal['success', 'failure', 'seen'],
- message: str | None = None,
- created_time: int | None = None,
Bases:
ExtractionPipelineRunCoreA representation of an extraction pipeline run. This is the write version of the ExtractionPipelineRun class, which is used when creating extraction pipeline runs.
- Parameters:
extpipe_external_id (str) – The external ID of the extraction pipeline.
status (Literal['success', 'failure', 'seen']) – success/failure/seen.
message (str | None) – Optional status message.
created_time (int | None) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.
- as_write() ExtractionPipelineRunWrite
Returns this ExtractionPipelineRunWrite instance.
- dump(
- camel_case: bool = True,
Dump the instance into a json serializable Python data type.
- Parameters:
camel_case (bool) – Use camelCase for attribute names. Defaults to True.
- Returns:
A dictionary representation of the instance.
- Return type:
dict[str, Any]
- class cognite.client.data_classes.extractionpipelines.ExtractionPipelineRunWriteList(
- resources: Sequence[T_CogniteResource],
- class cognite.client.data_classes.extractionpipelines.ExtractionPipelineUpdate(id: int | None = None, external_id: str | None = None)
Bases:
CogniteUpdateChanges applied to an extraction pipeline
- Parameters:
id (int) – A server-generated ID for the object.
external_id (str) – The external ID provided by the client. Must be unique for the resource type.
- class cognite.client.data_classes.extractionpipelines.ExtractionPipelineWrite(
- external_id: str,
- name: str,
- data_set_id: int,
- description: str | None = None,
- raw_tables: list[dict[str, str]] | None = None,
- schedule: str | None = None,
- contacts: list[ExtractionPipelineContact] | None = None,
- metadata: dict[str, str] | None = None,
- source: str | None = None,
- documentation: str | None = None,
- notification_config: ExtractionPipelineNotificationConfiguration | None = None,
- created_by: str | None = None,
Bases:
ExtractionPipelineCoreAn extraction pipeline is a representation of a process writing data to CDF, such as an extractor or an ETL tool. This is the write version of the ExtractionPipeline class, which is used when creating extraction pipelines.
- Parameters:
external_id (str) – The external ID provided by the client. Must be unique for the resource type.
name (str) – The name of the extraction pipeline.
data_set_id (int) – The id of the dataset this extraction pipeline related with.
description (str | None) – The description of the extraction pipeline.
raw_tables (list[dict[str, str]] | None) – list of raw tables in list format: [{“dbName”: “value”, “tableName” : “value”}].
schedule (str | None) – One of None/On trigger/Continuous/cron regex.
contacts (list[ExtractionPipelineContact] | None) – list of contacts
metadata (dict[str, str] | None) – Custom, application specific metadata. String key -> String value. Limits: Maximum length of key is 128 bytes, value 10240 bytes, up to 256 key-value pairs, of total size at most 10240.
source (str | None) – Source text value for extraction pipeline.
documentation (str | None) – Documentation text value for extraction pipeline.
notification_config (ExtractionPipelineNotificationConfiguration | None) – Notification configuration for the extraction pipeline.
created_by (str | None) – Extraction pipeline creator, usually an email.
- as_write() ExtractionPipelineWrite
Returns this ExtractionPipelineWrite instance.
- class cognite.client.data_classes.extractionpipelines.ExtractionPipelineWriteList(
- resources: Sequence[T_CogniteResource],
Bases:
CogniteResourceList[ExtractionPipelineWrite],ExternalIDTransformerMixin
- class cognite.client.data_classes.extractionpipelines.StringFilter(substring: str | None = None)
Bases:
CogniteFilterFilter runs on substrings of the message
- Parameters:
substring (str | None) – Part of message