Time Series
Metadata
Retrieve a time series by id
- TimeSeriesAPI.retrieve(id: Optional[int] = None, external_id: Optional[str] = None, instance_id: Optional[NodeId] = None) cognite.client.data_classes.time_series.TimeSeries | None
Retrieve a single time series by id.
- Parameters
id (int | None) – ID
external_id (str | None) – External ID
instance_id (NodeId | None) – Instance ID
- Returns
Requested time series or None if it does not exist.
- Return type
TimeSeries | None
Examples
Get time series by id:
>>> from cognite.client import CogniteClient >>> client = CogniteClient() >>> res = client.time_series.retrieve(id=1)
Get time series by external id:
>>> res = client.time_series.retrieve(external_id="1")
Retrieve multiple time series by id
- TimeSeriesAPI.retrieve_multiple(ids: Optional[Sequence[int]] = None, external_ids: Optional[SequenceNotStr[str]] = None, instance_ids: Optional[Sequence[NodeId]] = None, ignore_unknown_ids: bool = False) TimeSeriesList
Retrieve multiple time series by id.
- Parameters
ids (Sequence[int] | None) – IDs
external_ids (SequenceNotStr[str] | None) – External IDs
instance_ids (Sequence[NodeId] | None) – Instance IDs
ignore_unknown_ids (bool) – Ignore IDs and external IDs that are not found rather than throw an exception.
- Returns
The requested time series.
- Return type
Examples
Get time series by id:
>>> from cognite.client import CogniteClient >>> client = CogniteClient() >>> res = client.time_series.retrieve_multiple(ids=[1, 2, 3])
Get time series by external id:
>>> res = client.time_series.retrieve_multiple(external_ids=["abc", "def"])
List time series
- TimeSeriesAPI.list(name: Optional[str] = None, unit: Optional[str] = None, unit_external_id: Optional[str] = None, unit_quantity: Optional[str] = None, is_string: Optional[bool] = None, is_step: Optional[bool] = None, asset_ids: Optional[Sequence[int]] = None, asset_external_ids: Optional[SequenceNotStr[str]] = None, asset_subtree_ids: Optional[Union[int, Sequence[int]]] = None, asset_subtree_external_ids: Optional[Union[str, SequenceNotStr[str]]] = None, data_set_ids: Optional[Union[int, Sequence[int]]] = None, data_set_external_ids: Optional[Union[str, SequenceNotStr[str]]] = None, metadata: Optional[dict[str, Any]] = None, external_id_prefix: Optional[str] = None, created_time: Optional[dict[str, Any]] = None, last_updated_time: Optional[dict[str, Any]] = None, partitions: Optional[int] = None, limit: int | None = 25, advanced_filter: Optional[Union[Filter, dict[str, Any]]] = None, sort: Optional[Union[TimeSeriesSort, str, SortableTimeSeriesProperty, tuple[str, Literal['asc', 'desc']], tuple[str, Literal['asc', 'desc'], Literal['auto', 'first', 'last']], list[cognite.client.data_classes.time_series.TimeSeriesSort | str | cognite.client.data_classes.time_series.SortableTimeSeriesProperty | tuple[str, Literal['asc', 'desc']] | tuple[str, Literal['asc', 'desc'], Literal['auto', 'first', 'last']]]]] = None) TimeSeriesList
-
- Parameters
name (str | None) – Name of the time series. Often referred to as tag.
unit (str | None) – Unit of the time series.
unit_external_id (str | None) – Filter on unit external ID.
unit_quantity (str | None) – Filter on unit quantity.
is_string (bool | None) – Whether the time series is a string time series.
is_step (bool | None) – Whether the time series is a step (piecewise constant) time series.
asset_ids (Sequence[int] | None) – List time series related to these assets.
asset_external_ids (SequenceNotStr[str] | None) – List time series related to these assets.
asset_subtree_ids (int | Sequence[int] | None) – Only include time series that are related to an asset in a subtree rooted at any of these assetIds. If the total size of the given subtrees exceeds 100,000 assets, an error will be returned.
asset_subtree_external_ids (str | SequenceNotStr[str] | None) – Only include time series that are related to an asset in a subtree rooted at any of these assetExternalIds. If the total size of the given subtrees exceeds 100,000 assets, an error will be returned.
data_set_ids (int | Sequence[int] | None) – Return only time series in the specified data set(s) with this id / these ids.
data_set_external_ids (str | SequenceNotStr[str] | None) – Return only time series in the specified data set(s) with this external id / these external ids.
metadata (dict[str, Any] | None) – Custom, application specific metadata. String key -> String value
external_id_prefix (str | None) – Filter by this (case-sensitive) prefix for the external ID.
created_time (dict[str, Any] | None) – Range between two timestamps. Possible keys are min and max, with values given as time stamps in ms.
last_updated_time (dict[str, Any] | None) – Range between two timestamps. Possible keys are min and max, with values given as time stamps in ms.
partitions (int | None) – Retrieve resources in parallel using this number of workers (values up to 10 allowed), limit must be set to None (or -1).
limit (int | None) – Maximum number of time series to return. Defaults to 25. Set to -1, float(“inf”) or None to return all items.
advanced_filter (Filter | dict[str, Any] | None) – Advanced filter query using the filter DSL (Domain Specific Language). It allows defining complex filtering expressions that combine simple operations, such as equals, prefix, exists, etc., using boolean operators and, or, and not. See examples below for usage.
sort (SortSpec | list[SortSpec] | None) – The criteria to sort by. Defaults to desc for _score_ and asc for all other properties. Sort is not allowed if partitions is used.
- Returns
The requested time series.
- Return type
Note
- When using partitions, there are few considerations to keep in mind:
limit has to be set to None (or -1).
API may reject requests if you specify more than 10 partitions. When Cognite enforces this behavior, the requests result in a 400 Bad Request status.
Partitions are done independently of sorting: there’s no guarantee of the sort order between elements from different partitions. For this reason providing a sort parameter when using partitions is not allowed.
Examples
List time series:
>>> from cognite.client import CogniteClient >>> client = CogniteClient() >>> res = client.time_series.list(limit=5)
Iterate over time series:
>>> for ts in client.time_series: ... ts # do something with the time series
Iterate over chunks of time series to reduce memory load:
>>> for ts_list in client.time_series(chunk_size=2500): ... ts_list # do something with the time series
Using advanced filter, find all time series that have a metadata key ‘timezone’ starting with ‘Europe’, and sort by external id ascending:
>>> from cognite.client.data_classes import filters >>> in_timezone = filters.Prefix(["metadata", "timezone"], "Europe") >>> res = client.time_series.list(advanced_filter=in_timezone, sort=("external_id", "asc"))
Note that you can check the API documentation above to see which properties you can filter on with which filters.
To make it easier to avoid spelling mistakes and easier to look up available properties for filtering and sorting, you can also use the TimeSeriesProperty and SortableTimeSeriesProperty Enums.
>>> from cognite.client.data_classes import filters >>> from cognite.client.data_classes.time_series import TimeSeriesProperty, SortableTimeSeriesProperty >>> in_timezone = filters.Prefix(TimeSeriesProperty.metadata_key("timezone"), "Europe") >>> res = client.time_series.list( ... advanced_filter=in_timezone, ... sort=(SortableTimeSeriesProperty.external_id, "asc"))
Combine filter and advanced filter:
>>> from cognite.client.data_classes import filters >>> not_instrument_lvl5 = filters.And( ... filters.ContainsAny("labels", ["Level5"]), ... filters.Not(filters.ContainsAny("labels", ["Instrument"])) ... ) >>> res = client.time_series.list(asset_subtree_ids=[123456], advanced_filter=not_instrument_lvl5)
Aggregate time series
- TimeSeriesAPI.aggregate(filter: Optional[Union[TimeSeriesFilter, dict[str, Any]]] = None) list[cognite.client.data_classes.aggregations.CountAggregate]
-
- Parameters
filter (TimeSeriesFilter | dict[str, Any] | None) – Filter on time series filter with exact match
- Returns
List of sequence aggregates
- Return type
list[CountAggregate]
Examples
List time series:
>>> from cognite.client import CogniteClient >>> client = CogniteClient() >>> res = client.time_series.aggregate(filter={"unit": "kpa"})
Aggregate Time Series Count
- TimeSeriesAPI.aggregate_count(advanced_filter: Optional[Union[Filter, dict[str, Any]]] = None, filter: Optional[Union[TimeSeriesFilter, dict[str, Any]]] = None) int
Count of time series matching the specified filters and search.
- Parameters
advanced_filter (Filter | dict[str, Any] | None) – The filter to narrow down the time series to count.
filter (TimeSeriesFilter | dict[str, Any] | None) – The filter to narrow down time series to count requiring exact match.
- Returns
The number of time series matching the specified filters and search.
- Return type
int
Examples:
Count the number of time series in your CDF project:
>>> from cognite.client import CogniteClient >>> client = CogniteClient() >>> count = client.time_series.aggregate_count()
Count the number of numeric time series in your CDF project:
>>> from cognite.client.data_classes import filters >>> from cognite.client.data_classes.time_series import TimeSeriesProperty >>> is_numeric = filters.Equals(TimeSeriesProperty.is_string, False) >>> count = client.time_series.aggregate_count(advanced_filter=is_numeric)
Aggregate Time Series Values Cardinality
- TimeSeriesAPI.aggregate_cardinality_values(property: cognite.client.data_classes.time_series.TimeSeriesProperty | str | list[str], advanced_filter: Optional[Union[Filter, dict[str, Any]]] = None, aggregate_filter: Optional[Union[AggregationFilter, dict[str, Any]]] = None, filter: Optional[Union[TimeSeriesFilter, dict[str, Any]]] = None) int
Find approximate property count for time series.
- Parameters
property (TimeSeriesProperty | str | list[str]) – The property to count the cardinality of.
advanced_filter (Filter | dict[str, Any] | None) – The filter to narrow down the time series to count cardinality.
aggregate_filter (AggregationFilter | dict[str, Any] | None) – The filter to apply to the resulting buckets.
filter (TimeSeriesFilter | dict[str, Any] | None) – The filter to narrow down the time series to count requiring exact match.
- Returns
The number of properties matching the specified filters and search.
- Return type
int
Examples:
Count the number of different units used for time series in your CDF project:
>>> from cognite.client import CogniteClient >>> from cognite.client.data_classes.time_series import TimeSeriesProperty >>> client = CogniteClient() >>> unit_count = client.time_series.aggregate_cardinality_values(TimeSeriesProperty.unit)
Count the number of timezones (metadata key) for time series with the word “critical” in the description in your CDF project, but exclude timezones from america:
>>> from cognite.client.data_classes import filters, aggregations as aggs >>> from cognite.client.data_classes.time_series import TimeSeriesProperty >>> not_america = aggs.Not(aggs.Prefix("america")) >>> is_critical = filters.Search(TimeSeriesProperty.description, "critical") >>> timezone_count = client.time_series.aggregate_cardinality_values( ... TimeSeriesProperty.metadata_key("timezone"), ... advanced_filter=is_critical, ... aggregate_filter=not_america)
Aggregate Time Series Property Cardinality
- TimeSeriesAPI.aggregate_cardinality_properties(path: cognite.client.data_classes.time_series.TimeSeriesProperty | str | list[str], advanced_filter: Optional[Union[Filter, dict[str, Any]]] = None, aggregate_filter: Optional[Union[AggregationFilter, dict[str, Any]]] = None, filter: Optional[Union[TimeSeriesFilter, dict[str, Any]]] = None) int
Find approximate paths count for time series.
- Parameters
path (TimeSeriesProperty | str | list[str]) – The scope in every document to aggregate properties. The only value allowed now is [“metadata”]. It means to aggregate only metadata properties (aka keys).
advanced_filter (Filter | dict[str, Any] | None) – The filter to narrow down the time series to count cardinality.
aggregate_filter (AggregationFilter | dict[str, Any] | None) – The filter to apply to the resulting buckets.
filter (TimeSeriesFilter | dict[str, Any] | None) – The filter to narrow down the time series to count requiring exact match.
- Returns
The number of properties matching the specified filters and search.
- Return type
int
Examples
Count the number of metadata keys in your CDF project:
>>> from cognite.client import CogniteClient >>> from cognite.client.data_classes.time_series import TimeSeriesProperty >>> client = CogniteClient() >>> key_count = client.time_series.aggregate_cardinality_properties(TimeSeriesProperty.metadata)
Aggregate Time Series Unique Values
- TimeSeriesAPI.aggregate_unique_values(property: cognite.client.data_classes.time_series.TimeSeriesProperty | str | list[str], advanced_filter: Optional[Union[Filter, dict[str, Any]]] = None, aggregate_filter: Optional[Union[AggregationFilter, dict[str, Any]]] = None, filter: Optional[Union[TimeSeriesFilter, dict[str, Any]]] = None) UniqueResultList
Get unique properties with counts for time series.
- Parameters
property (TimeSeriesProperty | str | list[str]) – The property to group by.
advanced_filter (Filter | dict[str, Any] | None) – The filter to narrow down the time series to count cardinality.
aggregate_filter (AggregationFilter | dict[str, Any] | None) – The filter to apply to the resulting buckets.
filter (TimeSeriesFilter | dict[str, Any] | None) – The filter to narrow down the time series to count requiring exact match.
- Returns
List of unique values of time series matching the specified filters and search.
- Return type
UniqueResultList
Examples
Get the timezones (metadata key) with count for your time series in your CDF project:
>>> from cognite.client import CogniteClient >>> from cognite.client.data_classes.time_series import TimeSeriesProperty >>> client = CogniteClient() >>> result = client.time_series.aggregate_unique_values(TimeSeriesProperty.metadata_key("timezone")) >>> print(result.unique)
Get the different units with count used for time series created after 2020-01-01 in your CDF project:
>>> from cognite.client.data_classes import filters >>> from cognite.client.data_classes.time_series import TimeSeriesProperty >>> from cognite.client.utils import timestamp_to_ms >>> from datetime import datetime >>> created_after_2020 = filters.Range(TimeSeriesProperty.created_time, gte=timestamp_to_ms(datetime(2020, 1, 1))) >>> result = client.time_series.aggregate_unique_values(TimeSeriesProperty.unit, advanced_filter=created_after_2020) >>> print(result.unique)
Get the different units with count for time series updated after 2020-01-01 in your CDF project, but exclude all units that start with “test”:
>>> from cognite.client.data_classes.time_series import TimeSeriesProperty >>> from cognite.client.data_classes import aggregations as aggs, filters >>> not_test = aggs.Not(aggs.Prefix("test")) >>> created_after_2020 = filters.Range(TimeSeriesProperty.last_updated_time, gte=timestamp_to_ms(datetime(2020, 1, 1))) >>> result = client.time_series.aggregate_unique_values(TimeSeriesProperty.unit, advanced_filter=created_after_2020, aggregate_filter=not_test) >>> print(result.unique)
Aggregate Time Series Unique Properties
- TimeSeriesAPI.aggregate_unique_properties(path: cognite.client.data_classes.time_series.TimeSeriesProperty | str | list[str], advanced_filter: Optional[Union[Filter, dict[str, Any]]] = None, aggregate_filter: Optional[Union[AggregationFilter, dict[str, Any]]] = None, filter: Optional[Union[TimeSeriesFilter, dict[str, Any]]] = None) UniqueResultList
Get unique paths with counts for time series.
- Parameters
path (TimeSeriesProperty | str | list[str]) – The scope in every document to aggregate properties. The only value allowed now is [“metadata”]. It means to aggregate only metadata properties (aka keys).
advanced_filter (Filter | dict[str, Any] | None) – The filter to narrow down the time series to count cardinality.
aggregate_filter (AggregationFilter | dict[str, Any] | None) – The filter to apply to the resulting buckets.
filter (TimeSeriesFilter | dict[str, Any] | None) – The filter to narrow down the time series to count requiring exact match.
- Returns
List of unique values of time series matching the specified filters and search.
- Return type
UniqueResultList
Examples
Get the metadata keys with count for your time series in your CDF project:
>>> from cognite.client import CogniteClient >>> from cognite.client.data_classes.time_series import TimeSeriesProperty >>> client = CogniteClient() >>> result = client.time_series.aggregate_unique_values(TimeSeriesProperty.metadata)
Search for time series
- TimeSeriesAPI.search(name: Optional[str] = None, description: Optional[str] = None, query: Optional[str] = None, filter: Optional[Union[TimeSeriesFilter, dict[str, Any]]] = None, limit: int = 25) TimeSeriesList
Search for time series. Primarily meant for human-centric use-cases and data exploration, not for programs, since matching and ordering may change over time. Use the list function if stable or exact matches are required.
- Parameters
name (str | None) – Prefix and fuzzy search on name.
description (str | None) – Prefix and fuzzy search on description.
query (str | None) – Search on name and description using wildcard search on each of the words (separated by spaces). Retrieves results where at least one word must match. Example: ‘some other’
filter (TimeSeriesFilter | dict[str, Any] | None) – Filter to apply. Performs exact match on these fields.
limit (int) – Max number of results to return.
- Returns
List of requested time series.
- Return type
Examples
Search for a time series:
>>> from cognite.client import CogniteClient >>> client = CogniteClient() >>> res = client.time_series.search(name="some name")
Search for all time series connected to asset with id 123:
>>> res = client.time_series.search(filter={"asset_ids":[123]})
Create time series
- TimeSeriesAPI.create(time_series: collections.abc.Sequence[cognite.client.data_classes.time_series.TimeSeries] | collections.abc.Sequence[cognite.client.data_classes.time_series.TimeSeriesWrite]) TimeSeriesList
- TimeSeriesAPI.create(time_series: cognite.client.data_classes.time_series.TimeSeries | cognite.client.data_classes.time_series.TimeSeriesWrite) TimeSeries
Create one or more time series.
- Parameters
time_series (TimeSeries | TimeSeriesWrite | Sequence[TimeSeries] | Sequence[TimeSeriesWrite]) – TimeSeries or list of TimeSeries to create.
- Returns
The created time series.
- Return type
Examples
Create a new time series:
>>> from cognite.client import CogniteClient >>> from cognite.client.data_classes import TimeSeriesWrite >>> client = CogniteClient() >>> ts = client.time_series.create(TimeSeriesWrite(name="my_ts", data_set_id=123, external_id="foo"))
Delete time series
- TimeSeriesAPI.delete(id: Optional[Union[int, Sequence[int]]] = None, external_id: Optional[Union[str, SequenceNotStr[str]]] = None, ignore_unknown_ids: bool = False) None
Delete one or more time series.
- Parameters
id (int | Sequence[int] | None) – Id or list of ids
external_id (str | SequenceNotStr[str] | None) – External ID or list of external ids
ignore_unknown_ids (bool) – Ignore IDs and external IDs that are not found rather than throw an exception.
Examples
Delete time series by id or external id:
>>> from cognite.client import CogniteClient >>> client = CogniteClient() >>> client.time_series.delete(id=[1,2,3], external_id="3")
Filter time series
- TimeSeriesAPI.filter(filter: cognite.client.data_classes.filters.Filter | dict, sort: Optional[Union[TimeSeriesSort, str, SortableTimeSeriesProperty, tuple[str, Literal['asc', 'desc']], tuple[str, Literal['asc', 'desc'], Literal['auto', 'first', 'last']], list[cognite.client.data_classes.time_series.TimeSeriesSort | str | cognite.client.data_classes.time_series.SortableTimeSeriesProperty | tuple[str, Literal['asc', 'desc']] | tuple[str, Literal['asc', 'desc'], Literal['auto', 'first', 'last']]]]] = None, limit: int | None = 25) TimeSeriesList
-
Advanced filter lets you create complex filtering expressions that combine simple operations, such as equals, prefix, exists, etc., using boolean operators and, or, and not. It applies to basic fields as well as metadata.
- Parameters
filter (Filter | dict) – Filter to apply.
sort (SortSpec | list[SortSpec] | None) – The criteria to sort by. Can be up to two properties to sort by default to ascending order.
limit (int | None) – Maximum number of results to return. Defaults to 25. Set to -1, float(“inf”) or None to return all items.
- Returns
List of time series that match the filter criteria.
- Return type
Examples
Find all numeric time series and return them sorted by external id:
>>> from cognite.client import CogniteClient >>> from cognite.client.data_classes.filters import Equals >>> client = CogniteClient() >>> is_numeric = Equals("is_string", False) >>> res = client.time_series.filter(filter=is_numeric, sort="external_id")
Note that you can check the API documentation above to see which properties you can filter on with which filters.
To make it easier to avoid spelling mistakes and easier to look up available properties for filtering and sorting, you can also use the TimeSeriesProperty and SortableTimeSeriesProperty enums.
>>> from cognite.client.data_classes.filters import Equals >>> from cognite.client.data_classes.time_series import TimeSeriesProperty, SortableTimeSeriesProperty >>> is_numeric = Equals(TimeSeriesProperty.is_string, False) >>> res = client.time_series.filter(filter=is_numeric, sort=SortableTimeSeriesProperty.external_id)
Update time series
- TimeSeriesAPI.update(item: Sequence[cognite.client.data_classes.time_series.TimeSeries | cognite.client.data_classes.time_series.TimeSeriesWrite | cognite.client.data_classes.time_series.TimeSeriesUpdate], mode: Literal['replace_ignore_null', 'patch', 'replace'] = 'replace_ignore_null') TimeSeriesList
- TimeSeriesAPI.update(item: cognite.client.data_classes.time_series.TimeSeries | cognite.client.data_classes.time_series.TimeSeriesWrite | cognite.client.data_classes.time_series.TimeSeriesUpdate, mode: Literal['replace_ignore_null', 'patch', 'replace'] = 'replace_ignore_null') TimeSeries
Update one or more time series.
- Parameters
item (TimeSeries | TimeSeriesWrite | TimeSeriesUpdate | Sequence[TimeSeries | TimeSeriesWrite | TimeSeriesUpdate]) – Time series to update
mode (Literal['replace_ignore_null', 'patch', 'replace']) – How to update data when a non-update object is given (TimeSeries or -Write). If you use ‘replace_ignore_null’, only the fields you have set will be used to replace existing (default). Using ‘replace’ will additionally clear all the fields that are not specified by you. Last option, ‘patch’, will update only the fields you have set and for container-like fields such as metadata or labels, add the values to the existing. For more details, see Update and Upsert Mode Parameter.
- Returns
Updated time series.
- Return type
Examples
Update a time series that you have fetched. This will perform a full update of the time series:
>>> from cognite.client import CogniteClient >>> client = CogniteClient() >>> res = client.time_series.retrieve(id=1) >>> res.description = "New description" >>> res = client.time_series.update(res)
Perform a partial update on a time series, updating the description and adding a new field to metadata:
>>> from cognite.client.data_classes import TimeSeriesUpdate >>> my_update = TimeSeriesUpdate(id=1).description.set("New description").metadata.add({"key": "value"}) >>> res = client.time_series.update(my_update)
Perform a partial update on a time series by instance id:
>>> from cognite.client.data_classes import TimeSeriesUpdate >>> from cognite.client.data_classes.data_modeling import NodeId
>>> my_update = ( ... TimeSeriesUpdate(instance_id=NodeId("test", "hello")) ... .external_id.set("test:hello") ... .metadata.add({"test": "hello"}) ... ) >>> client.time_series.update(my_update)
Upsert time series
- TimeSeriesAPI.upsert(item: Sequence[cognite.client.data_classes.time_series.TimeSeries | cognite.client.data_classes.time_series.TimeSeriesWrite], mode: Literal['patch', 'replace'] = 'patch') TimeSeriesList
- TimeSeriesAPI.upsert(item: cognite.client.data_classes.time_series.TimeSeries | cognite.client.data_classes.time_series.TimeSeriesWrite, mode: Literal['patch', 'replace'] = 'patch') TimeSeries
- Upsert time series, i.e., update if it exists, and create if it does not exist.
Note this is a convenience method that handles the upserting for you by first calling update on all items, and if any of them fail because they do not exist, it will create them instead.
For more details, see Upsert.
- Parameters
item (TimeSeries | TimeSeriesWrite | Sequence[TimeSeries | TimeSeriesWrite]) – TimeSeries or list of TimeSeries to upsert.
mode (Literal['patch', 'replace']) – Whether to patch or replace in the case the time series are existing. If you set ‘patch’, the call will only update fields with non-null values (default). Setting ‘replace’ will unset any fields that are not specified.
- Returns
The upserted time series(s).
- Return type
Examples
Upsert for TimeSeries:
>>> from cognite.client import CogniteClient >>> from cognite.client.data_classes import TimeSeries >>> client = CogniteClient() >>> existing_time_series = client.time_series.retrieve(id=1) >>> existing_time_series.description = "New description" >>> new_time_series = TimeSeries(external_id="new_timeSeries", description="New timeSeries") >>> res = client.time_series.upsert([existing_time_series, new_time_series], mode="replace")
Time Series Data classes
- class cognite.client.data_classes.time_series.SortableTimeSeriesProperty(value)
Bases:
EnumProperty
An enumeration.
- class cognite.client.data_classes.time_series.TimeSeries(id: int | None = None, external_id: str | None = None, instance_id: NodeId | None = None, name: str | None = None, is_string: bool | None = None, metadata: dict[str, str] | None = None, unit: str | None = None, unit_external_id: str | None = None, asset_id: int | None = None, is_step: bool | None = None, description: str | None = None, security_categories: Sequence[int] | None = None, data_set_id: int | None = None, created_time: int | None = None, last_updated_time: int | None = None, legacy_name: str | None = None, cognite_client: CogniteClient | None = None)
Bases:
TimeSeriesCore
This represents a sequence of data points. The TimeSeries object is the metadata about the datapoints, and the Datapoint object is the actual data points. This is the reading version of TimesSeries, which is used when retrieving from CDF.
- Parameters
id (int | None) – A server-generated ID for the object.
external_id (str | None) – The externally supplied ID for the time series.
instance_id (NodeId | None) – The Instance ID for the time series. (Only applicable for time series created in DMS)
name (str | None) – The display short name of the time series.
is_string (bool | None) – Whether the time series is string valued or not.
metadata (dict[str, str] | None) – Custom, application-specific metadata. String key -> String value. Limits: Maximum length of key is 32 bytes, value 512 bytes, up to 16 key-value pairs.
unit (str | None) – The physical unit of the time series.
unit_external_id (str | None) – The physical unit of the time series (reference to unit catalog). Only available for numeric time series.
asset_id (int | None) – Asset ID of equipment linked to this time series.
is_step (bool | None) – Whether the time series is a step series or not.
description (str | None) – Description of the time series.
security_categories (Sequence[int] | None) – The required security categories to access this time series.
data_set_id (int | None) – The dataSet ID for the item.
created_time (int | None) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.
last_updated_time (int | None) – The number of milliseconds since 00:00:00 Thursday, 1 January 1970, Coordinated Universal Time (UTC), minus leap seconds.
legacy_name (str | None) – This field is not used by the API and will be removed October 2024.
cognite_client (CogniteClient | None) – The client to associate with this object.
- as_write() TimeSeriesWrite
Returns a TimeSeriesWrite object with the same properties as this TimeSeries.
- asset() Asset
Returns the asset this time series belongs to.
- Returns
The asset given by its asset_id.
- Return type
- Raises
ValueError – If asset_id is missing.
- count() int
Returns the number of datapoints in this time series.
This result may not be completely accurate, as it is based on aggregates which may be occasionally out of date.
- Returns
The number of datapoints in this time series.
- Return type
int
- Raises
RuntimeError – If the time series is string, as count aggregate is only supported for numeric data
- Returns
The total number of datapoints
- Return type
int
- first() Datapoint | None
Returns the first datapoint in this time series. If empty, returns None.
- Returns
A datapoint object containing the value and timestamp of the first datapoint.
- Return type
Datapoint | None
- latest(before: int | str | datetime | None = None) Datapoint | None
Returns the latest datapoint in this time series. If empty, returns None.
- Parameters
before (int | str | datetime | None) – No description.
- Returns
A datapoint object containing the value and timestamp of the latest datapoint.
- Return type
Datapoint | None
- class cognite.client.data_classes.time_series.TimeSeriesCore(external_id: Optional[str] = None, instance_id: Optional[NodeId] = None, name: Optional[str] = None, is_string: Optional[bool] = None, metadata: Optional[dict[str, str]] = None, unit: Optional[str] = None, unit_external_id: Optional[str] = None, asset_id: Optional[int] = None, is_step: Optional[bool] = None, description: Optional[str] = None, security_categories: Optional[Sequence[int]] = None, data_set_id: Optional[int] = None, legacy_name: Optional[str] = None)
Bases:
WriteableCogniteResource
[TimeSeriesWrite
],ABC
No description.
- Parameters
external_id (str | None) – The externally supplied ID for the time series.
instance_id (NodeId | None) – The Instance ID for the time series. (Only applicable for time series created in DMS)
name (str | None) – The display short name of the time series.
is_string (bool | None) – Whether the time series is string valued or not.
metadata (dict[str, str] | None) – Custom, application-specific metadata. String key -> String value. Limits: Maximum length of key is 32 bytes, value 512 bytes, up to 16 key-value pairs.
unit (str | None) – The physical unit of the time series.
unit_external_id (str | None) – The physical unit of the time series (reference to unit catalog). Only available for numeric time series.
asset_id (int | None) – Asset ID of equipment linked to this time series.
is_step (bool | None) – Whether the time series is a step series or not.
description (str | None) – Description of the time series.
security_categories (Sequence[int] | None) – The required security categories to access this time series.
data_set_id (int | None) – The dataSet ID for the item.
legacy_name (str | None) – This field is not used by the API and will be removed October 2024.
- dump(camel_case: bool = True) dict[str, Any]
Dump the object to a dictionary
- class cognite.client.data_classes.time_series.TimeSeriesFilter(name: Optional[str] = None, unit: Optional[str] = None, unit_external_id: Optional[str] = None, unit_quantity: Optional[str] = None, is_string: Optional[bool] = None, is_step: Optional[bool] = None, metadata: Optional[dict[str, str]] = None, asset_ids: Optional[Sequence[int]] = None, asset_external_ids: Optional[SequenceNotStr[str]] = None, asset_subtree_ids: Optional[Sequence[dict[str, Any]]] = None, data_set_ids: Optional[Sequence[dict[str, Any]]] = None, external_id_prefix: Optional[str] = None, created_time: Optional[Union[dict[str, Any], TimestampRange]] = None, last_updated_time: Optional[Union[dict[str, Any], TimestampRange]] = None)
Bases:
CogniteFilter
No description.
- Parameters
name (str | None) – Filter on name.
unit (str | None) – Filter on unit.
unit_external_id (str | None) – Filter on unit external ID.
unit_quantity (str | None) – Filter on unit quantity.
is_string (bool | None) – Filter on isString.
is_step (bool | None) – Filter on isStep.
metadata (dict[str, str] | None) – Custom, application specific metadata. String key -> String value. Limits: Maximum length of key is 32 bytes, value 512 bytes, up to 16 key-value pairs.
asset_ids (Sequence[int] | None) – Only include time series that reference these specific asset IDs.
asset_external_ids (SequenceNotStr[str] | None) – Asset External IDs of related equipment that this time series relates to.
asset_subtree_ids (Sequence[dict[str, Any]] | None) – Only include time series that are related to an asset in a subtree rooted at any of these asset IDs or external IDs. If the total size of the given subtrees exceeds 100,000 assets, an error will be returned.
data_set_ids (Sequence[dict[str, Any]] | None) – No description.
external_id_prefix (str | None) – Filter by this (case-sensitive) prefix for the external ID.
created_time (dict[str, Any] | TimestampRange | None) – Range between two timestamps.
last_updated_time (dict[str, Any] | TimestampRange | None) – Range between two timestamps.
- class cognite.client.data_classes.time_series.TimeSeriesList(resources: Iterable[Any], cognite_client: CogniteClient | None = None)
Bases:
WriteableCogniteResourceList
[TimeSeriesWrite
,TimeSeries
],IdTransformerMixin
- class cognite.client.data_classes.time_series.TimeSeriesProperty(value)
Bases:
EnumProperty
An enumeration.
- class cognite.client.data_classes.time_series.TimeSeriesUpdate(id: Optional[int] = None, external_id: Optional[str] = None, instance_id: Optional[NodeId] = None)
Bases:
CogniteUpdate
Changes will be applied to time series.
- Parameters
id (int | None) – A server-generated ID for the object.
external_id (str | None) – The external ID provided by the client. Must be unique for the resource type.
instance_id (NodeId | None) – The ID of the instance this time series belongs to.
- dump(camel_case: Literal[True] = True) dict[str, Any]
Dump the instance into a json serializable Python data type.
- Parameters
camel_case (Literal[True]) – No description.
- Returns
A dictionary representation of the instance.
- Return type
dict[str, Any]
- class cognite.client.data_classes.time_series.TimeSeriesWrite(external_id: Optional[str] = None, instance_id: Optional[NodeId] = None, name: Optional[str] = None, is_string: Optional[bool] = None, metadata: Optional[dict[str, str]] = None, unit: Optional[str] = None, unit_external_id: Optional[str] = None, asset_id: Optional[int] = None, is_step: Optional[bool] = None, description: Optional[str] = None, security_categories: Optional[Sequence[int]] = None, data_set_id: Optional[int] = None, legacy_name: Optional[str] = None)
Bases:
TimeSeriesCore
This is the write version of TimeSeries, which is used when writing to CDF.
- Parameters
external_id (str | None) – The externally supplied ID for the time series.
instance_id (NodeId | None) – The Instance ID for the time series. (Only applicable for time series created in DMS)
name (str | None) – The display short name of the time series.
is_string (bool | None) – Whether the time series is string valued or not.
metadata (dict[str, str] | None) – Custom, application-specific metadata. String key -> String value. Limits: Maximum length of key is 32 bytes, value 512 bytes, up to 16 key-value pairs.
unit (str | None) – The physical unit of the time series.
unit_external_id (str | None) – The physical unit of the time series (reference to unit catalog). Only available for numeric time series.
asset_id (int | None) – Asset ID of equipment linked to this time series.
is_step (bool | None) – Whether the time series is a step series or not.
description (str | None) – Description of the time series.
security_categories (Sequence[int] | None) – The required security categories to access this time series.
data_set_id (int | None) – The dataSet ID for the item.
legacy_name (str | None) – This field is not used by the API and will be removed October 2024.
- as_write() TimeSeriesWrite
Returns this TimeSeriesWrite object.
- class cognite.client.data_classes.time_series.TimeSeriesWriteList(resources: Iterable[Any], cognite_client: CogniteClient | None = None)
Bases:
CogniteResourceList
[TimeSeriesWrite
],ExternalIDTransformerMixin
Synthetic time series
Calculate the result of a function on time series
- SyntheticDatapointsAPI.query(expressions: str | sympy.Basic | Sequence[str | sympy.Basic], start: int | str | datetime, end: int | str | datetime, limit: int | None = None, variables: dict[str | sympy.Symbol, str | NodeId | TimeSeries | TimeSeriesWrite] | None = None, aggregate: str | None = None, granularity: str | None = None, target_unit: str | None = None, target_unit_system: str | None = None) Datapoints | DatapointsList
Calculate the result of a function on time series.
- Parameters
expressions (str | sympy.Basic | Sequence[str | sympy.Basic]) – Functions to be calculated. Supports both strings and sympy expressions. Strings can have either the API ts{} syntax, or contain variable names to be replaced using the variables parameter.
start (int | str | datetime) – Inclusive start.
end (int | str | datetime) – Exclusive end.
limit (int | None) – Number of datapoints per expression to retrieve.
variables (dict[str | sympy.Symbol, str | NodeId | TimeSeries | TimeSeriesWrite] | None) – An optional map of symbol replacements.
aggregate (str | None) – use this aggregate when replacing entries from variables, does not affect time series given in the ts{} syntax.
granularity (str | None) – use this granularity with the aggregate.
target_unit (str | None) – use this target_unit when replacing entries from variables, does not affect time series given in the ts{} syntax.
target_unit_system (str | None) – Same as target_unit, but with unit system (e.g. SI). Only one of target_unit and target_unit_system can be specified.
- Returns
A DatapointsList object containing the calculated data.
- Return type
Examples
Execute a synthetic time series query with an expression. Here we sum three time series plus a constant. The first is referenced by ID, the second by external ID, and the third by instance ID:
>>> from cognite.client import CogniteClient >>> client = CogniteClient() >>> expression = ''' ... 123 ... + ts{id:123} ... + ts{externalId:'abc'} ... + ts{space:'my-space',externalId:'my-ts-xid'} ... ''' >>> dps = client.time_series.data.synthetic.query( ... expressions=expression, ... start="2w-ago", ... end="now")
You can also specify variables for an easier query syntax:
>>> from cognite.client.data_classes.data_modeling.ids import NodeId >>> ts = client.time_series.retrieve(id=123) >>> variables = { ... "A": ts, ... "B": "my_ts_external_id", ... "C": NodeId("my-space", "my-ts-xid"), ... } >>> dps = client.time_series.data.synthetic.query( ... expressions="A+B+C", start="2w-ago", end="2w-ahead", variables=variables)
Use sympy to build complex expressions:
>>> from sympy import symbols, cos, sin >>> x, y = symbols("x y") >>> dps = client.time_series.data.synthetic.query( ... [sin(x), y*cos(x)], ... start="2w-ago", ... end="now", ... variables={x: "foo", y: "bar"}, ... aggregate="interpolation", ... granularity="15m", ... target_unit="temperature:deg_c")
Datapoints
Retrieve datapoints
- DatapointsAPI.retrieve(*, id: Union[None, int, DatapointsQuery, Sequence[int | cognite.client.data_classes.datapoints.DatapointsQuery]] = None, external_id: Union[None, str, DatapointsQuery, SequenceNotStr[str | cognite.client.data_classes.datapoints.DatapointsQuery]] = None, instance_id: Union[None, NodeId, Sequence[NodeId], DatapointsQuery, Sequence[cognite.client.data_classes.data_modeling.ids.NodeId | cognite.client.data_classes.datapoints.DatapointsQuery]] = None, start: Optional[Union[int, str, datetime]] = None, end: Optional[Union[int, str, datetime]] = None, aggregates: Optional[Union[Literal['average', 'continuous_variance', 'count', 'count_bad', 'count_good', 'count_uncertain', 'discrete_variance', 'duration_bad', 'duration_good', 'duration_uncertain', 'interpolation', 'max', 'min', 'step_interpolation', 'sum', 'total_variation'], str, list[Union[Literal['average', 'continuous_variance', 'count', 'count_bad', 'count_good', 'count_uncertain', 'discrete_variance', 'duration_bad', 'duration_good', 'duration_uncertain', 'interpolation', 'max', 'min', 'step_interpolation', 'sum', 'total_variation'], str]]]] = None, granularity: Optional[str] = None, timezone: Optional[Union[str, timezone, ZoneInfo]] = None, target_unit: Optional[str] = None, target_unit_system: Optional[str] = None, limit: Optional[int] = None, include_outside_points: bool = False, ignore_unknown_ids: bool = False, include_status: bool = False, ignore_bad_datapoints: bool = True, treat_uncertain_as_bad: bool = True) cognite.client.data_classes.datapoints.Datapoints | cognite.client.data_classes.datapoints.DatapointsList | None
Retrieve datapoints for one or more time series.
- Performance guide:
In order to retrieve millions of datapoints as efficiently as possible, here are a few guidelines:
Make one call to retrieve and fetch all time series in go, rather than making multiple calls (if your memory allows it). The SDK will optimize retrieval strategy for you!
For best speed, and significantly lower memory usage, consider using
retrieve_arrays(...)
which usesnumpy.ndarrays
for data storage.Unlimited queries (
limit=None
) are most performant as they are always fetched in parallel, for any number of requested time series.Limited queries, (e.g.
limit=500_000
) are much less performant, at least for large limits, as each individual time series is fetched serially (we can’t predict where on the timeline the datapoints are). Thus parallelisation is only used when asking for multiple “limited” time series.Try to avoid specifying start and end to be very far from the actual data: If you have data from 2000 to 2015, don’t use start=0 (1970).
Using
timezone
and/or calendar granularities like month/quarter/year in aggregate queries comes at a penalty.
Tip
To read datapoints efficiently, while keeping a low memory footprint e.g. to copy from one project to another, check out
__call__()
. It allows you to iterate through datapoints in chunks, and also control how many time series to iterate at the same time.Time series support status codes like Good, Uncertain and Bad. You can read more in the Cognite Data Fusion developer documentation on status codes.
- Parameters
id (None | int | DatapointsQuery | Sequence[int | DatapointsQuery]) – Id, dict (with id) or (mixed) sequence of these. See examples below.
external_id (None | str | DatapointsQuery | SequenceNotStr[str | DatapointsQuery]) – External id, dict (with external id) or (mixed) sequence of these. See examples below.
instance_id (None | NodeId | Sequence[NodeId] | DatapointsQuery | Sequence[NodeId | DatapointsQuery]) – Instance id or sequence of instance ids.
start (int | str | datetime.datetime | None) – Inclusive start. Default: 1970-01-01 UTC.
end (int | str | datetime.datetime | None) – Exclusive end. Default: “now”
aggregates (Aggregate | str | list[Aggregate | str] | None) – Single aggregate or list of aggregates to retrieve. Available options:
average
,continuous_variance
,count
,count_bad
,count_good
,count_uncertain
,discrete_variance
,duration_bad
,duration_good
,duration_uncertain
,interpolation
,max
,min
,step_interpolation
,sum
andtotal_variation
. Default: None (raw datapoints returned)granularity (str | None) – The granularity to fetch aggregates at. Can be given as an abbreviation or spelled out for clarity:
s/second(s)
,m/minute(s)
,h/hour(s)
,d/day(s)
,w/week(s)
,mo/month(s)
,q/quarter(s)
, ory/year(s)
. Examples:30s
,5m
,1day
,2weeks
. Default: None.timezone (str | datetime.timezone | ZoneInfo | None) – For raw datapoints, which timezone to use when displaying (will not affect what is retrieved). For aggregates, which timezone to align to for granularity ‘hour’ and longer. Align to the start of the hour, day or month. For timezones of type Region/Location, like ‘Europe/Oslo’, pass a string or
ZoneInfo
instance. The aggregate duration will then vary, typically due to daylight saving time. You can also use a fixed offset from UTC by passing a string like ‘+04:00’, ‘UTC-7’ or ‘UTC-02:30’ or an instance ofdatetime.timezone
. Note: Historical timezones with second offset are not supported, and timezones with minute offsets (e.g. UTC+05:30 or Asia/Kolkata) may take longer to execute.target_unit (str | None) – The unit_external_id of the datapoints returned. If the time series does not have a unit_external_id that can be converted to the target_unit, an error will be returned. Cannot be used with target_unit_system.
target_unit_system (str | None) – The unit system of the datapoints returned. Cannot be used with target_unit.
limit (int | None) – Maximum number of datapoints to return for each time series. Default: None (no limit)
include_outside_points (bool) – Whether to include outside points. Not allowed when fetching aggregates. Default: False
ignore_unknown_ids (bool) – Whether to ignore missing time series rather than raising an exception. Default: False
include_status (bool) – Also return the status code, an integer, for each datapoint in the response. Only relevant for raw datapoint queries, not aggregates.
ignore_bad_datapoints (bool) – Treat datapoints with a bad status code as if they do not exist. If set to false, raw queries will include bad datapoints in the response, and aggregates will in general omit the time period between a bad datapoint and the next good datapoint. Also, the period between a bad datapoint and the previous good datapoint will be considered constant. Default: True.
treat_uncertain_as_bad (bool) – Treat datapoints with uncertain status codes as bad. If false, treat datapoints with uncertain status codes as good. Used for both raw queries and aggregates. Default: True.
- Returns
A
Datapoints
object containing the requested data, or aDatapointsList
if multiple time series were asked for (the ordering is ids first, then external_ids). If ignore_unknown_ids is True, a single time series is requested and it is not found, the function will return None.- Return type
Datapoints | DatapointsList | None
Examples
You can specify the identifiers of the datapoints you wish to retrieve in a number of ways. In this example we are using the time-ago format,
"2w-ago"
to get raw data for the time series with id=42 from 2 weeks ago up until now. You can also use the time-ahead format, like"3d-ahead"
, to specify a relative time in the future.>>> from cognite.client import CogniteClient >>> client = CogniteClient() >>> dps = client.time_series.data.retrieve(id=42, start="2w-ago") >>> # You can also use instance_id: >>> from cognite.client.data_classes.data_modeling import NodeId >>> dps = client.time_series.data.retrieve(instance_id=NodeId("ts-space", "foo"))
Although raw datapoints are returned by default, you can also get aggregated values, such as max or average. You may also fetch more than one time series simultaneously. Here we are getting daily averages and maximum values for all of 2018, for two different time series, where we’re specifying start and end as integers (milliseconds after epoch). In the below example, we fetch them using their external ids:
>>> dps_lst = client.time_series.data.retrieve( ... external_id=["foo", "bar"], ... start=1514764800000, ... end=1546300800000, ... aggregates=["max", "average"], ... granularity="1d")
In the two code examples above, we have a dps object (an instance of
Datapoints
), and a dps_lst object (an instance ofDatapointsList
). On dps, which in this case contains raw datapoints, you may access the underlying data directly by using the .value attribute. This works for both numeric and string (raw) datapoints, but not aggregates - they must be accessed by their respective names, because you’re allowed to fetch all available aggregates simultaneously, and they are stored on the same object:>>> raw_data = dps.value >>> first_dps = dps_lst[0] # optionally: `dps_lst.get(external_id="foo")` >>> avg_data = first_dps.average >>> max_data = first_dps.max
You may also slice a
Datapoints
object (you getDatapoints
back), or ask for “a row of data” at a single index in same way you would do with a built-in list (you get a Datapoint object back, note the singular name). You’ll also get Datapoint objects when iterating through aDatapoints
object, but this should generally be avoided (consider this a performance warning):>>> dps_slice = dps[-10:] # Last ten values >>> dp = dps[3] # The third value >>> for dp in dps_slice: ... pass # do something!
All parameters can be individually set if you use and pass
DatapointsQuery
objects (evenignore_unknown_ids
, contrary to the API). If you also pass top-level parameters, these will be overruled by the individual parameters (where both exist, so think of these as defaults). You are free to mix any kind of ids and external ids: Single identifiers, single DatapointsQuery objects and (mixed) lists of these.Let’s say you want different aggregates and end-times for a few time series (when only fetching a single aggregate, you may pass the string directly for convenience):
>>> from cognite.client.data_classes import DatapointsQuery >>> dps_lst = client.time_series.data.retrieve( ... id=[ ... DatapointsQuery(id=42, end="1d-ago", aggregates="average"), ... DatapointsQuery(id=69, end="2d-ahead", aggregates=["average"]), ... DatapointsQuery(id=96, end="3d-ago", aggregates=["min", "max", "count"]), ... ], ... external_id=DatapointsQuery(external_id="foo", aggregates="max"), ... start="5d-ago", ... granularity="1h")
Certain aggregates are very useful when they follow the calendar, for example electricity consumption per day, week, month or year. You may request such calendar-based aggregates in a specific timezone to make them even more useful: daylight savings (DST) will be taken care of automatically and the datapoints will be aligned to the timezone. Note: Calendar granularities and timezone can be used independently. To get monthly local aggregates in Oslo, Norway you can do:
>>> dps = client.time_series.data.retrieve( ... id=123, ... aggregates="sum", ... granularity="1month", ... timezone="Europe/Oslo")
When requesting multiple time series, an easy way to get the datapoints of a specific one is to use the .get method on the returned
DatapointsList
object, then specify if you want id or external_id. Note: If you fetch a time series by using id, you can still access it with its external_id (and the opposite way around), if you know it:>>> from datetime import datetime, timezone >>> utc = timezone.utc >>> dps_lst = client.time_series.data.retrieve( ... start=datetime(1907, 10, 14, tzinfo=utc), ... end=datetime(1907, 11, 6, tzinfo=utc), ... id=[42, 43, 44, ..., 499, 500], ... ) >>> ts_350 = dps_lst.get(id=350) # ``Datapoints`` object
…but what happens if you request some duplicate ids or external_ids? In this example we will show how to get data from multiple disconnected periods. Let’s say you’re tasked to train a machine learning model to recognize a specific failure mode of a system, and you want the training data to only be from certain periods (when an alarm was on/high). Assuming these alarms are stored as events in CDF, with both start- and end times, we can use these directly in the query.
After fetching, the .get method will return a list of
Datapoints
instead, (assuming we have more than one event) in the same order, similar to how slicing works with non-unique indices on Pandas DataFrames:>>> periods = client.events.list(type="alarm", subtype="pressure") >>> sensor_xid = "foo-pressure-bar" >>> dps_lst = client.time_series.data.retrieve( ... id=[42, 43, 44], ... external_id=[ ... DatapointsQuery(external_id=sensor_xid, start=ev.start_time, end=ev.end_time) ... for ev in periods ... ]) >>> ts_44 = dps_lst.get(id=44) # Single ``Datapoints`` object >>> ts_lst = dps_lst.get(external_id=sensor_xid) # List of ``len(periods)`` ``Datapoints`` objects
The API has an endpoint to
retrieve_latest()
, i.e. “before”, but not “after”. Luckily, we can emulate that behaviour easily. Let’s say we have a very dense time series and do not want to fetch all of the available raw data (or fetch less precise aggregate data), just to get the very first datapoint of every month (from e.g. the year 2000 through 2010):>>> import itertools >>> month_starts = [ ... datetime(year, month, 1, tzinfo=utc) ... for year, month in itertools.product(range(2000, 2011), range(1, 13))] >>> dps_lst = client.time_series.data.retrieve( ... external_id=[DatapointsQuery(external_id="foo", start=start) for start in month_starts], ... limit=1)
To get all historic and future datapoints for a time series, e.g. to do a backup, you may want to import the two integer constants:
MIN_TIMESTAMP_MS
andMAX_TIMESTAMP_MS
, to make sure you do not miss any. Performance warning: This pattern of fetching datapoints from the entire valid time domain is slower and shouldn’t be used for regular “day-to-day” queries:>>> from cognite.client.utils import MIN_TIMESTAMP_MS, MAX_TIMESTAMP_MS >>> dps_backup = client.time_series.data.retrieve( ... id=123, ... start=MIN_TIMESTAMP_MS, ... end=MAX_TIMESTAMP_MS + 1) # end is exclusive
If you have a time series with ‘unit_external_id’ set, you can use the ‘target_unit’ parameter to convert the datapoints to the desired unit. In the example below, we are converting temperature readings from a sensor measured and stored in Celsius, to Fahrenheit (we’re assuming that the time series has e.g.
unit_external_id="temperature:deg_c"
):>>> client.time_series.data.retrieve( ... id=42, start="2w-ago", target_unit="temperature:deg_f")
Or alternatively, you can use the ‘target_unit_system’ parameter to convert the datapoints to the desired unit system:
>>> client.time_series.data.retrieve( ... id=42, start="2w-ago", target_unit_system="Imperial")
To retrieve status codes for a time series, pass
include_status=True
. This is only possible for raw datapoint queries. You would typically also passignore_bad_datapoints=False
to not hide all the datapoints that are marked as uncertain or bad, which is the API’s default behaviour. You may also usetreat_uncertain_as_bad
to control how uncertain values are interpreted.>>> dps = client.time_series.data.retrieve( ... id=42, include_status=True, ignore_bad_datapoints=False) >>> dps.status_code # list of integer codes, e.g.: [0, 1073741824, 2147483648] >>> dps.status_symbol # list of symbolic representations, e.g. [Good, Uncertain, Bad]
There are six aggregates directly related to status codes, three for count: ‘count_good’, ‘count_uncertain’ and ‘count_bad’, and three for duration: ‘duration_good’, ‘duration_uncertain’ and ‘duration_bad’. These may be fetched as any other aggregate. It is important to note that status codes may influence how other aggregates are computed: Aggregates will in general omit the time period between a bad datapoint and the next good datapoint. Also, the period between a bad datapoint and the previous good datapoint will be considered constant. To put simply, what ‘average’ may return depends on your setting for ‘ignore_bad_datapoints’ and ‘treat_uncertain_as_bad’ (in the presence of uncertain/bad datapoints).
Retrieve datapoints as numpy arrays
- DatapointsAPI.retrieve_arrays(*, id: Union[None, int, DatapointsQuery, Sequence[int | cognite.client.data_classes.datapoints.DatapointsQuery]] = None, external_id: Union[None, str, DatapointsQuery, SequenceNotStr[str | cognite.client.data_classes.datapoints.DatapointsQuery]] = None, instance_id: Union[None, NodeId, Sequence[NodeId], DatapointsQuery, Sequence[cognite.client.data_classes.data_modeling.ids.NodeId | cognite.client.data_classes.datapoints.DatapointsQuery]] = None, start: Optional[Union[int, str, datetime]] = None, end: Optional[Union[int, str, datetime]] = None, aggregates: Optional[Union[Literal['average', 'continuous_variance', 'count', 'count_bad', 'count_good', 'count_uncertain', 'discrete_variance', 'duration_bad', 'duration_good', 'duration_uncertain', 'interpolation', 'max', 'min', 'step_interpolation', 'sum', 'total_variation'], str, list[Union[Literal['average', 'continuous_variance', 'count', 'count_bad', 'count_good', 'count_uncertain', 'discrete_variance', 'duration_bad', 'duration_good', 'duration_uncertain', 'interpolation', 'max', 'min', 'step_interpolation', 'sum', 'total_variation'], str]]]] = None, granularity: Optional[str] = None, timezone: Optional[Union[str, timezone, ZoneInfo]] = None, target_unit: Optional[str] = None, target_unit_system: Optional[str] = None, limit: Optional[int] = None, include_outside_points: bool = False, ignore_unknown_ids: bool = False, include_status: bool = False, ignore_bad_datapoints: bool = True, treat_uncertain_as_bad: bool = True) cognite.client.data_classes.datapoints.DatapointsArray | cognite.client.data_classes.datapoints.DatapointsArrayList | None
Retrieve datapoints for one or more time series.
Note
This method requires
numpy
to be installed.Time series support status codes like Good, Uncertain and Bad. You can read more in the Cognite Data Fusion developer documentation on status codes.
- Parameters
id (None | int | DatapointsQuery | Sequence[int | DatapointsQuery]) – Id, dict (with id) or (mixed) sequence of these. See examples below.
external_id (None | str | DatapointsQuery | SequenceNotStr[str | DatapointsQuery]) – External id, dict (with external id) or (mixed) sequence of these. See examples below.
instance_id (None | NodeId | Sequence[NodeId] | DatapointsQuery | Sequence[NodeId | DatapointsQuery]) – Instance id or sequence of instance ids.
start (int | str | datetime.datetime | None) – Inclusive start. Default: 1970-01-01 UTC.
end (int | str | datetime.datetime | None) – Exclusive end. Default: “now”
aggregates (Aggregate | str | list[Aggregate | str] | None) – Single aggregate or list of aggregates to retrieve. Available options:
average
,continuous_variance
,count
,count_bad
,count_good
,count_uncertain
,discrete_variance
,duration_bad
,duration_good
,duration_uncertain
,interpolation
,max
,min
,step_interpolation
,sum
andtotal_variation
. Default: None (raw datapoints returned)granularity (str | None) – The granularity to fetch aggregates at. Can be given as an abbreviation or spelled out for clarity:
s/second(s)
,m/minute(s)
,h/hour(s)
,d/day(s)
,w/week(s)
,mo/month(s)
,q/quarter(s)
, ory/year(s)
. Examples:30s
,5m
,1day
,2weeks
. Default: None.timezone (str | datetime.timezone | ZoneInfo | None) – For raw datapoints, which timezone to use when displaying (will not affect what is retrieved). For aggregates, which timezone to align to for granularity ‘hour’ and longer. Align to the start of the hour, day or month. For timezones of type Region/Location, like ‘Europe/Oslo’, pass a string or
ZoneInfo
instance. The aggregate duration will then vary, typically due to daylight saving time. You can also use a fixed offset from UTC by passing a string like ‘+04:00’, ‘UTC-7’ or ‘UTC-02:30’ or an instance ofdatetime.timezone
. Note: Historical timezones with second offset are not supported, and timezones with minute offsets (e.g. UTC+05:30 or Asia/Kolkata) may take longer to execute.target_unit (str | None) – The unit_external_id of the datapoints returned. If the time series does not have a unit_external_id that can be converted to the target_unit, an error will be returned. Cannot be used with target_unit_system.
target_unit_system (str | None) – The unit system of the datapoints returned. Cannot be used with target_unit.
limit (int | None) – Maximum number of datapoints to return for each time series. Default: None (no limit)
include_outside_points (bool) – Whether to include outside points. Not allowed when fetching aggregates. Default: False
ignore_unknown_ids (bool) – Whether to ignore missing time series rather than raising an exception. Default: False
include_status (bool) – Also return the status code, an integer, for each datapoint in the response. Only relevant for raw datapoint queries, not aggregates.
ignore_bad_datapoints (bool) – Treat datapoints with a bad status code as if they do not exist. If set to false, raw queries will include bad datapoints in the response, and aggregates will in general omit the time period between a bad datapoint and the next good datapoint. Also, the period between a bad datapoint and the previous good datapoint will be considered constant. Default: True.
treat_uncertain_as_bad (bool) – Treat datapoints with uncertain status codes as bad. If false, treat datapoints with uncertain status codes as good. Used for both raw queries and aggregates. Default: True.
- Returns
A
DatapointsArray
object containing the requested data, or aDatapointsArrayList
if multiple time series were asked for (the ordering is ids first, then external_ids). If ignore_unknown_ids is True, a single time series is requested and it is not found, the function will return None.- Return type
DatapointsArray | DatapointsArrayList | None
Note
For many more usage examples, check out the
retrieve()
method which accepts exactly the same arguments.When retrieving raw datapoints with
ignore_bad_datapoints=False
, bad datapoints with the value NaN can not be distinguished from those missing a value (due to being stored in a numpy array). To solve this, all missing values have their timestamp recorded in a set you may access:dps.null_timestamps
. If you chose to pass aDatapointsArray
to an insert method, this will be inspected automatically to replicate correctly (inserting status codes will soon be supported).Examples
Get weekly
min
andmax
aggregates for a time series with id=42 since the year 2000, then compute the range of values:>>> from cognite.client import CogniteClient >>> from datetime import datetime, timezone >>> client = CogniteClient() >>> dps = client.time_series.data.retrieve_arrays( ... id=42, ... start=datetime(2020, 1, 1, tzinfo=timezone.utc), ... aggregates=["min", "max"], ... granularity="7d") >>> weekly_range = dps.max - dps.min
Get up-to 2 million raw datapoints for the last 48 hours for a noisy time series with external_id=”ts-noisy”, then use a small and wide moving average filter to smooth it out:
>>> import numpy as np >>> dps = client.time_series.data.retrieve_arrays( ... external_id="ts-noisy", ... start="2d-ago", ... limit=2_000_000) >>> smooth = np.convolve(dps.value, np.ones(5) / 5) >>> smoother = np.convolve(dps.value, np.ones(20) / 20)
Get raw datapoints for multiple time series, that may or may not exist, from the last 2 hours, then find the largest gap between two consecutive values for all time series, also taking the previous value into account (outside point).
>>> id_lst = [42, 43, 44] >>> dps_lst = client.time_series.data.retrieve_arrays( ... id=id_lst, ... start="2h-ago", ... include_outside_points=True, ... ignore_unknown_ids=True) >>> largest_gaps = [np.max(np.diff(dps.timestamp)) for dps in dps_lst]
Get raw datapoints for a time series with external_id=”bar” from the last 10 weeks, then convert to a
pandas.Series
(you can of course also use theto_pandas()
convenience method if you want apandas.DataFrame
):>>> import pandas as pd >>> dps = client.time_series.data.retrieve_arrays(external_id="bar", start="10w-ago") >>> series = pd.Series(dps.value, index=dps.timestamp)
Retrieve datapoints in pandas dataframe
- DatapointsAPI.retrieve_dataframe(*, id: None | int | DatapointsQuery | Sequence[int | DatapointsQuery] = None, external_id: None | str | DatapointsQuery | SequenceNotStr[str | DatapointsQuery] = None, instance_id: None | NodeId | Sequence[NodeId] | DatapointsQuery | Sequence[NodeId | DatapointsQuery] = None, start: int | str | datetime.datetime | None = None, end: int | str | datetime.datetime | None = None, aggregates: Aggregate | str | list[Aggregate | str] | None = None, granularity: str | None = None, timezone: str | datetime.timezone | ZoneInfo | None = None, target_unit: str | None = None, target_unit_system: str | None = None, limit: int | None = None, include_outside_points: bool = False, ignore_unknown_ids: bool = False, include_status: bool = False, ignore_bad_datapoints: bool = True, treat_uncertain_as_bad: bool = True, uniform_index: bool = False, include_aggregate_name: bool = True, include_granularity_name: bool = False, column_names: Literal['id', 'external_id', 'instance_id'] = 'instance_id') pd.DataFrame
Get datapoints directly in a pandas dataframe.
Time series support status codes like Good, Uncertain and Bad. You can read more in the Cognite Data Fusion developer documentation on status codes.
Note
For many more usage examples, check out the
retrieve()
method which accepts exactly the same arguments.- Parameters
id (None | int | DatapointsQuery | Sequence[int | DatapointsQuery]) – Id, dict (with id) or (mixed) sequence of these. See examples below.
external_id (None | str | DatapointsQuery | SequenceNotStr[str | DatapointsQuery]) – External id, dict (with external id) or (mixed) sequence of these. See examples below.
instance_id (None | NodeId | Sequence[NodeId] | DatapointsQuery | Sequence[NodeId | DatapointsQuery]) – Instance id or sequence of instance ids.
start (int | str | datetime.datetime | None) – Inclusive start. Default: 1970-01-01 UTC.
end (int | str | datetime.datetime | None) – Exclusive end. Default: “now”
aggregates (Aggregate | str | list[Aggregate | str] | None) – Single aggregate or list of aggregates to retrieve. Available options:
average
,continuous_variance
,count
,count_bad
,count_good
,count_uncertain
,discrete_variance
,duration_bad
,duration_good
,duration_uncertain
,interpolation
,max
,min
,step_interpolation
,sum
andtotal_variation
. Default: None (raw datapoints returned)granularity (str | None) – The granularity to fetch aggregates at. Can be given as an abbreviation or spelled out for clarity:
s/second(s)
,m/minute(s)
,h/hour(s)
,d/day(s)
,w/week(s)
,mo/month(s)
,q/quarter(s)
, ory/year(s)
. Examples:30s
,5m
,1day
,2weeks
. Default: None.timezone (str | datetime.timezone | ZoneInfo | None) – For raw datapoints, which timezone to use when displaying (will not affect what is retrieved). For aggregates, which timezone to align to for granularity ‘hour’ and longer. Align to the start of the hour, -day or -month. For timezones of type Region/Location, like ‘Europe/Oslo’, pass a string or
ZoneInfo
instance. The aggregate duration will then vary, typically due to daylight saving time. You can also use a fixed offset from UTC by passing a string like ‘+04:00’, ‘UTC-7’ or ‘UTC-02:30’ or an instance ofdatetime.timezone
. Note: Historical timezones with second offset are not supported, and timezones with minute offsets (e.g. UTC+05:30 or Asia/Kolkata) may take longer to execute.target_unit (str | None) – The unit_external_id of the datapoints returned. If the time series does not have a unit_external_id that can be converted to the target_unit, an error will be returned. Cannot be used with target_unit_system.
target_unit_system (str | None) – The unit system of the datapoints returned. Cannot be used with target_unit.
limit (int | None) – Maximum number of datapoints to return for each time series. Default: None (no limit)
include_outside_points (bool) – Whether to include outside points. Not allowed when fetching aggregates. Default: False
ignore_unknown_ids (bool) – Whether to ignore missing time series rather than raising an exception. Default: False
include_status (bool) – Also return the status code, an integer, for each datapoint in the response. Only relevant for raw datapoint queries, not aggregates.
ignore_bad_datapoints (bool) – Treat datapoints with a bad status code as if they do not exist. If set to false, raw queries will include bad datapoints in the response, and aggregates will in general omit the time period between a bad datapoint and the next good datapoint. Also, the period between a bad datapoint and the previous good datapoint will be considered constant. Default: True.
treat_uncertain_as_bad (bool) – Treat datapoints with uncertain status codes as bad. If false, treat datapoints with uncertain status codes as good. Used for both raw queries and aggregates. Default: True.
uniform_index (bool) – If only querying aggregates AND a single granularity is used AND no limit is used, specifying uniform_index=True will return a dataframe with an equidistant datetime index from the earliest start to the latest end (missing values will be NaNs). If these requirements are not met, a ValueError is raised. Default: False
include_aggregate_name (bool) – Include ‘aggregate’ in the column name, e.g. my-ts|average. Ignored for raw time series. Default: True
include_granularity_name (bool) – Include ‘granularity’ in the column name, e.g. my-ts|12h. Added after ‘aggregate’ when present. Ignored for raw time series. Default: False
column_names (Literal['id', 'external_id', 'instance_id']) – Use either instance IDs, external IDs or IDs as column names. Time series missing instance ID will use external ID if it exists then ID as backup. Default: “instance_id”
- Returns
A pandas DataFrame containing the requested time series. The ordering of columns is ids first, then external_ids. For time series with multiple aggregates, they will be sorted in alphabetical order (“average” before “max”).
- Return type
pd.DataFrame
Warning
If you have duplicated time series in your query, the dataframe columns will also contain duplicates.
When retrieving raw datapoints with
ignore_bad_datapoints=False
, bad datapoints with the value NaN can not be distinguished from those missing a value (due to being stored in a numpy array); all will become NaNs in the dataframe.Examples
Get a pandas dataframe using a single id, and use this id as column name, with no more than 100 datapoints:
>>> from cognite.client import CogniteClient >>> client = CogniteClient() >>> df = client.time_series.data.retrieve_dataframe( ... id=12345, ... start="2w-ago", ... end="now", ... limit=100, ... column_names="id")
Get the pandas dataframe with a uniform index (fixed spacing between points) of 1 day, for two time series with individually specified aggregates, from 1990 through 2020:
>>> from datetime import datetime, timezone >>> from cognite.client.data_classes import DatapointsQuery >>> df = client.time_series.data.retrieve_dataframe( ... external_id=[ ... DatapointsQuery(external_id="foo", aggregates="discrete_variance"), ... DatapointsQuery(external_id="bar", aggregates=["total_variation", "continuous_variance"]), ... ], ... granularity="1d", ... start=datetime(1990, 1, 1, tzinfo=timezone.utc), ... end=datetime(2020, 12, 31, tzinfo=timezone.utc), ... uniform_index=True)
Get a pandas dataframe containing the ‘average’ aggregate for two time series using a 30-day granularity, starting Jan 1, 1970 all the way up to present, without having the aggregate name in the column names:
>>> df = client.time_series.data.retrieve_dataframe( ... external_id=["foo", "bar"], ... aggregates="average", ... granularity="30d", ... include_aggregate_name=False)
You may also use
pandas.Timestamp
to define start and end:>>> import pandas as pd >>> df = client.time_series.data.retrieve_dataframe( ... external_id="foo", ... start=pd.Timestamp("2023-01-01"), ... end=pd.Timestamp("2023-02-01"))
Retrieve datapoints in time zone in pandas dataframe
- DatapointsAPI.retrieve_dataframe_in_tz(*, id: int | Sequence[int] | None = None, external_id: str | SequenceNotStr[str] | None = None, start: datetime.datetime, end: datetime.datetime, aggregates: Aggregate | str | list[Aggregate | str] | None = None, granularity: str | None = None, target_unit: str | None = None, target_unit_system: str | None = None, ignore_unknown_ids: bool = False, include_status: bool = False, ignore_bad_datapoints: bool = True, treat_uncertain_as_bad: bool = True, uniform_index: bool = False, include_aggregate_name: bool = True, include_granularity_name: bool = False, column_names: Literal['id', 'external_id'] = 'external_id') pd.DataFrame
Get datapoints directly in a pandas dataframe in the same timezone as
start
andend
.Deprecation Warning
This SDK function is deprecated and will be removed in the next major release. Reason: Cognite Data Fusion now has native support for timezone and calendar-based aggregations. Please consider migrating already today: The API also supports fixed offsets, yields more accurate results and have better support for exotic timezones and unusual DST offsets. You can use the normal retrieve methods instead, just pass ‘timezone’ as a parameter.
- Parameters
id (int | Sequence[int] | None) – ID or list of IDs.
external_id (str | SequenceNotStr[str] | None) – External ID or list of External IDs.
start (datetime.datetime) – Inclusive start, must be timezone aware.
end (datetime.datetime) – Exclusive end, must be timezone aware and have the same timezone as start.
aggregates (Aggregate | str | list[Aggregate | str] | None) – Single aggregate or list of aggregates to retrieve. Available options:
average
,continuous_variance
,count
,count_bad
,count_good
,count_uncertain
,discrete_variance
,duration_bad
,duration_good
,duration_uncertain
,interpolation
,max
,min
,step_interpolation
,sum
andtotal_variation
. Default: None (raw datapoints returned)granularity (str | None) – The granularity to fetch aggregates at. Can be given as an abbreviation or spelled out for clarity:
s/second(s)
,m/minute(s)
,h/hour(s)
,d/day(s)
,w/week(s)
,mo/month(s)
,q/quarter(s)
, ory/year(s)
. Examples:30s
,5m
,1day
,2weeks
. Default: None.target_unit (str | None) – The unit_external_id of the datapoints returned. If the time series does not have a unit_external_id that can be converted to the target_unit, an error will be returned. Cannot be used with target_unit_system.
target_unit_system (str | None) – The unit system of the datapoints returned. Cannot be used with target_unit.
ignore_unknown_ids (bool) – Whether to ignore missing time series rather than raising an exception. Default: False
include_status (bool) – Also return the status code, an integer, for each datapoint in the response. Only relevant for raw datapoint queries, not aggregates.
ignore_bad_datapoints (bool) – Treat datapoints with a bad status code as if they do not exist. If set to false, raw queries will include bad datapoints in the response, and aggregates will in general omit the time period between a bad datapoint and the next good datapoint. Also, the period between a bad datapoint and the previous good datapoint will be considered constant. Default: True.
treat_uncertain_as_bad (bool) – Treat datapoints with uncertain status codes as bad. If false, treat datapoints with uncertain status codes as good. Used for both raw queries and aggregates. Default: True.
uniform_index (bool) – If querying aggregates with a non-calendar granularity, specifying
uniform_index=True
will return a dataframe with an index with constant spacing between timestamps decided by granularity all the way from start to end (missing values will be NaNs). Default: Falseinclude_aggregate_name (bool) – Include ‘aggregate’ in the column name, e.g. my-ts|average. Ignored for raw time series. Default: True
include_granularity_name (bool) – Include ‘granularity’ in the column name, e.g. my-ts|12h. Added after ‘aggregate’ when present. Ignored for raw time series. Default: False
column_names (Literal['id', 'external_id']) – Use either ids or external ids as column names. Time series missing external id will use id as backup. Default: “external_id”
- Returns
A pandas DataFrame containing the requested time series with a DatetimeIndex localized in the given timezone.
- Return type
pd.DataFrame
Iterate through datapoints in chunks
- DatapointsAPI.__call__(queries: DatapointsQuery, *, return_arrays: Literal[True]) Iterator[DatapointsArray]
- DatapointsAPI.__call__(queries: Sequence[DatapointsQuery], *, return_arrays: Literal[True]) Iterator[DatapointsArrayList]
- DatapointsAPI.__call__(queries: DatapointsQuery, *, return_arrays: Literal[False]) Iterator[Datapoints]
- DatapointsAPI.__call__(queries: Sequence[DatapointsQuery], *, return_arrays: Literal[False]) Iterator[DatapointsList]
Iterate through datapoints in chunks, for one or more time series.
Note
Control memory usage by specifying
chunk_size_time_series
, how many time series to iterate simultaneously andchunk_size_datapoints
, how many datapoints to yield per iteration (per individual time series). See full example in examples. Note that in order to make efficient use of the API request limits, this method will never hold less than 100k datapoints in memory at a time, per time series.If you run with memory constraints, use
return_arrays=True
(the default).No empty chunk is ever returned.
- Parameters
queries (DatapointsQuery | Sequence[DatapointsQuery]) – Query, or queries, using id, external_id or instance_id for the time series to fetch data for, with individual settings specified. The options ‘limit’ and ‘include_outside_points’ are not supported when iterating.
chunk_size_datapoints (int) – The number of datapoints per time series to yield per iteration. Must evenly divide 100k OR be an integer multiple of 100k. Default: 100_000.
chunk_size_time_series (int | None) – The max number of time series to yield per iteration (varies as time series get exhausted, but is never empty). Default: None (all given queries are iterated at the same time).
return_arrays (bool) – Whether to return the datapoints as numpy arrays. Default: True.
- Yields
DatapointsArray | DatapointsArrayList | Datapoints | DatapointsList – If return_arrays=True, a
DatapointsArray
object containing the datapoints chunk, or aDatapointsArrayList
if multiple time series were asked for. When False, aDatapoints
object containing the datapoints chunk, or aDatapointsList
if multiple time series were asked for.
Examples
Iterate through the datapoints of a single time series with external_id=”foo”, in chunks of 25k:
>>> from cognite.client import CogniteClient >>> from cognite.client.data_classes import DatapointsQuery >>> client = CogniteClient() >>> query = DatapointsQuery(external_id="foo", start="2w-ago") >>> for chunk in client.time_series.data(query, chunk_size_datapoints=25_000): ... pass # do something with the datapoints chunk
Iterate through datapoints from multiple time series, and do not return them as memory-efficient numpy arrays. As one or more time series get exhausted (no more data), they are no longer part of the returned “chunk list”. Note that the order is still preserved (for the remaining).
If you run with
chunk_size_time_series=None
, an easy way to check when a time series is exhausted is to use the.get
method, as illustrated below:>>> from cognite.client.data_classes.data_modeling import NodeId >>> queries = [ ... DatapointsQuery(id=123), ... DatapointsQuery(external_id="foo"), ... DatapointsQuery(instance_id=NodeId("my-space", "my-ts-xid")) ... ] >>> for chunk_lst in client.time_series.data(query, return_arrays=False): ... if chunk_lst.get(id=123) is None: ... print("Time series with id=123 has no more datapoints!")
A likely use case for iterating datapoints is to clone data from one project to another, while keeping a low memory footprint and without having to write very custom logic involving count aggregates (which won’t work for string data) or do time-domain splitting yourself.
Here’s an example of how to do so efficiently, while including bad- and uncertain data (
ignore_bad_datapoints=False
) and copying status codes (include_status=True
). This is automatically taken care of when the Datapoints(-Array) objects are passed directly to an insert method. The only assumption below is that the time series have already been created in the target project.>>> from cognite.client.utils import MIN_TIMESTAMP_MS, MAX_TIMESTAMP_MS >>> target_client = CogniteClient() >>> ts_to_copy = client.time_series.list(data_set_external_ids="my-use-case") >>> queries = [ ... DatapointsQuery( ... external_id=ts.external_id, ... include_status=True, ... ignore_bad_datapoints=False, ... start=MIN_TIMESTAMP_MS, ... end=MAX_TIMESTAMP_MS + 1, # end is exclusive ... ) ... for ts in ts_to_copy ... ] >>> for dps_chunk in client.time_series.data( ... queries, # may be several thousand time series... ... chunk_size_time_series=20, # control memory usage by specifying how many to iterate at a time ... chunk_size_datapoints=100_000, ... ): ... target_client.time_series.data.insert_multiple( ... [{"external_id": dps.external_id, "datapoints": dps} for dps in dps_chunk] ... )
Retrieve latest datapoint
- DatapointsAPI.retrieve_latest(id: Optional[Union[int, LatestDatapointQuery, list[int | cognite.client.data_classes.datapoints.LatestDatapointQuery]]] = None, external_id: Optional[Union[str, LatestDatapointQuery, list[str | cognite.client.data_classes.datapoints.LatestDatapointQuery]]] = None, before: Optional[Union[int, str, datetime]] = None, target_unit: Optional[str] = None, target_unit_system: Optional[str] = None, include_status: bool = False, ignore_bad_datapoints: bool = True, treat_uncertain_as_bad: bool = True, ignore_unknown_ids: bool = False) cognite.client.data_classes.datapoints.Datapoints | cognite.client.data_classes.datapoints.DatapointsList | None
Get the latest datapoint for one or more time series
Time series support status codes like Good, Uncertain and Bad. You can read more in the Cognite Data Fusion developer documentation on status codes.
- Parameters
id (int | LatestDatapointQuery | list[int | LatestDatapointQuery] | None) – Id or list of ids.
external_id (str | LatestDatapointQuery | list[str | LatestDatapointQuery] | None) – External id or list of external ids.
before (None | int | str | datetime.datetime) – (Union[int, str, datetime]): Get latest datapoint before this time. Not used when passing ‘LatestDatapointQuery’.
target_unit (str | None) – The unit_external_id of the datapoint returned. If the time series does not have a unit_external_id that can be converted to the target_unit, an error will be returned. Cannot be used with target_unit_system.
target_unit_system (str | None) – The unit system of the datapoint returned. Cannot be used with target_unit.
include_status (bool) – Also return the status code, an integer, for each datapoint in the response.
ignore_bad_datapoints (bool) – Prevent datapoints with a bad status code to be returned. Default: True.
treat_uncertain_as_bad (bool) – Treat uncertain status codes as bad. If false, treat uncertain as good. Default: True.
ignore_unknown_ids (bool) – Ignore IDs and external IDs that are not found rather than throw an exception.
- Returns
A Datapoints object containing the requested data, or a DatapointsList if multiple were requested. If ignore_unknown_ids is True, a single time series is requested and it is not found, the function will return None.
- Return type
Datapoints | DatapointsList | None
Examples
Getting the latest datapoint in a time series. This method returns a Datapoints object, so the datapoint (if it exists) will be the first element:
>>> from cognite.client import CogniteClient >>> client = CogniteClient() >>> res = client.time_series.data.retrieve_latest(id=1)[0]
You can also get the first datapoint before a specific time:
>>> res = client.time_series.data.retrieve_latest(id=1, before="2d-ago")[0]
You can also get the first datapoint before a specific time in the future e.g. forecast data:
>>> res = client.time_series.data.retrieve_latest(id=1, before="2d-ahead")[0]
You can also retrieve the datapoint in a different unit or unit system:
>>> res = client.time_series.data.retrieve_latest(id=1, target_unit="temperature:deg_f")[0] >>> res = client.time_series.data.retrieve_latest(id=1, target_unit_system="Imperial")[0]
You may also pass an instance of LatestDatapointQuery:
>>> from cognite.client.data_classes import LatestDatapointQuery >>> res = client.time_series.data.retrieve_latest(id=LatestDatapointQuery(id=1, before=60_000))[0]
If you need the latest datapoint for multiple time series, simply give a list of ids. Note that we are using external ids here, but either will work:
>>> res = client.time_series.data.retrieve_latest(external_id=["abc", "def"]) >>> latest_abc = res[0][0] >>> latest_def = res[1][0]
If you for example need to specify a different value of ‘before’ for each time series, you may pass several LatestDatapointQuery objects. These will override any parameter passed directly to the function and also allows for individual customisation of ‘target_unit’, ‘target_unit_system’, ‘include_status’, ‘ignore_bad_datapoints’ and ‘treat_uncertain_as_bad’.
>>> from datetime import datetime, timezone >>> id_queries = [ ... 123, ... LatestDatapointQuery(id=456, before="1w-ago"), ... LatestDatapointQuery(id=789, before=datetime(2018,1,1, tzinfo=timezone.utc)), ... LatestDatapointQuery(id=987, target_unit="temperature:deg_f")] >>> ext_id_queries = [ ... "foo", ... LatestDatapointQuery(external_id="abc", before="3h-ago", target_unit_system="Imperial"), ... LatestDatapointQuery(external_id="def", include_status=True), ... LatestDatapointQuery(external_id="ghi", treat_uncertain_as_bad=False), ... LatestDatapointQuery(external_id="jkl", include_status=True, ignore_bad_datapoints=False)] >>> res = client.time_series.data.retrieve_latest( ... id=id_queries, external_id=ext_id_queries)
Insert datapoints
- DatapointsAPI.insert(datapoints: cognite.client.data_classes.datapoints.Datapoints | cognite.client.data_classes.datapoints.DatapointsArray | collections.abc.Sequence[dict[str, int | float | str | datetime.datetime]] | collections.abc.Sequence[tuple[int | float | datetime.datetime, int | float | str]], id: Optional[int] = None, external_id: Optional[str] = None, instance_id: Optional[NodeId] = None) None
Insert datapoints into a time series
Timestamps can be represented as milliseconds since epoch or datetime objects. Note that naive datetimes are interpreted to be in the local timezone (not UTC), adhering to Python conventions for datetime handling.
Time series support status codes like Good, Uncertain and Bad. You can read more in the Cognite Data Fusion developer documentation on status codes.
- Parameters
datapoints (Datapoints | DatapointsArray | Sequence[dict[str, int | float | str | datetime.datetime]] | Sequence[tuple[int | float | datetime.datetime, int | float | str]]) – The datapoints you wish to insert. Can either be a list of tuples, a list of dictionaries, a Datapoints object or a DatapointsArray object. See examples below.
id (int | None) – Id of time series to insert datapoints into.
external_id (str | None) – External id of time series to insert datapoint into.
instance_id (NodeId | None) – Instance ID of time series to insert datapoints into.
Note
All datapoints inserted without a status code (or symbol) is assumed to be good (code 0). To mark a value, pass either the status code (int) or status symbol (str). Only one of code and symbol is required. If both are given, they must match or an API error will be raised.
Datapoints marked bad can take on any of the following values: None (missing), NaN, and +/- Infinity. It is also not restricted by the normal numeric range [-1e100, 1e100] (i.e. can be any valid float64).
Examples
Your datapoints can be a list of tuples where the first element is the timestamp and the second element is the value. The third element is optional and may contain the status code for the datapoint. To pass by symbol, a dictionary must be used.
>>> from cognite.client import CogniteClient >>> from cognite.client.data_classes import StatusCode >>> from datetime import datetime, timezone >>> client = CogniteClient() >>> datapoints = [ ... (datetime(2018,1,1, tzinfo=timezone.utc), 1000), ... (datetime(2018,1,2, tzinfo=timezone.utc), 2000, StatusCode.Good), ... (datetime(2018,1,3, tzinfo=timezone.utc), 3000, StatusCode.Uncertain), ... (datetime(2018,1,4, tzinfo=timezone.utc), None, StatusCode.Bad), ... ] >>> client.time_series.data.insert(datapoints, id=1)
The timestamp can be given by datetime as above, or in milliseconds since epoch. Status codes can also be passed as normal integers; this is necessary if a subcategory or modifier flag is needed, e.g. 3145728: ‘GoodClamped’:
>>> from cognite.client.data_classes.data_modeling import NodeId >>> datapoints = [ ... (150000000000, 1000), ... (160000000000, 2000, 3145728), ... (170000000000, 2000, 2147483648), # Same as StatusCode.Bad ... ] >>> client.time_series.data.insert(datapoints, instance_id=NodeId("my-space", "my-ts-xid"))
Or they can be a list of dictionaries:
>>> import math >>> datapoints = [ ... {"timestamp": 150000000000, "value": 1000}, ... {"timestamp": 160000000000, "value": 2000}, ... {"timestamp": 170000000000, "value": 3000, "status": {"code": 0}}, ... {"timestamp": 180000000000, "value": 4000, "status": {"symbol": "Uncertain"}}, ... {"timestamp": 190000000000, "value": math.nan, "status": {"code": StatusCode.Bad, "symbol": "Bad"}}, ... ] >>> client.time_series.data.insert(datapoints, external_id="abcd")
Or they can be a Datapoints or DatapointsArray object (with raw datapoints only). Note that the id or external_id set on these objects are not inspected/used (as they belong to the “from-time-series”, and not the “to-time-series”), and so you must explicitly pass the identifier of the time series you want to insert into, which in this example is external_id=”foo”.
If the Datapoints or DatapointsArray are fetched with status codes, these will be automatically used in the insert:
>>> data = client.time_series.data.retrieve( ... external_id="abc", ... start="1w-ago", ... end="now", ... include_status=True, ... ignore_bad_datapoints=False, ... ) >>> client.time_series.data.insert(data, external_id="foo")
Insert datapoints into multiple time series
- DatapointsAPI.insert_multiple(datapoints: list[dict[str, str | int | list | cognite.client.data_classes.datapoints.Datapoints | cognite.client.data_classes.datapoints.DatapointsArray]]) None
Insert datapoints into multiple time series
Timestamps can be represented as milliseconds since epoch or datetime objects. Note that naive datetimes are interpreted to be in the local timezone (not UTC), adhering to Python conventions for datetime handling.
Time series support status codes like Good, Uncertain and Bad. You can read more in the Cognite Data Fusion developer documentation on status codes.
- Parameters
datapoints (list[dict[str, str | int | list | Datapoints | DatapointsArray]]) – The datapoints you wish to insert along with the ids of the time series. See examples below.
Note
All datapoints inserted without a status code (or symbol) is assumed to be good (code 0). To mark a value, pass either the status code (int) or status symbol (str). Only one of code and symbol is required. If both are given, they must match or an API error will be raised.
Datapoints marked bad can take on any of the following values: None (missing), NaN, and +/- Infinity. It is also not restricted by the normal numeric range [-1e100, 1e100] (i.e. can be any valid float64).
Examples
Your datapoints can be a list of dictionaries, each containing datapoints for a different (presumably) time series. These dictionaries must have the key “datapoints” (containing the data) specified as a
Datapoints
object, aDatapointsArray
object, or list of either tuples (timestamp, value) or dictionaries, {“timestamp”: ts, “value”: value}.When passing tuples, the third element is optional and may contain the status code for the datapoint. To pass by symbol, a dictionary must be used.
>>> from cognite.client import CogniteClient >>> from cognite.client.data_classes.data_modeling import NodeId >>> from cognite.client.data_classes import StatusCode >>> from datetime import datetime, timezone >>> client = CogniteClient() >>> to_insert = [ ... {"id": 1, "datapoints": [ ... (datetime(2018,1,1, tzinfo=timezone.utc), 1000), ... (datetime(2018,1,2, tzinfo=timezone.utc), 2000, StatusCode.Good)], ... }, ... {"external_id": "foo", "datapoints": [ ... (datetime(2018,1,3, tzinfo=timezone.utc), 3000), ... (datetime(2018,1,4, tzinfo=timezone.utc), 4000, StatusCode.Uncertain)], ... }, ... {"instance_id": NodeId("my-space", "my-ts-xid"), "datapoints": [ ... (datetime(2018,1,5, tzinfo=timezone.utc), 5000), ... (datetime(2018,1,6, tzinfo=timezone.utc), None, StatusCode.Bad)], ... } ... ]
Passing datapoints using the dictionary format with timestamp given in milliseconds since epoch:
>>> import math >>> to_insert.append( ... {"external_id": "bar", "datapoints": [ ... {"timestamp": 170000000, "value": 7000}, ... {"timestamp": 180000000, "value": 8000, "status": {"symbol": "Uncertain"}}, ... {"timestamp": 190000000, "value": None, "status": {"code": StatusCode.Bad}}, ... {"timestamp": 200000000, "value": math.inf, "status": {"code": StatusCode.Bad, "symbol": "Bad"}}, ... ]})
If the Datapoints or DatapointsArray are fetched with status codes, these will be automatically used in the insert:
>>> data_to_clone = client.time_series.data.retrieve( ... external_id="bar", include_status=True, ignore_bad_datapoints=False) >>> to_insert.append({"external_id": "bar-clone", "datapoints": data_to_clone}) >>> client.time_series.data.insert_multiple(to_insert)
Insert pandas dataframe
- DatapointsAPI.insert_dataframe(df: pd.DataFrame, external_id_headers: bool = True, dropna: bool = True) None
Insert a dataframe (columns must be unique).
The index of the dataframe must contain the timestamps (pd.DatetimeIndex). The names of the columns specify the ids or external ids of the time series to which the datapoints will be written.
Said time series must already exist.
- Parameters
df (pd.DataFrame) – Pandas DataFrame object containing the time series.
external_id_headers (bool) – Interpret the column names as external id. Pass False if using ids. Default: True.
dropna (bool) – Set to True to ignore NaNs in the given DataFrame, applied per column. Default: True.
Warning
You can not insert datapoints with status codes using this method (
insert_dataframe
), you’ll need to use theinsert()
method instead (orinsert_multiple()
)!Examples
Post a dataframe with white noise:
>>> import numpy as np >>> import pandas as pd >>> from cognite.client import CogniteClient >>> client = CogniteClient() >>> ts_xid = "my-foo-ts" >>> idx = pd.date_range(start="2018-01-01", periods=100, freq="1d") >>> noise = np.random.normal(0, 1, 100) >>> df = pd.DataFrame({ts_xid: noise}, index=idx) >>> client.time_series.data.insert_dataframe(df)
Delete a range of datapoints
- DatapointsAPI.delete_range(start: int | str | datetime.datetime, end: int | str | datetime.datetime, id: Optional[int] = None, external_id: Optional[str] = None, instance_id: Optional[NodeId] = None) None
Delete a range of datapoints from a time series.
- Parameters
start (int | str | datetime.datetime) – Inclusive start of delete range
end (int | str | datetime.datetime) – Exclusive end of delete range
id (int | None) – Id of time series to delete data from
external_id (str | None) – External id of time series to delete data from
instance_id (NodeId | None) – Instance ID of time series to delete data from
Examples
Deleting the last week of data from a time series:
>>> from cognite.client import CogniteClient >>> client = CogniteClient() >>> client.time_series.data.delete_range(start="1w-ago", end="now", id=1)
Deleting the data from now until 2 days in the future from a time series containing e.g. forecasted data:
>>> client.time_series.data.delete_range(start="now", end="2d-ahead", id=1)
Delete ranges of datapoints
- DatapointsAPI.delete_ranges(ranges: list[dict[str, Any]]) None
Delete a range of datapoints from multiple time series.
- Parameters
ranges (list[dict[str, Any]]) – The list of datapoint ids along with time range to delete. See examples below.
Examples
Each element in the list ranges must be specify either id or external_id, and a range:
>>> from cognite.client import CogniteClient >>> client = CogniteClient() >>> ranges = [{"id": 1, "start": "2d-ago", "end": "now"}, ... {"external_id": "abc", "start": "2d-ago", "end": "2d-ahead"}] >>> client.time_series.data.delete_ranges(ranges)
Datapoints Data classes
- class cognite.client.data_classes.datapoints.Datapoint(timestamp: Optional[int] = None, value: Optional[Union[str, float]] = None, average: Optional[float] = None, max: Optional[float] = None, min: Optional[float] = None, count: Optional[int] = None, sum: Optional[float] = None, interpolation: Optional[float] = None, step_interpolation: Optional[float] = None, continuous_variance: Optional[float] = None, discrete_variance: Optional[float] = None, total_variation: Optional[float] = None, count_bad: Optional[int] = None, count_good: Optional[int] = None, count_uncertain: Optional[int] = None, duration_bad: Optional[int] = None, duration_good: Optional[int] = None, duration_uncertain: Optional[int] = None, status_code: Optional[int] = None, status_symbol: Optional[str] = None, timezone: Optional[Union[timezone, ZoneInfo]] = None)
Bases:
CogniteResource
An object representing a datapoint.
- Parameters
timestamp (int | None) – The data timestamp in milliseconds since the epoch (Jan 1, 1970). Can be negative to define a date before 1970. Minimum timestamp is 1900.01.01 00:00:00 UTC
value (str | float | None) – The raw data value. Can be string or numeric.
average (float | None) – The time-weighted average value in the aggregate interval.
max (float | None) – The maximum value in the aggregate interval.
min (float | None) – The minimum value in the aggregate interval.
count (int | None) – The number of raw datapoints in the aggregate interval.
sum (float | None) – The sum of the raw datapoints in the aggregate interval.
interpolation (float | None) – The interpolated value at the beginning of the aggregate interval.
step_interpolation (float | None) – The interpolated value at the beginning of the aggregate interval using stepwise interpretation.
continuous_variance (float | None) – The variance of the interpolated underlying function.
discrete_variance (float | None) – The variance of the datapoint values.
total_variation (float | None) – The total variation of the interpolated underlying function.
count_bad (int | None) – The number of raw datapoints with a bad status code, in the aggregate interval.
count_good (int | None) – The number of raw datapoints with a good status code, in the aggregate interval.
count_uncertain (int | None) – The number of raw datapoints with a uncertain status code, in the aggregate interval.
duration_bad (int | None) – The duration the aggregate is defined and marked as bad (measured in milliseconds).
duration_good (int | None) – The duration the aggregate is defined and marked as good (measured in milliseconds).
duration_uncertain (int | None) – The duration the aggregate is defined and marked as uncertain (measured in milliseconds).
status_code (int | None) – The status code for the raw datapoint.
status_symbol (str | None) – The status symbol for the raw datapoint.
timezone (datetime.timezone | ZoneInfo | None) – The timezone to use when displaying the datapoint.
- dump(camel_case: bool = True, include_timezone: bool = True) dict[str, Any]
Dump the instance into a json serializable Python data type.
- Parameters
camel_case (bool) – Use camelCase for attribute names. Defaults to True.
- Returns
A dictionary representation of the instance.
- Return type
dict[str, Any]
- to_pandas(camel_case: bool = False) pandas.DataFrame
Convert the datapoint into a pandas DataFrame.
- Parameters
camel_case (bool) – Convert column names to camel case (e.g. stepInterpolation instead of step_interpolation)
- Returns
pandas.DataFrame
- Return type
pandas.DataFrame
- class cognite.client.data_classes.datapoints.Datapoints(id: Optional[int] = None, external_id: Optional[str] = None, instance_id: Optional[NodeId] = None, is_string: Optional[bool] = None, is_step: Optional[bool] = None, unit: Optional[str] = None, unit_external_id: Optional[str] = None, granularity: Optional[str] = None, timestamp: Optional[Sequence[int]] = None, value: Optional[Union[SequenceNotStr[str], Sequence[float]]] = None, average: Optional[list[float]] = None, max: Optional[list[float]] = None, min: Optional[list[float]] = None, count: Optional[list[int]] = None, sum: Optional[list[float]] = None, interpolation: Optional[list[float]] = None, step_interpolation: Optional[list[float]] = None, continuous_variance: Optional[list[float]] = None, discrete_variance: Optional[list[float]] = None, total_variation: Optional[list[float]] = None, count_bad: Optional[list[int]] = None, count_good: Optional[list[int]] = None, count_uncertain: Optional[list[int]] = None, duration_bad: Optional[list[int]] = None, duration_good: Optional[list[int]] = None, duration_uncertain: Optional[list[int]] = None, status_code: Optional[list[int]] = None, status_symbol: Optional[list[str]] = None, error: Optional[list[None | str]] = None, timezone: Optional[Union[timezone, ZoneInfo]] = None)
Bases:
CogniteResource
An object representing a list of datapoints.
- Parameters
id (int | None) – Id of the time series the datapoints belong to
external_id (str | None) – External id of the time series the datapoints belong to
instance_id (NodeId | None) – The instance id of the time series the datapoints belong to
is_string (bool | None) – Whether the time series contains numerical or string data.
is_step (bool | None) – Whether the time series is stepwise or continuous.
unit (str | None) – The physical unit of the time series (free-text field). Omitted if the datapoints were converted to another unit.
unit_external_id (str | None) – The unit_external_id (as defined in the unit catalog) of the returned data points. If the datapoints were converted to a compatible unit, this will equal the converted unit, not the one defined on the time series.
granularity (str | None) – The granularity of the aggregate datapoints (does not apply to raw data)
timestamp (Sequence[int] | None) – The data timestamps in milliseconds since the epoch (Jan 1, 1970). Can be negative to define a date before 1970. Minimum timestamp is 1900.01.01 00:00:00 UTC
value (SequenceNotStr[str] | Sequence[float] | None) – The raw data values. Can be string or numeric.
average (list[float] | None) – The time-weighted average values per aggregate interval.
max (list[float] | None) – The maximum values per aggregate interval.
min (list[float] | None) – The minimum values per aggregate interval.
count (list[int] | None) – The number of raw datapoints per aggregate interval.
sum (list[float] | None) – The sum of the raw datapoints per aggregate interval.
interpolation (list[float] | None) – The interpolated values at the beginning of each the aggregate interval.
step_interpolation (list[float] | None) – The interpolated values at the beginning of each the aggregate interval using stepwise interpretation.
continuous_variance (list[float] | None) – The variance of the interpolated underlying function.
discrete_variance (list[float] | None) – The variance of the datapoint values.
total_variation (list[float] | None) – The total variation of the interpolated underlying function.
count_bad (list[int] | None) – The number of raw datapoints with a bad status code, per aggregate interval.
count_good (list[int] | None) – The number of raw datapoints with a good status code, per aggregate interval.
count_uncertain (list[int] | None) – The number of raw datapoints with a uncertain status code, per aggregate interval.
duration_bad (list[int] | None) – The duration the aggregate is defined and marked as bad (measured in milliseconds).
duration_good (list[int] | None) – The duration the aggregate is defined and marked as good (measured in milliseconds).
duration_uncertain (list[int] | None) – The duration the aggregate is defined and marked as uncertain (measured in milliseconds).
status_code (list[int] | None) – The status codes for the raw datapoints.
status_symbol (list[str] | None) – The status symbols for the raw datapoints.
error (list[None | str] | None) – Human readable strings with description of what went wrong (returned by synthetic datapoints queries).
timezone (datetime.timezone | ZoneInfo | None) – The timezone to use when displaying the datapoints.
- dump(camel_case: bool = True) dict[str, Any]
Dump the datapoints into a json serializable Python data type.
- Parameters
camel_case (bool) – Use camelCase for attribute names. Defaults to True.
- Returns
A dictionary representing the instance.
- Return type
dict[str, Any]
- to_pandas(column_names: Literal['id', 'external_id', 'instance_id'] = 'instance_id', include_aggregate_name: bool = True, include_granularity_name: bool = False, include_errors: bool = False, include_status: bool = True) pandas.DataFrame
Convert the datapoints into a pandas DataFrame.
- Parameters
column_names (Literal['id', 'external_id', 'instance_id']) – Which field to use for the columns. Defaults to “instance_id”, if it exists, then uses “external_id” if available, and “id” as fallback.
include_aggregate_name (bool) – Include aggregate in the column name
include_granularity_name (bool) – Include granularity in the column name (after aggregate if present)
include_errors (bool) – For synthetic datapoint queries, include a column with errors.
include_status (bool) – Include status code and status symbol as separate columns, if available.
- Returns
The dataframe.
- Return type
pandas.DataFrame
- class cognite.client.data_classes.datapoints.DatapointsArray(id: int | None = None, external_id: str | None = None, instance_id: NodeId | None = None, is_string: bool | None = None, is_step: bool | None = None, unit: str | None = None, unit_external_id: str | None = None, granularity: str | None = None, timestamp: NumpyDatetime64NSArray | None = None, value: NumpyFloat64Array | NumpyObjArray | None = None, average: NumpyFloat64Array | None = None, max: NumpyFloat64Array | None = None, min: NumpyFloat64Array | None = None, count: NumpyInt64Array | None = None, sum: NumpyFloat64Array | None = None, interpolation: NumpyFloat64Array | None = None, step_interpolation: NumpyFloat64Array | None = None, continuous_variance: NumpyFloat64Array | None = None, discrete_variance: NumpyFloat64Array | None = None, total_variation: NumpyFloat64Array | None = None, count_bad: NumpyInt64Array | None = None, count_good: NumpyInt64Array | None = None, count_uncertain: NumpyInt64Array | None = None, duration_bad: NumpyInt64Array | None = None, duration_good: NumpyInt64Array | None = None, duration_uncertain: NumpyInt64Array | None = None, status_code: NumpyUInt32Array | None = None, status_symbol: NumpyObjArray | None = None, null_timestamps: set[int] | None = None, timezone: datetime.timezone | ZoneInfo | None = None)
Bases:
CogniteResource
An object representing datapoints using numpy arrays.
- dump(camel_case: bool = True, convert_timestamps: bool = False) dict[str, Any]
Dump the DatapointsArray into a json serializable Python data type.
- Parameters
camel_case (bool) – Use camelCase for attribute names. Defaults to True.
convert_timestamps (bool) – Convert timestamps to ISO 8601 formatted strings. Default: False (returns as integer, milliseconds since epoch)
- Returns
A dictionary representing the instance.
- Return type
dict[str, Any]
- to_pandas(column_names: Literal['id', 'external_id', 'instance_id'] = 'instance_id', include_aggregate_name: bool = True, include_granularity_name: bool = False, include_status: bool = True) pandas.DataFrame
Convert the DatapointsArray into a pandas DataFrame.
- Parameters
column_names (Literal['id', 'external_id', 'instance_id']) – Which field to use for the columns. Defaults to “instance_id”, if it exists, then uses “external_id” if available, and “id” as fallback.
include_aggregate_name (bool) – Include aggregate in the column name
include_granularity_name (bool) – Include granularity in the column name (after aggregate if present)
include_status (bool) – Include status code and status symbol as separate columns, if available.
- Returns
The datapoints as a pandas DataFrame.
- Return type
pandas.DataFrame
- class cognite.client.data_classes.datapoints.DatapointsArrayList(resources: Collection[Any], cognite_client: CogniteClient | None = None)
Bases:
CogniteResourceList
[DatapointsArray
]- concat_duplicate_ids() None
Concatenates all arrays with duplicated IDs.
Arrays with the same ids are stacked in chronological order.
Caveat This method is not guaranteed to preserve the order of the list.
- dump(camel_case: bool = True, convert_timestamps: bool = False) list[dict[str, Any]]
Dump the instance into a json serializable Python data type.
- Parameters
camel_case (bool) – Use camelCase for attribute names. Defaults to True.
convert_timestamps (bool) – Convert timestamps to ISO 8601 formatted strings. Default: False (returns as integer, milliseconds since epoch)
- Returns
A list of dicts representing the instance.
- Return type
list[dict[str, Any]]
- get(id: Optional[int] = None, external_id: Optional[str] = None, instance_id: Optional[Union[NodeId, tuple[str, str]]] = None) cognite.client.data_classes.datapoints.DatapointsArray | list[cognite.client.data_classes.datapoints.DatapointsArray] | None
Get a specific DatapointsArray from this list by id or external_id.
Note
For duplicated time series, returns a list of DatapointsArray.
- Parameters
id (int | None) – The id of the item(s) to get.
external_id (str | None) – The external_id of the item(s) to get.
instance_id (NodeId | tuple[str, str] | None) – The instance_id of the item(s) to get.
- Returns
The requested item(s)
- Return type
DatapointsArray | list[DatapointsArray] | None
- to_pandas(column_names: Literal['id', 'external_id', 'instance_id'] = 'instance_id', include_aggregate_name: bool = True, include_granularity_name: bool = False, include_status: bool = True) pandas.DataFrame
Convert the DatapointsArrayList into a pandas DataFrame.
- Parameters
column_names (Literal['id', 'external_id', 'instance_id']) – Which field to use for the columns. Defaults to “instance_id”, if it exists, then uses “external_id” if available, and “id” as fallback.
include_aggregate_name (bool) – Include aggregate in the column name
include_granularity_name (bool) – Include granularity in the column name (after aggregate if present)
include_status (bool) – Include status code and status symbol as separate columns, if available.
- Returns
The datapoints as a pandas DataFrame.
- Return type
pandas.DataFrame
- class cognite.client.data_classes.datapoints.DatapointsList(resources: Collection[Any], cognite_client: CogniteClient | None = None)
Bases:
CogniteResourceList
[Datapoints
]- get(id: Optional[int] = None, external_id: Optional[str] = None, instance_id: Optional[Union[InstanceId, tuple[str, str]]] = None) cognite.client.data_classes.datapoints.Datapoints | list[cognite.client.data_classes.datapoints.Datapoints] | None
Get a specific Datapoints from this list by id, external_id or instance_id.
Note
For duplicated time series, returns a list of Datapoints.
- Parameters
id (int | None) – The id of the item(s) to get.
external_id (str | None) – The external_id of the item(s) to get.
instance_id (InstanceId | tuple[str, str] | None) – The instance_id of the item(s) to get.
- Returns
The requested item(s)
- Return type
Datapoints | list[Datapoints] | None
- to_pandas(column_names: Literal['id', 'external_id', 'instance_id'] = 'instance_id', include_aggregate_name: bool = True, include_granularity_name: bool = False, include_status: bool = True) pandas.DataFrame
Convert the datapoints list into a pandas DataFrame.
- Parameters
column_names (Literal['id', 'external_id', 'instance_id']) – Which field to use for the columns. Defaults to “instance_id”, if it exists, then uses “external_id” if available, and “id” as fallback.
include_aggregate_name (bool) – Include aggregate in the column name
include_granularity_name (bool) – Include granularity in the column name (after aggregate if present)
include_status (bool) – Include status code and status symbol as separate columns, if available.
- Returns
The datapoints list as a pandas DataFrame.
- Return type
pandas.DataFrame
- class cognite.client.data_classes.datapoints.DatapointsQuery(id: InitVar[int | None] = None, external_id: InitVar[str | None] = None, instance_id: InitVar[NodeId | tuple[str, str] | None] = None, start: int | str | datetime.datetime = <object object>, end: int | str | datetime.datetime = <object object>, aggregates: Aggregate | list[Aggregate] | None = <object object>, granularity: str | None = <object object>, timezone: str | datetime.timezone | ZoneInfo | None = <object object>, target_unit: str | None = <object object>, target_unit_system: str | None = <object object>, limit: int | None = <object object>, include_outside_points: bool = <object object>, ignore_unknown_ids: bool = <object object>, include_status: bool = <object object>, ignore_bad_datapoints: bool = <object object>, treat_uncertain_as_bad: bool = <object object>)
Bases:
object
Represent a user request for datapoints for a single time series
- class cognite.client.data_classes.datapoints.LatestDatapointQuery(id: InitVar[int | None] = None, external_id: InitVar[str | None] = None, before: None | int | str | datetime.datetime = None, target_unit: str | None = None, target_unit_system: str | None = None, include_status: bool | None = None, ignore_bad_datapoints: bool | None = None, treat_uncertain_as_bad: bool | None = None)
Bases:
object
Parameters describing a query for the latest datapoint from a time series.
Note
Pass either ID or external ID.
- Parameters
id (Optional[int]) – The internal ID of the time series to query.
external_id (Optional[str]) – The external ID of the time series to query.
before (Union[None, int, str, datetime]) – Get latest datapoint before this time. None means ‘now’.
target_unit (str | None) – The unit_external_id of the data points returned. If the time series does not have a unit_external_id that can be converted to the target_unit, an error will be returned. Cannot be used with target_unit_system.
target_unit_system (str | None) – The unit system of the data points returned. Cannot be used with target_unit.
include_status (bool) – Also return the status code, an integer, for each datapoint in the response.
ignore_bad_datapoints (bool) – Prevent data points with a bad status code to be returned. Default: True.
treat_uncertain_as_bad (bool) – Treat uncertain status codes as bad. If false, treat uncertain as good. Default: True.
- class cognite.client.data_classes.datapoints.StatusCode(value)
Bases:
IntEnum
The three main categories of status codes
Datapoint Subscriptions
Create datapoint subscriptions
- DatapointsSubscriptionAPI.create(subscription: DataPointSubscriptionWrite) DatapointSubscription
-
Create a subscription that can be used to listen for changes in data points for a set of time series.
- Parameters
subscription (DataPointSubscriptionWrite) – Subscription to create.
- Returns
Created subscription
- Return type
Examples
Create a subscription with explicit time series IDs:
>>> from cognite.client import CogniteClient >>> from cognite.client.data_classes import DataPointSubscriptionWrite >>> client = CogniteClient() >>> sub = DataPointSubscriptionWrite( ... external_id="my_subscription", ... name="My subscription", ... partition_count=1, ... time_series_ids=["myFistTimeSeries", "mySecondTimeSeries"]) >>> created = client.time_series.subscriptions.create(sub)
Create a subscription with explicit time series IDs given as Node IDs either from CogniteTimeSeries or an extension of CogniteTimeseries:
>>> from cognite.client.data_classes import DataPointSubscriptionWrite >>> from cognite.client.data_classes.data_modeling import NodeId >>> sub = DataPointSubscriptionWrite( ... external_id="my_subscription", ... name="My subscription with Data Model Ids", ... partition_count=1, ... instance_ids=[NodeId("my_space", "myFistTimeSeries"), NodeId("my_space", "mySecondTimeSeries")]) >>> created = client.time_series.subscriptions.create(sub)
Create a filter defined subscription for all numeric time series that are stepwise:
>>> from cognite.client.data_classes import DataPointSubscriptionWrite >>> from cognite.client.data_classes import filters as flt >>> from cognite.client.data_classes.datapoints_subscriptions import DatapointSubscriptionProperty >>> is_numeric_stepwise = flt.And( ... flt.Equals(DatapointSubscriptionProperty.is_string, False), ... flt.Equals(DatapointSubscriptionProperty.is_step, True)) >>> sub = DataPointSubscriptionWrite( ... external_id="my_subscription", ... name="My subscription for numeric, stepwise time series", ... partition_count=1, ... filter=is_numeric_stepwise) >>> created = client.time_series.subscriptions.create(sub)
Retrieve a datapoint subscription by id(s)
- DatapointsSubscriptionAPI.retrieve(external_id: str) cognite.client.data_classes.datapoints_subscriptions.DatapointSubscription | None
Retrieve one subscription by external ID.
- Parameters
external_id (str) – External ID of the subscription to retrieve.
- Returns
The requested subscription.
- Return type
DatapointSubscription | None
Examples
Retrieve a subscription by external ID:
>>> from cognite.client import CogniteClient >>> client = CogniteClient() >>> res = client.time_series.subscriptions.retrieve("my_subscription")
List datapoint subscriptions
- DatapointsSubscriptionAPI.list(limit: int | None = 25) DatapointSubscriptionList
-
- Parameters
limit (int | None) – Maximum number of subscriptions to return. Defaults to 25. Set to -1, float(“inf”) or None to return all items.
- Returns
List of requested datapoint subscriptions
- Return type
Examples
List 5 subscriptions:
>>> from cognite.client import CogniteClient >>> client = CogniteClient() >>> subscriptions = client.time_series.subscriptions.list(limit=5)
List member time series of subscription
- DatapointsSubscriptionAPI.list_member_time_series(external_id: str, limit: int | None = 25) TimeSeriesIDList
List time series in a subscription
Retrieve a list of time series (IDs) that the subscription is currently retrieving updates from
- Parameters
external_id (str) – External ID of the subscription to retrieve members of.
limit (int | None) – Maximum number of time series to return. Defaults to 25. Set to -1, float(“inf”) or None to return all items.
- Returns
List of time series in the subscription.
- Return type
Examples
List time series in a subscription:
>>> from cognite.client import CogniteClient >>> from cognite.client.data_classes import DataPointSubscriptionUpdate >>> client = CogniteClient() >>> members = client.time_series.subscriptions.list_member_time_series("my_subscription") >>> timeseries_external_ids = members.as_external_ids()
Iterate over subscriptions data
- DatapointsSubscriptionAPI.iterate_data(external_id: str, start: Optional[str] = None, limit: int = 25, partition: int = 0, poll_timeout: int = 5, cursor: Optional[str] = None, include_status: bool = False, ignore_bad_datapoints: bool = True, treat_uncertain_as_bad: bool = True) Iterator[DatapointSubscriptionBatch]
Iterate over data from a given subscription.
Data can be ingested datapoints and time ranges where data is deleted. This endpoint will also return changes to the subscription itself, that is, if time series are added or removed from the subscription.
Warning
This endpoint will store updates from when the subscription was created, but updates older than 7 days may be discarded.
- Parameters
external_id (str) – The external ID of the subscription.
start (str | None) – When to start the iteration. If set to None, the iteration will start from the beginning. The format is “N[timeunit]-ago”, where timeunit is w,d,h,m (week, day, hour, minute). For example, “12h-ago” will start the iteration from 12 hours ago. You can also set it to “now” to jump straight to the end. Defaults to None.
limit (int) – Approximate number of results to return across all partitions.
partition (int) – The partition to iterate over. Defaults to 0.
poll_timeout (int) – How many seconds to wait for new data, until an empty response is sent. Defaults to 5.
cursor (str | None) – Optional cursor to start iterating from.
include_status (bool) – Also return the status code, an integer, for each datapoint in the response.
ignore_bad_datapoints (bool) – Do not return bad datapoints. Default: True.
treat_uncertain_as_bad (bool) – Treat datapoints with uncertain status codes as bad. If false, treat datapoints with uncertain status codes as good. Default: True.
- Yields
DatapointSubscriptionBatch – Changes to the subscription and data in the subscribed time series.
Examples
Iterate over changes to subscription timeseries since the beginning until there is no more data:
>>> from cognite.client import CogniteClient >>> client = CogniteClient() >>> for batch in client.time_series.subscriptions.iterate_data("my_subscription"): ... # Changes to the subscription itself: ... print(f"Added {len(batch.subscription_changes.added)} timeseries") ... print(f"Removed {len(batch.subscription_changes.removed)} timeseries") ... print(f"Changed timeseries data in {len(batch.updates)} updates") ... # Changes to datapoints for time series in the subscription: ... for update in batch.updates: ... upserts.time_series # The time series the update belongs to ... upserts.upserts # The upserted datapoints, if any ... upserts.deletes # Ranges of deleted periods, if any ... if not batch.has_next: ... break
Iterate continuously over all changes to the subscription newer than 3 days:
>>> for batch in client.time_series.subscriptions.iterate_data("my_subscription", "3d-ago"): ... pass # do something
Update datapoint subscription
- DatapointsSubscriptionAPI.update(update: cognite.client.data_classes.datapoints_subscriptions.DataPointSubscriptionUpdate | cognite.client.data_classes.datapoints_subscriptions.DataPointSubscriptionWrite, mode: Literal['replace_ignore_null', 'patch', 'replace'] = 'replace_ignore_null') DatapointSubscription
-
Update a subscription. Note that Fields that are not included in the request are not changed. Furthermore, the subscription partition cannot be changed.
- Parameters
update (DataPointSubscriptionUpdate | DataPointSubscriptionWrite) – The subscription update.
mode (Literal['replace_ignore_null', 'patch', 'replace']) – How to update data when a non-update object is given (DataPointSubscriptionWrite). If you use ‘replace_ignore_null’, only the fields you have set will be used to replace existing (default). Using ‘replace’ will additionally clear all the fields that are not specified by you. Last option, ‘patch’, will update only the fields you have set and for container-like fields such as metadata or labels, add the values to the existing.
- Returns
Updated subscription.
- Return type
Examples
Change the name of a preexisting subscription:
>>> from cognite.client import CogniteClient >>> from cognite.client.data_classes import DataPointSubscriptionUpdate >>> client = CogniteClient() >>> update = DataPointSubscriptionUpdate("my_subscription").name.set("My New Name") >>> updated = client.time_series.subscriptions.update(update)
Add a time series to a preexisting subscription:
>>> from cognite.client.data_classes import DataPointSubscriptionUpdate >>> update = DataPointSubscriptionUpdate("my_subscription").time_series_ids.add(["MyNewTimeSeriesExternalId"]) >>> updated = client.time_series.subscriptions.update(update)
Delete datapoint subscription
- DatapointsSubscriptionAPI.delete(external_id: Union[str, SequenceNotStr[str]], ignore_unknown_ids: bool = False) None
Delete subscription(s). This operation cannot be undone.
- Parameters
external_id (str | SequenceNotStr[str]) – External ID or list of external IDs of subscriptions to delete.
ignore_unknown_ids (bool) – Whether to ignore IDs and external IDs that are not found rather than throw an exception.
Examples
Delete a subscription by external ID:
>>> from cognite.client import CogniteClient >>> client = CogniteClient() >>> client.time_series.subscriptions.delete("my_subscription")
Datapoint Subscription classes
- class cognite.client.data_classes.datapoints_subscriptions.DataDeletion(inclusive_begin: 'int', exclusive_end: 'int | None')
Bases:
object
- cognite.client.data_classes.datapoints_subscriptions.DataPointSubscriptionCreate
alias of
DataPointSubscriptionWrite
- class cognite.client.data_classes.datapoints_subscriptions.DataPointSubscriptionUpdate(external_id: str)
Bases:
CogniteUpdate
Changes applied to datapoint subscription
- Parameters
external_id (str) – The external ID provided by the client. Must be unique for the resource type.
- class cognite.client.data_classes.datapoints_subscriptions.DataPointSubscriptionWrite(external_id: str, partition_count: int, time_series_ids: Optional[list[str]] = None, instance_ids: Optional[list[cognite.client.data_classes.data_modeling.ids.NodeId]] = None, filter: Optional[Filter] = None, name: Optional[str] = None, description: Optional[str] = None, data_set_id: Optional[int] = None)
Bases:
DatapointSubscriptionCore
- A data point subscription is a way to listen to changes to time series data points, in ingestion order.
This is the write version of a subscription, used to create new subscriptions.
A subscription can either be defined directly by a list of time series ids or indirectly by a filter.
- Parameters
external_id (str) – Externally provided ID for the subscription. Must be unique.
partition_count (int) – The maximum effective parallelism of this subscription (the number of clients that can read from it concurrently) will be limited to this number, but a higher partition count will cause a higher time overhead. The partition count must be between 1 and 100. CAVEAT: This cannot change after the subscription has been created.
time_series_ids (list[ExternalId] | None) – List of (external) ids of time series that this subscription will listen to. Not compatible with filter.
instance_ids (list[NodeId] | None) – List of instance ids of time series that this subscription will listen to. Not compatible with filter.
filter (Filter | None) – A filter DSL (Domain Specific Language) to define advanced filter queries. Not compatible with time_series_ids.
name (str | None) – No description.
description (str | None) – A summary explanation for the subscription.
data_set_id (int | None) – The id of the dataset this subscription belongs to.
- as_write() DataPointSubscriptionWrite
Returns this DatapointSubscription instance
- dump(camel_case: bool = True) dict[str, Any]
Dump the instance into a json serializable Python data type.
- Parameters
camel_case (bool) – Use camelCase for attribute names. Defaults to True.
- Returns
A dictionary representation of the instance.
- Return type
dict[str, Any]
- class cognite.client.data_classes.datapoints_subscriptions.DatapointSubscription(external_id: str, partition_count: int, created_time: int, last_updated_time: int, time_series_count: int, filter: Optional[Filter] = None, name: Optional[str] = None, description: Optional[str] = None, data_set_id: Optional[int] = None)
Bases:
DatapointSubscriptionCore
- A data point subscription is a way to listen to changes to time series data points, in ingestion order.
This is the read version of a subscription, used when reading subscriptions from CDF.
- Parameters
external_id (ExternalId) – Externally provided ID for the subscription. Must be unique.
partition_count (int) – The maximum effective parallelism of this subscription (the number of clients that can read from it concurrently) will be limited to this number, but a higher partition count will cause a higher time overhead.
created_time (int) – Time when the subscription was created in CDF in milliseconds since Jan 1, 1970.
last_updated_time (int) – Time when the subscription was last updated in CDF in milliseconds since Jan 1, 1970.
time_series_count (int) – The number of time series in the subscription.
filter (Filter | None) – If present, the subscription is defined by this filter.
name (str | None) – No description.
description (str | None) – A summary explanation for the subscription.
data_set_id (int | None) – The id of the dataset this subscription belongs to.
- as_write() DataPointSubscriptionWrite
Returns this DatapointSubscription as a DataPointSubscriptionWrite
- class cognite.client.data_classes.datapoints_subscriptions.DatapointSubscriptionBatch(updates: 'list[DatapointsUpdate]', subscription_changes: 'SubscriptionTimeSeriesUpdate', has_next: 'bool', cursor: 'str')
Bases:
object
- class cognite.client.data_classes.datapoints_subscriptions.DatapointSubscriptionCore(external_id: str, partition_count: int, filter: cognite.client.data_classes.filters.Filter | None, name: str | None, description: str | None, data_set_id: int | None)
Bases:
WriteableCogniteResource
[DataPointSubscriptionWrite
],ABC
- dump(camel_case: bool = True) dict[str, Any]
Dump the instance into a json serializable Python data type.
- Parameters
camel_case (bool) – Use camelCase for attribute names. Defaults to True.
- Returns
A dictionary representation of the instance.
- Return type
dict[str, Any]
- cognite.client.data_classes.datapoints_subscriptions.DatapointSubscriptionFilterProperties
alias of
DatapointSubscriptionProperty
- class cognite.client.data_classes.datapoints_subscriptions.DatapointSubscriptionList(resources: Iterable[Any], cognite_client: CogniteClient | None = None)
Bases:
WriteableCogniteResourceList
[DataPointSubscriptionWrite
,DatapointSubscription
],ExternalIDTransformerMixin
- as_write() DatapointSubscriptionWriteList
Returns this DatapointSubscriptionList as a DatapointSubscriptionWriteList
- class cognite.client.data_classes.datapoints_subscriptions.DatapointSubscriptionPartition(index: 'int', cursor: 'str | None' = None)
Bases:
object
- class cognite.client.data_classes.datapoints_subscriptions.DatapointSubscriptionProperty(value)
Bases:
EnumProperty
An enumeration.
- class cognite.client.data_classes.datapoints_subscriptions.DatapointSubscriptionWriteList(resources: Iterable[Any], cognite_client: CogniteClient | None = None)
Bases:
CogniteResourceList
[DataPointSubscriptionWrite
],ExternalIDTransformerMixin
- class cognite.client.data_classes.datapoints_subscriptions.DatapointsUpdate(time_series: 'TimeSeriesID', upserts: 'Datapoints', deletes: 'list[DataDeletion]')
Bases:
object
- class cognite.client.data_classes.datapoints_subscriptions.SubscriptionTimeSeriesUpdate(added: 'list[TimeSeriesID]', removed: 'list[TimeSeriesID]')
Bases:
object
- class cognite.client.data_classes.datapoints_subscriptions.TimeSeriesID(id: int, external_id: Optional[str] = None, instance_id: Optional[NodeId] = None)
Bases:
CogniteResource
A TimeSeries Identifier to uniquely identify a time series.
- Parameters
id (int) – A server-generated ID for the object.
external_id (ExternalId | None) – The external ID provided by the client. Must be unique for the resource type.
instance_id (NodeId | None) – The ID of an instance in Cognite Data Models.
- dump(camel_case: bool = True) dict[str, Any]
Dump the instance into a json serializable Python data type.
- Parameters
camel_case (bool) – Use camelCase for attribute names. Defaults to True.
- Returns
A dictionary representation of the instance.
- Return type
dict[str, Any]
- class cognite.client.data_classes.datapoints_subscriptions.TimeSeriesIDList(resources: Iterable[Any], cognite_client: CogniteClient | None = None)
Bases:
CogniteResourceList
[TimeSeriesID
],IdTransformerMixin