Fit Entity Matching Model
- async AsyncCogniteClient.entity_matching.fit(
- sources: Sequence[dict | CogniteResource],
- targets: Sequence[dict | CogniteResource],
- true_matches: Sequence[dict | tuple[int | str, int | str]] | None = None,
- match_fields: dict | Sequence[tuple[str, str]] | None = None,
- feature_type: str | None = None,
- classifier: str | None = None,
- ignore_missing_fields: bool = False,
- name: str | None = None,
- description: str | None = None,
- external_id: str | None = None,
Fit entity matching model.
Note
All users on this CDF subscription with assets read-all and entitymatching read-all and write-all capabilities in the project, are able to access the data sent to this endpoint.
- Parameters:
sources (Sequence[dict | CogniteResource]) – entities to match from, should have an ‘id’ field. Tolerant to passing more than is needed or used (e.g. json dump of time series list). Metadata fields are automatically flattened to “metadata.key” entries, such that they can be used in match_fields.
targets (Sequence[dict | CogniteResource]) – entities to match to, should have an ‘id’ field. Tolerant to passing more than is needed or used.
true_matches (Sequence[dict | tuple[int | str, int | str]] | None) – Known valid matches given as a list of dicts with keys ‘sourceId’, ‘sourceExternalId’, ‘targetId’, ‘targetExternalId’). If omitted, uses an unsupervised model. A tuple can be used instead of the dictionary for convenience, interpreted as id/externalId based on type.
match_fields (dict | Sequence[tuple[str, str]] | None) – List of (from,to) keys to use in matching. Default in the API is [(‘name’,’name’)]. Also accepts {“source”: .., “target”: ..}.
feature_type (str | None) – feature type that defines the combination of features used, see API docs for details.
classifier (str | None) – classifier used in training.
ignore_missing_fields (bool) – whether missing data in match_fields should return error or be filled in with an empty string.
name (str | None) – Optional user-defined name of model.
description (str | None) – Optional user-defined description of model.
external_id (str | None) – Optional external id. Must be unique within the project.
- Returns:
Resulting queued model.
- Return type:
Example
>>> from cognite.client import CogniteClient, AsyncCogniteClient >>> client = CogniteClient() >>> # async_client = AsyncCogniteClient() # another option >>> sources = [ ... {"id": 101, "name": "ChildAsset1", "description": "Child of ParentAsset1"} ... ] >>> targets = [{"id": 1, "name": "ParentAsset1", "description": "Parent to ChildAsset1"}] >>> true_matches = [(1, 101)] >>> model = client.entity_matching.fit( ... sources=sources, ... targets=targets, ... true_matches=true_matches, ... description="AssetMatchingJob1", ... )