Smart Datasets (🧪Beta)
albert.resources.smart_datasets
Attributes:
| Name | Type | Description |
|---|---|---|
SmartDatasetVariable |
|
SmartDatasetVariable
SmartDatasetVariable = Annotated[
MaterialAmountVariable
| ParameterVariable
| MoleculeVariable
| PropertyVariable,
Field(discriminator="type"),
]
SmartDatasetBuildState
SmartDatasetScope
Bases: BaseAlbertModel
Represents the scope of a smart dataset.
Attributes:
| Name | Type | Description |
|---|---|---|
project_ids |
list[ProjectId]
|
List of project IDs. |
target_ids |
list[TargetId]
|
List of target IDs. |
sheet_ids |
list[WorksheetId] | None
|
List of worksheet IDs. If None, all worksheets in the projects will be used. |
target_parent_ids |
dict[TargetId, ProjectId] | None
|
Optional mapping from target ID to a parent project ID. When set, the target inherits its ACL policy from the referenced project. |
Show JSON schema:
{
"description": "Represents the scope of a smart dataset.\n\nAttributes\n----------\nproject_ids : list[ProjectId]\n List of project IDs.\ntarget_ids : list[TargetId]\n List of target IDs.\nsheet_ids : list[WorksheetId] | None\n List of worksheet IDs. If None, all worksheets in the projects will be used.\ntarget_parent_ids : dict[TargetId, ProjectId] | None\n Optional mapping from target ID to a parent project ID. When set, the target\n inherits its ACL policy from the referenced project.",
"properties": {
"projectIds": {
"items": {
"type": "string"
},
"title": "Projectids",
"type": "array"
},
"targetIds": {
"items": {
"type": "string"
},
"title": "Targetids",
"type": "array"
},
"sheetIds": {
"anyOf": [
{
"items": {
"type": "string"
},
"type": "array"
},
{
"type": "null"
}
],
"default": null,
"title": "Sheetids"
},
"targetParentIds": {
"anyOf": [
{
"additionalProperties": {
"type": "string"
},
"type": "object"
},
{
"type": "null"
}
],
"title": "Targetparentids"
}
},
"title": "SmartDatasetScope",
"type": "object"
}
Fields:
-
project_ids(list[ProjectId]) -
target_ids(list[TargetId]) -
sheet_ids(list[WorksheetId] | None) -
target_parent_ids(dict[TargetId, ProjectId] | None)
Validators:
filter_invalid_sheet_ids
SmartDataset
Bases: BaseResource
Represents a smart dataset entity.
Attributes:
| Name | Type | Description |
|---|---|---|
id |
SmartDatasetId | None
|
The unique identifier of the smart dataset. |
parent_id |
ProjectId | None
|
The ID of the parent project this smart dataset belongs to. When set, the smart dataset inherits its ACL policy from the referenced project. |
scope |
SmartDatasetScope | None
|
The dataset scope containing project, target, and sheet IDs. |
schema_ |
dict | None
|
The dataset schema. |
storage_key |
str | None
|
The storage key for the dataset. |
SmartDatasetAggregateBy
The aggregation level for smart dataset experiment data.
Methods:
| Name | Description |
|---|---|
to_api_value |
|
from_api_value |
|
Attributes:
| Name | Type | Description |
|---|---|---|
INV |
|
|
LOT |
|
|
WFL |
|
|
PTD |
|
to_api_value
to_api_value() -> str
from_api_value
from_api_value(value: str) -> SmartDatasetAggregateBy
Source code in src/albert/resources/smart_datasets.py
SmartDatasetVariableDataType
SmartDatasetRecordIdentifier
Bases: BaseAlbertModel
An identifier for a record in a smart dataset experiment data matrix.
The same shape is used across all aggregation levels (inventory, material, experiment, measurement); fields that don't apply at a given level are left unset.
Attributes:
| Name | Type | Description |
|---|---|---|
type |
str
|
The identifier type (e.g., |
inventory_id |
str
|
The inventory ID of the record. |
key |
str | None
|
The unique key of the identifier. |
lot_id |
str | None
|
The lot ID, if applicable. |
workflow_interval |
str | None
|
The workflow interval, if applicable. |
task_id |
str | None
|
The task ID, if applicable. |
property_data_id |
str | None
|
The property data ID, if applicable. |
Show JSON schema:
{
"description": "An identifier for a record in a smart dataset experiment data matrix.\n\nThe same shape is used across all aggregation levels (inventory, material,\nexperiment, measurement); fields that don't apply at a given level are left\nunset.\n\nAttributes\n----------\ntype : str\n The identifier type (e.g., ``albert_inventory``, ``albert_material``).\ninventory_id : str\n The inventory ID of the record.\nkey : str | None\n The unique key of the identifier.\nlot_id : str | None\n The lot ID, if applicable.\nworkflow_interval : str | None\n The workflow interval, if applicable.\ntask_id : str | None\n The task ID, if applicable.\nproperty_data_id : str | None\n The property data ID, if applicable.",
"properties": {
"type": {
"title": "Type",
"type": "string"
},
"inventory_id": {
"title": "Inventory Id",
"type": "string"
},
"key": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Key"
},
"lot_id": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Lot Id"
},
"workflow_interval": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Workflow Interval"
},
"task_id": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Task Id"
},
"property_data_id": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Property Data Id"
}
},
"required": [
"type",
"inventory_id"
],
"title": "SmartDatasetRecordIdentifier",
"type": "object"
}
Fields:
-
type(str) -
inventory_id(str) -
key(str | None) -
lot_id(str | None) -
workflow_interval(str | None) -
task_id(str | None) -
property_data_id(str | None)
MaterialAmountVariable
Bases: _BaseVariable
A material amount variable.
Show JSON schema:
{
"description": "A material amount variable.",
"properties": {
"key": {
"title": "Key",
"type": "string"
},
"name": {
"title": "Name",
"type": "string"
},
"type": {
"const": "material_amount",
"default": "material_amount",
"title": "Type",
"type": "string"
},
"data_type": {
"const": "numeric",
"default": "numeric",
"title": "Data Type",
"type": "string"
}
},
"required": [
"key",
"name"
],
"title": "MaterialAmountVariable",
"type": "object"
}
Fields:
ParameterVariable
Bases: _BaseVariable
A parameter variable.
Show JSON schema:
{
"$defs": {
"SmartDatasetVariableDataType": {
"description": "The data type of a smart dataset variable.",
"enum": [
"numeric",
"categorical",
"molecular",
"boolean"
],
"title": "SmartDatasetVariableDataType",
"type": "string"
}
},
"description": "A parameter variable.",
"properties": {
"key": {
"title": "Key",
"type": "string"
},
"name": {
"title": "Name",
"type": "string"
},
"type": {
"const": "parameter",
"default": "parameter",
"title": "Type",
"type": "string"
},
"data_type": {
"$ref": "#/$defs/SmartDatasetVariableDataType"
},
"sources": {
"items": {
"enum": [
"property",
"batch",
"process_design"
],
"type": "string"
},
"title": "Sources",
"type": "array"
}
},
"required": [
"key",
"name",
"data_type"
],
"title": "ParameterVariable",
"type": "object"
}
Fields:
-
key(str) -
name(str) -
type(Literal['parameter']) -
data_type(SmartDatasetVariableDataType) -
sources(list[Literal['property', 'batch', 'process_design']])
MoleculeVariable
Bases: _BaseVariable
A molecule variable.
Show JSON schema:
{
"description": "A molecule variable.",
"properties": {
"key": {
"title": "Key",
"type": "string"
},
"name": {
"title": "Name",
"type": "string"
},
"type": {
"const": "molecule",
"default": "molecule",
"title": "Type",
"type": "string"
},
"data_type": {
"const": "molecular",
"default": "molecular",
"title": "Data Type",
"type": "string"
}
},
"required": [
"key",
"name"
],
"title": "MoleculeVariable",
"type": "object"
}
Fields:
PropertyVariable
Bases: _BaseVariable
A property variable.
Show JSON schema:
{
"$defs": {
"SmartDatasetVariableDataType": {
"description": "The data type of a smart dataset variable.",
"enum": [
"numeric",
"categorical",
"molecular",
"boolean"
],
"title": "SmartDatasetVariableDataType",
"type": "string"
}
},
"description": "A property variable.",
"properties": {
"key": {
"title": "Key",
"type": "string"
},
"name": {
"title": "Name",
"type": "string"
},
"type": {
"const": "property",
"default": "property",
"title": "Type",
"type": "string"
},
"data_type": {
"$ref": "#/$defs/SmartDatasetVariableDataType"
}
},
"required": [
"key",
"name",
"data_type"
],
"title": "PropertyVariable",
"type": "object"
}
Fields:
SmartDatasetData
Bases: BaseAlbertModel
The experiment data matrix for a smart dataset.
Attributes:
| Name | Type | Description |
|---|---|---|
aggregate_by |
SmartDatasetAggregateBy
|
The aggregation level of the returned data. |
identifiers |
list[SmartDatasetRecordIdentifier]
|
The identifier metadata for each row index entry. |
variables |
list[SmartDatasetVariable]
|
The variable metadata for each column entry. |
data |
OrientTightDataFrame
|
The experiment data values. |
uncertainty |
OrientTightDataFrame | None
|
The associated uncertainty values, if available. |
counts |
OrientTightDataFrame | None
|
The associated observation counts, if available. |
Show JSON schema:
{
"$defs": {
"MaterialAmountVariable": {
"description": "A material amount variable.",
"properties": {
"key": {
"title": "Key",
"type": "string"
},
"name": {
"title": "Name",
"type": "string"
},
"type": {
"const": "material_amount",
"default": "material_amount",
"title": "Type",
"type": "string"
},
"data_type": {
"const": "numeric",
"default": "numeric",
"title": "Data Type",
"type": "string"
}
},
"required": [
"key",
"name"
],
"title": "MaterialAmountVariable",
"type": "object"
},
"MoleculeVariable": {
"description": "A molecule variable.",
"properties": {
"key": {
"title": "Key",
"type": "string"
},
"name": {
"title": "Name",
"type": "string"
},
"type": {
"const": "molecule",
"default": "molecule",
"title": "Type",
"type": "string"
},
"data_type": {
"const": "molecular",
"default": "molecular",
"title": "Data Type",
"type": "string"
}
},
"required": [
"key",
"name"
],
"title": "MoleculeVariable",
"type": "object"
},
"ParameterVariable": {
"description": "A parameter variable.",
"properties": {
"key": {
"title": "Key",
"type": "string"
},
"name": {
"title": "Name",
"type": "string"
},
"type": {
"const": "parameter",
"default": "parameter",
"title": "Type",
"type": "string"
},
"data_type": {
"$ref": "#/$defs/SmartDatasetVariableDataType"
},
"sources": {
"items": {
"enum": [
"property",
"batch",
"process_design"
],
"type": "string"
},
"title": "Sources",
"type": "array"
}
},
"required": [
"key",
"name",
"data_type"
],
"title": "ParameterVariable",
"type": "object"
},
"PropertyVariable": {
"description": "A property variable.",
"properties": {
"key": {
"title": "Key",
"type": "string"
},
"name": {
"title": "Name",
"type": "string"
},
"type": {
"const": "property",
"default": "property",
"title": "Type",
"type": "string"
},
"data_type": {
"$ref": "#/$defs/SmartDatasetVariableDataType"
}
},
"required": [
"key",
"name",
"data_type"
],
"title": "PropertyVariable",
"type": "object"
},
"SmartDatasetAggregateBy": {
"description": "The aggregation level for smart dataset experiment data.",
"enum": [
"inv",
"lot",
"wfl",
"ptd"
],
"title": "SmartDatasetAggregateBy",
"type": "string"
},
"SmartDatasetRecordIdentifier": {
"description": "An identifier for a record in a smart dataset experiment data matrix.\n\nThe same shape is used across all aggregation levels (inventory, material,\nexperiment, measurement); fields that don't apply at a given level are left\nunset.\n\nAttributes\n----------\ntype : str\n The identifier type (e.g., ``albert_inventory``, ``albert_material``).\ninventory_id : str\n The inventory ID of the record.\nkey : str | None\n The unique key of the identifier.\nlot_id : str | None\n The lot ID, if applicable.\nworkflow_interval : str | None\n The workflow interval, if applicable.\ntask_id : str | None\n The task ID, if applicable.\nproperty_data_id : str | None\n The property data ID, if applicable.",
"properties": {
"type": {
"title": "Type",
"type": "string"
},
"inventory_id": {
"title": "Inventory Id",
"type": "string"
},
"key": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Key"
},
"lot_id": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Lot Id"
},
"workflow_interval": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Workflow Interval"
},
"task_id": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Task Id"
},
"property_data_id": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Property Data Id"
}
},
"required": [
"type",
"inventory_id"
],
"title": "SmartDatasetRecordIdentifier",
"type": "object"
},
"SmartDatasetVariableDataType": {
"description": "The data type of a smart dataset variable.",
"enum": [
"numeric",
"categorical",
"molecular",
"boolean"
],
"title": "SmartDatasetVariableDataType",
"type": "string"
}
},
"description": "The experiment data matrix for a smart dataset.\n\nAttributes\n----------\naggregate_by : SmartDatasetAggregateBy\n The aggregation level of the returned data.\nidentifiers : list[SmartDatasetRecordIdentifier]\n The identifier metadata for each row index entry.\nvariables : list[SmartDatasetVariable]\n The variable metadata for each column entry.\ndata : OrientTightDataFrame\n The experiment data values.\nuncertainty : OrientTightDataFrame | None\n The associated uncertainty values, if available.\ncounts : OrientTightDataFrame | None\n The associated observation counts, if available.",
"properties": {
"aggregate_by": {
"$ref": "#/$defs/SmartDatasetAggregateBy"
},
"identifiers": {
"items": {
"$ref": "#/$defs/SmartDatasetRecordIdentifier"
},
"title": "Identifiers",
"type": "array"
},
"variables": {
"items": {
"discriminator": {
"mapping": {
"material_amount": "#/$defs/MaterialAmountVariable",
"molecule": "#/$defs/MoleculeVariable",
"parameter": "#/$defs/ParameterVariable",
"property": "#/$defs/PropertyVariable"
},
"propertyName": "type"
},
"oneOf": [
{
"$ref": "#/$defs/MaterialAmountVariable"
},
{
"$ref": "#/$defs/ParameterVariable"
},
{
"$ref": "#/$defs/MoleculeVariable"
},
{
"$ref": "#/$defs/PropertyVariable"
}
]
},
"title": "Variables",
"type": "array"
},
"data": {
"title": "Data",
"type": "object"
},
"uncertainty": {
"anyOf": [
{
"type": "object"
},
{
"type": "null"
}
],
"default": null,
"title": "Uncertainty"
},
"counts": {
"anyOf": [
{
"type": "object"
},
{
"type": "null"
}
],
"default": null,
"title": "Counts"
}
},
"required": [
"aggregate_by",
"data"
],
"title": "SmartDatasetData",
"type": "object"
}
Fields:
-
aggregate_by(SmartDatasetAggregateBy) -
identifiers(list[SmartDatasetRecordIdentifier]) -
variables(list[SmartDatasetVariable]) -
data(OrientTightDataFrame) -
uncertainty(OrientTightDataFrame | None) -
counts(OrientTightDataFrame | None)