example
startorch.example ¶
Contain example generators.
startorch.example.BaseExampleGenerator ¶
Bases: Generic[T]
, ABC
Define the base class to generate examples.
Example usage:
>>> from startorch.example import HypercubeClassification
>>> generator = HypercubeClassification(num_classes=5, feature_size=6)
>>> generator
HypercubeClassificationExampleGenerator(num_classes=5, feature_size=6, noise_std=0.2)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}
startorch.example.BaseExampleGenerator.generate
abstractmethod
¶
generate(
batch_size: int = 1, rng: Generator | None = None
) -> dict[Hashable, Tensor]
Generate a batch of examples.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
batch_size |
int
|
The batch size. |
1
|
rng |
Generator | None
|
An optional random number generator. |
None
|
Returns:
Type | Description |
---|---|
dict[Hashable, Tensor]
|
A batch of examples. |
Example usage:
>>> from startorch.example import HypercubeClassification
>>> generator = HypercubeClassification(num_classes=5, feature_size=6)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}
startorch.example.BlobsClassification ¶
Bases: BaseExampleGenerator
Implement a binary classification example generator where the data are generated from isotropic Gaussian blobs.
The implementation is based on https://scikit-learn.org/stable/modules/generated/sklearn.datasets.make_blobs.html
Parameters:
Name | Type | Description | Default |
---|---|---|---|
centers |
Tensor
|
The cluster centers used to generate the
examples. It must be a float tensor of shape
|
required |
cluster_std |
Tensor | float
|
The standard deviation of the clusters.
It must be a float tensor of shape
|
1.0
|
Raises:
Type | Description |
---|---|
TypeError
|
if one of the parameters has an invalid type. |
RuntimeError
|
if one of the parameters is not valid. |
Example usage:
>>> import torch
>>> from startorch.example import BlobsClassification
>>> generator = BlobsClassification(torch.rand(5, 4))
>>> generator
BlobsClassificationExampleGenerator(num_clusters=5, feature_size=4)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}
startorch.example.BlobsClassification.centers
property
¶
centers: Tensor
torch.Tensor
of type float and shape (num_clusters,
feature_size)
: The cluster centers.
startorch.example.BlobsClassification.cluster_std
property
¶
cluster_std: Tensor
torch.Tensor
of type float and shape (num_clusters,
feature_size)
: The standard deviation for each cluster.
startorch.example.BlobsClassification.feature_size
property
¶
feature_size: int
The feature size i.e. the number of features.
startorch.example.BlobsClassification.num_clusters
property
¶
num_clusters: int
The number of clusters i.e. categories.
startorch.example.BlobsClassification.create_uniform_centers
classmethod
¶
create_uniform_centers(
num_clusters: int = 3,
feature_size: int = 2,
random_seed: int = 17532042831661189422,
) -> BlobsClassificationExampleGenerator
Instantiate a BlobsClassificationExampleGenerator
where
the centers are sampled from a uniform distribution.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
num_clusters |
int
|
The number of clusters. |
3
|
feature_size |
int
|
The feature size. |
2
|
random_seed |
int
|
The random seed used to generate the cluster centers. |
17532042831661189422
|
Returns:
Type | Description |
---|---|
BlobsClassificationExampleGenerator
|
An instantiated example generator. |
Example usage:
>>> from startorch.example import BlobsClassification
>>> generator = BlobsClassification.create_uniform_centers()
>>> generator
BlobsClassificationExampleGenerator(num_clusters=3, feature_size=2)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}
startorch.example.BlobsClassificationExampleGenerator ¶
Bases: BaseExampleGenerator
Implement a binary classification example generator where the data are generated from isotropic Gaussian blobs.
The implementation is based on https://scikit-learn.org/stable/modules/generated/sklearn.datasets.make_blobs.html
Parameters:
Name | Type | Description | Default |
---|---|---|---|
centers |
Tensor
|
The cluster centers used to generate the
examples. It must be a float tensor of shape
|
required |
cluster_std |
Tensor | float
|
The standard deviation of the clusters.
It must be a float tensor of shape
|
1.0
|
Raises:
Type | Description |
---|---|
TypeError
|
if one of the parameters has an invalid type. |
RuntimeError
|
if one of the parameters is not valid. |
Example usage:
>>> import torch
>>> from startorch.example import BlobsClassification
>>> generator = BlobsClassification(torch.rand(5, 4))
>>> generator
BlobsClassificationExampleGenerator(num_clusters=5, feature_size=4)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}
startorch.example.BlobsClassificationExampleGenerator.centers
property
¶
centers: Tensor
torch.Tensor
of type float and shape (num_clusters,
feature_size)
: The cluster centers.
startorch.example.BlobsClassificationExampleGenerator.cluster_std
property
¶
cluster_std: Tensor
torch.Tensor
of type float and shape (num_clusters,
feature_size)
: The standard deviation for each cluster.
startorch.example.BlobsClassificationExampleGenerator.feature_size
property
¶
feature_size: int
The feature size i.e. the number of features.
startorch.example.BlobsClassificationExampleGenerator.num_clusters
property
¶
num_clusters: int
The number of clusters i.e. categories.
startorch.example.BlobsClassificationExampleGenerator.create_uniform_centers
classmethod
¶
create_uniform_centers(
num_clusters: int = 3,
feature_size: int = 2,
random_seed: int = 17532042831661189422,
) -> BlobsClassificationExampleGenerator
Instantiate a BlobsClassificationExampleGenerator
where
the centers are sampled from a uniform distribution.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
num_clusters |
int
|
The number of clusters. |
3
|
feature_size |
int
|
The feature size. |
2
|
random_seed |
int
|
The random seed used to generate the cluster centers. |
17532042831661189422
|
Returns:
Type | Description |
---|---|
BlobsClassificationExampleGenerator
|
An instantiated example generator. |
Example usage:
>>> from startorch.example import BlobsClassification
>>> generator = BlobsClassification.create_uniform_centers()
>>> generator
BlobsClassificationExampleGenerator(num_clusters=3, feature_size=2)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}
startorch.example.Cache ¶
Bases: BaseExampleGenerator
Implement an example generator that caches the last batch and returns it everytime a batch is generated.
A new batch is generated only if the batch size changes.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
generator |
BaseExampleGenerator | dict
|
The example generator or its configuration. |
required |
deepcopy |
bool
|
If |
False
|
Example usage:
>>> from startorch.example import Cache, SwissRoll
>>> generator = Cache(SwissRoll())
>>> generator
CacheExampleGenerator(
(generator): SwissRollExampleGenerator(noise_std=0.0, spin=1.5, hole=False)
(deepcopy): False
)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}
startorch.example.CacheExampleGenerator ¶
Bases: BaseExampleGenerator
Implement an example generator that caches the last batch and returns it everytime a batch is generated.
A new batch is generated only if the batch size changes.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
generator |
BaseExampleGenerator | dict
|
The example generator or its configuration. |
required |
deepcopy |
bool
|
If |
False
|
Example usage:
>>> from startorch.example import Cache, SwissRoll
>>> generator = Cache(SwissRoll())
>>> generator
CacheExampleGenerator(
(generator): SwissRollExampleGenerator(noise_std=0.0, spin=1.5, hole=False)
(deepcopy): False
)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}
startorch.example.CirclesClassification ¶
Bases: BaseExampleGenerator
Implements a binary classification example generator where the data are generated with a large circle containing a smaller circle in 2d.
The implementation is based on
sklearn.datasets.make_circles
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
shuffle |
bool
|
If |
True
|
noise_std |
float
|
The standard deviation of the Gaussian noise. |
0.0
|
factor |
float
|
The scale factor between inner and outer
circle in the range |
0.8
|
ratio |
float
|
The ratio between the number of examples in outer circle and inner circle. |
0.5
|
Raises:
Type | Description |
---|---|
TypeError
|
if one of the parameters is not valid. |
RuntimeError
|
if one of the parameters is not valid. |
Example usage:
>>> from startorch.example import CirclesClassification
>>> generator = CirclesClassification()
>>> generator
CirclesClassificationExampleGenerator(shuffle=True, noise_std=0.0, factor=0.8, ratio=0.5)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}
startorch.example.CirclesClassification.factor
property
¶
factor: float
The scale factor between inner and outer circle.
startorch.example.CirclesClassification.noise_std
property
¶
noise_std: float
The standard deviation of the Gaussian noise.
startorch.example.CirclesClassification.ratio
property
¶
ratio: float
The ratio between the number of examples in outer circle and inner circle.
startorch.example.CirclesClassificationExampleGenerator ¶
Bases: BaseExampleGenerator
Implements a binary classification example generator where the data are generated with a large circle containing a smaller circle in 2d.
The implementation is based on
sklearn.datasets.make_circles
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
shuffle |
bool
|
If |
True
|
noise_std |
float
|
The standard deviation of the Gaussian noise. |
0.0
|
factor |
float
|
The scale factor between inner and outer
circle in the range |
0.8
|
ratio |
float
|
The ratio between the number of examples in outer circle and inner circle. |
0.5
|
Raises:
Type | Description |
---|---|
TypeError
|
if one of the parameters is not valid. |
RuntimeError
|
if one of the parameters is not valid. |
Example usage:
>>> from startorch.example import CirclesClassification
>>> generator = CirclesClassification()
>>> generator
CirclesClassificationExampleGenerator(shuffle=True, noise_std=0.0, factor=0.8, ratio=0.5)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}
startorch.example.CirclesClassificationExampleGenerator.factor
property
¶
factor: float
The scale factor between inner and outer circle.
startorch.example.CirclesClassificationExampleGenerator.noise_std
property
¶
noise_std: float
The standard deviation of the Gaussian noise.
startorch.example.CirclesClassificationExampleGenerator.ratio
property
¶
ratio: float
The ratio between the number of examples in outer circle and inner circle.
startorch.example.Concatenate ¶
Bases: BaseExampleGenerator
Implement an example generator that concatenates the outputs of multiple example generators.
Note that the last value is used if there are duplicated keys.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
generators |
Sequence[BaseExampleGenerator | dict]
|
The example generators or their configurations. |
required |
Example usage:
>>> from startorch.example import TensorExampleGenerator, Concatenate
>>> from startorch.tensor import RandInt, RandUniform
>>> generator = Concatenate(
... [
... TensorExampleGenerator(
... generators={"value": RandUniform(), "time": RandUniform()},
... size=(6,),
... ),
... TensorExampleGenerator(generators={"label": RandInt(0, 10)}),
... ]
... )
>>> generator
ConcatenateExampleGenerator(
(0): TensorExampleGenerator(
(value): RandUniformTensorGenerator(low=0.0, high=1.0)
(time): RandUniformTensorGenerator(low=0.0, high=1.0)
(size): (6,)
)
(1): TensorExampleGenerator(
(label): RandIntTensorGenerator(low=0, high=10)
(size): ()
)
)
>>> generator.generate(batch_size=10)
{'value': tensor([[...]]), 'time': tensor([[...]]), 'label': tensor([...])}
startorch.example.ConcatenateExampleGenerator ¶
Bases: BaseExampleGenerator
Implement an example generator that concatenates the outputs of multiple example generators.
Note that the last value is used if there are duplicated keys.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
generators |
Sequence[BaseExampleGenerator | dict]
|
The example generators or their configurations. |
required |
Example usage:
>>> from startorch.example import TensorExampleGenerator, Concatenate
>>> from startorch.tensor import RandInt, RandUniform
>>> generator = Concatenate(
... [
... TensorExampleGenerator(
... generators={"value": RandUniform(), "time": RandUniform()},
... size=(6,),
... ),
... TensorExampleGenerator(generators={"label": RandInt(0, 10)}),
... ]
... )
>>> generator
ConcatenateExampleGenerator(
(0): TensorExampleGenerator(
(value): RandUniformTensorGenerator(low=0.0, high=1.0)
(time): RandUniformTensorGenerator(low=0.0, high=1.0)
(size): (6,)
)
(1): TensorExampleGenerator(
(label): RandIntTensorGenerator(low=0, high=10)
(size): ()
)
)
>>> generator.generate(batch_size=10)
{'value': tensor([[...]]), 'time': tensor([[...]]), 'label': tensor([...])}
startorch.example.Friedman1Regression ¶
Bases: BaseExampleGenerator
Implement the "Friedman #1" regression example generator.
The implementation is based on
sklearn.datasets.make_friedman1
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
feature_size |
int
|
The feature size. The feature size has to be greater than or equal to 5. Out of all features, only 5 are actually used to compute the targets. The remaining features are independent of targets. |
10
|
noise_std |
float
|
The standard deviation of the Gaussian noise. |
0.0
|
Raises:
Type | Description |
---|---|
ValueError
|
if one of the parameters is not valid. |
Example usage:
>>> from startorch.example import Friedman1Regression
>>> generator = Friedman1Regression(feature_size=6)
>>> generator
Friedman1RegressionExampleGenerator(feature_size=6, noise_std=0.0)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}
startorch.example.Friedman1RegressionExampleGenerator ¶
Bases: BaseExampleGenerator
Implement the "Friedman #1" regression example generator.
The implementation is based on
sklearn.datasets.make_friedman1
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
feature_size |
int
|
The feature size. The feature size has to be greater than or equal to 5. Out of all features, only 5 are actually used to compute the targets. The remaining features are independent of targets. |
10
|
noise_std |
float
|
The standard deviation of the Gaussian noise. |
0.0
|
Raises:
Type | Description |
---|---|
ValueError
|
if one of the parameters is not valid. |
Example usage:
>>> from startorch.example import Friedman1Regression
>>> generator = Friedman1Regression(feature_size=6)
>>> generator
Friedman1RegressionExampleGenerator(feature_size=6, noise_std=0.0)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}
startorch.example.Friedman2Regression ¶
Bases: BaseExampleGenerator
Implement the "Friedman #2" regression example generator.
The implementation is based on
sklearn.datasets.make_friedman2
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
feature_size |
int
|
The feature size. The feature size has to be greater than or equal to 4. Out of all features, only 4 are actually used to compute the targets. The remaining features are independent of targets. |
4
|
noise_std |
float
|
The standard deviation of the Gaussian noise. |
0.0
|
Raises:
Type | Description |
---|---|
ValueError
|
if one of the parameters is not valid. |
Example usage:
>>> from startorch.example import Friedman2Regression
>>> generator = Friedman2Regression(feature_size=6)
>>> generator
Friedman2RegressionExampleGenerator(feature_size=6, noise_std=0.0)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}
startorch.example.Friedman2RegressionExampleGenerator ¶
Bases: BaseExampleGenerator
Implement the "Friedman #2" regression example generator.
The implementation is based on
sklearn.datasets.make_friedman2
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
feature_size |
int
|
The feature size. The feature size has to be greater than or equal to 4. Out of all features, only 4 are actually used to compute the targets. The remaining features are independent of targets. |
4
|
noise_std |
float
|
The standard deviation of the Gaussian noise. |
0.0
|
Raises:
Type | Description |
---|---|
ValueError
|
if one of the parameters is not valid. |
Example usage:
>>> from startorch.example import Friedman2Regression
>>> generator = Friedman2Regression(feature_size=6)
>>> generator
Friedman2RegressionExampleGenerator(feature_size=6, noise_std=0.0)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}
startorch.example.Friedman3Regression ¶
Bases: BaseExampleGenerator
Implement the "Friedman #3" regression example generator.
The implementation is based on
sklearn.datasets.make_friedman3
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
feature_size |
int
|
The feature size. The feature size has to be greater than or equal to 4. Out of all features, only 4 are actually used to compute the targets. The remaining features are independent of targets. |
4
|
noise_std |
float
|
The standard deviation of the Gaussian noise. |
0.0
|
Raises:
Type | Description |
---|---|
ValueError
|
if one of the parameters is not valid. |
Example usage:
>>> from startorch.example import Friedman3Regression
>>> generator = Friedman3Regression(feature_size=6)
>>> generator
Friedman3RegressionExampleGenerator(feature_size=6, noise_std=0.0)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}
startorch.example.Friedman3RegressionExampleGenerator ¶
Bases: BaseExampleGenerator
Implement the "Friedman #3" regression example generator.
The implementation is based on
sklearn.datasets.make_friedman3
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
feature_size |
int
|
The feature size. The feature size has to be greater than or equal to 4. Out of all features, only 4 are actually used to compute the targets. The remaining features are independent of targets. |
4
|
noise_std |
float
|
The standard deviation of the Gaussian noise. |
0.0
|
Raises:
Type | Description |
---|---|
ValueError
|
if one of the parameters is not valid. |
Example usage:
>>> from startorch.example import Friedman3Regression
>>> generator = Friedman3Regression(feature_size=6)
>>> generator
Friedman3RegressionExampleGenerator(feature_size=6, noise_std=0.0)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}
startorch.example.HypercubeClassification ¶
Bases: BaseExampleGenerator
Implement a classification example generator.
The data are generated by using a hypercube. The targets are some vertices of the hypercube. Each input feature is a 1-hot representation of the target plus a Gaussian noise. These data can be used for a multi-class classification task.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
num_classes |
int
|
The number of classes. |
50
|
feature_size |
int
|
The feature size. The feature size has to be greater than the number of classes. |
64
|
noise_std |
float
|
The standard deviation of the Gaussian noise. |
0.2
|
Raises:
Type | Description |
---|---|
ValueError
|
if one of the parameters is not valid. |
Example usage:
>>> from startorch.example import HypercubeClassification
>>> generator = HypercubeClassification(num_classes=5, feature_size=6)
>>> generator
HypercubeClassificationExampleGenerator(num_classes=5, feature_size=6, noise_std=0.2)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}
startorch.example.HypercubeClassification.feature_size
property
¶
feature_size: int
The feature size when the data are created.
startorch.example.HypercubeClassification.noise_std
property
¶
noise_std: float
The standard deviation of the Gaussian noise.
startorch.example.HypercubeClassification.num_classes
property
¶
num_classes: int
The number of classes when the data are created.
startorch.example.HypercubeClassificationExampleGenerator ¶
Bases: BaseExampleGenerator
Implement a classification example generator.
The data are generated by using a hypercube. The targets are some vertices of the hypercube. Each input feature is a 1-hot representation of the target plus a Gaussian noise. These data can be used for a multi-class classification task.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
num_classes |
int
|
The number of classes. |
50
|
feature_size |
int
|
The feature size. The feature size has to be greater than the number of classes. |
64
|
noise_std |
float
|
The standard deviation of the Gaussian noise. |
0.2
|
Raises:
Type | Description |
---|---|
ValueError
|
if one of the parameters is not valid. |
Example usage:
>>> from startorch.example import HypercubeClassification
>>> generator = HypercubeClassification(num_classes=5, feature_size=6)
>>> generator
HypercubeClassificationExampleGenerator(num_classes=5, feature_size=6, noise_std=0.2)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}
startorch.example.HypercubeClassificationExampleGenerator.feature_size
property
¶
feature_size: int
The feature size when the data are created.
startorch.example.HypercubeClassificationExampleGenerator.noise_std
property
¶
noise_std: float
The standard deviation of the Gaussian noise.
startorch.example.HypercubeClassificationExampleGenerator.num_classes
property
¶
num_classes: int
The number of classes when the data are created.
startorch.example.LinearRegression ¶
Bases: BaseExampleGenerator
Implement a regression example generator where the data are generated with an underlying linear model.
The implementation is based on
sklearn.datasets.make_regression
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
weights |
Tensor | Sequence[float]
|
The linear weights in the underlying linear
model. It must be a float tensor of shape
|
required |
bias |
float
|
The bias term in the underlying linear model. |
0.0
|
noise_std |
float
|
The standard deviation of the Gaussian noise. |
0.0
|
Raises:
Type | Description |
---|---|
ValueError
|
if one of the parameters is not valid. |
Example usage:
>>> from startorch.example import LinearRegression
>>> generator = LinearRegression.create_uniform_weights()
>>> generator
LinearRegressionExampleGenerator(feature_size=100, bias=0.0, noise_std=0.0)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}
startorch.example.LinearRegression.bias
property
¶
bias: float
The bias of the underlying linear model.
startorch.example.LinearRegression.noise_std
property
¶
noise_std: float
The standard deviation of the Gaussian noise.
startorch.example.LinearRegression.weights
property
¶
weights: Tensor
torch.Tensor
: The weights of the underlying linear
model.
startorch.example.LinearRegressionExampleGenerator ¶
Bases: BaseExampleGenerator
Implement a regression example generator where the data are generated with an underlying linear model.
The implementation is based on
sklearn.datasets.make_regression
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
weights |
Tensor | Sequence[float]
|
The linear weights in the underlying linear
model. It must be a float tensor of shape
|
required |
bias |
float
|
The bias term in the underlying linear model. |
0.0
|
noise_std |
float
|
The standard deviation of the Gaussian noise. |
0.0
|
Raises:
Type | Description |
---|---|
ValueError
|
if one of the parameters is not valid. |
Example usage:
>>> from startorch.example import LinearRegression
>>> generator = LinearRegression.create_uniform_weights()
>>> generator
LinearRegressionExampleGenerator(feature_size=100, bias=0.0, noise_std=0.0)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}
startorch.example.LinearRegressionExampleGenerator.bias
property
¶
bias: float
The bias of the underlying linear model.
startorch.example.LinearRegressionExampleGenerator.feature_size
property
¶
feature_size: int
The feature size.
startorch.example.LinearRegressionExampleGenerator.noise_std
property
¶
noise_std: float
The standard deviation of the Gaussian noise.
startorch.example.LinearRegressionExampleGenerator.weights
property
¶
weights: Tensor
torch.Tensor
: The weights of the underlying linear
model.
startorch.example.MoonsClassification ¶
Bases: BaseExampleGenerator
Implements a binary classification example generator where the data are generated with a large circle containing a smaller circle in 2d.
The implementation is based on
sklearn.datasets.make_moons
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
shuffle |
bool
|
If |
True
|
noise_std |
float
|
The standard deviation of the Gaussian noise. |
0.0
|
ratio |
float
|
The ratio between the number of examples in outer circle and inner circle. |
0.5
|
Raises:
Type | Description |
---|---|
TypeError
|
if one of the parameters has an invalid type. |
RuntimeError
|
if one of the parameters has an invalid value. |
Example usage:
>>> from startorch.example import MoonsClassification
>>> generator = MoonsClassification()
>>> generator
MoonsClassificationExampleGenerator(shuffle=True, noise_std=0.0, ratio=0.5)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}
startorch.example.MoonsClassificationExampleGenerator ¶
Bases: BaseExampleGenerator
Implements a binary classification example generator where the data are generated with a large circle containing a smaller circle in 2d.
The implementation is based on
sklearn.datasets.make_moons
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
shuffle |
bool
|
If |
True
|
noise_std |
float
|
The standard deviation of the Gaussian noise. |
0.0
|
ratio |
float
|
The ratio between the number of examples in outer circle and inner circle. |
0.5
|
Raises:
Type | Description |
---|---|
TypeError
|
if one of the parameters has an invalid type. |
RuntimeError
|
if one of the parameters has an invalid value. |
Example usage:
>>> from startorch.example import MoonsClassification
>>> generator = MoonsClassification()
>>> generator
MoonsClassificationExampleGenerator(shuffle=True, noise_std=0.0, ratio=0.5)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}
startorch.example.SwissRoll ¶
Bases: BaseExampleGenerator
Implements a manifold example generator based on the Swiss roll pattern.
The implementation is based on
sklearn.datasets.make_swiss_roll
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
noise_std |
float
|
The standard deviation of the Gaussian noise. |
0.0
|
spin |
float
|
The number of spins of the Swiss roll. |
1.5
|
hole |
bool
|
If |
False
|
Raises:
Type | Description |
---|---|
ValueError
|
if one of the parameters is not valid. |
Example usage:
>>> from startorch.example import SwissRoll
>>> generator = SwissRoll()
>>> generator
SwissRollExampleGenerator(noise_std=0.0, spin=1.5, hole=False)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}
startorch.example.SwissRollExampleGenerator ¶
Bases: BaseExampleGenerator
Implements a manifold example generator based on the Swiss roll pattern.
The implementation is based on
sklearn.datasets.make_swiss_roll
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
noise_std |
float
|
The standard deviation of the Gaussian noise. |
0.0
|
spin |
float
|
The number of spins of the Swiss roll. |
1.5
|
hole |
bool
|
If |
False
|
Raises:
Type | Description |
---|---|
ValueError
|
if one of the parameters is not valid. |
Example usage:
>>> from startorch.example import SwissRoll
>>> generator = SwissRoll()
>>> generator
SwissRollExampleGenerator(noise_std=0.0, spin=1.5, hole=False)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}
startorch.example.TensorExampleGenerator ¶
Bases: BaseExampleGenerator
Implement an example generator to generate time series.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
generators |
Mapping[str, BaseTensorGenerator | dict]
|
The tensor generators or their configurations. |
required |
size |
Sequence[int]
|
The output tensor shape excepts the first dimension
which is set to |
()
|
Example usage:
>>> from startorch.example import TensorExampleGenerator
>>> from startorch.tensor import RandInt, RandUniform
>>> generator = TensorExampleGenerator(
... generators={"value": RandUniform(), "time": RandUniform()},
... size=(6,),
... )
>>> generator
TensorExampleGenerator(
(value): RandUniformTensorGenerator(low=0.0, high=1.0)
(time): RandUniformTensorGenerator(low=0.0, high=1.0)
(size): (6,)
)
>>> generator.generate(batch_size=10)
{'value': tensor([[...]]), 'time': tensor([[...]])}
startorch.example.TimeSeriesExampleGenerator ¶
Bases: BaseExampleGenerator
Implement an example generator to generate time series.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
generators |
BaseTimeSeriesGenerator | dict
|
A time series generator or its configuration. |
required |
seq_len |
BaseTensorGenerator | dict
|
The sequence length sampler or its configuration. This sampler is used to sample the sequence length at each batch. |
required |
Example usage:
>>> from startorch.example import TimeSeriesExampleGenerator
>>> from startorch.timeseries import SequenceTimeSeriesGenerator
>>> from startorch.sequence import Periodic, RandUniform
>>> from startorch.tensor import RandInt
>>> generator = TimeSeriesExampleGenerator(
... generators=SequenceTimeSeriesGenerator(
... {"value": RandUniform(), "time": RandUniform()}
... ),
... seq_len=RandInt(2, 5),
... )
>>> generator
TimeSeriesExampleGenerator(
(generators): SequenceTimeSeriesGenerator(
(value): RandUniformSequenceGenerator(low=0.0, high=1.0, feature_size=(1,))
(time): RandUniformSequenceGenerator(low=0.0, high=1.0, feature_size=(1,))
)
(seq_len): RandIntTensorGenerator(low=2, high=5)
)
>>> generator.generate(batch_size=10)
{'value': tensor([[...]]), 'time': tensor([[...]])}
startorch.example.TransformExampleGenerator ¶
Bases: BaseExampleGenerator
Implement an example generator that generates examples, and then transformes them.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
generator |
BaseExampleGenerator | dict
|
The example generator or its configuration. |
required |
transformer |
BaseTransformer | dict
|
The data transformer or its configuration. |
required |
Example usage:
>>> from startorch.example import TransformExampleGenerator, HypercubeClassification
>>> from startorch.transformer import TensorTransformer
>>> from startorch.tensor.transformer import Abs
>>> generator = TransformExampleGenerator(
... generator=HypercubeClassification(num_classes=5, feature_size=6),
... transformer=TensorTransformer(
... transformer=Abs(), input="feature", output="feature_transformed"
... ),
... )
>>> generator
TransformExampleGenerator(
(generator): HypercubeClassificationExampleGenerator(num_classes=5, feature_size=6, noise_std=0.2)
(transformer): TensorTransformer(
(transformer): AbsTensorTransformer()
(input): feature
(output): feature_transformed
(exist_ok): False
)
)
>>> generator.generate(batch_size=10)
{'target': tensor([...]), 'feature': tensor([[...]]), 'feature_transformed': tensor([[...]])}
startorch.example.VanillaExampleGenerator ¶
Bases: BaseExampleGenerator
Implement an example generator to "generate" the input data.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
dict[Hashable, Tensor]
|
The data to generate. The dictionary cannot be empty. |
required |
Raises:
Type | Description |
---|---|
ValueError
|
if |
Example usage:
>>> import torch
>>> from startorch.example import VanillaExampleGenerator
>>> generator = VanillaExampleGenerator(
... data={"value": torch.ones(10, 3), "time": torch.arange(10)}
... )
>>> generator
VanillaExampleGenerator(batch_size=10)
>>> generator.generate(batch_size=5)
{'value': tensor([[1., 1., 1.],
[1., 1., 1.],
[1., 1., 1.],
[1., 1., 1.],
[1., 1., 1.]]),
'time': tensor([0, 1, 2, 3, 4])}
startorch.example.is_example_generator_config ¶
is_example_generator_config(config: dict) -> bool
Indicate if the input configuration is a configuration for a
BaseExampleGenerator
.
This function only checks if the value of the key _target_
is valid. It does not check the other values. If _target_
indicates a function, the returned type hint is used to check
the class.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
config |
dict
|
The configuration to check. |
required |
Returns:
Type | Description |
---|---|
bool
|
|
Example usage:
```pycon
>>> from startorch.example import is_example_generator_config
>>> is_example_generator_config({"_target_": "startorch.example.HypercubeClassification"})
True
startorch.example.make_blobs_classification ¶
make_blobs_classification(
num_examples: int,
centers: Tensor,
cluster_std: Tensor | float = 1.0,
generator: Generator | None = None,
) -> dict[str, Tensor]
Generate a classification dataset where the data are gnerated from isotropic Gaussian blobs for clustering.
The implementation is based on https://scikit-learn.org/stable/modules/generated/sklearn.datasets.make_blobs.html
Parameters:
Name | Type | Description | Default |
---|---|---|---|
num_examples |
int
|
The number of examples. |
required |
centers |
Tensor
|
The cluster centers used to generate the
examples. It must be a float tensor of shape
|
required |
cluster_std |
Tensor | float
|
The standard deviation of the clusters.
It must be a float tensor of shape
|
1.0
|
generator |
Generator | None
|
An optional random number generator. |
None
|
Returns:
Type | Description |
---|---|
dict[str, Tensor]
|
A dictionary with two items:
- |
Raises:
Type | Description |
---|---|
RuntimeError
|
if one of the parameters is not valid. |
Example usage:
>>> import torch
>>> from startorch.example import make_blobs_classification
>>> batch = make_blobs_classification(num_examples=10, centers=torch.rand(5, 2))
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}
startorch.example.make_circles_classification ¶
make_circles_classification(
num_examples: int = 100,
shuffle: bool = True,
noise_std: float = 0.0,
factor: float = 0.8,
ratio: float = 0.5,
generator: Generator | None = None,
) -> dict[str, Tensor]
Generate a binary classification dataset where the data are generated with a large circle containing a smaller circle in 2d.
The implementation is based on
sklearn.datasets.make_circles
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
num_examples |
int
|
The number of examples. |
100
|
shuffle |
bool
|
If |
True
|
noise_std |
float
|
The standard deviation of the Gaussian noise. |
0.0
|
factor |
float
|
The scale factor between inner and outer
circle in the range |
0.8
|
ratio |
float
|
The ratio between the number of examples in outer circle and inner circle. |
0.5
|
generator |
Generator | None
|
An optional random generator. |
None
|
Returns:
Type | Description |
---|---|
dict[str, Tensor]
|
A dictionary with two items:
- |
Raises:
Type | Description |
---|---|
RuntimeError
|
if one of the parameters is not valid. |
Example usage:
>>> from startorch.example import make_circles_classification
>>> batch = make_circles_classification(num_examples=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}
startorch.example.make_friedman1_regression ¶
make_friedman1_regression(
num_examples: int = 100,
feature_size: int = 10,
noise_std: float = 0.0,
generator: Generator | None = None,
) -> dict[str, Tensor]
Generate the "Friedman #1" regression data.
The implementation is based on
sklearn.datasets.make_friedman1
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
num_examples |
int
|
The number of examples. |
100
|
feature_size |
int
|
The feature size. The feature size has to be greater than or equal to 5. Out of all features, only 5 are actually used to compute the targets. The remaining features are independent of targets. |
10
|
noise_std |
float
|
The standard deviation of the Gaussian noise. |
0.0
|
generator |
Generator | None
|
An optional random number generator. |
None
|
Returns:
Type | Description |
---|---|
dict[str, Tensor]
|
A dictionary with two items:
- |
Raises:
Type | Description |
---|---|
RuntimeError
|
if one of the parameters is not valid. |
Example usage:
>>> from startorch.example import make_friedman1_regression
>>> batch = make_friedman1_regression(num_examples=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}
startorch.example.make_friedman2_regression ¶
make_friedman2_regression(
num_examples: int = 100,
feature_size: int = 4,
noise_std: float = 0.0,
generator: Generator | None = None,
) -> dict[str, Tensor]
Generate the "Friedman #2" regression data.
The implementation is based on
sklearn.datasets.make_friedman2
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
num_examples |
int
|
The number of examples. |
100
|
feature_size |
int
|
The feature size. The feature size has to be greater than or equal to 4. Out of all features, only 4 are actually used to compute the targets. The remaining features are independent of targets. |
4
|
noise_std |
float
|
The standard deviation of the Gaussian noise. |
0.0
|
generator |
Generator | None
|
An optional random number generator. |
None
|
Returns:
Type | Description |
---|---|
dict[str, Tensor]
|
A dictionary with two items:
- |
Raises:
Type | Description |
---|---|
RuntimeError
|
if one of the parameters is not valid. |
Example usage:
>>> from startorch.example import make_friedman2_regression
>>> batch = make_friedman2_regression(num_examples=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}
startorch.example.make_friedman3_regression ¶
make_friedman3_regression(
num_examples: int = 100,
feature_size: int = 4,
noise_std: float = 0.0,
generator: Generator | None = None,
) -> dict[str, Tensor]
Generate the "Friedman #3" regression problem.
The implementation is based on
sklearn.datasets.make_friedman3
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
num_examples |
int
|
The number of examples. |
100
|
feature_size |
int
|
The feature size. The feature size has to be greater than or equal to 4. Out of all features, only 4 are actually used to compute the targets. The remaining features are independent of targets. |
4
|
noise_std |
float
|
The standard deviation of the Gaussian noise. |
0.0
|
generator |
Generator | None
|
An optional random number generator. |
None
|
Returns:
Type | Description |
---|---|
dict[str, Tensor]
|
A dictionary with two items:
- |
Raises:
Type | Description |
---|---|
RuntimeError
|
if one of the parameters is not valid. |
Example usage:
>>> from startorch.example import make_friedman3_regression
>>> batch = make_friedman3_regression(num_examples=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}
startorch.example.make_hypercube_classification ¶
make_hypercube_classification(
num_examples: int = 1000,
num_classes: int = 50,
feature_size: int = 64,
noise_std: float = 0.2,
generator: Generator | None = None,
) -> dict[str, Tensor]
Generate a synthetic classification dataset based on hypercube vertex structure.
The data are generated by using a hypercube. The targets are some vertices of the hypercube. Each input feature is a 1-hot representation of the target plus a Gaussian noise. These data can be used for a multi-class classification task.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
num_examples |
int
|
The number of examples. |
1000
|
num_classes |
int
|
The number of classes. |
50
|
feature_size |
int
|
The feature size. The feature size has to be greater than the number of classes. |
64
|
noise_std |
float
|
The standard deviation of the Gaussian noise. |
0.2
|
generator |
Generator | None
|
An optional random generator. |
None
|
Returns:
Type | Description |
---|---|
dict[str, Tensor]
|
A dictionary with two items:
- |
Raises:
Type | Description |
---|---|
RuntimeError
|
if one of the parameters is not valid. |
Example usage:
>>> from startorch.example.hypercube import make_hypercube_classification
>>> batch = make_hypercube_classification(num_examples=10, num_classes=5, feature_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}
startorch.example.make_linear_regression ¶
make_linear_regression(
weights: Tensor,
bias: float = 0.0,
num_examples: int = 100,
noise_std: float = 0.0,
generator: Generator | None = None,
) -> dict[str, Tensor]
Generate a regression dataset where the data are generated with an underlying linear model.
The features are sampled from a Normal distribution. Then, the targets are generated by applying a random linear regression model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
weights |
Tensor
|
The linear weights in the underlying linear
model. It must be a float tensor of shape
|
required |
bias |
float
|
The bias term in the underlying linear model. |
0.0
|
num_examples |
int
|
The number of examples to generate. |
100
|
noise_std |
float
|
The standard deviation of the Gaussian noise. |
0.0
|
generator |
Generator | None
|
An optional random generator. |
None
|
Returns:
Type | Description |
---|---|
dict[str, Tensor]
|
A dictionary with two items:
- |
Raises:
Type | Description |
---|---|
RuntimeError
|
if one of the parameters is not valid. |
Example usage:
>>> from startorch.example import make_linear_regression
>>> batch = make_linear_regression(weights=torch.rand(10), num_examples=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}
startorch.example.make_moons_classification ¶
make_moons_classification(
num_examples: int = 100,
shuffle: bool = True,
noise_std: float = 0.0,
ratio: float = 0.5,
generator: Generator | None = None,
) -> dict[str, Tensor]
Generate a binary classification dataset where the data are two interleaving half circles in 2d.
The implementation is based on
sklearn.datasets.make_moons
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
num_examples |
int
|
The number of examples. |
100
|
shuffle |
bool
|
If |
True
|
noise_std |
float
|
The standard deviation of the Gaussian noise. |
0.0
|
ratio |
float
|
The ratio between the number of examples in outer circle and inner circle. |
0.5
|
generator |
Generator | None
|
An optional random generator. |
None
|
Returns:
Type | Description |
---|---|
dict[str, Tensor]
|
A dictionary with two items:
- |
Raises:
Type | Description |
---|---|
RuntimeError
|
if one of the parameters is not valid. |
Example usage:
>>> from startorch.example import make_moons_classification
>>> batch = make_moons_classification(num_examples=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}
startorch.example.make_sparse_uncorrelated_regression ¶
make_sparse_uncorrelated_regression(
num_examples: int = 100,
feature_size: int = 4,
noise_std: float = 0.0,
generator: Generator | None = None,
) -> dict[str, Tensor]
Generate a random regression problem with sparse uncorrelated design.
The implementation is based on
sklearn.datasets.make_sparse_uncorrelated
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
num_examples |
int
|
The number of examples. |
100
|
feature_size |
int
|
The feature size. The feature size has to be greater than or equal to 4. Out of all features, only 4 are actually used to compute the targets. The remaining features are independent of targets. |
4
|
noise_std |
float
|
The standard deviation of the Gaussian noise. |
0.0
|
generator |
Generator | None
|
An optional random generator. |
None
|
Returns:
Type | Description |
---|---|
dict[str, Tensor]
|
A batch with two items:
- |
Raises:
Type | Description |
---|---|
RuntimeError
|
if one of the parameters is not valid. |
Example usage:
>>> from startorch.example import make_sparse_uncorrelated_regression
>>> data = make_sparse_uncorrelated_regression(num_examples=10)
>>> data
{'target': tensor([...]), 'feature': tensor([[...]])}
startorch.example.make_swiss_roll ¶
make_swiss_roll(
num_examples: int = 100,
noise_std: float = 0.0,
spin: float = 1.5,
hole: bool = False,
generator: Generator | None = None,
) -> dict[str, Tensor]
Generate a toy manifold dataset based on Swiss roll pattern.
The implementation is based on
sklearn.datasets.make_swiss_roll
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
num_examples |
int
|
The number of examples. |
100
|
noise_std |
float
|
The standard deviation of the Gaussian noise. |
0.0
|
spin |
float
|
The number of spins of the Swiss roll. |
1.5
|
hole |
bool
|
If |
False
|
generator |
Generator | None
|
An optional random generator. |
None
|
Returns:
Type | Description |
---|---|
dict[str, Tensor]
|
A batch with two items:
- |
Raises:
Type | Description |
---|---|
RuntimeError
|
if one of the parameters is not valid. |
Example usage:
>>> from startorch.example import make_swiss_roll
>>> batch = make_swiss_roll(num_examples=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}
startorch.example.setup_example_generator ¶
setup_example_generator(
generator: BaseExampleGenerator | dict,
) -> BaseExampleGenerator
Set up an example generator.
The time series generator is instantiated from its configuration
by using the BaseExampleGenerator
factory function.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
generator |
BaseExampleGenerator | dict
|
An example generator or its configuration. |
required |
Returns:
Type | Description |
---|---|
BaseExampleGenerator
|
An example generator. |
Example usage:
>>> from startorch.example import setup_example_generator
>>> generator = setup_example_generator(
... {"_target_": "startorch.example.HypercubeClassification"}
... )
>>> generator
HypercubeClassificationExampleGenerator(num_classes=50, feature_size=64, noise_std=0.2)