Skip to content

example

startorch.example

Contain example generators.

startorch.example.BaseExampleGenerator

Bases: Generic[T], ABC

Define the base class to generate examples.

Example usage:

>>> from startorch.example import HypercubeClassification
>>> generator = HypercubeClassification(num_classes=5, feature_size=6)
>>> generator
HypercubeClassificationExampleGenerator(num_classes=5, feature_size=6, noise_std=0.2)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}

startorch.example.BaseExampleGenerator.generate abstractmethod

generate(
    batch_size: int = 1, rng: Generator | None = None
) -> dict[Hashable, Tensor]

Generate a batch of examples.

Parameters:

Name Type Description Default
batch_size int

The batch size.

1
rng Generator | None

An optional random number generator.

None

Returns:

Type Description
dict[Hashable, Tensor]

A batch of examples.

Example usage:

>>> from startorch.example import HypercubeClassification
>>> generator = HypercubeClassification(num_classes=5, feature_size=6)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}

startorch.example.BlobsClassification

Bases: BaseExampleGenerator

Implement a binary classification example generator where the data are generated from isotropic Gaussian blobs.

The implementation is based on https://scikit-learn.org/stable/modules/generated/sklearn.datasets.make_blobs.html

Parameters:

Name Type Description Default
centers Tensor

The cluster centers used to generate the examples. It must be a float tensor of shape (num_clusters, feature_size).

required
cluster_std Tensor | float

The standard deviation of the clusters. It must be a float tensor of shape (num_clusters, feature_size).

1.0

Raises:

Type Description
TypeError

if one of the parameters has an invalid type.

RuntimeError

if one of the parameters is not valid.

Example usage:

>>> import torch
>>> from startorch.example import BlobsClassification
>>> generator = BlobsClassification(torch.rand(5, 4))
>>> generator
BlobsClassificationExampleGenerator(num_clusters=5, feature_size=4)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}

startorch.example.BlobsClassification.centers property

centers: Tensor

torch.Tensor of type float and shape (num_clusters, feature_size): The cluster centers.

startorch.example.BlobsClassification.cluster_std property

cluster_std: Tensor

torch.Tensor of type float and shape (num_clusters, feature_size): The standard deviation for each cluster.

startorch.example.BlobsClassification.feature_size property

feature_size: int

The feature size i.e. the number of features.

startorch.example.BlobsClassification.num_clusters property

num_clusters: int

The number of clusters i.e. categories.

startorch.example.BlobsClassification.create_uniform_centers classmethod

create_uniform_centers(
    num_clusters: int = 3,
    feature_size: int = 2,
    random_seed: int = 17532042831661189422,
) -> BlobsClassificationExampleGenerator

Instantiate a BlobsClassificationExampleGenerator where the centers are sampled from a uniform distribution.

Parameters:

Name Type Description Default
num_clusters int

The number of clusters.

3
feature_size int

The feature size.

2
random_seed int

The random seed used to generate the cluster centers.

17532042831661189422

Returns:

Type Description
BlobsClassificationExampleGenerator

An instantiated example generator.

Example usage:

>>> from startorch.example import BlobsClassification
>>> generator = BlobsClassification.create_uniform_centers()
>>> generator
BlobsClassificationExampleGenerator(num_clusters=3, feature_size=2)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}

startorch.example.BlobsClassificationExampleGenerator

Bases: BaseExampleGenerator

Implement a binary classification example generator where the data are generated from isotropic Gaussian blobs.

The implementation is based on https://scikit-learn.org/stable/modules/generated/sklearn.datasets.make_blobs.html

Parameters:

Name Type Description Default
centers Tensor

The cluster centers used to generate the examples. It must be a float tensor of shape (num_clusters, feature_size).

required
cluster_std Tensor | float

The standard deviation of the clusters. It must be a float tensor of shape (num_clusters, feature_size).

1.0

Raises:

Type Description
TypeError

if one of the parameters has an invalid type.

RuntimeError

if one of the parameters is not valid.

Example usage:

>>> import torch
>>> from startorch.example import BlobsClassification
>>> generator = BlobsClassification(torch.rand(5, 4))
>>> generator
BlobsClassificationExampleGenerator(num_clusters=5, feature_size=4)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}

startorch.example.BlobsClassificationExampleGenerator.centers property

centers: Tensor

torch.Tensor of type float and shape (num_clusters, feature_size): The cluster centers.

startorch.example.BlobsClassificationExampleGenerator.cluster_std property

cluster_std: Tensor

torch.Tensor of type float and shape (num_clusters, feature_size): The standard deviation for each cluster.

startorch.example.BlobsClassificationExampleGenerator.feature_size property

feature_size: int

The feature size i.e. the number of features.

startorch.example.BlobsClassificationExampleGenerator.num_clusters property

num_clusters: int

The number of clusters i.e. categories.

startorch.example.BlobsClassificationExampleGenerator.create_uniform_centers classmethod

create_uniform_centers(
    num_clusters: int = 3,
    feature_size: int = 2,
    random_seed: int = 17532042831661189422,
) -> BlobsClassificationExampleGenerator

Instantiate a BlobsClassificationExampleGenerator where the centers are sampled from a uniform distribution.

Parameters:

Name Type Description Default
num_clusters int

The number of clusters.

3
feature_size int

The feature size.

2
random_seed int

The random seed used to generate the cluster centers.

17532042831661189422

Returns:

Type Description
BlobsClassificationExampleGenerator

An instantiated example generator.

Example usage:

>>> from startorch.example import BlobsClassification
>>> generator = BlobsClassification.create_uniform_centers()
>>> generator
BlobsClassificationExampleGenerator(num_clusters=3, feature_size=2)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}

startorch.example.Cache

Bases: BaseExampleGenerator

Implement an example generator that caches the last batch and returns it everytime a batch is generated.

A new batch is generated only if the batch size changes.

Parameters:

Name Type Description Default
generator BaseExampleGenerator | dict

The example generator or its configuration.

required
deepcopy bool

If True, the cached batch is deepcopied before to be return.

False

Example usage:

>>> from startorch.example import Cache, SwissRoll
>>> generator = Cache(SwissRoll())
>>> generator
CacheExampleGenerator(
  (generator): SwissRollExampleGenerator(noise_std=0.0, spin=1.5, hole=False)
  (deepcopy): False
)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}

startorch.example.CacheExampleGenerator

Bases: BaseExampleGenerator

Implement an example generator that caches the last batch and returns it everytime a batch is generated.

A new batch is generated only if the batch size changes.

Parameters:

Name Type Description Default
generator BaseExampleGenerator | dict

The example generator or its configuration.

required
deepcopy bool

If True, the cached batch is deepcopied before to be return.

False

Example usage:

>>> from startorch.example import Cache, SwissRoll
>>> generator = Cache(SwissRoll())
>>> generator
CacheExampleGenerator(
  (generator): SwissRollExampleGenerator(noise_std=0.0, spin=1.5, hole=False)
  (deepcopy): False
)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}

startorch.example.CirclesClassification

Bases: BaseExampleGenerator

Implements a binary classification example generator where the data are generated with a large circle containing a smaller circle in 2d.

The implementation is based on sklearn.datasets.make_circles.

Parameters:

Name Type Description Default
shuffle bool

If True, the examples are shuffled.

True
noise_std float

The standard deviation of the Gaussian noise.

0.0
factor float

The scale factor between inner and outer circle in the range [0, 1).

0.8
ratio float

The ratio between the number of examples in outer circle and inner circle.

0.5

Raises:

Type Description
TypeError

if one of the parameters is not valid.

RuntimeError

if one of the parameters is not valid.

Example usage:

>>> from startorch.example import CirclesClassification
>>> generator = CirclesClassification()
>>> generator
CirclesClassificationExampleGenerator(shuffle=True, noise_std=0.0, factor=0.8, ratio=0.5)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}

startorch.example.CirclesClassification.factor property

factor: float

The scale factor between inner and outer circle.

startorch.example.CirclesClassification.noise_std property

noise_std: float

The standard deviation of the Gaussian noise.

startorch.example.CirclesClassification.ratio property

ratio: float

The ratio between the number of examples in outer circle and inner circle.

startorch.example.CirclesClassificationExampleGenerator

Bases: BaseExampleGenerator

Implements a binary classification example generator where the data are generated with a large circle containing a smaller circle in 2d.

The implementation is based on sklearn.datasets.make_circles.

Parameters:

Name Type Description Default
shuffle bool

If True, the examples are shuffled.

True
noise_std float

The standard deviation of the Gaussian noise.

0.0
factor float

The scale factor between inner and outer circle in the range [0, 1).

0.8
ratio float

The ratio between the number of examples in outer circle and inner circle.

0.5

Raises:

Type Description
TypeError

if one of the parameters is not valid.

RuntimeError

if one of the parameters is not valid.

Example usage:

>>> from startorch.example import CirclesClassification
>>> generator = CirclesClassification()
>>> generator
CirclesClassificationExampleGenerator(shuffle=True, noise_std=0.0, factor=0.8, ratio=0.5)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}

startorch.example.CirclesClassificationExampleGenerator.factor property

factor: float

The scale factor between inner and outer circle.

startorch.example.CirclesClassificationExampleGenerator.noise_std property

noise_std: float

The standard deviation of the Gaussian noise.

startorch.example.CirclesClassificationExampleGenerator.ratio property

ratio: float

The ratio between the number of examples in outer circle and inner circle.

startorch.example.Concatenate

Bases: BaseExampleGenerator

Implement an example generator that concatenates the outputs of multiple example generators.

Note that the last value is used if there are duplicated keys.

Parameters:

Name Type Description Default
generators Sequence[BaseExampleGenerator | dict]

The example generators or their configurations.

required

Example usage:

>>> from startorch.example import TensorExampleGenerator, Concatenate
>>> from startorch.tensor import RandInt, RandUniform
>>> generator = Concatenate(
...     [
...         TensorExampleGenerator(
...             generators={"value": RandUniform(), "time": RandUniform()},
...             size=(6,),
...         ),
...         TensorExampleGenerator(generators={"label": RandInt(0, 10)}),
...     ]
... )
>>> generator
ConcatenateExampleGenerator(
  (0): TensorExampleGenerator(
      (value): RandUniformTensorGenerator(low=0.0, high=1.0)
      (time): RandUniformTensorGenerator(low=0.0, high=1.0)
      (size): (6,)
    )
  (1): TensorExampleGenerator(
      (label): RandIntTensorGenerator(low=0, high=10)
      (size): ()
    )
)
>>> generator.generate(batch_size=10)
{'value': tensor([[...]]), 'time': tensor([[...]]), 'label': tensor([...])}

startorch.example.ConcatenateExampleGenerator

Bases: BaseExampleGenerator

Implement an example generator that concatenates the outputs of multiple example generators.

Note that the last value is used if there are duplicated keys.

Parameters:

Name Type Description Default
generators Sequence[BaseExampleGenerator | dict]

The example generators or their configurations.

required

Example usage:

>>> from startorch.example import TensorExampleGenerator, Concatenate
>>> from startorch.tensor import RandInt, RandUniform
>>> generator = Concatenate(
...     [
...         TensorExampleGenerator(
...             generators={"value": RandUniform(), "time": RandUniform()},
...             size=(6,),
...         ),
...         TensorExampleGenerator(generators={"label": RandInt(0, 10)}),
...     ]
... )
>>> generator
ConcatenateExampleGenerator(
  (0): TensorExampleGenerator(
      (value): RandUniformTensorGenerator(low=0.0, high=1.0)
      (time): RandUniformTensorGenerator(low=0.0, high=1.0)
      (size): (6,)
    )
  (1): TensorExampleGenerator(
      (label): RandIntTensorGenerator(low=0, high=10)
      (size): ()
    )
)
>>> generator.generate(batch_size=10)
{'value': tensor([[...]]), 'time': tensor([[...]]), 'label': tensor([...])}

startorch.example.Friedman1Regression

Bases: BaseExampleGenerator

Implement the "Friedman #1" regression example generator.

The implementation is based on sklearn.datasets.make_friedman1.

Parameters:

Name Type Description Default
feature_size int

The feature size. The feature size has to be greater than or equal to 5. Out of all features, only 5 are actually used to compute the targets. The remaining features are independent of targets.

10
noise_std float

The standard deviation of the Gaussian noise.

0.0

Raises:

Type Description
ValueError

if one of the parameters is not valid.

Example usage:

>>> from startorch.example import Friedman1Regression
>>> generator = Friedman1Regression(feature_size=6)
>>> generator
Friedman1RegressionExampleGenerator(feature_size=6, noise_std=0.0)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}

startorch.example.Friedman1Regression.feature_size property

feature_size: int

The feature size when the data are created.

startorch.example.Friedman1Regression.noise_std property

noise_std: float

The standard deviation of the Gaussian noise.

startorch.example.Friedman1RegressionExampleGenerator

Bases: BaseExampleGenerator

Implement the "Friedman #1" regression example generator.

The implementation is based on sklearn.datasets.make_friedman1.

Parameters:

Name Type Description Default
feature_size int

The feature size. The feature size has to be greater than or equal to 5. Out of all features, only 5 are actually used to compute the targets. The remaining features are independent of targets.

10
noise_std float

The standard deviation of the Gaussian noise.

0.0

Raises:

Type Description
ValueError

if one of the parameters is not valid.

Example usage:

>>> from startorch.example import Friedman1Regression
>>> generator = Friedman1Regression(feature_size=6)
>>> generator
Friedman1RegressionExampleGenerator(feature_size=6, noise_std=0.0)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}

startorch.example.Friedman1RegressionExampleGenerator.feature_size property

feature_size: int

The feature size when the data are created.

startorch.example.Friedman1RegressionExampleGenerator.noise_std property

noise_std: float

The standard deviation of the Gaussian noise.

startorch.example.Friedman2Regression

Bases: BaseExampleGenerator

Implement the "Friedman #2" regression example generator.

The implementation is based on sklearn.datasets.make_friedman2.

Parameters:

Name Type Description Default
feature_size int

The feature size. The feature size has to be greater than or equal to 4. Out of all features, only 4 are actually used to compute the targets. The remaining features are independent of targets.

4
noise_std float

The standard deviation of the Gaussian noise.

0.0

Raises:

Type Description
ValueError

if one of the parameters is not valid.

Example usage:

>>> from startorch.example import Friedman2Regression
>>> generator = Friedman2Regression(feature_size=6)
>>> generator
Friedman2RegressionExampleGenerator(feature_size=6, noise_std=0.0)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}

startorch.example.Friedman2Regression.feature_size property

feature_size: int

The feature size when the data are created.

startorch.example.Friedman2Regression.noise_std property

noise_std: float

The standard deviation of the Gaussian noise.

startorch.example.Friedman2RegressionExampleGenerator

Bases: BaseExampleGenerator

Implement the "Friedman #2" regression example generator.

The implementation is based on sklearn.datasets.make_friedman2.

Parameters:

Name Type Description Default
feature_size int

The feature size. The feature size has to be greater than or equal to 4. Out of all features, only 4 are actually used to compute the targets. The remaining features are independent of targets.

4
noise_std float

The standard deviation of the Gaussian noise.

0.0

Raises:

Type Description
ValueError

if one of the parameters is not valid.

Example usage:

>>> from startorch.example import Friedman2Regression
>>> generator = Friedman2Regression(feature_size=6)
>>> generator
Friedman2RegressionExampleGenerator(feature_size=6, noise_std=0.0)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}

startorch.example.Friedman2RegressionExampleGenerator.feature_size property

feature_size: int

The feature size when the data are created.

startorch.example.Friedman2RegressionExampleGenerator.noise_std property

noise_std: float

The standard deviation of the Gaussian noise.

startorch.example.Friedman3Regression

Bases: BaseExampleGenerator

Implement the "Friedman #3" regression example generator.

The implementation is based on sklearn.datasets.make_friedman3.

Parameters:

Name Type Description Default
feature_size int

The feature size. The feature size has to be greater than or equal to 4. Out of all features, only 4 are actually used to compute the targets. The remaining features are independent of targets.

4
noise_std float

The standard deviation of the Gaussian noise.

0.0

Raises:

Type Description
ValueError

if one of the parameters is not valid.

Example usage:

>>> from startorch.example import Friedman3Regression
>>> generator = Friedman3Regression(feature_size=6)
>>> generator
Friedman3RegressionExampleGenerator(feature_size=6, noise_std=0.0)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}

startorch.example.Friedman3Regression.feature_size property

feature_size: int

The feature size when the data are created.

startorch.example.Friedman3Regression.noise_std property

noise_std: float

The standard deviation of the Gaussian noise.

startorch.example.Friedman3RegressionExampleGenerator

Bases: BaseExampleGenerator

Implement the "Friedman #3" regression example generator.

The implementation is based on sklearn.datasets.make_friedman3.

Parameters:

Name Type Description Default
feature_size int

The feature size. The feature size has to be greater than or equal to 4. Out of all features, only 4 are actually used to compute the targets. The remaining features are independent of targets.

4
noise_std float

The standard deviation of the Gaussian noise.

0.0

Raises:

Type Description
ValueError

if one of the parameters is not valid.

Example usage:

>>> from startorch.example import Friedman3Regression
>>> generator = Friedman3Regression(feature_size=6)
>>> generator
Friedman3RegressionExampleGenerator(feature_size=6, noise_std=0.0)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}

startorch.example.Friedman3RegressionExampleGenerator.feature_size property

feature_size: int

The feature size when the data are created.

startorch.example.Friedman3RegressionExampleGenerator.noise_std property

noise_std: float

The standard deviation of the Gaussian noise.

startorch.example.HypercubeClassification

Bases: BaseExampleGenerator

Implement a classification example generator.

The data are generated by using a hypercube. The targets are some vertices of the hypercube. Each input feature is a 1-hot representation of the target plus a Gaussian noise. These data can be used for a multi-class classification task.

Parameters:

Name Type Description Default
num_classes int

The number of classes.

50
feature_size int

The feature size. The feature size has to be greater than the number of classes.

64
noise_std float

The standard deviation of the Gaussian noise.

0.2

Raises:

Type Description
ValueError

if one of the parameters is not valid.

Example usage:

>>> from startorch.example import HypercubeClassification
>>> generator = HypercubeClassification(num_classes=5, feature_size=6)
>>> generator
HypercubeClassificationExampleGenerator(num_classes=5, feature_size=6, noise_std=0.2)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}

startorch.example.HypercubeClassification.feature_size property

feature_size: int

The feature size when the data are created.

startorch.example.HypercubeClassification.noise_std property

noise_std: float

The standard deviation of the Gaussian noise.

startorch.example.HypercubeClassification.num_classes property

num_classes: int

The number of classes when the data are created.

startorch.example.HypercubeClassificationExampleGenerator

Bases: BaseExampleGenerator

Implement a classification example generator.

The data are generated by using a hypercube. The targets are some vertices of the hypercube. Each input feature is a 1-hot representation of the target plus a Gaussian noise. These data can be used for a multi-class classification task.

Parameters:

Name Type Description Default
num_classes int

The number of classes.

50
feature_size int

The feature size. The feature size has to be greater than the number of classes.

64
noise_std float

The standard deviation of the Gaussian noise.

0.2

Raises:

Type Description
ValueError

if one of the parameters is not valid.

Example usage:

>>> from startorch.example import HypercubeClassification
>>> generator = HypercubeClassification(num_classes=5, feature_size=6)
>>> generator
HypercubeClassificationExampleGenerator(num_classes=5, feature_size=6, noise_std=0.2)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}

startorch.example.HypercubeClassificationExampleGenerator.feature_size property

feature_size: int

The feature size when the data are created.

startorch.example.HypercubeClassificationExampleGenerator.noise_std property

noise_std: float

The standard deviation of the Gaussian noise.

startorch.example.HypercubeClassificationExampleGenerator.num_classes property

num_classes: int

The number of classes when the data are created.

startorch.example.LinearRegression

Bases: BaseExampleGenerator

Implement a regression example generator where the data are generated with an underlying linear model.

The implementation is based on sklearn.datasets.make_regression.

Parameters:

Name Type Description Default
weights Tensor | Sequence[float]

The linear weights in the underlying linear model. It must be a float tensor of shape (feature_size,).

required
bias float

The bias term in the underlying linear model.

0.0
noise_std float

The standard deviation of the Gaussian noise.

0.0

Raises:

Type Description
ValueError

if one of the parameters is not valid.

Example usage:

>>> from startorch.example import LinearRegression
>>> generator = LinearRegression.create_uniform_weights()
>>> generator
LinearRegressionExampleGenerator(feature_size=100, bias=0.0, noise_std=0.0)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}

startorch.example.LinearRegression.bias property

bias: float

The bias of the underlying linear model.

startorch.example.LinearRegression.feature_size property

feature_size: int

The feature size.

startorch.example.LinearRegression.noise_std property

noise_std: float

The standard deviation of the Gaussian noise.

startorch.example.LinearRegression.weights property

weights: Tensor

torch.Tensor: The weights of the underlying linear model.

startorch.example.LinearRegressionExampleGenerator

Bases: BaseExampleGenerator

Implement a regression example generator where the data are generated with an underlying linear model.

The implementation is based on sklearn.datasets.make_regression.

Parameters:

Name Type Description Default
weights Tensor | Sequence[float]

The linear weights in the underlying linear model. It must be a float tensor of shape (feature_size,).

required
bias float

The bias term in the underlying linear model.

0.0
noise_std float

The standard deviation of the Gaussian noise.

0.0

Raises:

Type Description
ValueError

if one of the parameters is not valid.

Example usage:

>>> from startorch.example import LinearRegression
>>> generator = LinearRegression.create_uniform_weights()
>>> generator
LinearRegressionExampleGenerator(feature_size=100, bias=0.0, noise_std=0.0)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}

startorch.example.LinearRegressionExampleGenerator.bias property

bias: float

The bias of the underlying linear model.

startorch.example.LinearRegressionExampleGenerator.feature_size property

feature_size: int

The feature size.

startorch.example.LinearRegressionExampleGenerator.noise_std property

noise_std: float

The standard deviation of the Gaussian noise.

startorch.example.LinearRegressionExampleGenerator.weights property

weights: Tensor

torch.Tensor: The weights of the underlying linear model.

startorch.example.MoonsClassification

Bases: BaseExampleGenerator

Implements a binary classification example generator where the data are generated with a large circle containing a smaller circle in 2d.

The implementation is based on sklearn.datasets.make_moons.

Parameters:

Name Type Description Default
shuffle bool

If True, the examples are shuffled.

True
noise_std float

The standard deviation of the Gaussian noise.

0.0
ratio float

The ratio between the number of examples in outer circle and inner circle.

0.5

Raises:

Type Description
TypeError

if one of the parameters has an invalid type.

RuntimeError

if one of the parameters has an invalid value.

Example usage:

>>> from startorch.example import MoonsClassification
>>> generator = MoonsClassification()
>>> generator
MoonsClassificationExampleGenerator(shuffle=True, noise_std=0.0, ratio=0.5)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}

startorch.example.MoonsClassification.noise_std property

noise_std: float

The standard deviation of the Gaussian noise.

startorch.example.MoonsClassification.ratio property

ratio: float

The ratio between the number of examples in outer circle and inner circle.

startorch.example.MoonsClassificationExampleGenerator

Bases: BaseExampleGenerator

Implements a binary classification example generator where the data are generated with a large circle containing a smaller circle in 2d.

The implementation is based on sklearn.datasets.make_moons.

Parameters:

Name Type Description Default
shuffle bool

If True, the examples are shuffled.

True
noise_std float

The standard deviation of the Gaussian noise.

0.0
ratio float

The ratio between the number of examples in outer circle and inner circle.

0.5

Raises:

Type Description
TypeError

if one of the parameters has an invalid type.

RuntimeError

if one of the parameters has an invalid value.

Example usage:

>>> from startorch.example import MoonsClassification
>>> generator = MoonsClassification()
>>> generator
MoonsClassificationExampleGenerator(shuffle=True, noise_std=0.0, ratio=0.5)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}

startorch.example.MoonsClassificationExampleGenerator.noise_std property

noise_std: float

The standard deviation of the Gaussian noise.

startorch.example.MoonsClassificationExampleGenerator.ratio property

ratio: float

The ratio between the number of examples in outer circle and inner circle.

startorch.example.SwissRoll

Bases: BaseExampleGenerator

Implements a manifold example generator based on the Swiss roll pattern.

The implementation is based on sklearn.datasets.make_swiss_roll.

Parameters:

Name Type Description Default
noise_std float

The standard deviation of the Gaussian noise.

0.0
spin float

The number of spins of the Swiss roll.

1.5
hole bool

If True generates the Swiss roll with a hole.

False

Raises:

Type Description
ValueError

if one of the parameters is not valid.

Example usage:

>>> from startorch.example import SwissRoll
>>> generator = SwissRoll()
>>> generator
SwissRollExampleGenerator(noise_std=0.0, spin=1.5, hole=False)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}

startorch.example.SwissRoll.noise_std property

noise_std: float

The standard deviation of the Gaussian noise.

startorch.example.SwissRoll.spin property

spin: float

The number of spins.

startorch.example.SwissRollExampleGenerator

Bases: BaseExampleGenerator

Implements a manifold example generator based on the Swiss roll pattern.

The implementation is based on sklearn.datasets.make_swiss_roll.

Parameters:

Name Type Description Default
noise_std float

The standard deviation of the Gaussian noise.

0.0
spin float

The number of spins of the Swiss roll.

1.5
hole bool

If True generates the Swiss roll with a hole.

False

Raises:

Type Description
ValueError

if one of the parameters is not valid.

Example usage:

>>> from startorch.example import SwissRoll
>>> generator = SwissRoll()
>>> generator
SwissRollExampleGenerator(noise_std=0.0, spin=1.5, hole=False)
>>> batch = generator.generate(batch_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}

startorch.example.SwissRollExampleGenerator.noise_std property

noise_std: float

The standard deviation of the Gaussian noise.

startorch.example.SwissRollExampleGenerator.spin property

spin: float

The number of spins.

startorch.example.TensorExampleGenerator

Bases: BaseExampleGenerator

Implement an example generator to generate time series.

Parameters:

Name Type Description Default
generators Mapping[str, BaseTensorGenerator | dict]

The tensor generators or their configurations.

required
size Sequence[int]

The output tensor shape excepts the first dimension which is set to batch_size.

()

Example usage:

>>> from startorch.example import TensorExampleGenerator
>>> from startorch.tensor import RandInt, RandUniform
>>> generator = TensorExampleGenerator(
...     generators={"value": RandUniform(), "time": RandUniform()},
...     size=(6,),
... )
>>> generator
TensorExampleGenerator(
  (value): RandUniformTensorGenerator(low=0.0, high=1.0)
  (time): RandUniformTensorGenerator(low=0.0, high=1.0)
  (size): (6,)
)
>>> generator.generate(batch_size=10)
{'value': tensor([[...]]), 'time': tensor([[...]])}

startorch.example.TimeSeriesExampleGenerator

Bases: BaseExampleGenerator

Implement an example generator to generate time series.

Parameters:

Name Type Description Default
generators BaseTimeSeriesGenerator | dict

A time series generator or its configuration.

required
seq_len BaseTensorGenerator | dict

The sequence length sampler or its configuration. This sampler is used to sample the sequence length at each batch.

required

Example usage:

>>> from startorch.example import TimeSeriesExampleGenerator
>>> from startorch.timeseries import SequenceTimeSeriesGenerator
>>> from startorch.sequence import Periodic, RandUniform
>>> from startorch.tensor import RandInt
>>> generator = TimeSeriesExampleGenerator(
...     generators=SequenceTimeSeriesGenerator(
...         {"value": RandUniform(), "time": RandUniform()}
...     ),
...     seq_len=RandInt(2, 5),
... )
>>> generator
TimeSeriesExampleGenerator(
  (generators): SequenceTimeSeriesGenerator(
      (value): RandUniformSequenceGenerator(low=0.0, high=1.0, feature_size=(1,))
      (time): RandUniformSequenceGenerator(low=0.0, high=1.0, feature_size=(1,))
    )
  (seq_len): RandIntTensorGenerator(low=2, high=5)
)
>>> generator.generate(batch_size=10)
{'value': tensor([[...]]), 'time': tensor([[...]])}

startorch.example.TransformExampleGenerator

Bases: BaseExampleGenerator

Implement an example generator that generates examples, and then transformes them.

Parameters:

Name Type Description Default
generator BaseExampleGenerator | dict

The example generator or its configuration.

required
transformer BaseTransformer | dict

The data transformer or its configuration.

required

Example usage:

>>> from startorch.example import TransformExampleGenerator, HypercubeClassification
>>> from startorch.transformer import TensorTransformer
>>> from startorch.tensor.transformer import Abs
>>> generator = TransformExampleGenerator(
...     generator=HypercubeClassification(num_classes=5, feature_size=6),
...     transformer=TensorTransformer(
...         transformer=Abs(), input="feature", output="feature_transformed"
...     ),
... )
>>> generator
TransformExampleGenerator(
  (generator): HypercubeClassificationExampleGenerator(num_classes=5, feature_size=6, noise_std=0.2)
  (transformer): TensorTransformer(
      (transformer): AbsTensorTransformer()
      (input): feature
      (output): feature_transformed
      (exist_ok): False
    )
)
>>> generator.generate(batch_size=10)
{'target': tensor([...]), 'feature': tensor([[...]]), 'feature_transformed': tensor([[...]])}

startorch.example.VanillaExampleGenerator

Bases: BaseExampleGenerator

Implement an example generator to "generate" the input data.

Parameters:

Name Type Description Default
data dict[Hashable, Tensor]

The data to generate. The dictionary cannot be empty.

required

Raises:

Type Description
ValueError

if data is an empty dictionary.

Example usage:

>>> import torch
>>> from startorch.example import VanillaExampleGenerator
>>> generator = VanillaExampleGenerator(
...     data={"value": torch.ones(10, 3), "time": torch.arange(10)}
... )
>>> generator
VanillaExampleGenerator(batch_size=10)
>>> generator.generate(batch_size=5)
{'value': tensor([[1., 1., 1.],
                  [1., 1., 1.],
                  [1., 1., 1.],
                  [1., 1., 1.],
                  [1., 1., 1.]]),
 'time': tensor([0, 1, 2, 3, 4])}

startorch.example.is_example_generator_config

is_example_generator_config(config: dict) -> bool

Indicate if the input configuration is a configuration for a BaseExampleGenerator.

This function only checks if the value of the key _target_ is valid. It does not check the other values. If _target_ indicates a function, the returned type hint is used to check the class.

Parameters:

Name Type Description Default
config dict

The configuration to check.

required

Returns:

Type Description
bool

True if the input configuration is a configuration for a BaseExampleGenerator object.

Example usage:

```pycon

>>> from startorch.example import is_example_generator_config
>>> is_example_generator_config({"_target_": "startorch.example.HypercubeClassification"})
True

startorch.example.make_blobs_classification

make_blobs_classification(
    num_examples: int,
    centers: Tensor,
    cluster_std: Tensor | float = 1.0,
    generator: Generator | None = None,
) -> dict[str, Tensor]

Generate a classification dataset where the data are gnerated from isotropic Gaussian blobs for clustering.

The implementation is based on https://scikit-learn.org/stable/modules/generated/sklearn.datasets.make_blobs.html

Parameters:

Name Type Description Default
num_examples int

The number of examples.

required
centers Tensor

The cluster centers used to generate the examples. It must be a float tensor of shape (num_clusters, feature_size).

required
cluster_std Tensor | float

The standard deviation of the clusters. It must be a float tensor of shape (num_clusters, feature_size).

1.0
generator Generator | None

An optional random number generator.

None

Returns:

Type Description
dict[str, Tensor]

A dictionary with two items: - 'input': a BatchedTensor of type float and shape (num_examples, feature_size). This tensor represents the input features. - 'target': a BatchedTensor of type long and shape (num_examples,). This tensor represents the targets.

Raises:

Type Description
RuntimeError

if one of the parameters is not valid.

Example usage:

>>> import torch
>>> from startorch.example import make_blobs_classification
>>> batch = make_blobs_classification(num_examples=10, centers=torch.rand(5, 2))
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}

startorch.example.make_circles_classification

make_circles_classification(
    num_examples: int = 100,
    shuffle: bool = True,
    noise_std: float = 0.0,
    factor: float = 0.8,
    ratio: float = 0.5,
    generator: Generator | None = None,
) -> dict[str, Tensor]

Generate a binary classification dataset where the data are generated with a large circle containing a smaller circle in 2d.

The implementation is based on sklearn.datasets.make_circles.

Parameters:

Name Type Description Default
num_examples int

The number of examples.

100
shuffle bool

If True, the examples are shuffled.

True
noise_std float

The standard deviation of the Gaussian noise.

0.0
factor float

The scale factor between inner and outer circle in the range [0, 1).

0.8
ratio float

The ratio between the number of examples in outer circle and inner circle.

0.5
generator Generator | None

An optional random generator.

None

Returns:

Type Description
dict[str, Tensor]

A dictionary with two items: - 'input': a BatchedTensor of type float and shape (num_examples, 2). This tensor represents the input features. - 'target': a BatchedTensor of type long and shape (num_examples,). This tensor represents the targets.

Raises:

Type Description
RuntimeError

if one of the parameters is not valid.

Example usage:

>>> from startorch.example import make_circles_classification
>>> batch = make_circles_classification(num_examples=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}

startorch.example.make_friedman1_regression

make_friedman1_regression(
    num_examples: int = 100,
    feature_size: int = 10,
    noise_std: float = 0.0,
    generator: Generator | None = None,
) -> dict[str, Tensor]

Generate the "Friedman #1" regression data.

The implementation is based on sklearn.datasets.make_friedman1.

Parameters:

Name Type Description Default
num_examples int

The number of examples.

100
feature_size int

The feature size. The feature size has to be greater than or equal to 5. Out of all features, only 5 are actually used to compute the targets. The remaining features are independent of targets.

10
noise_std float

The standard deviation of the Gaussian noise.

0.0
generator Generator | None

An optional random number generator.

None

Returns:

Type Description
dict[str, Tensor]

A dictionary with two items: - 'input': a BatchedTensor of type float and shape (num_examples, feature_size). This tensor represents the input features. - 'target': a BatchedTensor of type float and shape (num_examples,). This tensor represents the targets.

Raises:

Type Description
RuntimeError

if one of the parameters is not valid.

Example usage:

>>> from startorch.example import make_friedman1_regression
>>> batch = make_friedman1_regression(num_examples=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}

startorch.example.make_friedman2_regression

make_friedman2_regression(
    num_examples: int = 100,
    feature_size: int = 4,
    noise_std: float = 0.0,
    generator: Generator | None = None,
) -> dict[str, Tensor]

Generate the "Friedman #2" regression data.

The implementation is based on sklearn.datasets.make_friedman2.

Parameters:

Name Type Description Default
num_examples int

The number of examples.

100
feature_size int

The feature size. The feature size has to be greater than or equal to 4. Out of all features, only 4 are actually used to compute the targets. The remaining features are independent of targets.

4
noise_std float

The standard deviation of the Gaussian noise.

0.0
generator Generator | None

An optional random number generator.

None

Returns:

Type Description
dict[str, Tensor]

A dictionary with two items: - 'input': a BatchedTensor of type float and shape (num_examples, feature_size). This tensor represents the input features. - 'target': a BatchedTensor of type float and shape (num_examples,). This tensor represents the targets.

Raises:

Type Description
RuntimeError

if one of the parameters is not valid.

Example usage:

>>> from startorch.example import make_friedman2_regression
>>> batch = make_friedman2_regression(num_examples=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}

startorch.example.make_friedman3_regression

make_friedman3_regression(
    num_examples: int = 100,
    feature_size: int = 4,
    noise_std: float = 0.0,
    generator: Generator | None = None,
) -> dict[str, Tensor]

Generate the "Friedman #3" regression problem.

The implementation is based on sklearn.datasets.make_friedman3.

Parameters:

Name Type Description Default
num_examples int

The number of examples.

100
feature_size int

The feature size. The feature size has to be greater than or equal to 4. Out of all features, only 4 are actually used to compute the targets. The remaining features are independent of targets.

4
noise_std float

The standard deviation of the Gaussian noise.

0.0
generator Generator | None

An optional random number generator.

None

Returns:

Type Description
dict[str, Tensor]

A dictionary with two items: - 'input': a BatchedTensor of type float and shape (num_examples, feature_size). This tensor represents the input features. - 'target': a BatchedTensor of type float and shape (num_examples,). This tensor represents the targets.

Raises:

Type Description
RuntimeError

if one of the parameters is not valid.

Example usage:

>>> from startorch.example import make_friedman3_regression
>>> batch = make_friedman3_regression(num_examples=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}

startorch.example.make_hypercube_classification

make_hypercube_classification(
    num_examples: int = 1000,
    num_classes: int = 50,
    feature_size: int = 64,
    noise_std: float = 0.2,
    generator: Generator | None = None,
) -> dict[str, Tensor]

Generate a synthetic classification dataset based on hypercube vertex structure.

The data are generated by using a hypercube. The targets are some vertices of the hypercube. Each input feature is a 1-hot representation of the target plus a Gaussian noise. These data can be used for a multi-class classification task.

Parameters:

Name Type Description Default
num_examples int

The number of examples.

1000
num_classes int

The number of classes.

50
feature_size int

The feature size. The feature size has to be greater than the number of classes.

64
noise_std float

The standard deviation of the Gaussian noise.

0.2
generator Generator | None

An optional random generator.

None

Returns:

Type Description
dict[str, Tensor]

A dictionary with two items: - 'input': a torch.Tensor of type float and shape (num_examples, feature_size). This tensor represents the input features. - 'target': a torch.Tensor of type long and shape (num_examples,). This tensor represents the targets.

Raises:

Type Description
RuntimeError

if one of the parameters is not valid.

Example usage:

>>> from startorch.example.hypercube import make_hypercube_classification
>>> batch = make_hypercube_classification(num_examples=10, num_classes=5, feature_size=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}

startorch.example.make_linear_regression

make_linear_regression(
    weights: Tensor,
    bias: float = 0.0,
    num_examples: int = 100,
    noise_std: float = 0.0,
    generator: Generator | None = None,
) -> dict[str, Tensor]

Generate a regression dataset where the data are generated with an underlying linear model.

The features are sampled from a Normal distribution. Then, the targets are generated by applying a random linear regression model.

Parameters:

Name Type Description Default
weights Tensor

The linear weights in the underlying linear model. It must be a float tensor of shape (feature_size,).

required
bias float

The bias term in the underlying linear model.

0.0
num_examples int

The number of examples to generate.

100
noise_std float

The standard deviation of the Gaussian noise.

0.0
generator Generator | None

An optional random generator.

None

Returns:

Type Description
dict[str, Tensor]

A dictionary with two items: - 'input': a BatchedTensor of type float and shape (num_examples, feature_size). This tensor represents the input features. - 'target': a BatchedTensor of type float and shape (num_examples,). This tensor represents the targets.

Raises:

Type Description
RuntimeError

if one of the parameters is not valid.

Example usage:

>>> from startorch.example import make_linear_regression
>>> batch = make_linear_regression(weights=torch.rand(10), num_examples=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}

startorch.example.make_moons_classification

make_moons_classification(
    num_examples: int = 100,
    shuffle: bool = True,
    noise_std: float = 0.0,
    ratio: float = 0.5,
    generator: Generator | None = None,
) -> dict[str, Tensor]

Generate a binary classification dataset where the data are two interleaving half circles in 2d.

The implementation is based on sklearn.datasets.make_moons.

Parameters:

Name Type Description Default
num_examples int

The number of examples.

100
shuffle bool

If True, the examples are shuffled.

True
noise_std float

The standard deviation of the Gaussian noise.

0.0
ratio float

The ratio between the number of examples in outer circle and inner circle.

0.5
generator Generator | None

An optional random generator.

None

Returns:

Type Description
dict[str, Tensor]

A dictionary with two items: - 'input': a BatchedTensor of type float and shape (num_examples, 2). This tensor represents the input features. - 'target': a BatchedTensor of type long and shape (num_examples,). This tensor represents the targets.

Raises:

Type Description
RuntimeError

if one of the parameters is not valid.

Example usage:

>>> from startorch.example import make_moons_classification
>>> batch = make_moons_classification(num_examples=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}

startorch.example.make_sparse_uncorrelated_regression

make_sparse_uncorrelated_regression(
    num_examples: int = 100,
    feature_size: int = 4,
    noise_std: float = 0.0,
    generator: Generator | None = None,
) -> dict[str, Tensor]

Generate a random regression problem with sparse uncorrelated design.

The implementation is based on sklearn.datasets.make_sparse_uncorrelated.

Parameters:

Name Type Description Default
num_examples int

The number of examples.

100
feature_size int

The feature size. The feature size has to be greater than or equal to 4. Out of all features, only 4 are actually used to compute the targets. The remaining features are independent of targets.

4
noise_std float

The standard deviation of the Gaussian noise.

0.0
generator Generator | None

An optional random generator.

None

Returns:

Type Description
dict[str, Tensor]

A batch with two items: - 'input': a BatchedTensor of type float and shape (num_examples, feature_size). This tensor represents the input features. - 'target': a BatchedTensor of type float and shape (num_examples,). This tensor represents the targets.

Raises:

Type Description
RuntimeError

if one of the parameters is not valid.

Example usage:

>>> from startorch.example import make_sparse_uncorrelated_regression
>>> data = make_sparse_uncorrelated_regression(num_examples=10)
>>> data
{'target': tensor([...]), 'feature': tensor([[...]])}

startorch.example.make_swiss_roll

make_swiss_roll(
    num_examples: int = 100,
    noise_std: float = 0.0,
    spin: float = 1.5,
    hole: bool = False,
    generator: Generator | None = None,
) -> dict[str, Tensor]

Generate a toy manifold dataset based on Swiss roll pattern.

The implementation is based on sklearn.datasets.make_swiss_roll.

Parameters:

Name Type Description Default
num_examples int

The number of examples.

100
noise_std float

The standard deviation of the Gaussian noise.

0.0
spin float

The number of spins of the Swiss roll.

1.5
hole bool

If True generates the Swiss roll with hole dataset.

False
generator Generator | None

An optional random generator.

None

Returns:

Type Description
dict[str, Tensor]

A batch with two items: - 'input': a BatchedTensor of type float and shape (num_examples, 3). This tensor represents the input features. - 'target': a BatchedTensor of type float and shape (num_examples,). This tensor represents the targets.

Raises:

Type Description
RuntimeError

if one of the parameters is not valid.

Example usage:

>>> from startorch.example import make_swiss_roll
>>> batch = make_swiss_roll(num_examples=10)
>>> batch
{'target': tensor([...]), 'feature': tensor([[...]])}

startorch.example.setup_example_generator

setup_example_generator(
    generator: BaseExampleGenerator | dict,
) -> BaseExampleGenerator

Set up an example generator.

The time series generator is instantiated from its configuration by using the BaseExampleGenerator factory function.

Parameters:

Name Type Description Default
generator BaseExampleGenerator | dict

An example generator or its configuration.

required

Returns:

Type Description
BaseExampleGenerator

An example generator.

Example usage:

>>> from startorch.example import setup_example_generator
>>> generator = setup_example_generator(
...     {"_target_": "startorch.example.HypercubeClassification"}
... )
>>> generator
HypercubeClassificationExampleGenerator(num_classes=50, feature_size=64, noise_std=0.2)