lir.datasets.synthesized_normal_multiclass module

class lir.datasets.synthesized_normal_multiclass.SynthesizedDimension(population_mean: float, population_std: float, sources_std: float)[source]

Bases: NamedTuple

Representation of a data distribution.

population_mean: float

Alias for field number 0

population_std: float

Alias for field number 1

sources_std: float

Alias for field number 2

class lir.datasets.synthesized_normal_multiclass.SynthesizedNormalMulticlassData(dimensions: list[SynthesizedDimension], population_size: int, sources_size: int, seed: int | None)[source]

Bases: DataProvider

Implementation of a data source generating normally distributed multiclass data.

Parameters:
  • dimensions (list[SynthesizedDimension]) – Number of feature dimensions to include in the header.

  • population_size (int) – Number of sources to sample in the synthetic population.

  • sources_size (int) – Number of source groups represented in the dataset.

  • seed (int | None) – Random seed controlling stochastic behaviour for reproducible results.

get_instances() FeatureData[source]

Return instances with randomly synthesized data and multi-class labels.

The features are drawn from a normal distribution, as configured.

Returns:

FeatureData object parsed from the source.

Return type:

FeatureData