lir.datasets.synthesized_normal_binary module

class lir.datasets.synthesized_normal_binary.SynthesizedNormalBinaryData(h1_params: SynthesizedNormalData, h2_params: SynthesizedNormalData, seed: int | None = None)[source]

Bases: DataProvider

Implementation of a data source generating normally distributed binary class data.

Parameters:
  • h1_params (SynthesizedNormalData) – Distribution parameters used to sample class-1 data.

  • h2_params (SynthesizedNormalData) – Distribution parameters used to sample class-2 data.

  • seed (int | None) – Random seed controlling stochastic behaviour for reproducible results.

get_instances() FeatureData[source]

Return instances with randomly synthesized data and binary labels.

The features are drawn from a normal distribution, as configured.

Returns:

FeatureData object parsed from the source.

Return type:

FeatureData

class lir.datasets.synthesized_normal_binary.SynthesizedNormalData(mean: float, std: float, size: int | tuple[int, int])[source]

Bases: object

Representation of normally distributed data, leveraging a number generator.

The generated data can be used to generate normally distributed data and is useful for debugging purposes or gaining insight in the effect of varying parts within the LR system pipeline.

Parameters:
  • mean (float) – Mean value of the generated normal distribution.

  • std (float) – Standard deviation of the generated normal distribution.

  • size (int | tuple[int, int]) – Number of samples to generate.

get(rng: Generator) ndarray[source]

Draw random samples from a normally distributed data set.

Parameters:

rng (numpy.random.Generator) – Random number generator used for sampling.

Returns:

Randomly sampled values for this distribution configuration.

Return type:

np.ndarray