lir.transform.pairing module
- class lir.transform.pairing.InstancePairing(same_source_limit: int | None = None, different_source_limit: int | None = None, ratio_limit: float | None = None, seed: int | None = None)[source]
Bases:
PairingMethodConstruct pairs from a set of instances.
Note that this pairing method may cause performance problems with large datasets, even if the number of instances in the output is limited.
The ratio is
ds pairs / ss pairs. The number of ds pairs will not exceedratio_limit * ss pairs. If bothratio_limitandsame_source_limit/different_source_limitare specified, the number of pairs is chosen such that the ratio_limit is preserved and the limit(s) are not exceeded, while taking as many pairs as possible within these constraints.- Parameters:
same_source_limit (int | None) – Limit for the number or fraction of same-source pairs.
different_source_limit (int | None) – Limit for the number or fraction of different-source pairs.
ratio_limit (float | None) – Maximum allowed ratio between same-source and different-source pairs.
seed (int | None) – Random seed controlling stochastic behaviour for reproducible results.
- pair(instances: InstanceData, n_trace_instances: int = 1, n_ref_instances: int = 1) PairedFeatureData[source]
Construct pairs.
- Parameters:
instances (InstanceData) – Input instances to be processed by this method.
n_trace_instances (int) – Number of trace instances to include in each pairing.
n_ref_instances (int) – Number of reference instances to include in each pairing.
- Returns:
FeatureData object parsed from the source.
- Return type:
- property rng: Generator
Obtain a random number generator using a provided seed.
- Returns:
Random number generator initialized from the configured seed.
- Return type:
np.random.Generator
- class lir.transform.pairing.PairingMethod[source]
Bases:
ABCBase class for pairing methods.
A pairing method should implement the pair() function.
- abstractmethod pair(instances: InstanceData, n_trace_instances: int = 1, n_ref_instances: int = 1) PairedFeatureData[source]
Take instances as input, and return pairs.
A pair may be a pair of sources, with multiple instances per source.
The returned features have dimensions (p, i, …)` where the first dimension is the pairs, the second dimension is the instances, and subsequent dimensions are the features. If the input has labels, the returned labels are an array of source labels, one label per pair, where the labels are 0=different source, 1=same source. Any other attributes are combined into tuples.
- Parameters:
instances (InstanceData) – Input instances to be processed by this method.
n_trace_instances (int) – Number of trace instances to include in each pairing.
n_ref_instances (int) – Number of reference instances to include in each pairing.
- Returns:
FeatureData object parsed from the source.
- Return type:
- class lir.transform.pairing.SourcePairing(same_source_limit: int | None = None, different_source_limit: int | None = None, ratio_limit: int | None = None, seed: Any | int = None)[source]
Bases:
PairingMethodConstruct pairs of sources (i.e. classes) from an array of instances.
While pairing at instance level results in pairs of instances, some same-source and some different-source, pairing at source level results in pairing of multiple instances of source A against multiple instances of source B, where A and B can be same-source or different-source.
- Parameters:
same_source_limit (int | None) – Limit for the number or fraction of same-source pairs.
different_source_limit (int | None) – Limit for the number or fraction of different-source pairs.
ratio_limit (int | None) – Maximum allowed ratio between same-source and different-source pairs.
seed (Any | int) – Random seed controlling stochastic behaviour for reproducible results.
- pair(instances: InstanceData, n_trace_instances: int = 1, n_ref_instances: int = 1) PairedFeatureData[source]
Pair sources.
Takes a FeatureData object that contains instances. Returns pairs as a PairedFeatureData object.
The input is expected to have source_ids, that govern how pairs are compiled. The input instances may be used in a pair, either as a trace instance or as a reference instance.
- Parameters:
instances (InstanceData) – Input instances to be processed by this method.
n_trace_instances (int) – Number of trace instances to include in each pairing.
n_ref_instances (int) – Number of reference instances to include in each pairing.
- Returns:
FeatureData object parsed from the source.
- Return type: