lir.data_strategies.pairs module
- class lir.data_strategies.pairs.PairsTrainTestSplit(test_size: float | int, seed: int | None = None)[source]
Bases:
DataStrategyA train/test split policy for paired instances.
The input data should have
source_idswith two columns. This split assigns all sources to either the training set or the test set. The pairs are assigned to training or testing if both of their sources have that role. Pairs with mixed roles are omitted.- Parameters:
- apply(instances: DataType) Iterator[tuple[DataType, DataType]][source]
Split the data into a training set and a test set.
- Parameters:
instances (InstanceDataType) – Input instances to be processed by this method.
- Yields:
tuple[DataType, DataType] – An iterator over a single item, which is a tuple of the training set and the test set.
- lir.data_strategies.pairs.is_valid_input(instances: InstanceData) bool[source]
Return True iff pair-based strategies can be applied.