lir package

Subpackages

Submodules

lir.aggregation module

class lir.aggregation.AggregatePlot(plot_fn: Callable, plot_name: str, output_dir: Path | None = None, **kwargs: Any)

Bases: Aggregation

Aggregation that generates plots by repeatedly calling a plotting function.

report(data: AggregationData) → None: Plot the data when new results are available.

class lir.aggregation.Aggregation

Bases: ABC

Base representation of an aggregated data collection.

Other classes may extend from this class.

close() → None

Finalize the aggregation; no more results will come in.

The close method is called at the end of gathering the aggregation(s) to ensure files are closed, buffers are cleared, or other things that need to finish / tear down.

abstractmethod report(data: AggregationData) → None

Report that new results are available.

Parameters:: data – a named tuple containing the results

class lir.aggregation.AggregationData(llrdata: LLRData, lrsystem: LRSystem, parameters: dict[str, Any])

Bases: NamedTuple

Representation of aggregated data.

Fields: - llrdata: the LLR data containing LLRs and labels. - lrsystem: the model that produced the results - parameters: parameters that identify the system producing the results.

llrdata: LLRData: Alias for field number 0

lrsystem: LRSystem: Alias for field number 1

parameters: dict[str, Any]: Alias for field number 2

class lir.aggregation.WriteMetricsToCsv(output_dir: Path, columns: Mapping[str, Callable])

Bases: Aggregation

Helper class to write aggregated results to CSV file.

close() → None: Ensure the CSV file is properly closed after writing.

report(data: AggregationData) → None: Write the metrics to CSV.

lir.bounding module

class lir.bounding.LLRBounder(lower_llr_bound: float | None = None, upper_llr_bound: float | None = None)

Bases: Transformer, ABC

Base class for LLR bounders.

A bounder updates any LLRs that are out of bounds. Any LLR values within bounds remain unchanged. LLR values that are out-of-bounds are updated to the nearest bound.

apply(instances: InstanceData) → LLRData: Recalculate the LLR data using the first step calibrator and applying the bounds.

abstractmethod calculate_bounds(llrdata: LLRData) → tuple[float | None, float | None]: Calculate and returns appropriate bounds for a set of LLRs and their labels.

fit(instances: InstanceData) → Self

Configure this bounder by calculating bounds.

assuming that y=1 corresponds to Hp, y=0 to Hd

class lir.bounding.NSourceBounder(lower_llr_bound: float | None = None, upper_llr_bound: float | None = None)

Bases: LLRBounder

Bound LLRs based on the number of sources.

This bounder sets the lower LLR bound to -log(N) and the upper bound to log(N), where N is the number of sources.

In non-log space, this corresponds to bounding likelihood ratios to [1/N, N]. This is a logical consequence of having N sources: no source can provide more than N support for one hypothesis over the other.

calculate_bounds(llrdata: LLRData) → tuple[float | None, float | None]: Calculate and return the lower and upper LLR bounds.

class lir.bounding.StaticBounder(lower_llr_bound: float | None, upper_llr_bound: float | None)

Bases: LLRBounder

Bound LLRs to constant values.

This bounder takes arguments for a lower and upper bound, which may take None in which case no bounds are applied.

calculate_bounds(llrdata: LLRData) → tuple[float | None, float | None]: Calculate and return the lower and upper LLR bounds.

lir.experiment module

class lir.experiment.Experiment(name: str, data_provider: DataProvider, splitter: DataStrategy, outputs: Sequence[Aggregation], output_path: Path)

Bases: ABC

Representation of an experiment pipeline run for each provided LR system.

run() → None

Run experiment for all configured LR systems.

Perform the single experiment of all configured LR systems and write the obtained key metrics - results on the performance of the LR system - to the dedicated metrics.csv file in the output_path directory.

class lir.experiment.PredefinedExperiment(name: str, data_provider: DataProvider, splitter: DataStrategy, outputs: Sequence[Aggregation], output_path: Path, lrsystems: Iterable[tuple[LRSystem, dict[str, Any]]])

Bases: Experiment

Representation of an experiment run for each provided LR system.

lir.main module

lir.main.copy_yaml_definition(output_dir: Path, config_yaml_path: Path) → None: Copy the YAML definition for a given LR system experiment to persist the used configuration.

lir.main.error(msg: str, e: Exception | None = None) → None: Stop execution with given error message or raise exception.

lir.main.initialize_experiments(cfg: Configuration) → tuple[Mapping[str, Experiment], Path]

Extract which Experiment to run as dictated in the configuration.

The following pre-defined variables are injected to the configuration:

timestamp: a formatted timestamp of the current date/time

Parameters:: cfg – a Configuration object describing the experiments
Returns:: a tuple with two elements: (1) mapping of names to experiments; (2) path to output directory

lir.main.initialize_logfile(output_dir: Path) → None: Set up logfile for debugging purposes when running experiment.

lir.main.main(input_args: list[str] | None = None) → None: Provide Command Line Interface (CLI) to LiR.

lir.main.setup_logging(level_increase: int) → None

Set up logging to stderr and to a file.

Parameters:: level_increase – log level for stderr, relative to the default log level

lir.optuna module

class lir.optuna.OptunaExperiment(name: str, data_provider: DataProvider, splitter: DataStrategy, outputs: Sequence[Aggregation], output_path: Path, baseline_config: ContextAwareDict, hyperparameters: list[Hyperparameter], n_trials: int, metric_function: Callable[[LLRData], float])

Bases: Experiment

Representation of an experiment run for each provided LR system.

lir.persistence module

class lir.persistence.SaveModel(path: Path)

Bases: Aggregation

Write the model to a file.

report(data: AggregationData) → None: Write the trained LR system model to file.

lir.persistence.load_model(path: Path) → LRSystem: Load previously cached model.

lir.persistence.save_model(path: Path, model: LRSystem) → None: Save a model to disk.

lir.registry module

class lir.registry.ClassLoader

Bases: ConfigParserLoader

A configuration parser loader that uses reflection to resolve class names.

get(key: str, default_config_parser: Callable[[Any], ConfigParser] | None = None, search_path: list[str] | None = None) → ConfigParser: Get the accompanying config parser class from the registry.

exception lir.registry.ComponentNotFoundError

Bases: ValueError

Representation of an error when a component class can not be found.

class lir.registry.ConfigParserLoader

Bases: ABC, Iterable

Base class for a configuration parser loader.

A configuration parser is able to interpret a dictionary-style configuration loaded from a YAML. Sub classes are expected to implement the get() method.

abstractmethod get(key: str, default_config_parser: Callable[[Any], ConfigParser] | None = None, search_path: list[str] | None = None) → ConfigParser

Retrieve a value for a given key name.

The key may resolve to a ConfigParser class, or it is passed as an argument to default_config_parser, which in turn returns a ConfigParser class.

Parameters:

key – the key name to resolve
default_config_parser – a function that returns a ConfigParser if the key does not resolve to a ConfigParser
search_path – the domain of the search query

Returns:

a ConfigParser object

class lir.registry.FederatedLoader(registries: list[ConfigParserLoader])

Bases: ConfigParserLoader

A configuration parser loader that delegates resolution to other loaders.

get(key: str, default_config_parser: ~collections.abc.Callable[[~typing.Any], ~lir.config.base.ConfigParser] | None = <class 'lir.config.base.GenericConfigParser'>, search_path: list[str] | None = None) → ConfigParser: Get the accompanying config parser class from the registry.

exception lir.registry.InvalidRegistryEntryError

Bases: ValueError

Representation of an invalid registry entry.

class lir.registry.YamlRegistry(cfg: Configuration)

Bases: ConfigParserLoader

Representation of a YAML-based registry.

The YAML registry is expected to define “sections” as the top-level key names, followed by keys referring to (paths to) classnames or functions.

This registry parses this YAML mapping and provides access to these values through a get() method.

get(key: str, default_config_parser: Callable[[Any], ConfigParser] | None = None, search_path: list[str] | None = None) → ConfigParser

Retrieve a value for a given key name from the YAML-based registry.

An entry can take the following forms, available under the keys path.to.key1 and path.to.key2 respectively: ``` path.to.key1: ObjectName path.to.key2:

class: ObjectName

```

In the example, ObjectName refers to a Python object available in the current runtime.

lir.registry.get(name: str, default_config_parser: Callable[[Any], ConfigParser] | None = None, search_path: list[str] | None = None) → ConfigParser: Retrieve corresponding value for a given key name from the central registry.

lir.registry.registry() → ConfigParserLoader: Provide access to a centralized registry of available configuration options.

lir.util module

class lir.util.Bind

Bases: partial

Wrap partial to support the ellipsis (…) as a placeholder.

Can be used to fix parameters not at the end of the list of parameters (which is a limitation of partial).

class lir.util.LR(lr, p0, p1)

Bases: tuple

lr: Alias for field number 0

p0: Alias for field number 1

p1: Alias for field number 2

lir.util.Xn_to_Xy(*Xn: ndarray) → tuple[ndarray, ndarray]

Convert Xn to Xy format.

Xn is a format where samples are divided into separate variables based on class. Xy is a format where all samples are concatenated, with an equal length variable y indicating class.

lir.util.Xy_to_Xn(X: ndarray, y: ndarray, classes: list[int] | None = None) → list[ndarray]

Convert Xy to Xn format.

Xn is a format where samples are divided into separate variables based on class. Xy is a format where all samples are concatenated, with an equal length variable y indicating class.

lir.util.check_type(type_class: type[AnyType], v: Any, message: str | None = None) → AnyType: Check if a given input is of the expected, specified type.

lir.util.get_classes_from_Xy(X: ndarray, y: ndarray, classes: list[Any] | None = None) → ndarray: Get the classification classes from labeled data.

lir.util.ln_to_log10(ln_data: FloatOrArray) → FloatOrArray: Convert natural logarithm to 10-base logarithm.

lir.util.logodds_to_odds(log_odds: FloatOrArray) → FloatOrArray: Convert 10-base logarithm odds to odds.

lir.util.logodds_to_probability(log_odds: FloatOrArray) → FloatOrArray: Convert 10-base logarithm of odds to probability.

lir.util.odds_to_logodds(odds: FloatOrArray) → FloatOrArray: Convert odds to 10-base logarithm odds.

lir.util.odds_to_probability(odds: FloatOrArray) → FloatOrArray

Converts odds to a probability.

Returns:

1 , for odds values of inf
odds / (1 + odds), otherwise

lir.util.probability_to_logodds(p: FloatOrArray) → FloatOrArray: Converts probability values to their log odds with base 10.

lir.util.probability_to_odds(p: FloatOrArray) → FloatOrArray: Converts a probability to odds.

lir.util.warn_deprecated() → None: Provide template message for deprecated functions.

lir.visualization module

lir.visualization.ece(base_path: Path | None, llrdata: LLRData, ax: Any | None = None) → None: Generate and handle an ECE plot, either saving to a file or plotting on a given axis.

lir.visualization.llr_interval(base_path: Path | None, llrdata: LLRData, ax: Any | None = None) → None: Generate and handle a Score-LR plot, either saving to file or plotting on an axis.

lir.visualization.lr_histogram(base_path: Path | None, llrdata: LLRData, ax: Any | None = None) → None: Generate and handle a histogram plot of likelihood ratios, either saving to file or plotting on an axis.

lir.visualization.pav(base_path: Path | None, llrdata: LLRData, ax: Any | None = None) → None: Generate and handle a PAV plot, either saving to a file or plotting on a given axis.

Module contents

LiR - Toolkit for developing, optimising and evaluating Likelihood Ratio (LR) systems.

This allows benchmarking of LR systems on different datasets, investigating impact of different sampling schemes or techniques, and doing case-based validation and computation of case LRs.

lir.is_interactive() → bool

Determine if the LiR tool is running from the CLI and should be interactive.

This method is used, for example, to determine if a progress bar should be shown.