lir.algorithms.llr_overestimation module

lir.algorithms.llr_overestimation.calc_fiducial_density_functions(data: ndarray, grid: ndarray, df_type: str = 'pdf', num_fids: int = 1000, smoothing_grid_fraction: float = 0.1, smoothing_sample_size_correction: float = 1, seed: None | int = None) ndarray[source]

Calculate smoothed density functions of fiducial distributions for a dataset.

Parameters:
  • data (np.ndarray) – One-dimensional array of data points.

  • grid (np.ndarray) – One-dimensional array of equally spaced grid points.

  • df_type (str, optional) – Density function type: ‘pdf’ or ‘cdf’.

  • num_fids (int, optional) – Number of fiducial distributions to generate.

  • smoothing_grid_fraction (float, optional) – Fraction of grid points used as half window for smoothing.

  • smoothing_sample_size_correction (float, optional) – Sample-size correction factor for smoothing window size.

  • seed (int | None, optional) – Random seed for fiducial sampling.

Returns:

Density-function values evaluated on grid.

Return type:

np.ndarray

lir.algorithms.llr_overestimation.calc_llr_overestimation(llrs: ndarray, y: ndarray, num_fids: int = 1000, bw: tuple[str | float, str | float] = ('silverman', 'silverman'), num_grid_points: int = 100, alpha: float = 0.05, **kwargs: Any) tuple[ndarray | None, ndarray | None, ndarray | None][source]

Calculate LLR-overestimation as a function of the system LLR.

The LLR-overestimation is defined as the log-10 of the ratio between
  1. the system LRs; the outputs of the LR-system, and

  2. the empirical LRs; the ratio’s between the relative frequencies of the H1-LLRs and H2-LLRs.

  • It quantifies the deviation from the requirement that ‘the LR of the LR is the LR’: the ‘LR-consistency’.

  • For a perfect LR-system, the LLR-overestimation is 0: the system and empirical LRs are the same.

  • A positive LLR-overestimation indicates that the system LRs are too high, compared to the empirical LRs.

  • An LLR-overestimation of +1 indicates that the system LRs are too high by a factor of 10.

  • An LLR-overestimation of -1 indicates that the system LRs are too low by a factor of 10.

  • The relative frequencies are estimated with KDE using Silverman’s rule-of-thumb for the bandwidths.

  • An interval around the LLR-overestimation can be calculated using fiducial distributions.

Parameters:
  • llrs (np.ndarray) – Log10 likelihood ratios as calculated by the LR system.

  • y (np.ndarray) – Corresponding labels (0 for H2/Hd, 1 for H1/Hp).

  • num_fids (int, optional) – Number of fiducial distributions used for intervals.

  • bw (tuple[str | float, str | float], optional) – Bandwidth specifications for KDEs of H1 and H2.

  • num_grid_points (int, optional) – Number of grid points used for calculating overestimation.

  • alpha (float, optional) – Confidence level used for the interval.

  • **kwargs (Any) – Additional arguments passed to calc_fiducial_density_functions.

Returns:

Tuple of LLR grid, overestimation best estimate, and overestimation interval.

Return type:

tuple[np.ndarray | None, np.ndarray | None, np.ndarray | None]

lir.algorithms.llr_overestimation.plot_llr_overestimation(llrdata: LLRData, num_fids: int = 1000, ax: Axes = <module 'matplotlib.pyplot' from '/home/runner/work/lir/lir/.venv/lib/python3.12/site-packages/matplotlib/pyplot.py'>, **kwargs: Any) None[source]

Plot LLR-overestimation as a function of the system LLR.

The LLR-overestimation is defined as the log-10 of the ratio between
  1. the system LRs; the outputs of the LR-system, and

  2. the empirical LRs; the ratio’s between the relative frequencies of the H1-LLRs and H2-LLRs.

See documentation on calc_llr_overestimation() for more details on the LLR-overestimation.

An interval around the LLR-overestimation can be calculated using fiducial distributions. The average absolute LLR-overestimation can be used as single metric.

Parameters:
  • llrdata (LLRData) – LLR data containing LLRs and ground-truth labels.

  • num_fids (int, optional) – Number of fiducial distributions to base the interval on; use 0 for no interval.

  • ax (plt.Axes, optional) – Matplotlib axes to plot into.

  • **kwargs (Any) – Additional arguments passed to calc_llr_overestimation and/or calc_fiducial_density_functions.