lir.algorithms.bayeserror module

Normalised Bayes error rate (NBE).

References

Vergeer, P., van Es, A., de Jongh, A., Alberink, I., & Stoel, R. (2016). Numerical likelihood ratios output by LR systems are often based on extrapolation: When to stop extrapolating? Science and Justice, 56, 482–491.

class lir.algorithms.bayeserror.ELUBBounder(lower_llr_bound: float | None = None, upper_llr_bound: float | None = None)[source]

Bases: LLRBounder

Calculate the Empirical Upper and Lower Bounds for a given LR system.

Class that, given an LR system, outputs the same LRs as the system but bounded by the Empirical Upper and Lower Bounds as described in P. Vergeer, A. van Es, A. de Jongh, I. Alberink, R.D. Stoel, Numerical likelihood ratios outputted by LR systems are often based on extrapolation: when to stop extrapolating? Sci. Justics 56 (2016) 482-491.

# MATLAB code from the authors:

# clear all; close all; # llrs_hp=csvread(’…’); # llrs_hd=csvread(’…’); # start=-7; finish=7; # rho=start:0.01:finish; theta=10.^rho; # nbe=[]; # for k=1:length(rho) # if rho(k)<0 # llrs_hp=[llrs_hp;rho(k)]; # nbe=[nbe;(theta(k)^(-1))*mean(llrs_hp<=rho(k))+… # mean(llrs_hd>rho(k))]; # else # llrs_hd=[llrs_hd;rho(k)]; # nbe=[nbe;theta(k)*mean(llrs_hd>=rho(k))+… # mean(llrs_hp<rho(k))]; # end # end # plot(rho,-log10(nbe)); hold on; # plot([start finish],[0 0]); # a=rho(-log10(nbe)>0); # empirical_bounds=[min(a) max(a)]

calculate_bounds(llrdata: LLRData) tuple[float | None, float | None][source]

Calculate the LLR empirical upper and lower bounds (ELUB).

Parameters:

llrdata (LLRData) – An instance of LLRData containing LLRs and ground-truth labels.

Returns:

A tuple containing the lower and upper bounds. If the bounds cannot be calculated, returns (None, None).

Return type:

tuple[float | None, float | None]

lir.algorithms.bayeserror.calculate_expected_utility(lrs: ndarray, y: ndarray, threshold_lrs: ndarray, add_misleading: int = 0) float[source]

Calculate the expected utility of a set of LRs for a given threshold.

Parameters:
  • lrs (np.ndarray) – Array of LRs.

  • y (np.ndarray) – Array of ground-truth labels (0 for Hd or 1 for Hp), with the same length as lrs.

  • threshold_lrs (np.ndarray) – Array of threshold LRs used as acceptance thresholds.

  • add_misleading (int, optional) – Number of consequential misleading LRs to add.

Returns:

Expected utility values, one element for each threshold LR.

Return type:

float

lir.algorithms.bayeserror.elub(llrdata: LLRData, add_misleading: int = 1, step_size: float = 0.01, substitute_extremes: tuple[float, float] = (-9, 9)) tuple[float, float][source]

Calculate and return the empirical upper and lower bound log10-LRs (ELUB LLRs).

Parameters:
  • llrdata (LLRData) – An instance of LLRData containing LLRs and ground-truth labels.

  • add_misleading (int, optional) – The number of consequential misleading LLRs to be added to both sides (labels 0 and 1).

  • step_size (float, optional) – Required accuracy on a 10-base logarithmic scale.

  • substitute_extremes (tuple[float, float], optional) – The values to substitute for extreme LRs, i.e. LRs of 0 and inf are substituted by these values.

Returns:

A tuple containing the lower and upper ELUB log10-LRs.

Return type:

tuple[float, float]

lir.algorithms.bayeserror.plot_nbe(ax: Axes, llrdata: LLRData, log_lr_threshold_range: tuple[float, float] | None = None, add_misleading: int = 1, step_size: float = 0.01) None[source]

Generate the visual NBE plot using matplotlib.

Parameters:
  • ax (plt.Axes) – The matplotlib axis to plot on.

  • llrdata (LLRData) – An instance of LLRData containing LLRs and ground-truth labels.

  • log_lr_threshold_range (tuple[float, float] | None, optional) – The range of log LR threshold values to consider for the plot. If None, it will be determined based on the minimum and maximum LLRs in the data, with a margin of 0.5 added to both ends. Default is None.

  • add_misleading (int, optional) – The number of consequential misleading LLRs to be added to both sides (labels 0 and 1). Default is 1.

  • step_size (float, optional) – The step size for the log LR threshold range, determining the resolution of the plot. Default is 0.01.