lir.lrsystems package

Submodules

lir.lrsystems.binary_lrsystem module

class lir.lrsystems.binary_lrsystem.BinaryLRSystem(pipeline: Transformer)

Bases: LRSystem

LR system for binary data and a linear pipeline.

This may be used in specific source feature based LR systems.

In this strategy, a set of instances - captured within the feature vector X - and a set of (ground-truth) labels are used to train and afterward calculate corresponding LLR’s for given feature vectors.

apply(instances: InstanceData) → LLRData

Use LR system to calculate the LLR data from the instance data.

Applies the specific source LR system on a set of instances, optionally with corresponding labels, and returns a representation of the calculated LLR data through the LLRData tuple.

The returned set of LLRs has the same order as the set of input instances, and the returned labels are unchanged from the input labels.

fit(instances: InstanceData) → Self: Fit the model on the given instance data.

lir.lrsystems.lrsystems module

class lir.lrsystems.lrsystems.LRSystem

Bases: Transformer, ABC

General representation of an LR system.

abstractmethod apply(instances: InstanceData) → LLRData

Use the LR system to calculate the LLR data from the instances.

Applies the LR system on a set of instances, optionally with corresponding labels, and returns a representation of the calculated LLR data through the LLRData tuple.

fit(instances: InstanceData) → Self

Fit the LR system on a set of features and corresponding labels.

The number of labels must be equal to the number of instances.

lir.lrsystems.score_based module

class lir.lrsystems.score_based.ScoreBasedSystem(preprocessing_pipeline: Transformer | None, pairing_function: PairingMethod, evaluation_pipeline: Transformer | None)

Bases: LRSystem

Provide a representation of a common source, score-based LR system.

In this strategy, it is possible to prepare the data within a preprocessing_pipeline, create corresponding pairs of instances using the pairing_function and subsequently calculate scores as well as transform these scores to LLR’s in the final evaluation_pipeline.

apply(instances: InstanceData) → LLRData

Use LR system to calculate LLR data from the instances.

Applies the score-based LR system on a set of instances, optionally with corresponding labels, and returns a representation of the calculated LLR data through the LLRData tuple.

The system takes instances as input, and calculates LLRs for pairs of instances. That means that there is a 2-1 relation between input and output data.

fit(instances: InstanceData) → Self: Fit the model on the instance data.

lir.lrsystems.two_level module

class lir.lrsystems.two_level.TwoLevelModelNormalKDE

Bases: object

Implement two-level model as outlined by Bolck et al.

An implementation of the two-level model as outlined in FSI191(2009)42 by Bolck et al. “Different likelihood

ratio approaches to evaluate the strength of evidence of MDMA tablet comparisons”.

Model description:

Definitions X_ij = vector, measurement of reference j, ith repetition, with i=1..n Y_kl = vector, measurement of trace l, kth repetition, with k=1..m

Model:

First level of variance: X_ij ~ N(theta_j, sigma_within) Y_kl ~ N(theta_k, sigma_within) where theta_j is the true but unknown mean of the reference and theta_k the true but unknown mean of the trace. sigma_within is assumed equal for trace and reference (and for repeated measurements of some background data)

Second level of variance: theta_j ~ theta_k ~ KDE(means background database, h) with h the kernel bandwidth.

H1: theta_j = theta_k H2: theta_j independent of theta_k

Numerator LR = Integral_theta N(X_Mean|theta, sigma_within, n) * N(Y_mean|theta, sigma_within, m) * KDE(theta|means background database, h) Denominator LR = Integral_theta N(X_Mean|theta, sigma_within, n) * KDE(theta|means background database, h) * Integral_theta N(Y_Mean|theta, sigma_within, m) * KDE(theta|means background database, h)

In Bolck et al. in the appendix one finds a closed-form solution for the evaluation of these integrals.

sigma_within and h (and other parameters) are estimated from repeated measurements of background data.

fit_on_unpaired_instances(X: ndarray, y: ndarray) → TwoLevelModelNormalKDE

Fit the model on unpaired instances.

X np.ndarray of measurements, rows are sources/repetitions, columns are features y np 1d-array of labels. For each source a unique identifier (label). Repetitions get the same label.

Construct the necessary matrices/scores/etc based on test data (X) so that we can predict a score later on. Store any calculated parameters in self.

predict_proba(X_trace: ndarray, X_ref: ndarray) → ndarray

Predict probability scores, using the fitted model.

Predict probability scores, making use of the parameters constructed during self.fit() (which should now be stored in self).

X_trace measurements of trace object. np.ndarray of shape (instances, repetitions_trace, features) X_ref measurements of reference object. np.ndarray of shape (instances, repetitions_ref, features)

returns: probabilities for same source and different source: np.ndarray with shape (instances, 2)

transform(X_trace: ndarray, X_ref: ndarray) → ndarray

Transform the input data using the fitted model.

Predict odds scores, making use of the parameters constructed during self.fit() (which should now be stored in self).

X_trace measurements of trace object. np.ndarray of shape (instances, repetitions_trace, features) X_ref measurements of reference object. np.ndarray of shape (instances, repetitions_ref, features)

returns: odds of same source / different source: one-dimensional np.ndarray with one element per instance

class lir.lrsystems.two_level.TwoLevelSystem(preprocessing_pipeline: Transformer | None, pairing_function: PairingMethod, postprocessing_pipeline: Transformer | None, n_trace_instances: int, n_ref_instances: int)

Bases: LRSystem

Implement two level model, common-source feature-based LR system architecture.

During the training phase, the system calculates statistics on the unpaired instances. On application, it calculates LRs for same-source and different-source pairs. Each side of the pair may consist of multiple instances.

lir.lrsystems package

Submodules

lir.lrsystems.binary_lrsystem module

lir.lrsystems.lrsystems module

lir.lrsystems.score_based module

lir.lrsystems.two_level module

Module contents