lir.lrsystems.two_level module
- class lir.lrsystems.two_level.TwoLevelModelNormalKDE[source]
Bases:
objectImplement two-level model as outlined by Bolck et al.
- An implementation of the two-level model as outlined in FSI191(2009)42 by Bolck et al. “Different likelihood
ratio approaches to evaluate the strength of evidence of MDMA tablet comparisons”.
Model description:
Definitions X_ij = vector, measurement of reference j, ith repetition, with i=1..n Y_kl = vector, measurement of trace l, kth repetition, with k=1..m
Model:
First level of variance: X_ij ~ N(theta_j, sigma_within) Y_kl ~ N(theta_k, sigma_within) where theta_j is the true but unknown mean of the reference and theta_k the true but unknown mean of the trace. sigma_within is assumed equal for trace and reference (and for repeated measurements of some background data)
Second level of variance: theta_j ~ theta_k ~ KDE(means background database, h) with h the kernel bandwidth.
H1: theta_j = theta_k H2: theta_j independent of theta_k
Numerator LR = Integral_theta N(X_Mean|theta, sigma_within, n) * N(Y_mean|theta, sigma_within, m) * KDE(theta|means background database, h) Denominator LR = Integral_theta N(X_Mean|theta, sigma_within, n) * KDE(theta|means background database, h) * Integral_theta N(Y_Mean|theta, sigma_within, m) * KDE(theta|means background database, h)
In Bolck et al. in the appendix one finds a closed-form solution for the evaluation of these integrals.
sigma_within and h (and other parameters) are estimated from repeated measurements of background data.
- fit_on_unpaired_instances(X: ndarray, y: ndarray) TwoLevelModelNormalKDE[source]
Fit the model on unpaired instances.
X np.ndarray of measurements, rows are sources/repetitions, columns are features y np 1d-array of labels. For each source a unique identifier (label). Repetitions get the same label.
Construct the necessary matrices/scores/etc based on test data (X) so that we can predict a score later on. Store any calculated parameters in self.
- Parameters:
X (np.ndarray) – Value passed via
X.y (np.ndarray) – Value passed via
y.
- Returns:
Fitted two-level KDE model instance.
- Return type:
‘TwoLevelModelNormalKDE’
- predict_proba(X_trace: ndarray, X_ref: ndarray) ndarray[source]
Predict probability scores, using the fitted model.
Predict probability scores, making use of the parameters constructed during self.fit() (which should now be stored in self).
X_trace measurements of trace object. np.ndarray of shape (instances, repetitions_trace, features) X_ref measurements of reference object. np.ndarray of shape (instances, repetitions_ref, features)
returns: probabilities for same source and different source: np.ndarray with shape (instances, 2)
- Parameters:
X_trace (np.ndarray) – Value passed via
X_trace.X_ref (np.ndarray) – Value passed via
X_ref.
- Returns:
Two-column probability matrix for Hd and Hp.
- Return type:
np.ndarray
- transform(X_trace: ndarray, X_ref: ndarray) ndarray[source]
Transform the input data using the fitted model.
Predict odds scores, making use of the parameters constructed during self.fit() (which should now be stored in self).
X_trace measurements of trace object. np.ndarray of shape (instances, repetitions_trace, features) X_ref measurements of reference object. np.ndarray of shape (instances, repetitions_ref, features)
returns: odds of same source / different source: one-dimensional np.ndarray with one element per instance
- Parameters:
X_trace (np.ndarray) – Value passed via
X_trace.X_ref (np.ndarray) – Value passed via
X_ref.
- Returns:
Log10 LR scores for each trace/reference pair.
- Return type:
np.ndarray
- class lir.lrsystems.two_level.TwoLevelSystem(preprocessing_pipeline: Transformer | None, pairing_function: PairingMethod, postprocessing_pipeline: Transformer | None, n_trace_instances: int, n_ref_instances: int)[source]
Bases:
LRSystemImplement two level model, common-source feature-based LR system architecture.
During the training phase, the system calculates statistics on the unpaired instances. On application, it calculates LRs for same-source and different-source pairs. Each side of the pair may consist of multiple instances.
See also: TwoLevelModelNormalKDE
- Parameters:
preprocessing_pipeline (Transformer | None) – Pipeline that preprocesses instances before pairing and evaluation.
pairing_function (PairingMethod) – Pairing method used to construct trace/reference comparisons.
postprocessing_pipeline (Transformer | None) – Value passed via
postprocessing_pipeline.n_trace_instances (int) – Number of trace instances to include in each pairing.
n_ref_instances (int) – Number of reference instances to include in each pairing.
- apply(instances: InstanceData) LLRData[source]
Apply this LR system on a set of instances and return LLR data.
Applies the two level LR system on a set of instances, and returns a representation of the calculated LLR data through the LLRData tuple.
- Parameters:
instances (InstanceData) – Input instances to be processed by this method.
- Returns:
Likelihood-ratio data produced by applying the LR system.
- Return type:
- fit(instances: InstanceData) Self[source]
Fit the model based on the instance data.
- Parameters:
instances (InstanceData) – Input instances to be processed by this method.
- Returns:
This LR system instance after fitting all components.
- Return type:
Self