[TODO] Calculation of LR's from scores
===================================

A collection of scripts are provided to aid calibration, and
calculation and evaluation of Likelihood Ratios.

A simple score-based LR system
------------------------------

A score-based LR system needs a scorer and a calibrator. The most basic setup
uses a training set and a test set. Both the scorer and the calibrator are
fitted on the training set.

.. code-block:: python

    import lir
    import numpy as np
    from sklearn.linear_model import LogisticRegression
    from sklearn.model_selection import train_test_split

    # generate some data randomly from a normal distribution
    X = np.concatenate([np.random.normal(loc=0, size=(100, 1)),
                  np.random.normal(loc=1, size=(100, 1))])
    y = np.concatenate([np.zeros(100), np.ones(100)])

    # split the data into train and test
    X_train, X_test, y_train, y_test = train_test_split(X, y)

    # initialize a scorer and a calibrator
    scorer = LogisticRegression(solver='lbfgs')  # choose any sklearn style classifier
    calibrator = lir.KDECalibrator()  # use plain KDE for calibration
    calibrated_scorer = lir.CalibratedScorer(scorer, calibrator)

    # fit and predict
    calibrated_scorer.fit(X_train, y_train)
    llrs_test = calibrated_scorer.predict_lr(X_test)

    # print the quality of the system as log likelihood ratio cost (lower is better)
    print('The log likelihood ratio cost is', lir.metrics.cllr(llrs_test, y_test), '(lower is better)')
    print('The discriminative power is', lir.metrics.cllr_min(llrs_test, y_test), '(lower is better)')

    # plot calibration
    import lir.plotting
    with lir.plotting.show() as ax:
        ax.pav(llrs_test, y_test)

The log likelihood ratio cost (Cllr) may be used as a metric of performance.
In this case it should yield a value of around .8, but highly variable due to
the small number of samples. Increase the sample size to get more stable
results.