lir.config.substitution module

Substitution module.

This module provides utility functions for replacing or modifying components of an LR Benchmark pipeline at runtime. Typical use cases include comparing different modelling approaches (e.g. logistic regression versus support vector machines) or optimising system hyperparameters.

For example, the parameters section of the model_selection_run benchmark can define a path (comparing.clf) to be modified using the options listed in the values field. Each option updates the comparing component in the LR system configuration used by the pipeline.

experiments:
  - name: model_selection_run
    lr_system: ...
    ...
    hyperparameters:
      - path: comparing.clf
        options:
          - name: logit
            method: logistic_regression
            C: 1
          - name: svm
            method: svm
            probability: True
class lir.config.substitution.CategoricalHyperparameter(name: str, options: list[HyperparameterOption])[source]

Bases: Hyperparameter

A categorical hyperparameter.

A categorical hyperparameter has the following fields in a YAML configuration: - path: the path of this hyperparameter in the LR system configuration - options: a list of options

Parameters:
options() list[HyperparameterOption][source]

Provide API access to the options for the hyperparameter.

Returns:

Configured categorical options.

Return type:

list[HyperparameterOption]

class lir.config.substitution.FloatHyperparameter(path: str, low: float, high: float, step: float | None, log: bool)[source]

Bases: Hyperparameter

Floating-point hyperparameter.

In a YAML configuration, this hyperparameter supports the following fields:

  • path: Path to the hyperparameter in the LR system configuration.

  • low: Lower bound of the search range.

  • high: Upper bound of the search range.

  • step (optional): Step size for a linear grid search.

  • log (optional): If True, search in logarithmic space instead of

linear space. Cannot be combined with step. Defaults to False.

Parameters:
  • path (str) – Configuration path to substitute.

  • low (float) – Lower bound.

  • high (float) – Upper bound.

  • step (float | None) – Optional step size for grid options.

  • log (bool) – Whether to sample in log space.

options() list[HyperparameterOption][source]

Provide API access to the options for the hyperparameter.

Returns:

Enumerated hyperparameter options.

Return type:

list[HyperparameterOption]

class lir.config.substitution.FolderHyperparameter(path: str, folder: str, ignore_files: list[str] | None = None)[source]

Bases: Hyperparameter

Hyperparameter that enumerates all files in a given folder as options.

This hyperparameter reads the contents of a specified folder and generates one option per file. Each option uses the file’s full path as both its name and its value.

In a YAML configuration, a folder hyperparameter supports the following fields:

  • folder: Path to the folder containing the candidate files.

  • ignore_files: Optional list of file patterns to ignore.

Example configuration:

hyperparameters:
- path: data.provider.path
  type: folder
  folder: project_files/my_dataset/
  ignore_files:  # Optional list of file patterns to ignore.
   - '*.tmp'
   - 'ignore_this_file.csv'
Parameters:
  • path (str) – Configuration path to substitute.

  • folder (str) – Folder containing candidate files.

  • ignore_files (list[str] | None, optional) – Filename patterns to exclude.

Raises:
  • ValueError – If the specified folder does not exist (during initialisation).

  • ValueError – If no valid files are found in the folder after applying the ignore patterns (when calling options()).

options() list[HyperparameterOption][source]

Generate options by walking over the folder.

Returns:

File-based options discovered in the folder.

Return type:

list[HyperparameterOption]

class lir.config.substitution.Hyperparameter(name: str)[source]

Bases: ABC

Base class for all hyperparameters.

Parameters:

name (str) – Hyperparameter name.

abstractmethod options() list[HyperparameterOption][source]

Get a list of values that a hyperparameter can take in the context of a particular experiment.

Returns:

List of options for this hyperparameter.

Return type:

list[HyperparameterOption]

class lir.config.substitution.HyperparameterOption(name: str, substitutions: Mapping[str, Any])[source]

Bases: NamedTuple

An option for a value of a hyperparameter.

A HyperparameterOption is a named tuple with two fields: - name: a descriptive name of this option - substitutions: a mapping of configuration paths to values

name: str

Alias for field number 0

substitutions: Mapping[str, Any]

Alias for field number 1

lir.config.substitution.parse_parameter(spec: ContextAwareDict, output_dir: Path) Hyperparameter[source]

Parse one parameter specification into a hyperparameter object.

Parameters:
  • spec (ContextAwareDict) – Parameter specification.

  • output_dir (Path) – Output directory used by nested parser calls.

Returns:

Parsed hyperparameter object.

Return type:

Hyperparameter

lir.config.substitution.substitute_parameters(base_config: ContextAwareDict, hyperparameters: Mapping[str, Any], context: list[str]) ContextAwareDict[source]

Substitute parameters in an LR system configuration and return the updated configuration.

Parameters:
  • base_config (ContextAwareDict) – Original LR system configuration.

  • hyperparameters (Mapping[str, Any]) – Hyperparameters and their replacement values.

  • context (list[str]) – Context path of the augmented configuration.

Returns:

Augmented LR system configuration.

Return type:

ContextAwareDict