lir.transform package

class lir.transform.BinaryClassifierTransformer(estimator: SKLearnPipelineModule)[source]

Bases: Transformer

Implementation of a binary class classifier as scikit-learn Pipeline step.

Parameters:

estimator (SKLearnPipelineModule) – Estimator used to produce transformed or scored outputs.

apply(instances: InstanceData) InstanceData[source]

Convert instances by applying the fitted model.

Parameters:

instances (InstanceData) – Input instances to be processed by this method.

Returns:

Instance data object produced by this operation.

Return type:

InstanceData

fit(instances: InstanceData) Self[source]

Fit the model on the provided instances.

Parameters:

instances (InstanceData) – Input instances to be processed by this method.

Returns:

This transformer instance after fitting.

Return type:

Self

class lir.transform.CsvWriter(path: Path, include: list[str] | None = None, header: list[str] | None = None, include_labels: bool = False, include_meta: bool = False, include_input: bool = True, include_batch: bool = False)[source]

Bases: Transformer

Implementation of a transformation step in a scikit-learn Pipeline that writes to CSV.

This might be used to obtain temporary or intermediate results for logging or debugging purposes.

Parameters:
  • path (Path) – Filesystem path used by this operation.

  • include (list[str] | None) – Value passed via include.

  • header (list[str] | None) – Value passed via header.

  • include_labels (bool) – Whether to include labels in logged output.

  • include_meta (bool) – Value passed via include_meta.

  • include_input (bool) – Whether to include original inputs in logged output.

  • include_batch (bool) – Value passed via include_batch.

apply(instances: InstanceData) FeatureData[source]

Write numpy feature vector to CSV output file.

Parameters:

instances (InstanceData) – Input instances to be processed by this method.

Returns:

FeatureData object parsed from the source.

Return type:

FeatureData

fit_apply(instances: InstanceDataType) InstanceDataType[source]

Provide required fit_apply() and return all instances.

Since the CsvWriter is implemented as a step (Transformer) in the pipeline, it should support the fit_apply method which is called on all transformers of the pipeline.

We don’t need to actually fit or transform anything, so we simply return the instances (as is).

Parameters:

instances (InstanceDataType) – Input instances to be processed by this method.

Returns:

Instance data object produced by this operation.

Return type:

InstanceDataType

class lir.transform.DataWriter(*args, **kwargs)[source]

Bases: Protocol

Representation of a data writer and necessary methods.

writerow(row: Any) None[source]

Write row to output.

Parameters:

row (Any) – CSV row dictionary to parse.

class lir.transform.FunctionTransformer(func: Callable)[source]

Bases: Transformer

Implementation of a transformer function as scikit-learn Pipeline step.

Parameters:

func (Callable) – Callable used to transform input instances.

apply(instances: InstanceData) FeatureData[source]

Call the custom defined function on the feature data instances and use output as features.

Parameters:

instances (InstanceData) – Input instances to be processed by this method.

Returns:

FeatureData object parsed from the source.

Return type:

FeatureData

class lir.transform.Identity[source]

Bases: Transformer

Represent the Identity function of a transformer.

When apply() is called on such a transformer, it simply returns the instances.

apply(instances: InstanceDataType) InstanceDataType[source]

Simply provide the instances.

Parameters:

instances (InstanceDataType) – Input instances to be processed by this method.

Returns:

Instance data object produced by this operation.

Return type:

InstanceDataType

class lir.transform.NumpyTransformer(transformer: Transformer, header: list[str] | None)[source]

Bases: TransformerWrapper

Implementation of a transformer wrapper.

Parameters:
  • transformer (Transformer) – Transformer instance wrapped by this adapter.

  • header (list[str] | None) – Value passed via header.

apply(instances: InstanceData) InstanceData[source]

Extend the instances with the desired header data, call base apply.

Parameters:

instances (InstanceData) – Input instances to be processed by this method.

Returns:

Instance data object produced by this operation.

Return type:

InstanceData

fit_apply(instances: InstanceData) InstanceData[source]

Extend the instances with the desired header data, call base fit_apply.

Parameters:

instances (InstanceData) – Input instances to be processed by this method.

Returns:

Instance data object produced by this operation.

Return type:

InstanceData

class lir.transform.SKLearnPipelineModule(*args, **kwargs)[source]

Bases: Protocol

Representation of the interface required for estimators by the scikit-learn Pipeline.

fit(X: ndarray, y: ndarray | None) Self[source]
predict_proba(X: ndarray) Any[source]
transform(X: ndarray) Any[source]
class lir.transform.SklearnTransformer(transformer: SklearnTransformerType)[source]

Bases: Transformer

Implementation of a binary class classifier as scikit-learn Pipeline step.

Parameters:

transformer (SklearnTransformerType) – Transformer instance wrapped by this adapter.

apply(instances: InstanceData) InstanceData[source]

Convert instances by applying the fitted model.

Parameters:

instances (InstanceData) – Input instances to be processed by this method.

Returns:

Instance data object produced by this operation.

Return type:

InstanceData

fit(instances: InstanceData) Self[source]

Fit the model on the provided instances.

Parameters:

instances (InstanceData) – Input instances to be processed by this method.

Returns:

This transformer instance after fitting.

Return type:

Self

fit_apply(instances: InstanceData) FeatureData[source]

Combine call to .fit() followed by .apply().

Parameters:

instances (InstanceData) – Input instances to be processed by this method.

Returns:

FeatureData object parsed from the source.

Return type:

FeatureData

class lir.transform.SklearnTransformerType(*args, **kwargs)[source]

Bases: Protocol

Representation of the interface required for transformers by the scikit-learn Pipeline.

fit(features: ndarray, labels: ndarray | None) Self[source]
fit_transform(features: ndarray, labels: ndarray | None) ndarray[source]
transform(features: ndarray) Any[source]
class lir.transform.Tee(transformers: list[Transformer])[source]

Bases: Transformer

Implementation of a custom transformer allowing to perform two separate tasks on a given input.

Parameters:

transformers (list[Transformer]) – Collection of transformers applied in sequence or parallel.

apply(instances: InstanceData) InstanceData[source]

Delegate apply() to all specified transformers.

Parameters:

instances (InstanceData) – Input instances to be processed by this method.

Returns:

Instance data object produced by this operation.

Return type:

InstanceData

fit(instances: InstanceData) Self[source]

Delegate fit() to all specified transformers.

Parameters:

instances (InstanceData) – Input instances to be processed by this method.

Returns:

This tee transformer instance after delegating fit.

Return type:

Self

class lir.transform.Transformer[source]

Bases: ABC

Transformer module which is compatible with the scikit-learn Pipeline.

The transformer should provide a transform() method. Since transformers are not fitted to the data, the fit() simply returns the object it was called upon without side effects.

abstractmethod apply(instances: InstanceData) InstanceData[source]

Convert the instance data based on the (optionally fitted) model.

Parameters:

instances (InstanceData) – Input instances to be processed by this method.

Returns:

Instance data object produced by this operation.

Return type:

InstanceData

fit(instances: InstanceData) Self[source]

Perform (optional) fitting of the instance data.

Parameters:

instances (InstanceData) – Input instances to be processed by this method.

Returns:

This transformer instance after fitting.

Return type:

Self

fit_apply(instances: InstanceData) InstanceData[source]

Combine call to fit() with directly following call to apply().

Parameters:

instances (InstanceData) – Input instances to be processed by this method.

Returns:

Instance data object produced by this operation.

Return type:

InstanceData

class lir.transform.TransformerWrapper(wrapped_transformer: Transformer)[source]

Bases: Transformer

Base class for a transformer wrapper.

This class is derived from AdvancedTransformer and has a default implementation of all functions by forwarding the call to the wrapped transformer. A subclass may add or change functionality by overriding functions.

Parameters:

wrapped_transformer (Transformer) – Value passed via wrapped_transformer.

apply(instances: InstanceData) InstanceData[source]

Delegate calls to underlying wrapped transformer but return the Wrapper instance.

Parameters:

instances (InstanceData) – Input instances to be processed by this method.

Returns:

Instance data object produced by this operation.

Return type:

InstanceData

fit(instances: InstanceData) Self[source]

Delegate calls to underlying wrapped transformer but return the Wrapper instance.

Parameters:

instances (InstanceData) – Input instances to be processed by this method.

Returns:

This wrapper instance after delegating fit.

Return type:

Self

lir.transform.as_transformer(transformer_like: Any) Transformer[source]

Provide a Transformer instance of the provided transformer like input.

For any transformer-like object, wrap if necessary, and return a Transformer.

The transformer-like object may be one of the following: - an instance of Transformer, which is returned as-is; - a scikit-learn style transformer which implements transform() and optionally fit() and/or fit_transform(); - a scikit-learn style estimator, which implements fit() and predict_proba(); or - a callable which takes an np.ndarray argument and returns another np.ndarray.

Parameters:

transformer_like (Any) – Object to convert to the internal Transformer interface.

Returns:

Equivalent object adapted to the internal Transformer interface.

Return type:

Transformer

Submodules