hansken_extraction_plugin.api.extraction_trace
This module contains the different Trace apis.
- Note that there are a couple of different traces:
The ExtractionTrace and MetaExtractionTrace, which are offered to the process function.
ExtractionTraceBuilder, which is a trace that can be built; it does not exist in hansken yet, but it is added after building.
SearchTrace, which represents an immutable trace which is returned after searching for traces.
Classes
Trace offered to be processed. |
|
ExtractionTrace that can be build. |
|
MetaExtractionTraces contain only metadata. |
|
SearchTraces represent traces returned when searching for traces. |
|
|
All trace classes should be able to return values. |
- class ExtractionTraceBuilder[source]
Bases:
ABC
ExtractionTrace that can be build.
Represents child traces.
- abstract update(key_or_updates: Mapping | str | None = None, value: Any | None = None, data: Mapping[str, bytes] | None = None) ExtractionTraceBuilder [source]
Update or add metadata properties for this .ExtractionTraceBuilder.
Can be used to update the name of the Trace represented by this builder, if not already set.
- Parameters:
key_or_updates – either a str (the metadata property to be updated) or a mapping supplying both keys and values to be updated
value – the value to update metadata property key to (used only when key_or_updates is a str, an exception will be thrown if key_or_updates is a mapping)
data – a dict mapping data type / stream name to bytes to be added to the trace
- Returns:
this .ExtractionTraceBuilder
- abstract add_tracelet(tracelet: Tracelet | str, value: Mapping[str, Any] | None = None) ExtractionTraceBuilder [source]
Add a .Tracelet to this .ExtractionTraceBuilder.
- Parameters:
tracelet – the Tracelet or tracelet type (supplied as a str) to add
value – the tracelet properties to add (only applicable when tracelet is a str)
- Returns:
this .ExtractionTraceBuilder
- abstract add_transformation(data_type: str, transformation: Transformation) ExtractionTraceBuilder [source]
Update or add transformations for this .ExtractionTraceBuilder.
- Parameters:
data_type – data type of the Transformation
transformation – the Transformation to add
- Returns:
this .ExtractionTraceBuilder
- abstract child_builder(name: str | None = None) ExtractionTraceBuilder [source]
Create a new .TraceBuilder to build a child trace to the trace to be represented by this builder.
Note
Traces should be created and built in depth first order, parent before child (pre-order).
- Returns:
a .TraceBuilder set up to save a new trace as the child trace of this builder
- add_data(stream: str, data: bytes) ExtractionTraceBuilder [source]
Add data to this trace as a named stream.
- Parameters:
stream – name of the data stream to be added
data – data to be attached
- Returns:
this .ExtractionTraceBuilder
- abstract open(data_type: str | None = None, offset: int = 0, size: int | None = None, mode: Literal['rb', 'wb', 'w', 'wt'] = 'rb', encoding='utf-8', buffer_size: int | None = None) BufferedReader | BufferedWriter | TextIOBase [source]
Open a data stream to read or write data from or to the .ExtractionTrace.
- Parameters:
data_type – the data type of the datastream, ‘raw’ by default
offset – byte offset to start the stream on when reading
size – the number of bytes to make available when reading
mode – ‘rb’ for reading, ‘wb’ for writing
encoding – encoding for writing text, used to convert str values to bytes, only valid for modes ‘w’ and ‘wt’
buffer_size – buffer size for reading (cache read back/ahead) or writing (cache for flush) data
- Returns:
a file-like object to read or write bytes from the named stream
- class SearchTrace[source]
Bases:
Trace
SearchTraces represent traces returned when searching for traces.
- abstract open(stream: str = 'raw', offset: int = 0, size: int | None = None, buffer_size: int | None = None) BufferedReader [source]
Open a data stream of the data that is being processed.
- Parameters:
stream – data stream of trace to open. defaults to raw. other examples are html, text, etc.
offset – byte offset to start the stream on
size – the number of bytes to make available
buffer_size – buffer size for reading data
- Returns:
a file-like object to read bytes from the named stream
- class MetaExtractionTrace[source]
Bases:
Trace
MetaExtractionTraces contain only metadata.
This class represenst traces during the extraction of an extraction plugin without a data stream.
- abstract update(key_or_updates: Mapping | str | None = None, value: Any | None = None, data: Mapping[str, bytes] | None = None) None [source]
Update or add metadata properties for this .ExtractionTrace.
- Parameters:
key_or_updates – either a str (the metadata property to be updated) or a mapping supplying both keys and values to be updated
value – the value to update metadata property key to (used only when key_or_updates is a str, an exception will be thrown if key_or_updates is a mapping)
data – a dict mapping data type / stream name to bytes to be added to the trace
- abstract add_tracelet(tracelet: Tracelet | str, value: Mapping[str, Any] | None = None) None [source]
Add a .Tracelet to this .MetaExtractionTrace.
- Parameters:
tracelet – the Tracelet or tracelet type to add
value – the tracelet properties to add (only applicable when tracelet is a tracelet type)
- abstract add_transformation(data_type: str, transformation: Transformation) None [source]
Update or add transformations for this .ExtractionTraceBuilder.
- Parameters:
data_type – data type of the Transformation
transformation – the Transformation to add
- abstract child_builder(name: str | None = None) ExtractionTraceBuilder [source]
Create a .TraceBuilder to build a trace to be saved as a child of this .Trace.
A new trace will only be added to the index once explicitly saved (e.g. through .TraceBuilder.build).
Note
Traces should be created and built in depth first order, parent before child (pre-order).
- Parameters:
name – the name for the trace being built
- Returns:
a .TraceBuilder set up to create a child trace of this .MetaExtractionTrace
- class ExtractionTrace[source]
Bases:
MetaExtractionTrace
Trace offered to be processed.
- abstract open(data_type: str | None = None, offset: int = 0, size: int | None = None, mode: Literal['rb', 'wb', 'w', 'wt'] = 'rb', encoding='utf-8', buffer_size: int | None = None) BufferedReader | BufferedWriter | TextIOBase [source]
Open a data stream to read or write data from or to the .ExtractionTrace.
- Parameters:
data_type – the data type of the datastream, ‘raw’ by default
offset – byte offset to start the stream on when reading
size – the number of bytes to make available when reading
mode – ‘rb’ for reading, ‘wb’ for writing
encoding – encoding for writing text, used to convert str values to bytes, only valid for modes ‘w’ and ‘wt’
buffer_size – buffer size for reading (cache read back/ahead) or writing (cache for flush) data
- Returns:
a file-like object to read or write bytes from the named stream