Packaging

Extraction plugins are packaged as OCI images (also known as Docker images). The OCI images are labeled with the PluginInfo. To automate packaging of a Python plugin and labeling the OCI image, the Extraction Plugin SDK comes with three utility applications: label_plugin, build_plugin and build_plugin_ci.

To package a plugin, make sure that the Extraction Plugins SDK is installed, as well as Docker. Next build and label your plugin as described in the following sections.

To verify that the image has been built, use the following command to view all local images:

docker images

Once your plugin is packaged and labelled, it can be published or ‘uploaded’ to Hansken. See “Upload the plugin to Hansken” for instructions.

label_plugin

label_plugin is a utility to add labels to an extraction plugin image. To label a plugin, first build the plugin image with docker build; for example by using one of the following commands:

docker build . -t my_plugin
docker build . -t my_plugin --build-arg https_proxy=http://your_proxy:8080

Next, run the label_plugin utility to label the build plugin container:

label_plugin my_plugin

This utility will briefly start your plugin using Docker, and requests the PluginInfo from the plugin. The information from the PluginInfo will be added as labels to the plugin image. The result of label_plugin is a plugin image that can be published to a docker/OCI image registry.

build_plugin

The build_plugin extends label_plugin by also taking care of the docker build command. Use this as an one-liner to both build and label your plugin image.

To build your plugin container image you can use the following command:

build_plugin DOCKER_FILE_DIRECTORY [DOCKER_IMAGE_NAME] [DOCKER_ARGS]

For example:

build_plugin .

and to pass proxy configurations to Docker:

build_plugin . --build-arg http_proxy="$http_proxy" --build-arg https_proxy="$https_proxy"

This will generate a plugin image:

  • The extraction plugin is added to your local image registry (docker images),

  • Note that the variables $http_proxy and $https_proxy are put in quotes, this is needed in case they contain spaces,

  • The image is tagged with two tags: latest, and your plugin version.

Arguments:

  • DOCKER_FILE_DIRECTORY: Path to the directory containing the Dockerfile of the plugin.

  • (Optional) DOCKER\_IMAGE\_NAME: Name of the docker image without tag. Note that docker image names cannot start with a period or dash. If it starts with a dash, it will be interpreted as an additional docker argument (see DOCKER_ARGS). If no name is given the name defaults to extraction-plugin/PLUGINID, e.g. extraction-plugin/nfi.nl/extract/chat/whatsapp.

  • (Optional) DOCKER\_ARGS: Additional arguments for the docker command, which can be as many arguments as you like.

build_plugin_ci

The build_plugin_ci performs the same tasks as build_plugin except that it uses a different approach for labeling the plugin. Not all CI/CD pipelines allow docker containers to be started and connected to, something which build_plugin relies on to label the plugin correctly using the plugin info specified in the plugin. build_plugin_ci uses another approach which exports the image and then parses the plugin info from this image. This way a container does not have to be started. Therefore the advantage of build_plugin_ci is that it can more reliably build plugins on CI systems. The downside of this approach is that it is slower than build_plugin utility. It is therefore adviced to use build_plugin when developing locally to speed up the development process and to use build_plugin_ci when building plugins in CI/CD pipelines.

Requirements

To build your plugin container image using build_plugin_ci you have to add the following line to your Containerfile after you have copied your plugin to the image. build_plugin_ci requires this to find the plugin information.

RUN plugin_info "/app/plugin.py"

Command usage

build_plugin_ci can be invoked as follows:

build_plugin DOCKER_FILE_DIRECTORY [--target_name DOCKER_IMAGE_NAME] [--build_agent BUILD_AGENT] [BUILD_AGENT_ARGS]

Other than the options provided by build_plugin build_plugin_ci also provides a --build_agent flag which allows a user to choose between different build agents to build and label their plugin. Currently only docker, podman and buildah are supported. For example:

build_plugin . --target_name my-image:1.3.5 --build_agent "podman"

A proxy configurations can be configured as follows:

build_plugin . --build-arg http_proxy="$http_proxy" --build-arg https_proxy="$https_proxy"

This will generate a plugin image:

  • The extraction plugin is added to your local image registry (docker images),

  • Note that the variables $http_proxy and $https_proxy are put in quotes, this is needed in case they contain spaces,

  • The image is tagged with two tags: latest, and your plugin version.

Arguments:

  • DOCKER_FILE_DIRECTORY: Path to the directory containing the Dockerfile of the plugin.

  • (Optional) DOCKER\_IMAGE\_NAME: Name of the docker image without tag. Note that docker image names cannot start with a period or dash. If it starts with a dash, it will be interpreted as an additional docker argument (see DOCKER_ARGS). If no name is given the name defaults to extraction-plugin/PLUGINID, e.g. extraction-plugin/nfi.nl/extract/chat/whatsapp.

  • (Optional) BUILD\_AGENT: The build agent that should be used to build the plugin. Either docker, podman or buildah. Docker is the default.

  • (Optional) BUILD\_AGENT\_ARGS: Additional arguments for the buildagent command, which can be as many arguments as you like.