Packaging

Extraction plugins are packaged as OCI images (also known as Docker images). The OCI images are labeled with the PluginInfo. To automate packaging of a Python plugin and labeling the OCI image, the Extraction Plugin SDK comes with two utility applications: label_plugin, and the deprecated build_plugin.

To package a plugin, make sure that the Extraction Plugins SDK is installed, as well as Docker. Next build and label your plugin as described in the following sections.

To verify that the image has been built, use the following command to view all local images:

docker images

Once your plugin is packaged and labelled, it can be published or ‘uploaded’ to Hansken. See “Upload the plugin to Hansken” for instructions.

label_plugin

label_plugin is a utility to add labels to an extraction plugin image. To label a plugin, first build the plugin image with docker build; for example by using one of the following commands:

docker build . -t my_plugin
docker build . -t my_plugin --build-arg https_proxy=http://your_proxy:8080

Next, run the label_plugin utility to label the build plugin container:

label_plugin my_plugin

This utility will briefly start your plugin using Docker, and requests the PluginInfo from the plugin. The information from the PluginInfo will be added as labels to the plugin image. The result of label_plugin is a plugin image that can be published to a docker/OCI image registry.

build_plugin

Warning

This method is deprecated! label_plugin is preferred over build_plugin, as it does not require a full (virtual) environment with all plugin dependencies and resources. This is especially preferred when the plugin uses (big) data models or (external) dependencies.

The build_plugin script will take your plugin file and Docker-file directory as input, and build the plugin and label the generated image.

To build your plugin container image you can use the following command:

build_plugin PLUGIN_FILE DOCKER_FILE_DIRECTORY [DOCKER_IMAGE_NAME] [DOCKER_ARGS]

For example:

build_plugin chatplugin.py . chatplugin --build-arg http_proxy="$http_proxy" --build-arg https_proxy="$https_proxy"

This will generate a plugin image:

  • The extraction plugin is added to your local image registry (docker images),

  • Note that the variables $http_proxy and $https_proxy are put in quotes, this is needed in case they contain spaces,

  • The image is tagged with two tags: latest, and your plugin version.

Arguments:

  • PLUGIN_FILE: Path to the python file of the plugin.

  • DOCKER_FILE_DIRECTORY: Path to the directory containing the Dockerfile of the plugin.

  • (Optional) DOCKER\_IMAGE\_NAME: Name of the docker image without tag. Note that docker image names cannot start with a period or dash. If it starts with a dash, it will be interpreted as an additional docker argument (see DOCKER_ARGS). If no name is given the name defaults to extraction-plugin/PLUGINID, e.g. extraction-plugin/nfi.nl/extract/chat/whatsapp.

  • (Optional) DOCKER\_ARGS: Additional arguments for the docker command, which can be as many arguments as you like.