Why !podman-py

This was supposed to be the sixth week of my Libvirt project, but since we had to make an unexpected U-turn, consider our options, and take an alternative path towards completing the project. I have decided to write a post on why that happened and more

podman-py is a Python library which provides a Pythonic way to perform container operations on Podman, in other words, it exposes native Python interfaces wrapping the official podman REST API Internally giving access to operations such as image build, container run, and a lot of the operations provided by podman --help on the CLI.

The goal of the internship was to originally add a podman-py wrapper module into lcitool, a tool used internally by libvirt developers to prepare and manage virtualized test environments as well as generating templates for containerized test environments. This wrapper module will provide APIs such as build, run et.c which would be used to build container image, and run workloads on container.

The first month of the internship was mostly research related. We(my mentor & I) were trying to understand the podman-py module and test some of the functionalities it provides. This was to get an idea of the library's state of development, and how feasible it was to achieve our desired solution by wrapping the module.

After weeks of research, we have reached the conclusion that the library is not production-ready, at least for the libvirt's use case, and have decided to directly wrap the podman CLI commands instead.

Below are some of the reason:

Real-time logging: Real-time logging is simply the ability to get the current state of any process running in a system. The podman-py library doesn't expose real-time logs of the processes underneath with the exception of the "pull" method which has real-time log, but no other methods currently support real time logging. Endpoints like build, create, run have no way to currently get real-time log, and although it's not a major blocker; it's is a hassle.

For example; Let's say there is a need to build an image that is about 4gb, or there is a need to run workloads that runs for about 10 - 20 mins. Using the podman-py library, there is currently no way to know what's going on during these processes i.e whether the process simply takes too long to finish or it's hung somewhere.

This is one of the sore points, and it actually happened to me when I spun up a Virtual Machine (VM) to test some cases using podman-py. I had to build an Ubuntu image which was prepared with the requirements for libvirt-python module. The process ran for well over 10 minutes, and I was left wondering what was happening because I couldn't get any log to figure it out. I ended up cancelling the execution (only to find out that the process had encountered an error, and it hung), and had to manually finish the build. This was a very frustrating experience.

The lack of real time output of podman-py's background processes could be potentially resolved by e.g creating an asynchronous event-loop allowing clients to register for individual log events, but this would take a couple of months since it isn't a small project. There would be need to understand the architecture of the podman-py library, and the workings of the podman service REST API.
Access to an interactive shell: With the ability to run workloads in a container, there should also be the ability to have access to an interactive shell in order to inspect processes and perform custom actions. podman-py library does not provide the necessary endpoints to get access to shell. For our use case which will definitely involve going into a shell to perform operations, this was another reason to abandon the idea of making use of podman-py.
Responsiveness: Unfortunately the responsiveness of the project team wasn't where we'd hoped it to be and given the short time frame for the GSoC project I'm working on, the project would have eventually led to a complete failure hadn't we decided to take an alternative approach to the solution with my mentor.
Errors in the documentation: The documentation is almost a carbon-copy of the docker-py's library. The docker-py library is a more seasoned project (i.e a lot of the functionalities have been implemented) and therefore the documentation corresponds to what has been implemented. This is unlike the podman-py docs which is a copy of docker-py and has a lot of functionalities that haven't been implemented making the documentation hard to trust/follow.

For example, In the method used for creating containers, the documentation claims that there is a volume argument which is supposed to take a path, or a named volume as the key in a dictonary. In the real source code, only support for the named volume works.

Due to the documentation, I kept trying to use a directory path, and was getting error(It was a very frustrating experience).

Although the library has a lot of positive things, and it's on the upside, there are still a few issues that makes it not suitable for our use case(right now).

If you are reading this article, please consider working on the podman-py library as a developer or as a technical writer.

Bazzan's Blog

Bazzan's Blog

Why !podman-py

Decision to wrap podman CLI utility instead of using the podman-py module