
tldr;
Ubuntu 22.04 LTS was released on April 21st, 2022 and it just so happened that I had few machine upgrades to do a few days later. So I thought I’d jump on board early and figure out what problems other members of the team will encounter in the near future.
My primary job responsibility has me integrating software solutions developed between four different research institutions into a single deliverable for our project sponsor. As this is primarily a machine learning/CV-based project, Linux is up front and center and Ubuntu is the near universal choice among our researchers. The project may run for up to 4 years so it pays to get on the latest LTS release early in the project to help ensure stability down the road. It was with this goal in mind that I upgraded my workstations to 22.04, hoping to provide some guidance to the rest of the team about what they are likely to run into.
I have been largely impressed so far. Particularly notable is the almost intangible yet material improvement in UI aesthetics and responsiveness. But the trials have not been without issue. NVidia’s decision to revert to Xorg at the last minute was a little disappointing. After a little experimentation on an NVidia 2080ti however, Xorg does seem to have something of an edge at this point. I expect that some of the issues I have run into may in fact be due to this last minute change. There have also been a surprising number of breaking changes that have gone mostly unpublicized thus far, though I’m a little earlier in the uptake cycle than I have been for releases past. Commenting on these issues responsibly would require a depth of analysis that I have yet to conduct, so I’ll refrain from doing so until I can accurately determine on which side of the keyboard each of these issues truly resides. More is certain to follow in this regard.
From the machine learning perspective though, the biggest issue that I’ve run into is that Ubuntu 22.04 ships with Python 3.10.4 as the default. Normally this is something that I would wholeheartedly welcome, if not marvel at given that Python 3.10.4 was released just a month before the latest Ubuntu offering. The issue for here is our reliance on Pytorch for this project, and the fact that the project does not yet support Python 3.10.
Unfortunately there is no obviously simple (and easily maintainable) way to install Python 3.8 from the apt repositories. I am loath to go outside the standard repositories for something so core to our operation unless forced and successful utilization of PPA or backport more often then not leads to bigger issues down the road. Anyone who has ever messed with the default Python install on Ubuntu is undoubtedly aware of what dragons that way there be. Fortunately there are a few options available to address the issue, each with different pros and cons.
The first approach is to download Python 3.8 directly (or build from source for the sadists among us) and change path and/or environment variables at runtime to use the correct version. To me this tends to be a little cumbersome and error prone, but it is really dependent on the control you have over your development environments and the composition of your team, so YMMV. For me it is the method of last resort.
If you are using Anaconda as your virtual environment tooling and that system integrates with your development processes, then you can just create a Python 3.8-base environment and be good to go. We have utilized Anaconda as our environment manager in past projects, and there is a lot going for it in terms of control, but it does tend to be a little on the girthy side and can be a downright PITA when trying to deploy on containers, so for me this too was not the optimal solution. If Conda works in your environment, however, read no further as that’s all you really need.
If, like me, you are Conda averse, you probably prefer to to keep the tooling as close to Python/OS core as possible and rely on Python’s built-ins for environment management. This has given us the most dependable results for distributing code that can build its own runtime environment on the fly without much need for manual intervention. This tends to be more reproducible on remotely located and controlled development environments in my experience, and also deploys to container images easily without a lot of environment workarounds, all while keeping image sizes to a minimum. The challenge with Python’s built environment management is the inability to create environments for other Python versions without first having those Python versions installed outside of the environment tooling ecosystem, which as previously stated is not possible out-of-the-box with 22.04. So is there a happy, or even palatable, medium?
My Approach
The method I have employed to tide us over until Pytorch achieves Python 3.10 support is a bit of a hybrid. Call it a best of both worlds approach, or worst of both worlds depending on your view of water glass capacity measurements. The basic idea is to leverage Anaconda to create a Python 3.8 environment, and from that environment we can create Python 3.8 venv environment(s). The main benefit to this approach is that the Conda environment is completely disposable once it is no longer needed and it doesn’t require root privileges or core OS package management changes. Once we have created the necessary venv environment Conda is no longer required in the loop, allowing us to separate it from our distribution with the expectation that, eventually, the workaround itself will not be needed as we can migrate directly to Python 3.10.
A Few Simple Steps
Step 1: Download and install Anaconda
bash ~/Downloads/Anaconda3-2021.11-Linux-x86_64.sh
When prompted, accept the license (after reading it of course) and allow it to initialize the environment. Note that the installation ostensibly works with sh instead of bash, but it will likely throw a few errors due to syntax incompatibilities. While these errors are, at the time of this writing, inconsequential for our purposes, I would highly recommend using bash to install it if you intend to use Conda beyond the purposes of setting up our 3.8 venv environment.
Step 2: Create a Python 3.8 Conda environment
Note: You may have to restart your shell after installation to make the conda command available
conda create --name py38 python=3.8
From this environment we will be able to create a 3.8 venv environment that is isolated from any Conda dependency.
Step 3: Activate Conda environment and use it to create venv Python 3.8 environment
conda activate py38 python -m venv --copies ~/py38venv
Here ~/py38venv is just the path to our target Python 3.8 venv environment, substitute as you see fit. It may be worthwhile keeping an environment designated just for the creation of other environments so that you don’t have to reintroduce Conda to the process in the future.
Edit: I originally forgot to include the –copies option here, which is OK if you don’t plan to get rid of Conda once your environment is set up. The default behavior of the venv command on Linux, however, is to symlink the version of Python that was used to create the environment. This would leave the environment unusable after deleting Anaconda. Using the –copies argument isolates the two.
Step 4: Deactivate Conda environments and activate venv environment\
conda deactivate conda deactivate source ~/py38venv/bin/activate
Note: The duplicate deactive commands are not a typo. The first takes you out of the Conda 3.8 envirionment, and the second takes you out of Conda altogether.
And there you have it, a venv-based Python 3.8 environment. From here you can disable/delete Anaconda and install packages in your venv environment as you see fit.
Granted, this is a somewhat convoluted methodology to get a Python 3.8 venv environment installed on Ubuntu 22.04, but it’s the mechanism with the smallest permanent footprint for our purposes and will serve well as a work around until Pytorch updates finally land.
Wrapping It Up
By all accounts Ubuntu’s 22.04 LTS release has been met largely with praise, and I join in this sentiment. As with any new release, however, there are challenges. In this post I provide a mechanism for installing venv-based Python 3.8 environments on this new release without having to contaminate any operating system components or be bound to using Anaconda as your environment manager.
It is worth stating for the record here that while I am clearly eschewing the Anaconda route for our deployment, this shouldn’t be taken as any sort of contempt for the tool. Overall I think Anaconda is an amazing set of tooling, but one that has a few short comings with respect to my particular use case. For stand alone projects or projects where I am the sole developer, I am more than happy to use Anaconda and its more advanced capabilities and more expansive repository options. I just isn’t the right tool for for me in this instance.
-Kip
You must be logged in to post a comment.