Python Poetry mono repo without limitations

Gerben Oostra
9 min readJan 9, 2023

--

Poetry is great in managing dependencies in the modern pyproject.toml, but it doesn’t provide mono repo features (at least not yet, see issues 2270 and 936). Fortunately, adapting poetry to manage a mono repo is quite easy. I’ve made an example available here.

Update 2024: Creating artifacts with named dependencies is now possible with the Poetry Plugin poetry-plugin-mono-repo-deps.

A mono repo is a single (git) repository containing multiple related components. Think about all the packages and micro-services belonging to a single bounded context. For example, suppose you developed a mathematical library with a few thin AWS Lambda wrappers to serve different rest endpoints. You also extracted some AWS utility code to a separate library. As deployment and development of these are related, putting them in a mono repo can make life easier. Note that this doesn’t mean you’re building a monolith. It’s still helpful to adhere to S.O.L.I.D. principles and the clean architecture.

Development is easier because one can modify one of the core libraries and verify its behavior in the entire application without having to release the new version and subsequently update the dependency reference to it everywhere.

Deployment is easier, because typically a new version of the core libraries requires an update of the full application. Having the full service architecture in one repo, allows everything to be integrated and deployed together. This makes deployment and reverts easier.

I’ll demonstrate how to use poetry to manage the dependencies of each package in the mono repo. Specifically, I’ll show you how to:

  1. Develop and run code locally using local development (aka ` — dev`) dependencies.
  2. Release all packages to a pypi repo, such that they can be pip installed & reused by other projects.
  3. Build a docker with all packages within the CI/CD pipeline (locally, before publishing the artifacts to a pypi repo).

Releasing packages vs. local development.

For local development and installation, we want to depend on the packages in the mono repo using path dependencies. However, if we build the python wheels, it should depend on the name and the version. If the wheel would contain the path dependency, installing the wheel requires the dependency to live in the same relative path.

A short note on the version of the dependency. If we release two mono repo components that depend on each other, their versions should match. Therefore, if we release the mono repo as version 1.2.3, the internal dependencies need to become ^1.2.3 (compatible, with possible fixes, but at least 1.2.3). Note that in python’s PEP 440’s version specifiers this would be ~=1.2,>=1.2.3 (which is what poetry generates when building a wheel with a ^1.2.3 dependency).

Solving path dependencies in released wheels with a Poetry plugin.

(Added in 2024). Instead of using the below-described approach with scripts, the behavior of Poetry can be modified using a plugin. The poetry-mono-repo-deps plugin results in the same result as the next two approaches.

Solving path dependencies in released wheels with just-in-time modification. We keep the path dependency committed to git. Before building the wheel, we modify the path dependency in pyproject.toml to a name+version dependency, without updating the lockfile. A poetry build of the wheel will use the dependency definition as it exists in the pyproject.toml, not as it might be defined in the lock file. You will get a warning on the lock-file not matching the project definition, but as that is on purpose, you can ignore the warning.

I’ve provided a bash build script that does this. The core part is to replace the path dependencies with the version of the current project (all packages in the mono repo have the same version):

if [ "$(uname)" = "Darwin" ]; then export SEP=" "; else SEP=""; fi
VERSION=$(poetry version | awk '{print $2}')
sed -i$SEP'' "s|{.*path.*|\"^$VERSION\"|" pyproject.toml

Solving path dependencies in released wheels with post-build modifications. An alternative solution is to replace the path dependencies in the built wheel & tar.gz artifacts. The advantage is that you don’t have to modify the poetry files and don’t risk committing those changes to git. The disadvantage is that the replacement is more complicated, and we must create the PEP 440 compliant version range ourselves.

I’ve provided a bash script that updates the tar.gz and wheel files. It first determines the minor version:

VERSION=$(poetry version | awk '{print $2}')
VERSION_MINOR=$(echo $VERSION | sed -E "s/^([0-9]*\.[0-9]*).*/\1/")

It then replaces the path dependencies in tar.gz file to a PEP 440 compliant named version range:

FOLDER=$(ls)
sed -i$SEP'' "s|^Requires-Dist: \(.*\) @ \.\./.*|Requires-Dist: \1 (~=$VERSION_MINOR,>=$VERSION)|" "$FOLDER/PKG-INFO"
sed -i$SEP'' "s| @ \.\.[a-zA-Z\-_/]*|~=$VERSION_MINOR,>=$VERSION|" "$FOLDER/setup.py"
sed -i$SEP'' "s|{.*path.*\.\..*|\"~$VERSION\"|" "$FOLDER/pyproject.toml"

Similarly, it also updates the path dependencies in the wheel file:

FOLDER=$(ls -d *.dist-info)
sed -i$SEP'' "s|^Requires-Dist: \(.*\) @ \.\./.*|Requires-Dist: \1 (~=$VERSION_MINOR,>=$VERSION)|" "$FOLDER/METADATA"

Building docker files

To build docker files with the exact versions of your local checkout, we will:

  • Build the wheel files, with name+version instead of path dependencies, as described above
  • Create a requirements.txt that represents the poetry lock file’s contents, and include the service itself to it.
  • Build a docker in 2 stages, so we don’t get all the wheels into the final docker.

The building stage of the docker (provided here) then looks as follows:

FROM public.ecr.aws/lambda/python:3.8 AS builder
WORKDIR ${LAMBDA_TASK_ROOT}
# all the packages of the monorepo we could depend on:
COPY dist/*.whl /tmp/
# the specific versions service-a depends on, based on the lockfile:
COPY info/requirements.txt /tmp/
# we append ourselves to the requirements.txt, to also install service-a
# hadolint ignore=SC2086
RUN echo "service-a" >> /tmp/requirements.txt && \
python3 -m pip install --pre --no-cache-dir --find-links=/tmp/ -r /tmp/requirements.txt -t "${LAMBDA_TASK_ROOT}"

This:

  • Uses --pre to allow any (local) version of our mono repo dependencies.
  • Uses --find-links to specify the directory with our wheels.
  • Uses the requirements.txt of service-a ‘s lock file. Those are only its dependencies. Thus we add service-a itself so that it also gets installed. Note that service-a‘s wheel also contains version ranges. However, we use the pinned versions from the lock file to make the build deterministic.

Some caveats

Updating transitive lock files.

If you add a dependency to one of the packages, you also want to update the upstream packages’ lock files in the mono repo. For example, if package B depends on A, and A gets an additional dependency, we should update both lock files. Unfortunately, poetry will only update the lock file of A.

Thus we need to update all lock files ourselves. In other words, if B depends on A, and A gets a new dependency, running a poetry lock --no-update on B will not be sufficient, as that doesn’t inspect transitive dependencies. Ideally, poetry would make an exception for local path dependencies, but unfortunately, it doesn’t. Thus we need to run poetry lock or poetry update, which as a side effect, also updates all other transitive dependencies. To make this easy, we can introduce a short shell script that runs poetry lock on all packages in the project, which you can find here.

Sharing development utilities

Some development utilities, like duanamai, commitizen , safety are repo wide dependencies. These I add to the dev group of the root folder’s pyproject.toml. The production dependencies are kept empty.

Each package in the mono repo has all its dependencies, both production, and development. Your pytest dev dependency is thus added to each package, which allows you to run poetry run pytest in those packages. It also allows your IDE to use the versions used in your package.

Repository wide production group dependencies.

While I do add development dependencies to the mono repo’s root pyproject.toml, I don’t add any production dependencies.

There are two ways how you could do this, but I personally don’t recommend them:

  • Have all repo packages depend on the root project, and place repo wide dependencies here. This could simplify updating the versions of dependencies. My preference is however to keep dependencies explicit and limited. If a package requires a dependency, add it as a dependency to that package. That also limits the packages dependencies to only those it uses.
  • Have the root project depend on all packages. This could simplify working & developing in a virtual environment, as you now have environment that includes all modules. This would introduce a risk of missing dependencies. Suppose a package relies on another package, but doesn’t include it in its dependency list. You wouldn’t discover that, because you’re working in an environment that includes both packages.

Keeping dependency versions synchronized

Different packages could (and should) list the used dependencies in their pyproject.toml file. The versions of a specific package can differ. In that case, Poetry will resolve the correct production dependency versions. Thus if versions conflict, you’ll know right away.

Development dependencies are different. If each package in the mono repo has a pytest dependency, the versions might differ between the packages (because a project doesn’t depend on its dependency’s dev dependencies). The different versions can be intentional but are not preferred.

For both cases, I’d recommend using a dependency updater like renovate or dependabot to update dependencies automatically across the whole mono repo.

Versioning your mono repo

If you version your packages, it’s best to use good version numbers. Instead of relying on some emotional way of choosing whether a release is major, minor or a fix (aka sentimental versioning), I’d recommend using semantic versioning.

The great thing is that if you use conventional commit messages, one can automatically bump versions (and create tags) with commitizen.

If you build wheels in your CI/CD pipeline, they will be versioned as your latest released version, even though it includes new commits. To prevent this, I use dunamai to create a local version (as a valid semantic version).

When the version is bumped, this shell script updates all occurrences of the version across the mono repo. The downside is that it needs to know the files to edit, like the pyproject.toml toml files and the __version__ identifiers in python files. There is a promising dunamai poetry plugin that potentially could simplify this process.

Bringing it all together

Combining all the above results in https://gitlab.com/gerbenoostra/poetry-monorepo/, which has the following files:

.
├── .gitlab-ci.yml
├── .python-version
├── CHANGELOG.md
├── README.md
├── VERSION
├── package-a
│ ├── README.md
│ ├── VERSION
│ ├── package_a
│ │ └── __init__.py
│ ├── poetry.lock
│ ├── poetry.toml
│ └── pyproject.toml
├── package-b
│ ├── README.md
│ ├── VERSION
│ ├── package_b
│ │ └── __init__.py
│ ├── poetry.lock
│ ├── poetry.toml
│ ├── pyproject.toml
│ └── requirements.txt
├── poetry.lock
├── poetry.toml
├── pyproject.toml
├── scripts
│ ├── create_local_version.sh
│ ├── poetry_build.sh
│ ├── poetry_install.sh
│ ├── poetry_update.sh
│ ├── projects.sh
│ ├── replace_path_deps.sh
│ └── run_on_each.sh
└── service-c
├── Dockerfile
├── README.md
├── VERSION
├── app.py
├── poetry.lock
├── poetry.toml
├── pyproject.toml
└── service_c
└── __init__.py

To create all virtual environments, just run poetry_install.sh.

If you want to run poetry update on all projects, just run poetry_update.sh.

When bumping versions, commitizen is used to determine the correct version and update the versions across the repo. It also updates the repo-wide changelog

You can use poetry_build.sh to build all the wheels. However, note that this changes pyproject.toml files, which you don’t want to commit.

When building wheels in a CI/CD pipeline, it will first create a local version (and update all version references across the repo) using create_local_version.sh, and build the wheels with poetry_build.sh. But, again, this touches many files you don’t want to commit.

If you’ve changed a dependency of say package-a , you’ll need to update package-b , and also package-c. In that case, just run poetry_update.sh , which does exactly this (in the correct order)

The .gitlab-ci.yml file shows how to use these tools in a CI/CD environment. As it mainly utilizes shell scripts, it’s easy to switch to any CI/CD platform.

Adapting to your repo

When using the scripts, you’ll need to update projects.sh to list all the poetry packages in the mono repo.

The list should be in topological order: it should list a package’s dependencies before the depending package.

Also, note that both create_local_version.sh and the [tool.commitizen] section in the root folder’s pyproject.toml contain a list of files where the repo version is specified.

Further work

In the mindset of dunamai’s dynamic versioning plugin, perhaps I (or someone else) could implement the dependency updates as a poetry plugin. Because in the core, it is just replacing path dependencies to a name with (correct) version.

(2024 Update). This is exactly what I finally did with the poetry-plugin-mono-repo-deps Poetry plugin.

--

--