Rethinking requirements.txt
What is requirements.txt
?
This should be familiar to most Python programmers, but here’s a brief summary
anyway: a requirements file contains a list of dependencies for an application
(not a library), often with
specific version information. Many requirements files are generated with
commands like pip freeze > requirements.txt
1 . Here’s an example:
$ cat requirements.txt
blessings==1.6
bpython==0.14.2
curtsies==0.1.19
flake8==2.4.1
greenlet==0.4.7
jedi==0.9.0
mccabe==0.3.1
msgpack-python==0.4.6
neovim==0.0.38
pep257==0.6.0
pep8==1.5.7
pyflakes==0.8.1
Pygments==2.0.2
requests==2.7.0
six==1.9.0
Most importantly, requirements files serve several purposes:
-
They ensure that our environments are consistent across different machines. Reproducible environments are crucial to preventing bugs, incompatibilities, and other breakage introduced by changes between versions of libraries.
-
They communicate to our fellow developers what the code we’ve written relies on. Given the requirements file of a project, we can generally guess the kinds of things it’s going to do before we read one line of code. e.g.
requests
suggests that an application is going to communicate with HTTP servers, andmsgpack-python
tells us that we’ll probably be using msgpack as an interchange format. -
They allow for neat things like automated version update auditing from requires.io and (maybe one day) security alerts from Is it vulnerable?.
The Problem
Refer again to the file above. Where did all of those things come from? Here’s the command that created that environment:
pip install bpython flake8 jedi neovim pep257
Notice any differences? We asked for five packages and got fifteen back. This is actually exactly what we want for purposes #1 and #3. For both of those use cases, we want to know exactly what library versions our application is being deployed against. Our update notifications and security alerts are only any good when the auditing services are checking the versions running on our servers. However, it does very little to address point #2: providing relaying information from one person to another.
Here’s how that pip install
resulted in that list of requirements:
$ pipdeptree
bpython==0.14.2
- Pygments [installed: 2.0.2]
- requests [installed: 2.7.0]
- curtsies [required: >=0.1.18, installed: 0.1.19]
- blessings [required: >=1.5, installed: 1.6]
- greenlet [installed: 0.4.7]
- six [required: >=1.5, installed: 1.9.0]
flake8==2.4.1
- pyflakes [required: >=0.8.1, installed: 0.8.1]
- mccabe [required: >=0.2.1, installed: 0.3.1]
- pep8 [required: >=1.5.7, installed: 1.5.7]
jedi==0.9.0
neovim==0.0.38
- msgpack-python [required: >=0.4.0, installed: 0.4.6]
- greenlet [installed: 0.4.7]
pep257==0.6.0
It turns out that bpython
, flake8
, and neovim
required a bunch of
libraries. For a brand new virtualenv like this one, this is all pretty
readable. Once we start looking at projects that have a lot of high level
dependencies, making sense of this gets a lot harder. Additionally, when
updates to libraries like bpython
happen, e.g. when it drops support for
old, outdated versions of Python, dependencies like six
will be leftover,
unused by any libraries but still sticking around with each new deployment.
The Solution
Here’s where I say something controversial within the Python community: Ruby got it right 2.
Bundler is a tool for managing Ruby environments; it can be seen as Ruby’s answer to virtualenv. In addition to isolating environments and installing dependencies, it provides separate files (Gemfile and Gemfile.lock) for human readable and machine readable 3 dependency specifications. My proposal: the Python community needs to take a similar approach.
While we need better tools 4 to address this problem, it’s largely a
social one. Before we can solve this, we need to agree that the situation needs
improvement. For now, I’m using the pip-compile
command from
pip-tools (this lets me keep separate
requirements.in
and requirements.txt
files for my applications), and I think
you should, too.
Footnotes:
Big thanks to @dirn, @PaulLogston, and @mattupstate for reviewing early versions of this post.
-
Please curate your requirements files with more care than this. Simply dumping the output from
pip freeze
will likely lead to packages that are meant solely for development becoming permanent members of your deployment environments. ↩ -
While I’m at it, so did
node.js
. NPM includes a command called shrinkwrap that produces a full, version-locked list of dependencies based onpackage.json
. Because of how Python’s import system works, this would be incredibly difficult (if not impossible) to pull off. ↩ -
Both of these files are actually in machine readable formats, but only the Gemfile addresses purpose #2 above. ↩
-
There are several tools aside from
pip-compile
available. I considered each of these before finding and deciding on pip-tools. This list is here mainly as a reference for why each tool was not right for me.-
pbundler: clone of Bundler, last updated in 2012.
-
pundle: clone of Bundler, reimplements standard Python tooling instead of working with it.
-
pundler: looks interesting; my second choice, but not as mature as pip-tools, and was broken with latest versions of pip when I last tried it (it tends to be difficult to use / write / maintain software that as a library when it was intended to be an application).
-