The Problem
Not many things are more annoying than a build failure that is nobody’s fault:
1 2 3 4 5 6 7 8 9 10 11 12 |
|
The code is fine, but our build depends on a website that we don’t control!
We use Jenkins for our automated builds which runs Buildout to assemble our project for deployment. Buildout checks the PyPi index for any new packages that are needed or any new versions for packages that are not version-pinned. So if PyPi is unavailable (or very slow) for a while, our build fails. That sucks because Jenkins won’t try another build until it detects a new changelist in our repository.
The Solution
I started looking into how we could cache PyPi locally to avoid this problem altogether. I found several ways to achieve this, but finally settled on collective-eggproxy. I mainly chose it for 2 reasons: 1. It doesn’t cache/sync all 30+ gigs of PyPi 2. We already have an Apache instance and it can be ran as a mod_python module I ran into a couple of installation and configuration problems so I thought I’d share our setup.
Installation
Eggy Parts
BeautifulSoup is an awesome HTML/XML parser but whoever manages their index on PyPi has the wrong links. I’ve seen a couple of packages that rely on BeautifulSoup versions <= 3.09, but those seem to fail. I finally figured out that adding a find-links hint to easy_install fixed me right up:
1
|
|
Note that I’m just installing it in the system Python site-packages. You could use a virtualenv, but this is just our CI server.
mod_python
I use Ubuntu so I got mod_python via apt:
1
|
|
Configuration
collective.eggproxy
In /etc/eggproxy.conf I used:
1 2 3 4 5 6 |
|
Apache
I added a dedicated virtualhost to Apache like so:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|
Buildout
In our project’s buildout.cfg I simply had to add:
1
|
|
Conclusion
This has worked out great for us. Not only are our builds more stable, but there’s a noticeable speed improvement. Totally worth it!