The Binarium: How to set up a scientific virtual environment for python

Virtual environments are a way to partition your dependencies into different directories. So, for example, if program A needs pandas 1.14 and program 2 needs pandas 1.15, you can put programs 1 and 2 in separate directories, create virtual environments in each folder, and install the packages specific to each program in their respective places without conflict. Note that a typical python installation takes about 282 MB of memory.
This guide applies to Mac OS X and maybe Linux. First, make sure pip is installed. Python 2.7.9 comes with pip preinstalled. See this page for installation. You can check your version of pip with the following command

$ pip -V

Second, install the virtual environment package using pip:

$ pip install virtualenv

This may take a while. You are actually installing python.
Now make a directory in which you want to set up the virtual environment:

$ mkdir myproject
$ cd myproject
$ virtualenv vrtenv

You can specify a python interpreter using the -p option. The following will create a virtual environment with the version in /usr/bin/python2.7.

$ virtualenv -p /usr/bin/python2.7 vrtenv

If you want python 3.3, it must already be installed on your system, and call it the same way:

$ virtualenv -p /usr/bin/python3.3 vrtenv

vrtenv is the directory where the new python environmental variables reside. You can name this folder anything, but vrtenv is intuitive enough.
This is how you start a virtual environment.

$ source vrtenv/bin/activate
(vrtenv)$

Notice the change in the prompt. The name of the virtual environment should appear in parenthesis on the left. Now proceed to install your python packages:

$ pip install numpy
$ pip install scipy

It can be tedious to manually install everything you need. Luckily, pip provides the functionality to install packages from a file. Begin by collecting your current packages:

$ pip freeze > requirements.txt

Here is an example of what might be in requirements.txt:

backports.ssl-match-hostname==3.4.0.2
certifi==14.5.14
decorator==3.4.0
gnureadline==6.3.3
ipython==2.3.1
Jinja2==2.7.3
MarkupSafe==0.23
matplotlib==1.4.2
mock==1.0.1
networkx==1.9.1
nose==1.3.4
numpy==1.9.1
pandas==0.15.2
Pygments==2.0.2
pyparsing==2.0.3
python-dateutil==2.4.0
pytz==2014.10
pyzmq==14.4.1
requests==2.5.1
scikit-learn==0.15.1
scipy==0.14.1
six==1.9.0
tornado==4.0.2
twitter==1.15.0

Once you have frozen and redirected your old package index, just copy requirements.txt to your new directory, myproject, and call

$ pip install -r requirements.txt

All that remains is to copy pre-existing code into myproject or to create new code.
Nota Bene: before installing everything from requirements.txt, go through it and try to remove packages that are superfluous to your new directory. If you accidentally remove a package or two, that's okay, because when you try to run your python script, if you forgot anything, the program will raise an ImportError. Use pip to install whatever is missing, and then update your requirements.txt file using pip freeze again.
Using the requirements.txt above will give you all you need to run networkx for social network analysis, sklearn for machine learning, pandas for data analysis, and ipython notebooks for fun.
Finally, when you are done working in your new environment, deactivate it:

$ deactivate

The Binarium

mathjax

Tuesday, January 13, 2015

How to set up a scientific virtual environment for python

No comments:

Post a Comment

kjh