Anaconda Python Distribution

Photo Credit  via Flickr / Creative Commons

Photo Credit via Flickr / Creative Commons

A few days ago I tweeted about the Anaconda distribution of Python from Continuum Analytics. I expressed surprise that I'd never heard of it before. For my needs, I believe it's better than either Enthought's Canopy distribution (formerly "Enthought Python Distribution" or "EPD") or "rolling my own." To my further surprise, David Cournapeau from Enthought responded to my tweet asking if I could expand on my comments:

So, here is an expanded answer. I certainly don't mean any disrespect to Enthought, who pretty much led the charge in the field of scientific Python distributions. I saw Continuum IO's CEO, Travis Oliphant, speak at PyCon 2008 in Chicago... when he was working for Enthought!

Both Anaconda and Canopy are Python distributions that run on Windows, OSX and Linux. Generally, both provide a fully integrated environment for scientific analysis with SciPy and NumPy. The selling point of both, I'd say, is that all the compilation and integration details are handled for you. I suspect anybody who's tried to get SciPy, NumPy, the latest IPython and all the various libraries and dependencies to compile and install successfully would agree.

My comments on Enthought are based on my attempts to find a distro for Windows that installed and worked easily; I wasn't having very good luck getting some of the tools, especially easy_install and pip, working properly with the vanilla installer from Python.org. In general I find Python on Windows to be a pain in the neck, so my analysis may be colored by that. In much the same way, it's not that I don't like my wife's lamb recipe... it's just that I don't care for lamb.

Here are the differences as I see them.

  • Product Mix: With Anaconda, there's pretty much just one product, and it's free.  Continuum's add-on product offerings are mostly specialized compilers and libraries for allowing various parts of Anaconda to take better advantage of hardware (with "NumbaPro" and Intel's "MKL Optmiziations", collectively "Anaconda Accelerate") and to more efficiently interface with data stores ("IOPro.")  But except for performance differences, the free edition of the distribution basically works the same.  With Enthought's Canopy distribution, in order to de-cripple the distribution and get the package installer working, it seems that you have to purchase the "Pro" version -- effectively you've got to purchase services to go along with the code.  That might be OK for a Fortune 500 corporation, but not for me.
  • Platform Availability: Anaconda offers 32-bit and 64-bit installs for Linux and Windows, and 64-bit only installs for OSX. With Enthought, 64-bit support for Windows or OSX is only available in the Pro version. Since most scientific/technical users are probably running newer machines with lots of RAM, most of them are probably running 64-bit operating systems.
  • Licensing: Anaconda's licensing means that if I wanted to deploy it on a temporary EC2 cluster, or redistribute it with a virtual machine image, I can.  (This would not be true of the add-on products, naturally.) While Enthought does permit redistribution of the installer in unaltered form, a lot of its features are only available with a paid license, so redistributing it isn't very practical.
  • Python Versions: I like that Anaconda lets me easily work with multiple Python versions, and multiple environments, with just a command line switch. It's like a virtualenv on steriods that also lets you test your code with several different versions of Python. This is huge for developers that need to test in a variety of environments. One of the reasons I hadn't yet started using Python 3.x was that I was afraid of borking my standard system python. Enthought, as far as I can tell, only includes Python 2.7.3.
  • Package Management: The Anaconda package manager, in my testing thus far, pretty much just works. It also provides a correctly configure C compiler, which allows you to use pip to install things that are not included in Anaconda's repositories. I like that Anaconda warned me when I tried to install virtualenv, because conda does a lot of similar things and they would conflict. I had a lot of trouble with the graphical Enthought package manager, on the other hand. Why would I need a graphical tool for this? It's a programming environment for scientific and technical users; command line tools are perfectly acceptable, and in many cases preferable. I signed up for an Enthought account, hoping that it would make the package manager work, but no such luck.
  • Packages Included: Enthought lacks database drivers, sphinx, docutils, reportlab, pep8 and hdf5.  These seem like obvious omissions.

I don't want to say that Enthought doesn't have its uses. I've worked in University computing environments before. It would be nice to have a site license for a set of graphical tools that is pretty much the same on Windows, OSX and Linux. But for me, I just want something that works with a minimum of fuss on my own machines, which include Windows, OSX and Linux. And I want to be able to install it on as many machines as I want, including potentially hundreds of temporary VMs, without any licensing hassles. If I get to the point where I need services or consulting, I'm happy to pay for it ... I just don't want to be forced to in order to use the base distro.

For me, the new scientific Python distro of choice is Anaconda's.

That may not be a very satisfying answer for Enthought, as it doesn't really answer David's question, but those are my impressions.