How do I install Python packages on the research computing systems at IU?

Python packages are collections of modules (reusable code) that extend and enhance the functionality of the core Python language. Python developers contribute to the official Python Package Index (PyPI) repository, making their packages available to the Python community under open source license terms. The Python Packaging Authority (PyPA) manages the repository, and maintains a standard set of tools for building, distributing, and installing Python packages.

On Indiana University's research computing systems, many third-party packages already are installed to supplement commonly used Python builds. If you have a unique need for a third-party Python package that is not already installed, you can use pip or setup.py to install the package in your home directory. If you know several researchers are interested in using a Python package that is not already installed, you can request to have it installed as a system-wide site package.

On this page:


Installing Python packages for personal use

Setting up your user environment

To install Python packages, you must have Python added to your user environment. On Karst and Carbonate, Python is added to your user environment by default. On Big Red II, or if you previously removed Python, use the following instructions to add Python to your user environment:

  1. Check which modules are currently loaded; on the command line, enter:
      module list
    
  2. If Python is not among the list of currently loaded modules, use the module load command to add it; for example:
    • To add the default version, on the command line, enter:
        module load python
      
    • To add a non-default version:
      1. Check which versions are available; on the command line, enter:
          module avail python
        
      2. Load the preferred version; on the command line, enter (replace version_number with the preferred version number):
          module load python/version_number
        
  3. If Python is listed among the currently loaded modules, but you prefer or need to use another version, you must remove the currently loaded module before loading the other version. To do this with one command, use module switch; for example, on the command line, enter (replace current_version with the version number of the currently loaded python module and new_version with the preferred version number):
      module switch python/current_version python/new_version
    

To make permanent changes to your environment, edit your ~/.modules file. For more, see In Modules, how do I save my environment with a .modules file?

For more about using Modules to configure your user environment, see On the research computing systems at IU, how do I use Modules to manage my software environment?

Installing a package using pip

The pip package management tool, one of the standard tools maintained by the Python Package Authority (PyPA), is the recommended tool for installing packages from the Python Package Index (PyPI) repository.

To install a package from the PyPI repository (e.g., foo), use the pip install command with the --user flag; for example:

To install: Use the command:
The latest version pip install foo --user
A particular version (e.g., foo 1.0.3) pip install foo==1.0.3 --user
A minimum version (e.g., foo 2.0) pip install 'foo>=2.0' --user

The --user option directs pip to download and unpack the source distribution for your package (e.g., foo) in the user site-packages directory for the running Python; for example:

  ~/.local/lib/python2.7/site-packages/foo

Python automatically searches this directory for modules, so prepending this path to the PYTHONPATH environmental variable is not necessary.

If you omit the --user option, pip will try to install your package in the global site-packages directory (where you do not have the necessary permissions); as a result, the installation will fail.

For more about using pip, see the pip install page in the pip User Guide.

Installing a package using its setup.py script

To install a Python package from a source other than the PyPI repository, you can download and unpack the source distribution yourself, and then use its setup.py script to install the package in the user site-packages directory:

  1. Set up your user environment (as described in the previous section).
  2. Use the wget command to download the distribution archive (e.g., foo-1.0.3.gz) from the source (e.g., http://pythonfoo.org); for example:
      wget http://pythonfoo.org/foo-1.0.3.gz
    
  3. Use tar to unpack the archive (e.g., foo-1.0.3.gz); for example:
      tar -xzf foo-1.0.3.gz
    

    The distribution should unpack into a similarly-named directory in your home directory (e.g., ~/foo-1.0.3).

  4. Change (cd) to the new directory, and then, on the command line, enter:
      python setup.py install --user
    

The --user option directs setup.py to install the package (e.g., foo) in the user site-packages directory for the running Python; for example:

  ~/.local/lib/pythonX.Y/site-packages/foo

Python automatically searches this directory for modules, so prepending this path to the PYTHONPATH environmental variable is not necessary.

If you omit the --user option, setup.py will try to install the package in the global site-packages directory (where you do not have the necessary permissions); as a result, the installation will fail.

Alternatively, you can use the --home or --prefix option to install your package in a different location (where you have the necessary permissions); for example, to install your package in a subdirectory (e.g., python-pkgs):

  • Within your home directory, enter:
      python setup.py install --home=~/python-pkgs
    
  • In your Data Capacitor II scratch space, enter:
      python setup.py install --prefix=/N/dc2/scratch/$USER/python-pkgs
    
Note:
If you install your package to a location other than the user site-packages directory, you will need to prepend the path to that directory to your PYTHONPATH environment variable. For more about PYTHONPATH, see the PYTHONPATH and the python import order below.

For more on using setup.py to install packages, see Installing Python Modules (Legacy version).

Requesting a system-wide package installation

If you know of a Python package that is not already available on the IU research computing systems, and you believe several other researchers would find it useful, contact the UITS Scientific Applications and Performance Tuning (SciAPT) team to request a system-wide installation.

In your request:

  • Specify the package and version number you want installed.
  • Describe what the package does and/or how it's used.
  • Include the URL to the package's website or project home page.
  • Include an estimate of how many other researchers will use the package.

The SciAPT team considers these requests on a case-by-case basis. Although system-wide installations are practicable in some cases, SciAPT limits the number of system-wide installations to avoid potential conflicts and keep Python load times minimal.

Understanding the module search order

Knowing how the Python interpreter responds to import statements can help you determine why a particular module or package isn't loading, or why an unexpected version of a package is loading, even though the correct version is installed and the path to its location is listed in your PYTHONPATH environment variable.

When Python launches, it searches the paths found in sys.path, a list of directories that determines the interpreter's search path for modules. The sys.path variable is initialized from the following locations, in this order:

  1. The directory containing the script used to invoke the Python interpreter (if the interpreter is invoked interactively or the script is read from standard input, this first item, path[0], remains an empty string, which directs Python to search modules in the current working directory first)
  2. The directories listed in PYTHONPATH
  3. The version-specific site-packages directory for the running Python installation; for example:
      <sys.prefix>/lib/pythonX.Y/site-packages
    

    In this example, <sys.prefix> is the path to the running Python installation; X.Y is the version number (e.g., 2.7) of the running Python installation.

By default, Python also imports the site.py module upon initialization, which adds site-specific paths to the module search path (sys.path), including the path to your user site-packages directory within in your home directory; for example (X.Y will be the version number of the running Python installation):

  ~/.local/lib/pythonX.Y/site-packages

As site.py adds paths to sys.path, it scans them for path configuration (.pth) files, which contain additional directories that are added to sys.path. If a directory contains multiple .pth files, site.py processes them in alphabetical order.

However, some .pth files contain embedded commands that insert directory entries at the beginning of the module search path (ahead of the standard library path). As a result, a module from one of the inserted directories will load instead of the module of the same name from the standard library directory. This can be undesired and confusing behavior unless such a replacement is intended.

Note:

If your import requests are consistently disrupted by site.py and .pth files, try invoking the Python interpreter with the -S (uppercase "S"):

  python -S

This disables the automatic import of site.py and, as a result, prevents it from manipulating sys.path. However, it also prevents site.py from adding your user site-packages directory to sys.path. To import site.py without adding your user site-packages directory to sys.path, invoke Python with the -s (lowercase "s") option:

  python -s

To see which directories Python scans when you issue import commands, on the command line, enter:

  python -c "import sys; print '\n'.join(sys.path)"

Alternatively, launch Python in interactive mode, and then invoke the same commands in this order (>>> is the Python primary prompt):

   >>>import sys
   >>>print '\n'.join(sys.path)
Note:
The sys.path variable is only an editable list of strings that you can edit like any other Python list. Avoid editing the first item in the list (path[0]), because many packages assume it refers to the directory containing the script used to invoke the Python interpreter.

This is document acey in the Knowledge Base.
Last modified on 2018-05-03 15:10:42.

  • Fill out this form to submit your issue to the UITS Support Center.
  • Please note that you must be affiliated with Indiana University to receive support.
  • All fields are required.

Please provide your IU email address. If you currently have a problem receiving email at your IU account, enter an alternate email address.

  • Fill out this form to submit your comment to the IU Knowledge Base.
  • If you are affiliated with Indiana University and need help with a computing problem, please use the I need help with a computing problem section above, or contact your campus Support Center.

Please provide your IU email address. If you currently have a problem receiving email at your IU account, enter an alternate email address.