Install Python packages on the research supercomputers at IU
On this page:
- Overview
- Install Python packages for personal use
- Understand the module search order
- Supported packages
Overview
Python packages are collections of modules (reusable code) that extend and enhance the functionality of the core Python language. Python developers contribute to the official Python Package Index (PyPI) repository, making their packages available to the Python community under open source license terms. The Python Packaging Authority (PyPA) manages the repository, and maintains a standard set of tools for building, distributing, and installing Python packages.
On Indiana University's research supercomputers, many third-party packages already are installed to supplement commonly used Python builds. For the most commonly used packages that are supported at IU, see the Supported packages section; for a complete list, type pip freeze
after loading the Python module to see what packages are installed. If you have a unique need for a third-party Python package that is not already installed, you can use pip
or setup.py
to install the package in your home directory or in your Slate storage space.
Space on Slate is available to all IU research supercomputer users. To create a Slate account, follow the instructions in Get additional IU computing accounts.
If you know several researchers are interested in using a Python package that is not already installed, you can request to have it installed as a system-wide site package.
Install Python packages for personal use
Set up your user environment
The IU research supercomputers use module-based environment management systems that provide a convenient method for dynamically customizing your software environment.
To install Python packages, you must have Python added to your user environment. To check which modules are currently loaded; on the command line, enter:
module list
If Python is not among the list of currently loaded modules, use the module load
command to add it; for example:
- To add the default version, on the command line, enter:
module load python
- To add a non-default version:
- Check which versions are available; on the command line, enter:
module avail python
- Load the preferred version; on the command line, enter (replace
version_number
with the preferred version number):module load python/version_number
- Check which versions are available; on the command line, enter:
If Python is listed among the currently loaded modules, but you prefer or need to use another version, you must remove the currently loaded module before loading the other version. To do this with one command, use module switch
; for example, on the command line, enter (replace current_version
with the version number of the currently loaded python
module and new_version
with the preferred version number):
module switch python/current_version python/new_version
You can save your customized user environment so that it loads every time you start a new session; for instructions, see Use modules to manage your software environment on IU research supercomputers.
Install a package using pip
The pip
package management tool, one of the standard tools maintained by the Python Package Authority (PyPA), is the recommended tool for installing packages from the Python Package Index (PyPI) repository.
To install a package from the PyPI repository (for example, foo
), use the pip install
command with the --user
flag; for example:
To install: | Use the command: |
---|---|
The latest version | pip install foo --user |
A particular version (for example, foo 1.0.3 ) |
pip install foo==1.0.3 --user |
A minimum version (for example, foo 2.0 ) |
pip install 'foo>=2.0' --user |
The --user
option directs pip
to download and unpack the source distribution for your package (for example, foo
) in the user site-packages
directory for the running Python (for example, ~/.local/lib/python3.6/site-packages/foo
). Python automatically searches this directory for modules, so prepending this path to the PYTHONPATH environmental variable is not necessary. If you omit the --user
option, pip
will try to install your package in the global site-packages
directory (where you do not have the necessary permissions); as a result, the installation will fail.
Alternatively, you can use the --prefix
option to install your package in your Slate storage space; for example, to install your package (package_name
) in an existing subdirectory (python-pkgs
) in your Slate space, enter:
pip install --prefix=/N/slate/$USER/python-pkgs package_name
If you install your package to a location other than the user site-packages
directory, you will need to prepend the path to that directory to your PYTHONPATH environment variable; for example:
export PYTHONPATH=$PYTHONPATH:/N/slate/$USER/python-pkgs/lib/python3.x/site-packages
- In the above example,
3.x
represents the first two elements of the Python version number you are using. For example, if your Python version is 3.10.5, replace3.x
with3.10
. - This line can be added to your
.bashrc
file to avoid having to add it every time you log in.
For more about using pip
, see the pip install page in the pip User Guide.
Install a package using its setup.py
script
To install a Python package from a source other than the PyPI repository, you can download and unpack the source distribution yourself, and then use its setup.py
script to install the package in the user site-packages
directory:
- Set up your user environment (as described in the previous section).
- Use the
wget
command to download the distribution archive (for example,foo-1.0.3.gz
) from the source (for example, http://pythonfoo.org); for example:wget http://pythonfoo.org/foo-1.0.3.gz
- Use
tar
to unpack the archive (for example,foo-1.0.3.gz
); for example:tar -xzf foo-1.0.3.gz
The distribution should unpack into a similarly-named directory in your home directory (for example,
~/foo-1.0.3
). - Change (
cd
) to the new directory, and then, on the command line, enter:python setup.py install --user
The --user
option directs setup.py
to install the package (for example, foo
) in the user site-packages
directory for the running Python (for example, ~/.local/lib/pythonX.Y/site-packages/foo
).
Python automatically searches this directory for modules, so prepending this path to the PYTHONPATH environmental variable is not necessary.
If you omit the --user
option, setup.py
will try to install the package in the global site-packages
directory (where you do not have the necessary permissions); as a result, the installation will fail.
Alternatively, you can use the --home
or --prefix
option to install your package in a different location (where you have the necessary permissions); for example, to install your package in a subdirectory (for example, python-pkgs
):
- Within your home directory, enter:
python setup.py install --home=~/python-pkgs
- In your Slate storage space, enter:
python setup.py install --prefix=/N/slate/$USER/python-pkgs
site-packages
directory, you will need to prepend the path to that directory to your PYTHONPATH environment variable. For more about PYTHONPATH, see Understand the module search order below.
For more on using setup.py
to install packages, see Installing Python Modules (Alternate Installation).
Understand the module search order
Knowing how the Python interpreter responds to import
statements can help you determine why a particular module or package isn't loading, or why an unexpected version of a package is loading, even though the correct version is installed and the path to its location is listed in your PYTHONPATH environment variable.
When Python launches, it searches the paths found in sys.path
, a list of directories that determines the interpreter's search path for modules. The sys.path
variable is initialized from the following locations, in this order:
- The directory containing the script used to invoke the Python interpreter (if the interpreter is invoked interactively or the script is read from standard input, this first item,
path[0]
, remains an empty string, which directs Python to search modules in the current working directory first) - The directories listed in PYTHONPATH
- The version-specific
site-packages
directory for the running Python installation; for example<sys.prefix>/lib/pythonX.Y/site-packages
, in which<sys.prefix>
represents the path to the running Python installation andX.Y
represents the version number of the running Python installation
By default, Python also imports the site.py
module upon initialization, which adds site-specific paths to the module search path (sys.path
), including the path to your user site-packages
directory within in your home directory (for example, ~/.local/lib/pythonX.Y/site-packages
).
As site.py
adds paths to sys.path
, it scans them for path configuration (.pth
) files, which contain additional directories that are added to sys.path
. If a directory contains multiple .pth
files, site.py
processes them in alphabetical order.
However, some .pth
files contain embedded commands that insert directory entries at the beginning of the module search path (ahead of the standard library path). As a result, a module from one of the inserted directories will load instead of the module of the same name from the standard library directory. This can be undesired and confusing behavior unless such a replacement is intended.
If your import
requests are consistently disrupted by site.py
and .pth
files, try invoking the Python interpreter with the -S
(uppercase "S"):
python -S
This disables the automatic import of site.py
and, as a result, prevents it from manipulating sys.path
. However, it also prevents site.py
from adding your user site-packages
directory to sys.path
. To import site.py
without adding your user site-packages
directory to sys.path
, invoke Python with the -s
(lowercase "s") option:
python -s
To see which directories Python scans when you issue import
commands, on the command line, enter:
python -c "import sys; print ('\n'.join(sys.path))"
Alternatively, launch Python in interactive mode, and then invoke the same commands in this order (>>>
is the Python primary prompt):
>>>import sys
>>>print ('\n'.join(sys.path))
sys.path
variable is only an editable list of strings that you can edit like any other Python list. Avoid editing the first item in the list (path[0]
), because many packages assume it refers to the directory containing the script used to invoke the Python interpreter.
Supported packages
Supported packages in the Python modules
Following are the most commonly used packages available in the Python modules. To see a full list of available packages, after loading a Python module, type:
pip list
or
pip freeze
- astropy
- biopython
- cryptography
- cutadapt
- cycler
- Cython
- dask
- deeptools
- h5py
- idna
- igraph
- jupyterlab
- kiwisolver
- MACS2
- matplotlib
- mpi4py
- nose
- notebook
- numba
- numpy
- pandas
- Pillow
- plotly
- pycosat
- pysam
- PySocks
- pytz
- scikit-learn
- scipy
- seaborn
- tensorflow
- tornado
Supported packages in the python/gpu
modules
Following are the most commonly used packages available in the Python modules. To see a full list of available packages, after loading a python/gpu
module, type:
pip list
or
pip freeze
- accelerate
- bs4
- causality
- cupy-cuda117
- dask
- editdistance
- GraphViz
- guppy3
- h5py
- imbalanced_learn
- imutils
- jax
- Jinja2
- keras
- lightgbm
- Markov
- matplotlib
- mxnet-cu112*
- nibabel
- nltk
- numba
- numpy
- pandas
- Pillow
- sasa
- scikit-learn
- sds
- seaborn
- tensorflow*
- torch*
- vaderSentiment
- Werkzeug
* Specifically GPU-capable packages
Related documents
This is document acey in the Knowledge Base.
Last modified on 2024-03-19 14:16:41.