Install packages in a conda environment on IU's high performance computers

On this page:


Overview

Conda is an open source package manager similar to pip that makes installing packages and their dependencies easier. Unlike pip, conda is also an environment manager similar to virtualenv.

Following are instructions for creating and activating a conda environment, and installing packages in your Slate directory space from any of the research supercomputers at Indiana University.

Note:
Package managers can be helpful because they allow users to install packages and their dependencies locally with just one command. However, UITS Research Technologies installs and maintains many software packages on IU's research supercomputers, which users can access with module commands. Since Research Technologies installs many commonly used Python packages in the python and python/gpu modules, including numpy, pandas, tensorflow, pytorch, jupyter, and many others, check the Python and python/gpu modules before you install a Python package.

Before you begin

Conda installs virtually all dependencies, even system libraries that are a part of Research Technologies operating system installations. As a result, conda environments take up a large amount of disk space. Named environments (those created with the -n option) are placed in a subdirectory of the .conda directory in your home directory. In order to avoid exceeding your home directory disk quota, Research Technologies recommends you use the -p option to build your environments in your Slate directory, rather than your home directory tree. For information on how to create a Slate account, see About Slate high performance storage for research computation at IU.

Create a conda environment and install packages

Follow the steps below to create a conda environment. Within this environment, you can install and delete as many conda packages as you like without making any changes to the system-wide miniconda module.

Conda will attempt to resolve any conflicting dependencies between software packages and install all dependencies in the environment. For this reason, conda environments can be large.

  1. Check whether your user environment has a version of Python loaded already; on the command line, enter:
    module list

    This displays the modules that are already loaded to your environment; for example:

    [bkyloren@h1 ~]$ module list
    Currently Loaded Modules:
      1) gnu/9.3.0   2) quota/1.8   3) xalt/2.10.34   4) mvapich/2.3.5   5) StdEnv
  2. Miniconda uses Python but prefers its own installation; consequently, if your user environment already has Python added, you first must unload that Python module and then load a miniconda module:
    • To unload the Python module, on the command line, enter:
      module unload python
    • To load a miniconda module, on the command line, enter:
      module load miniconda
  3. Create the environment in your Slate directory: Research Technologies recommends installing the environment in your Slate space. Use the -p option to specify the file path; for example (replace username with your IU username):
    conda create -p /N/slate/username/env_name pkg1 pkg2 pg3
  4. Activate your conda environment; on the command line, enter (replace env_name with the name and non-default path, if applicable, of the environment you created):
    source activate /N/slate/username/env_name

    Upon activation, the environment name (for example, env_name) will be prepended to the command prompt; for example:

    (/N/slate/username/env_name)[bkyloren@h1 ~]$
    Note:

    If you have installed your own local version of miniconda, issuing the conda activate command may prompt you to issue the conda init command. To avoid having to issue the conda init command, use the source activate command instead. UITS Research Technologies recommends not issuing the conda init command, because when it runs, it adds commands to your .bashrc file that stop certain things from working on IU's research supercomputers; in particular, it may break the conda activate command itself. It also makes it impossible to log in to Research Desktop (RED). For more about this issue and a workaround for local miniconda installations, see the Workaround for the conda init command below.

    In 2022, UITS Research Technologies updated the miniconda modules available on IU's research supercomputers to correctly manage conda initialization and base environment activation. Consequently, you can use the conda activate command when one of the UITS-installed miniconda modules is loaded.

  5. Once your conda environment is activated, you can download and install additional packages. The exact command will depend on the programs you are installing (consult the program's documentation); for example:
    • To download and install the latest release of ipyrad, enter:
      conda install -c ipyrad ipyrad
    • To download and install (spaCy), enter:
      conda install -c conda-forge spacy

    The -c flag tells conda to install the package from the channel specified. Check your program's documentation to determine the appropriate channel to use. It is usually best if you know all of the software that you want to install in an environment and to list all the packages when you create the environment. This lets conda resolve dependencies for all packages to ensure there are no conflicts.

    If you need a Python package that is not available through conda, once the conda environment is activated, provided Python was one of the dependencies installed into your environment (which is usually the case), you can use pip to install Python packages in your conda environment:

    pip install python-package-name
  6. To confirm that all of your conda packages are installed, enter:
    conda list

    The packages you installed using conda and all their dependencies should be listed. Any packages installed with pip will not be included. To list packages installed with pip, enter:

    pip list

You now should be able to run your program within your conda environment. Enter your program's commands on the conda environment's command line. Remember, you should see your conda environment's name prepended to the command prompt; for example:

(env_name)[bkyloren@h1 ~]$

If you don't see your conda environment's name, most likely you did not activate the environment (see step 4, above).

When you are finished running your program, deactivate your conda environment; enter:

conda deactivate

The command prompt will no longer have your conda environment's name prepended; for example:

[bkyloren@h1 ~]$

Activate a previously created conda environment

To run a program you installed in a previously created conda environment:

  1. Activate the conda environment (see step 4, above).
  2. Run your program's commands. (You won't have to install the package each time you activate your environment; it should be installed already.)
  3. When you're finished, deactivate the environment; enter:
    conda deactivate

Alternatively, you can add these commands to a job script and submit them as a batch job; for help writing and submitting job scripts, see Use Slurm to submit and manage jobs on IU's research computing systems.

Workaround for the conda init command

The conda init command places code in your .bashrc file that modifies, among other things, the PATH environment variable by prepending to it the path of the base conda environment. The .bashrc file is executed before the default system modules are loaded. When those modules (or any other modules that are loaded at login) are loaded, libraries can be loaded that hide miniconda's libraries. Subsequently, this can cause errors when you use the conda command. On RED, the base miniconda environment has commands or libraries that hide some of those needed to run the RED session, usually causing a bus error when you try to log in after conda init modifies your .bashrc file. In general, it is rarely a good practice to modify PATH in your .bashrc file.

To work around this in local miniconda installations:

  1. Run conda init, and then immediately open .bashrc with a file editor.
  2. Remove the code that was added by conda init and place it in another script file (for example, conda_init.sh).
  3. After the login process completes, run the code in the script file:
    source conda_init.sh

    You should now be able to use conda activate.

Additional useful conda commands

  • To check which packages are available in a miniconda module, enter:
    conda list
  • To list all the conda environments you have created, enter:
    conda info --envs
  • To delete a conda environment, use (replace env_name with the name of the conda environment you want to delete):
    conda env remove --name env_name
  • To share your conda environment with collaborators:
    1. Create and activate your conda environment, and install your package(s).
    2. In your conda environment, run the following command:
      conda env export > environment.yml

      This exports a list of your environment's dependencies to the file environment.yml.

    3. Email the environment.yml file to your collaborators, and direct them to upload the file and run the following command:
      conda env create -f environment.yml

      This downloads the conda packages as a conda environment in their local directories. From there, they can activate the environment and start running their analyses.

Get help

For more about conda, see the conda User Guide.

If you have questions or need help, contact the UITS Research Applications and Deep Learning team.

This is document axgp in the Knowledge Base.
Last modified on 2023-12-17 07:03:11.