Install packages in a conda environment on IU's high performance computers

On this page:


Overview

Conda is an open source package manager similar to pip that makes installing packages and their dependencies easier. Unlike pip, conda is also an environment manager similar to virtualenv. Package managers are especially helpful in high-performance computer settings, because they allow users to install packages and their dependencies locally with just one command.

Following are instructions for creating and activating a conda environment, and installing packages in your home directory space on any of the research supercomputers at Indiana University.

Create a conda environment and install packages

Follow the steps below to create a conda environment. Within this environment, you can install and delete as many conda packages as you like without making any changes to the system-wide Anaconda module.

Conda will attempt to resolve any conflicting dependencies between software packages and install all dependencies in the environment. For this reason, conda environments can be large.

  1. Check whether your user environment has a version of Python loaded already; on the command line, enter:
    module list
    

    This displays the modules that are already loaded to your environment; for example:

    [bkyloren@h1 ~]$ module list
    Currently Loaded Modulefiles:
    1) gcc/6.3.0      3) quota/1.6    5) xalt/2.10.30   7) python/3.8.2
    2) intel/19.0.5   4) git/2.13.0   6) core
    
  2. Anaconda uses Python but prefers its own installation; consequently, if your user environment already has Python added, you first must unload that Python module and then load an Anaconda module:
    • To unload the Python module, on the command line, enter:
      module unload python 
      
    • To load an Anaconda module, on the command line, enter:
      module load anaconda
      
  3. Create a conda environment using one of the following commands. Replace env_name with any name you want for the environment, and replace pkg1 pkg2 pg3 with the name(s) of the package(s) you want to install.
    • Create the environment in your home directory: The following saves the conda environment in your home directory space; for example, on Carbonate, it will be saved to ~/.conda/envs.
      conda create -y -n env_name pkg1 pkg2 pg3
      
    • Create the environment in another directory: If you are using multiple conda environments, consider installing them in your Slate (scratch space) directory space. To install the conda environment to another directory or project space, add the -p option to specify the file path; for example:
      conda create -y -p /filepath/env_name pkg1 pkg2 pg3
      
  4. Activate your conda environment; on the command line, enter (replace env_name with the name and non-default path, if applicable, of the environment you created):
    source activate env_name
    

    Upon activation, the environment name (for example, env_name) will be prepended to the command prompt; for example:

    (env_name)[bkyloren@h1 ~]$
    
    Note:

    More recent Anaconda distributions will tell you to use the command conda activate instead of source activate to activate your newly created environment. If you use conda activate, you will be prompted to issue the command conda init. Do not do this. When you load one of our Anaconda modules, you have effectively loaded the base conda environment for that Anaconda version. However, conda tries to manage activation of its own base environment by modifying the user's .bashrc file.

    Even if you have installed your own local version of Anaconda or miniconda, do not use conda init. When conda init runs, it places commands into your .bashrc file that will stop certain things from working on the system; in particular, it will break the conda activate command itself. It will also make it impossible to log into Research Desktop (RED). For more about why this happens and a workaround for local Anaconda or miniconda installations, see Workaround for the conda init command.

  5. Once your conda environment is activated, you can download and install additional packages. The exact command will depend on the programs you are installing (consult the program's documentation); for example:
    • To download and install the latest release of ipyrad, enter:
      conda install -c ipyrad ipyrad
      
    • To download and install (spaCy), enter:
      conda install -c conda-forge spacy
      

    The -c flag tells conda to install the package from the channel specified. Check your program's documentation to determine the appropriate channel to use. It is usually best if you know all of the software that you want to install in an environment and to list all the packages when you create the environment. This lets conda resolve dependencies for all packages to ensure there are no conflicts.

    If you need a Python package that is not available through conda, once the conda environment is activated, provided Python was one of the dependencies installed into your environment (which is usually the case), you can use pip to install Python packages in your conda environment:

    pip install python-package-name
    
  6. To confirm that all of your conda packages are installed, enter:
    conda list
    

    The packages you installed using conda and all their dependencies should be listed. Any packages installed with pip will not be included. To list packages installed with pip, enter:

    pip list

You now should be able to run your program within your conda environment. Enter your program's commands on the conda environment's command line. Remember, you should see your conda environment's name prepended to the command prompt; for example:

(env_name)[bkyloren@h1 ~]$

If you don't see your conda environment's name, most likely you did not activate the environment (see step 4, above).

When you are finished running your program, deactivate your conda environment; enter:

source deactivate

The command prompt will no longer have your conda environment's name prepended; for example:

[bkyloren@h1 ~]$

Activate a previously created conda environment

To run a program you installed in a previously created conda environment:

  1. Activate the conda environment (see step 4, above).
  2. Run your program's commands. (You won't have to install the package each time you activate your environment; it should be installed already.)
  3. When you're finished, deactivate the environment; enter:
    source deactivate
    

Alternatively, you can add these commands to a job script and submit them as a batch job; for help writing and submitting job scripts, see Use Slurm to submit and manage jobs on high performance computing systems.

Workaround for the conda init command

The conda init command places code in your .bashrc file that modifies, among other things, the PATH environment variable by prepending it to the path of the base conda environment. This occurs before the default system modules are loaded. IU's HPC systems often load a Python module as one of the default modules; after you log in, this default module's Python and site packages will hide any that are in the Anaconda module and cause dependency version conflict errors.

Other modules may also have libraries that will hide Anaconda libraries and cause errors. On Research Desktop (RED), the base Anaconda environment has commands or libraries that hide some of those needed to run the Research Desktop session, usually causing a bus error when you try to log in after conda init modifies your .bashrc file.

To work around this in local Anaconda or miniconda installations:

  1. Run conda init, and then immediately open .bashrc with a file editor.
  2. Remove the code that was added by conda init and place it in another script file (for example, conda_init.sh).
  3. After the login process completes, run the code in the script file:
    source conda_init.sh
    

    You should now be able to use conda activate.

Additional useful conda commands

  • To check which packages are available in an Anaconda module, enter:
    conda list
    
  • To list all the conda environments you have created, enter:
    conda info --envs
    
  • To delete a conda environment, use (replace env_name with the name of the conda environment you want to delete):
    conda env remove --name env_name 
    
  • To share your conda environment with collaborators:
    1. Create and activate your conda environment, and install your package(s).
    2. In your conda environment, run the following command:
      conda env export > environment.yml 
      

      This exports a list of your environment's dependencies to the file environment.yml.

    3. Email the environment.yml file to your collaborators, and direct them to upload the file and run the following command:
      conda env create -f environment.yml
      

      This downloads the conda packages as a conda environment in their local directories. From there, they can activate the environment and start running their analyses.

Get help

For more about conda, see the conda User Guide.

If you have questions or need help, email NCGAS.

This is document axgp in the Knowledge Base.
Last modified on 2021-11-08 17:23:37.