Recommendation for the safe use of conda
Problem
Anaconda, Inc. has (rather silently) changed the terms of service (TOS) for the defaults
channel in 2020. This means that institutions with more than 200 employees (including academic) have to pay license fees if they make use of the defaults
channel (which might be in the worst case 50$ per month
This raised some uncertainty regarding the use of conda.
This document tries to give the needed background knowledge and a guideline that allows you to easily and safely use the free and open part of the conda ecosystem (i.e. avoid the defaults
channel).
Solution
Bottom line is: Make sure that the defaults
channel is not used and use other (equivalent) channels like conda-forge
.
Double check which conda packages are installed
Likely the simplest advice is to double check which packages are installed. This will protect you even from misconfigurations.
- Before even trying to install the software
conda
/mamba
update the channel information (conda
even explicitly shows the used channels as first output). Check the thedefaults
channel is not among the channels. - During installation the conda packages that will be installed will be listed. In the second column of this listing you find detailed information in the form
CHANNEL/PLATFORM::PACKAGE
. It seems to be good practice to double check this listing. Alternatively one could use the--dry-run
argument. -
conda
environments can be created fromenvironment.yaml
files. These files include achannels
list. Even if a properly configured conda installation (or arguments) should ensure that thedefaults
channel is not used it seems advisable to remove thedefaults
channel from this list if it is included.
In case installation still uses the defaults
channel double check your configuration.
Which conda distribution should be used
- The recommendation is to use
miniforge
which uses theconda-forge
channel by default which is community maintained, free to use and only contains open source software. -
micromamba
is a good alternative for users that prefermamba
(butmamba
is also included inminiforge
). - We advise against the use of
miniconda
. Even ifconda
installed viaminiconda
can be configured to ignore thedefaults
channel, thedefaults
channel will be used for the construction of thebase
environment. - Using
Anaconda
is clearly not an option since all software comes from thedefaults
channel.
Switching conda distribution
In order to remove an old conda distribution (like miniconda/anaconda) you need to:
- Remove the installation directory (e.g.
~/miniconda3/
). Note that this typically includes your environments which you might want to keep (see below). - Make sure that your shell's configuration file does not contain any traces of the old conda distribution, i.e. a block starting with
# >>> conda initialize >>>
and ending with# <<< conda initialize <<<
or anyexport
statements or modifications of thePATH
variable (only those that add conda paths). Potentially this can be done automatically withconda init --reverse
. - The official documentation also contains a guide to transitioning from
defaults
toconda-forge
conda
environments
Check existing - Activate the conda environment and execute
conda list
. The last column shows theconda
channel for each installed package. If this does not include thedefaults
channel or any of its subchannels (see below). - Alternatively (a more programmatic approach) might be to to check
grep "\"url\": \"https://repo.anaconda.com" $ENVDIR/conda-meta/*json
(where$ENVDIR
is the directory where the environment is installed). - If the
defaults
channel has been used the environment should be re-created using only packages from proper channels.
conda
vs `mamba
- Both are free to use (of course only if installed from a free source).
-
mamba
does not default to thedefaults
channel. -
conda
does not default to thedefaults
channel when installed via miniforge - Independent of the choice you need to double check the configurations if you have previously used
conda
/mamba
or you install conda environments using exported yaml files. - Nowadays most features of
conda
(in particular the solver) have been integrated inmamba
. - Nowadays
conda
andmamba
are nearly equivalent, i.e.mamba
covers most ofconda
's functionality andconda
usesmamba
's solver by default. Some aspects ofmamba
might be slightly faster. You can use aliases to retain script compatibility.
Check your configuration
- Make sure that
conda config --show-sources
/conda config --show channels
does not show thedefaults
channel. - If you find the
defaults
channel, remove it by executingconda config --remove channels defaults
. If desired you can addconda-forge
like so:conda config --add channels conda-forge
- Just to be sure, one can explicit forbid the usage of the channels:
conda config --append denylist_channels defaults
,conda config --append denylist_channels main
,conda config --append denylist_channels r
- Make sure that
override_channels_enabled
is enabled:conda config --show override_channels_enabled
should showTrue
. If needed set it withconda config --set override_channels_enabled True
. - It might be also a good idea to add
nodefaults
to the channel list, even if it's documented as equivalent tooverride_channels_enabled
- If preferred one can specify
--override-channels --channel conda-forge
forconda install
/conda create
to achieve the same effect. Additional channels by be added by listing more--channel ZYZ
arguments.
Details and background
Software (and in particular scientific software) is often difficult to install. Additionally, reproducible science requires independent installations of multiple versions of a software. anaconda
aims to provide easy installation of important scientific software with proper version management. To this end the conda
package manager has been created. It provides the possibility to install software independent of the programming language (it covers much more than only python and R) and operating system (it covers Linux, Mac and Windows).
At the source of the conda ecosystem there are conda recipes which contain information on metadata, requirements and installation instructions that are needed to build a package. These packages
which are a pre-built archives containing software, metadata, and information on dependencies, ready to be installed into a Conda environment. These packages are stored in
channels
, i.e. repositories (online or local) where conda packages are stored and made available for installation. Many channels are free and contain only free and open source software - most importantly conda-forge
and bioconda
.
The defaults
channel consists of conda packages that are maintained by Anaconda, Inc. It consists of several subchannels: pkgs/main
, pkgs/r
pkgs/pro
, pkgs/msys2
, pkgs/free
, pkgs/archive
.
Most conda channels are currently hosted by Anaconda Inc (if one is looking for risks in using conda
then this might be one since its a single point of failure).
Locally conda packages
can be installed in environments which will install the package and its requirements.
A conda
distribution is an installable software that provides conda
(and all its requirements needed to run conda), e.g. mini-forge
.
Further discussion
Other channels
Software can be dangerous. Hence besides licensing questions, installing software requires trust into the source of the software. Conda's search functionality allows you to find software in various channels. It seems not wise to blindly install software from all these channels.
It seems to be a good idea to rely on community maintained channels that include only free software. Examples for such channels are conda-forge
and bioconda
.
The larger picture
Beyond the problem with Anaconda's defaults
channel, it is worth to mention that currently reproducible science relies to a significant extend on the availability to freely use services offered by companies, see discussion here.
It's easy to blame Anaconda Inc for the TOS change, but maybe a bit of appreciation might be appropriate, but communication certainly might have been better.
Also note that Anaconda Inc. continues to provide substantial resources for free to everyone (and in particular the scientific community).
-
The software
conda
(which is also maintained by Anaconda, Inc) is still (and will always be) free to use and open source. -
Hosting of the packages of all
conda
channels. The open source community currently has no possibilities to host these packages elsewhere (see). -
Also note that part of the revenues are invested back in open source projects.
Note that Helmholtz (HiFiS) made some important steps toward scientific infrastructure, e.g. by providing container registries. Probably more investments of governmental and scientific bodies in open source and scientific infrastructure are needed.
Discussion of other documents
Recently UFZ and HiFiS published recommendations.
While these documents are certainly important to raise awareness, they contain a few points that are worth discussing
The company Anaconda provides very convenient software development environments and libraries for Python and R, especially for beginners. Due to their many years of free availability for research and educational institutions, they are also very popular in training and in the discussion on StackOverflow.
Partially correct, since the software is not provided by Anaconda, but only distributed. All (or almost all) software that is distributed in the defaults
channel is FOSS software and distributed in many other (usually free) ways, e.g. conda-forge
.
Furthermore, it's important to highlight that these statements only apply to the software in the defaults
channel. The possibilities offered by conda
and channels like conda-forge
go far beyond this.
- Proper versioned software environments for all sorts of interpreted and compiled software.
- An easy solution for reproducible science that can be used in large scale production ready deployments.
These fees are also due if employees of the respective institutions access the Anaconda software directories.
This could have been more precise, i.e. this only applies to the defaults
channel (and its subchannels) and not (in general) to the other channels hosted by Anaconda Inc.
mamba instead of conda
As discussed above, with minimal effort, both can be used without using the problematic defaults
channel.
Miniforge instead of conda-forge
This mixes two different concepts. miniforge
is a conda distribution and conda-forge
a conda channel. Both are part of the solution.
All other recommendations only apply to python (and are therefore no full alternative) or are unrelated to the actual question.
Also the Helmholtz Open Science newsletter recently talked about this topic
Users working in non-commercial or scientific organizations were able to use the software free of charge and were convinced that Anaconda was FOSS – Free and Open Source Software. In fact, this is not the case.
In fact all software that is distributed in Anaconda's defaults
channel is FOSS software. The channel is only a non-free means of distributing the software in a way that is much more convenient than anything before conda
was available.
Nowadays, with conda-forge
equally convenient and free ways are available.
The Jülich Supercomputing Centre's RSE Team has also uploaded a recommendation. Notably, the JSC typically encourages the use of virtual environments that are implemented in Python rather than a package manager such as conda
, as it is better compatible with EasyBuild (including user modules).