Environment types
One of the least intuitive things to grasp when beginning to work with conda recipes is what the
roles of the different environments are. Questions such as "does it go into build: or host:?"
or "what's the difference between these two?" are very common. This page provides a high-level
summary of the environment types and what distinguishes their different roles.
For compiled packages
Although the topic of cross-compilation is somewhat of an advanced topic, the constraints it imposes are very instructive about why things are as they are. For the purpose of this discussion, the only relevant thing to know is that there are two different platforms in play: the one we're building for, and the one we're building on.
In almost all cases, the difference in platforms is actually "only" a difference in CPU architectures.
For example, we currently do not have native linux-aarch64 builders, so we have to cross-compile packages for this platform from linux-64. That said, it is possible
(both conceptually and in terms of tooling) to do full cross-platform compilation, e.g. building for
osx-arm64 or win-64 from linux-64, though this is very rarely necessary.
In general, packages compiled for one architecture can only run on that CPU architecture (e.g. a
package built for linux-aarch64 can only be executed on that type of machine, but not on linux-64
or linux-ppc64le; for more details see compilation concepts), so we need
to be very precise about separating the necessary ingredients for building a package.
Let's take a simple recipe for mypkg and annotate what happens when we build it for linux-aarch64
(we will say: host_platform: linux-aarch64) on a linux-64 machine (build_platform: linux-64).
requirements:
build: # [build time] `linux-64`; where compilers and other tools get executed
- ${{ stdlib("c") }} # - translates to `sysroot_linux-aarch64`; mostly the C standard library
- ${{ compiler("cxx") }} # - translates to `gxx_linux-aarch64`; the actual compiler
- cmake # - regular build tools
- ninja
host: # [build time] `linux-aarch64`, where compile-time dependencies are placed
- zlib # - libraries `zlib.so` & `libzstd.so` (and their headers), which are
- zstd # necessary for compilation (i.e. the linker to find the right symbols)
- libboost-headers # - a header-only dependency (may still be architecture-dependent through
# values that got hard-coded during its build)
run: # [runtime] `linux-aarch64`; what will be installed alongside mypkg
# - libzlib # - dependencies that get injected by "run-exports" (see below);
# - zstd # note also that the header-only dependency did not inject anything
Let us unpack what is happening here. During the build, there are two environments in play, as
indicated by the presence of two occurrences of the [build time] marker. As a rule of thumb, the
build: environment is where we place binaries that need to be executed (rather than just
present); it is a conda environment, whose path is reachable during the
build using $BUILD_PREFIX.
The host: environment (under $PREFIX) contains dependencies that will be necessary for
building mypkg, for example because we need to find the correct header files when compiling, and
symbols when linking. With very few exceptions, things in host: cannot be executed during the
build phase, because binaries compiled for a different architecture (here linux-aarch64) cannot
run on our linux-64 build machine. An important special case here is python, which is explained
further down.
Finally, the run: environment does not have any role at build time. It specifies which
dependencies need to be present for mypkg to be functional once installed. In many cases, libraries from
the host: environment will inject dependencies into the run: environment. This is a consequence of
the fact that if mypkg depends on a shared library (here: zlib & zstd), these libraries need
to be present both at build time (for the linker to find the symbols therein, and register where to
find them), as well as at runtime (when the dynamic linker goes looking for the symbols that have
been marked as externally provided during the build of mypkg).
It's worth noting that the run: environment is never actually created during the build of mypkg.
It is, however, created at test time as part of the check stages in conda-build and rattler-build.
That's why testing packages is essential, even if only to find if the resulting artifact can actually
be installed as-is. Here, the recipe formats v0 (meta.yaml) and v1 (recipe.yaml) behave slightly
differently:
- v0 (meta.yaml)
- v1 (recipe.yaml)
requirements: # build/host/run as above
[...]
test:
requires: # [runtime] linux-aarch64; the package we built above
- pytest # ... plus (optionally) additional test-only dependencies
- coverage
requirements: # build/host/run as above
[...]
tests:
- requirements:
run: # [runtime] linux-aarch64; the package we built above
- pytest # ... plus (optionally) additional test-only dependencies
- coverage
# build: # not currently necessary in conda-forge, but future use-cases
# - ... # may need this (e.g., if conda-forge ever builds for emscripten)
Native builds
When the architectures between build: and host: match, the situation is simpler, because in that
case, both environments are able to execute code on the machine. Do not be confused by this additional
degree of freedom; the separation into the different roles of build: and host: remains
exactly the same as for cross-compiled builds: things that are only necessary to be executed
(without otherwise affecting the result) are in build:, while compile-time dependencies go into
host:. We will explore the latter some more below, but first need to introduce another mechanism.
Run-exports
As described above, run-exports ensure that shared libraries in host: are also present in run:.
In addition to the mere presence, the ABI tracking will often imply concrete version constraints
based on the version of the library that was present in the host: environment at build time. For
example, zlib has a run-export:
requirements:
run_exports:
- ${{ pin_subpackage('libzlib', max_pin='x') }}
For zlib 1.3.1 in our host: environment, pin_subpackage translates to libzlib >=1.3.1,<2.0a0,
which is what packages building against zlib in host: will thus inherit as a run: dependency.
As an aside to explain the difference why zstd run-exports itself, but zlib exports libzlib:
libraries can generally be split into development and runtime components. For examples, headers,
package metadata, etc. are all not necessary (and may be unwanted) at runtime. In this case, zlib
corresponds to the complete set of files (for development), whereas libzlib contains only the
library itself (i.e. all that's necessary at runtime). Not every library in conda-forge currently
follows such a stringent split though; in particular, zstd doesn't. Therefore, it has to run-export
the only thing available to ensure the library is present at runtime, which is again zstd.
Some select packages (especially compilers and the C standard library) may also contribute to run:
dependencies from the build: environment; these are so-called "strong" run-exports. These have not
been added in the above example for brevity, but look like libgcc / libstdcxx / __glibc etc.
ABI tracking
In many ways, the host: environment is the "heart" of a package's dependencies. While compilers,
build tools in build: (and their versions) can often be changed relatively freely, the packages
in host: imply a much tighter contract, i.e. mypkg depends on the Application Binary Interface
(ABI) of that host dependency, and if this ABI changes, we need to rebuild
mypkg.
Addressing this problem is one of the core goals of conda-forge's infrastructure, as we continuously rebuild
feedstocks if any one of their dependencies releases a version with a new ABI. In particular,
any name-only host: dependency (i.e. without any version or build constraints) that matches one of
the packages in the global pinning will participate in this
orchestration.
This is essential, because otherwise different packages would eventually end up depending on incompatible versions of the same dependencies, preventing them from being installed simultaneously.
Note that in contrast to the usual way dependencies work transitively (if one installs foo that
depends on bar which depends on baz, then any environment with foo must have baz too), the
ABI tracking in host: is not transitive: if foo declares a host: dependency only on bar,
it is not assumed to depend on the ABI of baz (and would not be rebuilt if baz releases a new
ABI-changing version; only bar would be rebuilt in that case).
This is why the link check at the end of the build has an essential role. It will warn if the package has not declared all dependencies (in terms of which libraries the final artefact links against), which means that changes to the ABI of that package are not being tracked, which may lead to ABI breaks (crashes, etc.) down the line. In conda-build's terminology, this is called "overlinking". You should always address those warnings.
On the other hand, the link check will also warn you if you are "overdepending" on libraries, which
is the case if your package has host: dependencies that aren't actually used. This is less severe
than overlinking, because it "just" means that your package has unnecessarily tight constraints and
may be rebuilt more often than strictly necessary.
Note also that the overdepending warning can have false positives, because the link check cannot
statically determine all ways that a given library may be loaded. In particular, things that are
only loaded at runtime cannot be determined ahead-of-time (numpy is an example of this). As a rule
of thumb, if removing the dependency causes the build to break (e.g. because the build process
expects to find the library), you may try ignoring its run-exports:
- v0 (meta.yaml)
- v1 (recipe.yaml)
build: # top-level key per output, not under `requirements:`!
ignore_run_exports_from:
- zlib
# and / or
ignore_run_exports:
- libzlib
requirements:
ignore_run_exports:
from_package:
- zlib
# and / or
by_name:
- libzlib
If this breaks the package (e.g. the tests fails), then you have found a false positive of the overdepending warning, and should simply ignore it.
Due to the way that the link check cannot capture all relevant scenarios (also around meta-packages,
compiler infrastructure, etc.), please do not add excessive ignore_run_exports:. In case of doubt,
start a discussion on Zulip.
Interpreted languages
Many packages in conda-forge are aimed at python or R. These languages have an interpreter that
has itself been compiled (e.g. from C/C++), but allows other code (in python/R) to run without
compilation. For packages (like numpy) that have both compiled code that interacts directly with
the python runtime (using python like a library), as well as code that passes through the
interpreter, we are in the situation that:
- the package is exposed to
python's ABI because we're compiling against it. pythongets run during the build (e.g.python -m pip install ...).
So here the situation shifts a little from purely compiled languages. Let's look at numpy's recipe
(slightly simplified):
requirements:
build:
- ${{ stdlib('c') }}
- ${{ compiler('c') }}
- ${{ compiler('cxx') }} # compilers as usual
- ninja
- pkg-config
host:
# ABI-relevant
- libblas
- libcblas
- liblapack
- python
# interpreted py-libs used during installation
- cython
- meson-python
- pip
- python-build
run:
- python # not shown here: run-export from python that
# enforces matching minor version as in `host:`
# - libblas >=3.9.0,<4.0a0
# - libcblas >=3.9.0,<4.0a0 # run-exports from BLAS/LAPACK packages
# - liblapack >=3.9.0,<4.0a0
run_exports:
- numpy >=${{ default_abi_level }},<3
You can see how the host: section effectively splits into two; the ABI-tracking aspect remains as
above, but we need to put python packages themselves next to their interpreter, otherwise we would
not be able to actually run anything once the build process wants to call into python.
The fact that python is arguably both a host: as well as a build: dependency creates some
obvious issues for cross-compilation. This is explained in
details about cross-compiled Python packages.
What about target_platform?
There is a long history of ambiguous use of terminology related to cross-compilation. From the point-of-view of compiler authors, there's a third architecture that becomes relevant:
- the platform where the artefact is being built ("build")
- the platform where the built artefact will be executed ("host")
- the platform that the built artefact will generate binaries for ("target")
The third point is almost only ever necessary for building a cross-compiler, because said compiler may have the parameters related to the target platform (for which it is generating binaries) baked into its own executable. Other cross-compilers leave the target platform as a runtime property that can be configured, but the case remains that there is potentially a third platform in play.
This most general case is also commonly known as a "Canadian Cross". Over many years, the predominant naming pattern that emerged matches the naming presented next to the bullet points above (e.g. GCC, meson Debian). However, there are many toolchains that have (either historically or presently) have had less of a need or focus for cross-compilation, where the same names might be used in different ways.
Even the v0 recipe format falls prey to some inconsistencies, in the sense that the variable
{{ target_platform }} (or $target_platform in build scripts) actually represents the host:.
The v1 recipe format has fixed this, but still allows the old, less accurate, naming for reasons of
compatibility. That said, v1 recipes should always prefer to use host_platform instead of
target_platform. Coming back to the bits from the numpy example (related to python cross-compilation)
that we omitted above, this is how the formulation differs between v0 and v1:
- v0 (meta.yaml)
- v1 (recipe.yaml)
requirements:
build:
- if: build_platform != host_platform
then:
- python
- cross-python_${{ host_platform }}
- cython
- ${{ stdlib('c') }} # and so forth
requirements:
build:
- python # [build_platform != target_platform]
- cross-python_{{ target_platform }} # [build_platform != target_platform]
- cython # [build_platform != target_platform]
- {{ stdlib('c') }} # and so forth