Clang
Clang is a C/C++/Objective-C/C++, CUDA, and RenderScript language front-end for the LLVM project with a focus on standards, code correctness, and useful diagnostic messages to ease compile time troubleshooting.
Clang can be used as an alternative to GCC. Simple usage is calling Clang instead of GCC, despite the latter providing more than a simple compiler front-end. The GCC toolchain also provides the system's C++ library, unwinder, OpenMP, and runtime libraries and sanitizers. It's possible to use Clang with LLVM's alternatives in place of the GCC provided libraries. These are part of the fundamental building blocks of a Linux system and toolchain.
Furthermore, it's possible to use a fully self contained LLVM toolchain by calling the LLVM alternatives to GNU binutils, thus bypassing nearly all the GNU toolchain with the exception of the C library which is usually provided by GNU glibc.
C, Objective-C, C++, and Objective-C++ compatibility can be found here, as detailed on the Clang compatibility website.
Prerequisites
One of the goals of the Clang project is to be compatible with code written with GCC as the target compiler. Using Clang system wide is experimental and comes with associated risks, such as packages requiring GCC specific functions, failures to build correctly, or successful compilations but having issues when executed. These can be reported in the system wide Clang bugtracker. In these events, it's necessary to use GCC as a fallback.
A compiler toolchain and how it functions compared to another is way beyond the scope of this article, but a understanding a few key differences is important to a Gentoo user wanting to daily drive the Clang and/or LLVM toolchain system wide.
Important differences when compared to GCC
- Clang doesn't support some GCC extensions like nested functions. This is the main reason Clang is not able to compile sys-libs/glibc although currently a lot of work is happening to make glibc alternate toolchain friendly.
- Second, GCC defaults to
-ftrapping-math
, Clang defaults to-fno-trapping-math
.
- GCC doesn't need a separate library installed for PGO and is self sufficient. Clang requires llvm-runtimes/compiler-rt-santizers[profile orc] before enabling the pgo USE flag for packages.
- GCC interposes code via
-fsemantic-interposition
by default. Clang does interprocedural optimizations by default, but-fno-semantic-interposition
allows further interprocedural optimizations if the code allows it.[1]
Minor differences to GCC
- GCC defaults to
-ffp-contract=fast
while Clang defaults to-ffp-contract=on
. Unless wanting to match GCC's slightly riskier behavior there should be no issue with Clang's safer default.
- Until version 12, GCC didn't run vector optimizations at
-O2
or lower. Clang runs vector optimizations at all levels greater than-O1
except level-Oz
, which only runs the SLP vectorizer.[2] Though unlikely to cause issue, it's currently relevant since sys-devel/gcc:11 is still in ::gentoo.
- Both compilers LTO phases function drastically different and it's outside the scope of this article to detail it. What works with GCC may or may not work for Clang, and vice versa.
Clang USE flags
USE flags for llvm-core/clang C language family frontend for LLVM
+debug
|
Enable extra debug codepaths, like asserts and extra output. If you want to get meaningful backtraces see https://wiki.gentoo.org/wiki/Project:Quality_Assurance/Backtraces |
+extra
|
Build extra tools (clangd, clang-tidy and a few more) |
+pie
|
Build programs as Position Independent Executables (a security hardening technique) |
+static-analyzer
|
Install the Clang static analyzer |
doc
|
Add extra documentation (API, Javadoc, etc). It is recommended to enable per package instead of globally |
ieee-long-double
|
Use accelerated 128-bit IEEE long double ABI (ppc64le only) |
test
|
Enable dependencies and/or preparations necessary to run tests (usually controlled by FEATURES=test but can be toggled independently) |
verify-sig
|
Verify upstream signatures on distfiles |
xml
|
Add support for XML files |
Installation
After choosing USE flags, run:
root #
emerge --ask --update --deep --changed-use llvm-core/clang
Configuration
There are many possible configurations. This section assumes the user wants to choose to use Clang to compile some packages. It is unrelated to packages using LLVM or Clang for their libraries.
Users may wish to default to Clang and selectively use GCC or vice-versa.
There are two ways to do this:
- System wide using /etc/portage/make.conf or,
- via environment variables like the one(s) created for the GCC fallback.
Basic setup
This example is for defaulting to Clang but using GCC per-package for those which fail to build with Clang.
Clang environment
package.env method
Clang versions prior to 14.0.0 did not have a default-pie option similar to gcc. Prior versions would need -fPIC in CFLAGS and -pie in LDFLAGS. Users should clean up any stale references in *FLAGS.
Users will still need GCC to build some packages like glibc or wine. Gentoo maintains a tracker bug (bug #408963) for packages that fail to build with Clang.
Configuring Gentoo to use Clang system wide is simple. Change the CC and CXX variables in /etc/portage/make.conf to reference the Clang equivalents. No further configuration is necessary.
# Normal settings here
COMMON_FLAGS="-O2 -march=native"
CFLAGS="${COMMON_FLAGS}"
CXXFLAGS="${COMMON_FLAGS}"
CC="clang"
CPP="clang-cpp" # necessary for xorg-server and possibly other packages
CXX="clang++"
AR="llvm-ar"
NM="llvm-nm"
RANLIB="llvm-ranlib"
Alternatively, the same contents could be put in e.g. /etc/portage/env/compiler-clang. This would allow using Clang on a per package basis by invoking the compiler-clang environment file if desired:
app-foo/bar compiler-clang
app-bar/baz compiler-clang
GCC fallback environment
Create a configuration file with a set of environment variables using Portage's built in /etc/portage/env directory. This will override any defaults for any packages that fail to compile with clang.
The name used below is just an example, so feel free to choose whatever name is desired for the fallback environment. Be sure to substitute the chosen name with the examples used in this article.
The most basic example is:
COMMON_FLAGS="-O2 -march=native"
CFLAGS="${COMMON_FLAGS}"
CXXFLAGS="${COMMON_FLAGS}"
LDFLAGS="-Wl,--as-needed"
CC="gcc"
CXX="g++"
CPP="gcc -E"
AR="ar"
NM="nm"
RANLIB="ranlib"
In the event GCC should be used as the fallback environment, set the appropriate flags in the /etc/portage/package.env file:
# Compiled using GCC with no link-time optimization since package baz fails using lto
app-bar/baz compiler-gcc
# Compiled using GCC with link-time optimization since package bar compiles using lto
app-foo/bar compiler-gcc-lto
Extended examples
Adjust the following /etc/portage/env entries to suit the desired needs, such as enabling/disabling link-time optimizations, alternative AR, NM, RANLIB, and so on.
LTO
For enabling LTO:
# $N refers to the amount of threads used during LTO, one option is to be set to the value of $(nproc)
CFLAGS="-flto=$N -march=native -O2 -pipe"
CXXFLAGS="${CFLAGS}"
CC="gcc"
CXX="g++"
CPP="gcc -E"
AR="ar"
NM="nm"
RANLIB="ranlib"
GCC fallback option (without LTO)
CFLAGS="-march=native -O2 -pipe"
CXXFLAGS="${CFLAGS}"
CC="gcc"
CXX="g++"
CPP="gcc -E"
AR="ar"
NM="nm"
RANLIB="ranlib"
This copies the current working GCC config from make.conf, in the event it needs to be used it as a fallback.
When choosing to use LLVM's implementation of AR, NM, and RANLIB as detailed later in the article, be sure to set them back to the GNU versions for the GCC fallback environments as shown in the above example.
When choosing to not LTO, ignore the AR, NM, and RANLIB variables. When desiring to continue to use link-time optimization it's a good idea to have two separate environments like the above examples.
In the event the GCC fallback environment is needed, set the appropriate flags in the /etc/portage/package.env file:
# Compiled using GCC with link-time optimization since package bar compiles using lto
app-foo/bar compiler-gcc-lto
# Compiled using GCC with no link-time optimization since package baz fails using lto
app-bar/baz compiler-gcc
Advanced usage
This section discusses more advanced configurations.
Bootstrapping the Clang toolchain
For a "pure" Clang toolchain, one can build the whole LLVM stack using itself. This is unnecessary but users may choose to do it for fun. This is detailed in a subpage: Clang/Bootstrapping.
LLVM profiles
The LLVM profiles in Gentoo are experimental and intended for playing around with pure-LLVM systems (no GCC).
Most people do not want these even if choosing to use Clang to build most packages.
They come with no guarantees of support or stability. They are not simply the same as setting CC and CXX for Clang. The LLVM profiles use libcxx which means they're ABI-incompatible with the regular profiles using libstdc++.
See also the following bugs:
- LLVM profiles: rename libcxx-using profiles to include libcxx in the name - bug #944478
- LLVM profiles: add separate libstdc++ profiles - bug #944482
- LLVM profile links should have a warning above them - bug #944483.
Desktop LLVM profiles
Desktop profiles for LLVM can be created by following Clang/Desktop_profile at one's own risk.
Using libcxx
Using libcxx / libc++ breaks ABI. Doing so means that GCC cannot be used as a fallback. See the LLVM profile section for more.
Loop nest optimization
Clang can use an integer polyhedra model on LLVM intermediate representation to perform loop nest optimization. This requires building the external package Polly from the xarblu-overlay. The repository is enabled by app-eselect/eselect-repository. Then the ebuilds are synced. After, Polly is emerged and undocumented syntax patterns of arguments may be added. Currently, not all ebuilds compile with Polly:
root #
emerge --ask app-eselect/eselect-repository
root #
eselect repository enable xarblu-overlay
root #
emerge --sync
root #
emerge --ask sys-devel/polly
COMMON_FLAGS="${COMMON_FLAGS} -fplugin=LLVMPolly.so -mllvm=-polly -mllvm=-polly-vectorizer=stripmine -mllvm=-polly-omp-backend=LLVM -mllvm=-polly-parallel -mllvm=-polly-num-threads=9 -mllvm=-polly-scheduling=dynamic"
CFLAGS="${COMMON_FLAGS}"
CXXFLAGS="${COMMON_FLAGS}"
FCFLAGS="${COMMON_FLAGS}"
FFLAGS="${COMMON_FLAGS}"
Link-time optimizations with Clang
Static libraries built with clang LTO or LTO-thin are not understandable by GCC and vice versa: static libraries built by GCC with LTO are not understandable by clang . Solutions can be:
- Disable USE flag
static-libs
- Turn off LTO for packages with the static-libs
- With all packages that use static libraries use only one compiler.
The link-time optimization feature defers optimizing the resulting executables to linking phase. This can result in better optimization of packages but isn't standard behavior in Gentoo yet. Clang uses the lld linker for LTO.
Environment
Clang supports two types of link time optimization:
- Full LTO, which is the traditional approach also used by gcc where the whole link unit is analyzed at once. Using it is no longer recommended.
- ThinLTO, where the link unit is scanned and split up into multiple parts.[3] With ThinLTO, the final compilation units only contain the code that are relevant to the current scope, thus speeding up compilation, lowering footprint and allowing for more parallelism at (mostly) no cost. ThinLTO is the recommended LTO mode when using Clang.
For full LTO, replace -flto=thin
with -flto
in the following examples. There should be no compatibility differences between full LTO and thin LTO. Additionally, if Clang was not built with the default-lld
USE flag, add the -fuse-ld=lld
value to the following LDFLAGS.
CFLAGS="${CFLAGS} -flto=thin"
CXXFLAGS="${CXXFLAGS} -flto=thin"
# -O2 in LDFLAGS refers to binary size optimization during linking, it is NOT related to the -O levels of the compiler
LDFLAGS="${LDFLAGS} -Wl,-O2 -Wl,--as-needed"
CC="clang"
CXX="clang++"
CPP="clang-cpp" # necessary for xorg-server and possibly other packages
As an alternative, LLVM provides its own ar
, nm
, and ranlib
values. Feel free to use them though mileage may vary over using the standard ar
, nm
, and ranlib
, since they're intended to handle LLVM bitcode which Clang produces when using the -flto
flag.
CFLAGS="${CFLAGS} -flto=thin"
CXXFLAGS="${CXXFLAGS} -flto=thin"
# -O2 in LDFLAGS refers to binary size optimization during linking, it is NOT related to the -O levels of the compiler
LDFLAGS="${LDFLAGS} -Wl,-O2 -Wl,--as-needed"
CC="clang"
CXX="clang++"
CPP="clang-cpp" # necessary for xorg-server and possibly other packages
AR="llvm-ar"
NM="llvm-nm"
RANLIB="llvm-ranlib"
Now set /etc/portage/package.env overrides using Clang with LTO enabled:
app-foo/bar compiler-clang-lto
app-bar/baz compiler-clang-lto
Global configuration
Similar to what was covered earlier in the article, a system wide Clang with LTO enabled can be done by changing the /etc/portage/make.conf file:
CFLAGS="${CFLAGS} -flto=thin"
CXXFLAGS="${CXXFLAGS} -flto=thin"
# -O2 in LDFLAGS refers to binary size optimization during linking, it is NOT related to the -O levels of the compiler
LDFLAGS="${LDFLAGS} -Wl,-O2 -Wl,--as-needed"
CC="clang"
CXX="clang++"
CPP="clang-cpp" # necessary for xorg-server and possibly other packages
AR="llvm-ar"
NM="llvm-nm"
RANLIB="llvm-ranlib"
Again, it is possible to set the AR, NM, and RANLIB to the LLVM implementations. Since earlier in the article compiler environments were set up using Clang without LTO, GCC without LTO, and GCC with LTO, it is now possible to pick and choose which is best on a per package basis. Since the goal is to compile packages system wide with Clang using LTO and not every package will successfully compile using it, fall back to Clang with LTO disabled or GCC. The /etc/portage/package.env may look like the following:
# Compiled using Clang with no link-time optimization since package bar fails using flto
app-foo/bar compiler-clang
# Compiled using GCC with no link-time optimization since package baz fails using flto
app-bar/baz compiler-gcc
# Compiled using GCC with link-time optimization since package foo compiles using flto
app-baz/foo compiler-gcc-lto
PGO
Set USE="profile orc" in make.conf or locally via package.use.
Also, ensure USE="sanitize" enabled for llvm-core/clang-runtime. This will install sys-libs/compiler-rt-sanitizers as its dependency.
root #
emerge --ask --update --deep --changed-use llvm-core/clang-runtime
Add USE="pgo" globally or locally and update the new use flags.
Not having the USE flags
profile
and orc
enabled will cause packages such as dev-lang/python[pgo] to fail the compile phase.build.log will report following errors if you do not have compiler-rt-sanitizers installed.
ld.lld: error: cannot open /usr/lib/llvm/18/bin/../../../../lib/clang/18/lib/linux/libclang_rt.profile-x86_64.a: No such file or directory
clang: error: linker command failed with exit code 1 (use -v to see invocation)
Example bug #917263
distcc
In order to use Clang on a distcc client, additional symlinks must to be created in /usr/lib*/distcc/bin:
root #
ln -s /usr/bin/distcc /usr/lib/distcc/bin/clang
root #
ln -s /usr/bin/distcc /usr/lib/distcc/bin/clang++
ccache
ccache support is automatic once Clang is emerged.
Kernel
The Linux kernel can be compiled with Clang and the LLVM toolchain by defining a kernel environment variable.
root #
LLVM=1
To configure Clang specific kernel options such as link-time optimizations or control flow integrity, run the following command:
root #
LLVM=1 make menuconfig
The above example demonstrates using menuconfig
. Other options are nconfig
and xconfig
. Next, compile the kernel as normal.
root #
LLVM=1 make -j$N
In the past, it was necessary to pass LLVM_IAS=1
to use the Clang internal assembler for a complete LLVM toolchain built kernel. This is no longer required since LLVM=1
now defaults to include the Clang internal assembler. Use LLVM_IAS=0
to disable the internal assembler if desired, otherwise stick to the default behavior.
Genkernel
These steps are mentioned in bug #786405.
When using genkernel, edit the /etc/genkernel.conf by substituting the following "Low Level Compile Settings" and adding the additional MAKEOPTS:
# =========Low Level Compile Settings=========
#
# Additional make options
MAKEOPTS="LLVM=1 LLVM_IAS=1 ${MAKEOPTS}"
# Assembler to use for the kernel. See also the --kernel-as command line
# option.
KERNEL_AS="clang -c"
# Archiver to use for the kernel. See also the --kernel-ar command line
# option.
KERNEL_AR="llvm-ar"
# Compiler to use for the kernel (e.g. distcc). See also the --kernel-cc
# command line option.
KERNEL_CC="clang"
# Linker to use for the kernel. See also the --kernel-ld command line option.
KERNEL_LD="ld.lld"
# NM utility to use for the kernel. See also the --kernel-nm command line option.
KERNEL_NM="llvm-nm"
# GNU Make to use for kernel. See also the --kernel-make command line option.
#KERNEL_MAKE="make"
# not exposed in default config
KERNEL_OBJCOPY="llvm-objcopy"
KERNEL_OBJDUMP="llvm-objdump"
KERNEL_READELF="llvm-readelf"
KERNEL_STRIP="llvm-strip"
KERNEL_RANLIB="llvm-ranlib"
# Assembler to use for the utilities. See also the --utils-as command line
# option.
UTILS_AS="clang -c"
# Archiver to use for the utilities. See also the --utils-ar command line
# option.
UTILS_AR="llvm-ar"
# C Compiler to use for the utilities (e.g. distcc). See also the --utils-cc
# command line option.
UTILS_CC="clang"
# C++ Compiler to use for the utilities (e.g. distcc). See also the --utils-cxx
# command line option.
UTILS_CXX="clang++"
# Linker to use for the utilities. See also the --utils-ld command line
# option.
UTILS_LD="ld.lld"
# NM utility to use for the utilities. See also the --utils-nm command line option.
UTILS_NM="llvm-nm"
# GNU Make to use for the utilities. See also the --utils-make command line
# option.
#UTILS_MAKE="make"
# not exposed in default config
UTILS_OBJCOPY="llvm-objcopy"
UTILS_OBJDUMP="llvm-objdump"
UTILS_READELF="llvm-readelf"
UTILS_STRIP="llvm-strip"
UTILS_RANLIB="llvm-ranlib"
# Target triple (i.e. aarch64-linux-gnu) to build for. If you do not
# cross-compile, leave blank for auto detection.
#CROSS_COMPILE=""
# Override default make target (bzImage). See also the --kernel-target
# command line option. Useful to build a uImage on arm.
#KERNEL_MAKE_DIRECTIVE_OVERRIDE="fooImage"
# Override default kernel binary path. See also the --kernel-binary
# command line option. Useful to install a uImage on arm.
#KERNEL_BINARY_OVERRIDE="arch/foo/boot/bar"
After that use genkernel as usual:
root #
genkernel all
Additionally, the same options will have to be provided for any kernel modules:
# Compiled using clang like kernel itself
app-foo/bar compiler-clang
# This is added to make options by linux-mod.eclass
BUILD_FIXES="LLVM=1 LLVM_IAS=1"
# CC/CCX and other tools must match genkernel config
Further, once clang becomes the default compiler, it might be possible to use portageq envvar and make things DRY.
Distribution Kernel
Compile the distribution kernels (for clarity's sake, not including the binary kernel) with LLVM using the following configs in addition to the values in package.env or make.conf as described in Clang#Basic_setup:
LLVM=1
and
sys-kernel/gentoo-kernel llvm-kernel
Troubleshooting
The main place for looking up known failures with Clang is the tracker bug #408963. If hitting an issue not reported on Gentoo's Bugzilla already, please open a new bug report and make it block the linked tracker.
Compile errors when using Clang with -flto
If the packages being installed are failing, check the logs. Often, packages with errors like the following will need to disable LTO by invoking the compiler-clang environment.
/usr/bin/x86_64-pc-linux-gnu-ld: error: version.o:1:3: invalid character
/usr/bin/x86_64-pc-linux-gnu-ld: error: version.o:1:3: syntax error, unexpected $end
/usr/bin/x86_64-pc-linux-gnu-ld: error: version.o: not an object or archive
The following error may be seen in every LTO failure case:
x86_64-pc-linux-gnu-clang-3.8: error: linker command failed with exit code 1 (use -v to see invocation)
Simply add the failing package to /etc/portage/package.env. In this case, it's the sys-apps/less package, so to apply the proper override.
# Compiled using Clang with no link-time optimization since the package 'less' fails using lto
sys-apps/less compiler-clang
Sometimes a package will fail to compile even when disabling LTO because it requires another package which was compiled using -flto and works incorrectly. Something like the following error may be seen:
/usr/lib64/libatomic_ops.a: error adding symbols: Archive has no index; run ranlib to add one
In this case libatomic_ops is causing boehm-gc to fail compiling. Recompile the program causing the failure using the non-LTO environment and then recompile the new program. In this case, boehm-gc fails when using LTO, so add both of them to the /etc/portage/package.env file to build them without LTO:
dev-libs/boehm-gc compiler-clang
dev-libs/libatomic_ops compiler-clang
Use of GNU extensions without proper -std=
Some packages tend to use GNU extensions in their code without specifying -std=
appropriately. GCC allows that usage, yet Clang disables some of more specific GNU extensions by default.
If a particular package relies on such extensions being available, then append the correct -std=
flag to it:
-std=gnu89
for C89/C90 with GNU extensions,-std=gnu99
for C99 with GNU extensions,-std=gnu++98
for C++:1998 with GNU extensions.
A common symptom of this problem are multiple definitions of inline functions like this:
/usr/bin/x86_64-pc-linux-gnu-ld: error: ../mpi/.libs/libmpi.a(mpi-bit.o): multiple definition of '_gcry_mpih_add'
/usr/bin/x86_64-pc-linux-gnu-ld: ../mpi/.libs/libmpi.a(mpi-add.o): previous definition here
/usr/bin/x86_64-pc-linux-gnu-ld: error: ../mpi/.libs/libmpi.a(mpi-bit.o): multiple definition of '_gcry_mpih_add_1'
/usr/bin/x86_64-pc-linux-gnu-ld: ../mpi/.libs/libmpi.a(mpi-add.o): previous definition here
This is because Clang uses C99 inline rules by default which do not work with gnu89 code. To work around it, it is likely necessary to pass -std=gnu89
or set one of the environmental overrides to use GCC to compile the failing package if passing the right -std=
flag doesn't work.
Since both current (2020) GCC and Clang default to -std=gnu17
with C99 inline rules, chances are the problems have already been spotted by a GCC user.
sudo: clang: command not found
Clang is not added to /usr/bin and instead lives in a separate path that is added to the PATH variable. Sudo has a whitelisted PATH variable that is baked in at compile time. So when a new version of clang is installed, it will not be added to sudo's PATH until sudo is re-emerged.
troubleshooting compiling with gcc at clang profile
/usr/src/debug/sys-libs/glibc-2.37-r3/glibc-2.37/csu/../sysdeps/x86_64/start.S:103: undefined reference to `main'
Use bfd linker. Add -fuse-ld=bfd
to CFLAGS
, CXXFLAGS
and LDFLAGS
at the /etc/portage/env/compiler-gcc-lto
or /etc/portage/env/compiler-gcc
configuration files .
ld.lld: error: undefined symbol: ... std::__1::basic_string
This means that libc++ has been enabled instead of libstdc++ as the default C++ standard library for Clang either by switching to the LLVM profile, or by installing llvm-core/clang-common with USE=default-libcxx
or by adding --stdlib=libc++
directly in CXXFLAGS
. Switching to libc++ breaks ABI compatibility for libraries with a C++ public interface (for example, libLLVM), because libc++ uses the std::__1
namespace. To use libc++, it is necessary to recompile such libraries with emerge -av1 llvm-core/llvm && emerge @preserved-rebuild
before installing other software.
See also
- GCC — among the most widely used compiler toolchains in the world with official support for: C, C++, Objective-C, Objective-C++, Modula-2 Fortran, Ada, Go, and D
References
- ↑ https://github.com/llvm/llvm-project/blob/f24c443e8241df7df1d5152c45636c76b682a043/clang/lib/Driver/ToolChains/Clang.cpp#L5349
- ↑ https://github.com/llvm/llvm-project/blob/f24c443e8241df7df1d5152c45636c76b682a043/clang/lib/Driver/ToolChains/Clang.cpp#L645
- ↑ https://blog.llvm.org/2016/06/thinlto-scalable-and-incremental-lto.html