Talk:Distcc

From Gentoo Wiki
Jump to:navigation Jump to:search
Note
Before creating a discussion or leaving a comment, please read about using talk pages. To create a new discussion, click here. Comments on an existing discussion should be signed using ~~~~:
A comment [[User:Larry|Larry]] 13:52, 13 May 2024 (UTC)
: A reply [[User:Sally|Sally]] 10:36, 5 November 2024 (UTC)
:: Your reply ~~~~

DistCC Needs a Tuning Section

Talk status
This discussion is done.

If I'm not mistaken, DistCC pretty much ignores the "/etc/make.conf" or "/etc/portage/make.conf" MAKEOPTS variable, and spawns the jobs as needed until the jobs fallback to the local box (not via network) for compiling. (So tinkering with the MAKEOPTS variable is likely not needed, as this variable is usually already setup on most Gentoo boxes.)

Think I figured how to properly tune DistCC. Add the following "--jobs" option to the server or helper box performing the compiling. Typically the "--jobs" option will mirror the MAKEOPTS variable within the "/etc/portage/make.conf" file or CPU's+1. Notice that the DistCC server (or helper box) now has spawned ten distccd threads. (ie. ps ax |grep distccd)

/etc/conf.d/distccd:

DISTCCD_OPTS="--port 3632 --log-level warning --log-file /var/log/distccd.log -N 15 --allow 127.0.0.1 --allow 192.168.1.0/24 --jobs 10"

On the client box, the "/etc/distcc/hosts" for designating the DistCC server (or helper box); the jobs can be limited, but not designated more jobs? /etc/distcc/hosts

192.168.1.5/5

Per DistCC manual; by defaults, DistCC spawns four threads or four jobs (plus one) for per each host listed within the /etc/distcc/hosts file.

I'm still slightly confused here. And likely the "--jobs 10" option needs to be integrated into both the client and server/helper boxes. And then the /etc/distcc/hosts limits the job number as necessary by using the /num suffix option to the server address?

--Roger (talk) 22:26, 1 May 2014 (UTC)

MAKEOPTS is passed to make and -j tells make how many parallel jobs it can run. Each job that tries to invoke a compiler will invoke distcc, which will use your settings in /etc/distcc/hosts to determine which machine should run that compilation. I don't know exactly what DISTCCD_OPTS does, but MAKEOPTS gets the job done.
So while distcc ignores MAKEOPTS, --jobs is still in effect for distcc jobs, in the same way that it's in effect for compilers, even though they don't understand --jobs either.
Waldo Lemmer 03:47, 23 May 2024 (UTC)

DistCC Pump Mode

Talk status
This discussion is done as of 2024-05-23.

Article jumps right into pump mode. Might be nice to have a very very brief one sentence explaining pump mode. ie. "Distcc's pump mode accelerates remote compilation with distcc by also distributing preprocessing to the servers. (ie. man pump)" --Roger (talk) 19:05, 21 January 2015 (UTC)

I missed to read the discussions page before changing the article and added quick note about pump mode in the configuration section. --Totktonada (talk) 03:24, 19 February 2016 (UTC)

The only other noteworthy item within the pump manual is; "Note that distcc's pump-mode assumes that sources files will not be modified during the lifetime of the include server, so modifying source files during a build may cause inconsistent results." Of which, I don't think I've ever personally seen source files modified during compilation, unless somebody remotely logged into a version control system while the sources were being recompiled. Being a source Linux based distribution, I still don't think the later is worth much concern? --Roger (talk) 18:15, 21 January 2015 (UTC)

As I understand it’s about situations when some different macros defined before including a header in different compilation units. Or distcc able to check such things and it’s out of topic? Maybe your understanding right. Anyway I think it’s unlikely to be common and concerned user surely looks into man page, briefly at least. The article I guess shouldn’t describe all kind of uncommon weird things that may lead to breaking build process or broken executable. --Totktonada (talk) 03:24, 19 February 2016 (UTC)

Also, no mention of requiring to add ",cpp,lzo" (ie. 192.168.1.2/10,cpp,lzo) to the hosts lines within the /etc/distcc/hosts file! This is a requisite for DistCC pump mode! The example 192.168.1.2/10,cpp,lzo first specifies a remote compiler host, then limit sending 10 jobs max to the remote host compiler, including sending cpp jobs with lzo compression for which the later two cpp/lzo options are required for pump mode to work. Else, the user will see warnings and errors upon pump emerge! --Roger (talk) 19:05, 21 January 2015 (UTC)

Thanks! I’m going to try pump mode on the week and will return with some thought here. --Totktonada (talk) 03:24, 19 February 2016 (UTC)
You right. I updated the article: expanded configuration section and usage example. --Totktonada (talk) 01:03, 24 February 2016 (UTC)

The user should be warned that distcc pump mode is really buggy and is causing « random » build failures. See the following bugs :

--Netfab (talk) 09:20, 8 December 2016 (UTC)

Pump mode is not supported in Portage anymore. See bug #702146.
Waldo Lemmer 03:53, 23 May 2024 (UTC)

Command for obtaining -march=native options

Talk status
This discussion is done as of 2024-05-23.

Previously, this command was listed as the mechanism for obtaining the CFLAGS equivent to -march=native:

user $gcc -Q -march=native --help=target

However, it has some flaws. Something valuable that the -march=native option obtains is information about the processor's cache (including the L1 line size). However, the one-liner I've replaced it with I can't promise will be 100% accurate on all archs, since I'm filtering out the hordes of -mno-* flags that gcc puts in there since 4.7 or so. I'm betting on gcc using these flags to add features to a base march, but if it uses a -mno-* flag to remove a feature from any arch, then this will be broken.

Maybe what we really need is a mechanism for the distcc client to insert these flags and relieve the user from having to fish them out.

Daniel Santos (talk) 22:42, 26 January 2015 (UTC)

I don’t understand why concerned about 'all archs'? C[XX]FLAGS should fit particular machine and be equivalent to '-march=native' on *that* machine (with some additions/modifications/optimization flags according to you). There is a good article about filtering redundant flags from gcc’s output—'Inlining -march=native for distcc'—it’s mentioned in the configuration section and can be found in 'External resources' too. But maybe scripting around gcc’s output is good idea. --Totktonada (talk) 03:41, 19 February 2016 (UTC)
Well there are a great many problems with the approach laid out in that article. I went through those sources (for the i386 back end) to see if I could implement a gcc -Q --help=target-distcc functionality and I gave up and filed a report (PR 81519) instead. Another developer who knows gcc better came along and tried and said "no way", but he has fixed PR 39851, so this will now make it possible to script it without having to decypher the gcc sources. This does still have flaws related to brevity, since some -m<isa> flags implicitly include others, so it still won't remove all redundant flags, but it can be fully automated, will never remove any -mno-<isa> flags that should not be removed (some of them shouldn't), be very close to as brief as possible and continue to work on newer architectures as they emerge in the market and support is added to gcc.
Honestly, I think that the entire mechanism gcc is currently using for machine flags needs to be completely redone so that all data structures used are defined in the middle-end and populated by each back-end so that arch-generic processing can be done on them. But at least when this patch is backported, we'll have a better way to get C(XX)FLAGS. Daniel Santos (talk) 23:44, 29 August 2017 (UTC)


Update: I've added a new section for the CFLAGS and CXXFLAGS with yet another script. This one isn't too useful until the next release of the GCC 6, 7 or 8 branch, but it boils down the machine flags quite dramatically. As it turns out, the answer for my Phenom machine was just -march=amdfam10. Also --param l1-cache-size et. al. aren't needed because that information is embedded in struct processor_costs which is dictated by the specific value of -march. The script doesn't extrapolate which ISA flags are implied by other flags however, so for example:

 $ ./distccflags -march=native -mavx
 CFLAGS="-march=amdfam10 -mavx -msse4 -msse4.1 -msse4.2 -mssse3 -mxsave"

Here, everything in the output after -mavx is implied and not actually needed. It's also fun because you can pass it different flags and it can tell you what it translates into, but I'm only processing machine flags. Also, this IS broken if you try to use both Windows and non-windows machines in your farm because of the -mabi defaults and a lot of other things (SEH, etc.). Daniel Santos (talk) 08:28, 16 September 2017 (UTC)

Suggestion

Hello, i'm not sure if the following algorithm is correct, in order to include it in the wiki, so i'll add it here.

I'm currently using three steps:

  • gcc -### -march=native -mtune=native -x c - to get the full list with the -mno-options.
    • Remove supposed redundant entries.
  • LANG=C gcc -Q -Ofast -march=native -mtune=native --help=target to get a reference list for comparison.
    • LANG=C gcc -Q -Ofast $(cat first_list) --help=target to get a new list.
  • Compare the results.

- Ladon (talk) 07:14, 17 December 2020 (UTC)

Resolution

I've changed it to use resolve-march-native (Special:Diff/1299850).

Waldo Lemmer 03:57, 23 May 2024 (UTC)

No Real Mention of DistCC Required Packages with Versions Synced

Talk status
This discussion is done as of 2024-05-23.

I always thought there were a few packages with versions requiring to be synchronized across servers and clients for DistCC to work flawlessly. The packages I thought requiring versions to be synchronized across servers and clients were: distcc, glibc, gcc, binutils, ... ? Or is the only package requiring to have a similar major version is GCC? No mention or strong noting or warning for this version synchronization seems to be currently mentioned within the WIKI page. There were strong warnings in the past elsewhere for such package version synchronization, else users would risk major system breaks. I should also mention, the previous fore mentioned built with similar USE flags (ie. configure --help) as well? --Roger (talk) 11:00, 4 October 2015 (UTC)

It's mentioned in sections Requirements across all hosts and Mixed GCC versions that the GCC versions must be the same.
Waldo Lemmer 04:00, 23 May 2024 (UTC)

Testing

Talk status
This discussion is done as of 5 August 2022.

When testing main.o with command './main.o' I get message:

-bash: ./main.o: Permission denied

After I chmod main.o with command 'chmod +x main.o' I get message:

-bash: ./main.o: cannot execute binary file: Exec format error

I need an explanation why can't I run main.o?

--Best, Pál (talk) 07:00, 6 February 2016 (UTC)

The example was particuary broken. It missed linking and suggested to run an object file instead of an executable. Now it should be fine. --Totktonada (talk) 04:06, 19 February 2016 (UTC)
A fix was provided (Special:Diff/450814). --Blacki (talk) 18:55, 5 August 2022 (UTC)

Another SSH usage example

Talk status
This discussion is done as of 2024-05-23.

Suppose that :

  • you are on a powerful system (system A) and you want to help a tiny device (system B) to compile (or cross-compile).
  • you already can connect from A to B with ssh.

On the system A:

  • install the distcc server
  • configure it to allow connections only from localhost :
FILE /etc/conf.d/distccd
DISTCCD_OPTS="${DISTCCD_OPTS} --allow 127.0.0.1"
  • run the distcc server

On the system B:

  • install distcc
  • set participating hosts, here 16 jobs for the helper, 2 jobs for locale cores :
root #distcc-config --set-hosts "127.0.0.1/16 localhost/2"

From the system A:

  • Open a reversed SSH tunnel to the system B :
user $ssh -R 3632:localhost:3632 systemB

You can now update your tiny system, jobs will be sent to the distcc server through the reversed SSH tunnel. --Netfab (talk) 08:00, 4 September 2016 (UTC)

Feel free to add this to the article. It will get lost here.
Waldo Lemmer 04:02, 23 May 2024 (UTC)

Bootstrapping Error

Talk status
This discussion is done as of 2024-05-23.

When doing Step 2 in the Bootstrapping section:

root #USE='-*' emerge --nodeps sys-devel/distcc

I got:

!!! Problem resolving dependencies for sys-devel/distcc

!!! The ebuild selected to satisfy "sys-devel/distcc" has unmet requirements.
- sys-devel/distcc-3.2_rc1-r4::gentoo USE="-crossdev -gnome -gssapi -gtk -hardened -ipv6 (-selinux) -xinetd -zeroconf" ABI_X86="(64)" PYTHON_TARGETS="-python2_7"

  The following REQUIRED_USE flag constraints are unsatisfied:
    python_targets_python2_7

Fixed by doing:

root #USE='-* python_targets_python2_7' emerge --nodeps sys-devel/distcc

since the -* was blowing away the PYTHON_TARGETS expanded flag

Huntersan9 (talk) 05:03, 22 July 2017 (UTC)

This is very likely no longer relevant. Closing.
Waldo Lemmer 04:03, 23 May 2024 (UTC)

Why we cannot strip -mno-* flags

Talk status
This discussion is done as of 2024-05-23.

When I originally added the snippet for extracting machine flags from gcc, I stripped out all of the -mno-* flags because they seemed like garbage. I was later told (in IRC) that GCC represents some CPUs with an -march value combined with -mno-<isa> flags, but that this was for very old CPUs. Out of caution I modified the snippet to not strip them, but then another editor reverted that change citing breakage in gimp ebuild where it enables ISAs for specific .c files for functions that test the ISA availability before executing it. This is of course the most common case in the world -- most people don't custom compile all of software specifically for their CPU.

But it is now known that at least one recent CPU in the westmere family requires -mno-* flags. As such, it is simply wrong to remove them.

Thankfully, as of GCC 7.3, my distccflags script will work to determine the most succinct and correct machine flags. Distcc is interested in this upstream if anybody wants to convert this to C. This script will also work on GCC 6.5 if that is ever released as well as the upcoming GCC 8. Daniel Santos (talk) 18:14, 1 March 2018 (UTC)

The article now uses resolve-march-native, which includes these flags.
Waldo Lemmer 04:04, 23 May 2024 (UTC)

Outdated content removed

Talk status
This discussion is done as of 2024-05-23.

I've just removed the following 3 years old content I think outdated, refering to an old bug on an old version of gcc:

"

A GCC bug has been fixed in the 8.0 dev tree which facilitates a more reliable and succinct mechanism for extrapolating appropriate machine flags. The fix has been backported to the 6 and 7 branches and should be released fairly soon. Some processing is still required and a script can be found in the distccflags repo, or via wget:

Warning
Downloading scripts and executing them without any validation is a security risk. Before executing such scripts, take a good look at what they want to accomplish and refrain from executing it when the content or behavior is not clear and purposeful.
user $chmod +x distccflags
user $./distccflags -march=native

"

Akar (talk) 19:19, 15 July 2020 (UTC)

Sounds good. The article now uses resolve-march-native anyway.
Waldo Lemmer 04:05, 23 May 2024 (UTC)

Reduce to just using distcc with Portage?

Talk status
This discussion is done as of 2024-05-23.

Distcc is complicated in itself. I think readers would be better served restricting this page to just using distcc with Portage. Os360 (talk) 15:44, 10 January 2021 (UTC)

I disagree:

The Wiki project is a collaborative effort to distill the collective knowledge of the Gentoo community at large into clear, concise, useful documentation, for the benefit of all.

Project:Wiki:

The goal of the Gentoo wiki is to become the best Linux-related resource on the web through contributions from the Gentoo community.

Instead, I added a todo item to move Portage stuff to a different page.
Waldo Lemmer 04:21, 23 May 2024 (UTC)

Configuration of environment variables

Talk status
This discussion needs help as of 2024-05-23.
Tip: To get this fixed sooner, use {{Proposal}}.

Wiki page proposes to configure environment variables with /etc/systemd/system/distccd.service.d/00gentoo.conf. But I experienced that this configuration is overridden by configuration in /etc/env.d/02distcc which is modified with distcc-config. I do not know if the wiki should be adapted to the ebuild or the ebuild to the wiki. I commented a bug on this https://bugs.gentoo.org/show_bug.cgi?id=570924 --Sylvm (talk) 13:56, 14 October 2021 (UTC)

Section systemd is a mess in general.
Waldo Lemmer 04:24, 23 May 2024 (UTC)

Warn not to use DISTCC_VERBOSE=1 for emerge as it breaks many packages

Talk status
This discussion is done as of 2024-05-23.

See bugs https://bugs.gentoo.org/818148 and https://bugs.gentoo.org/816480 I was tempted to use it in order to see if distcc was performing well (as it is shown in the example of the wiki page), and kept it, and got many merge problems.

Also warn DISTCC_SAVE_TEMPS=1 makes emerge use lots of space on /var/tmp/portage (more than 30Gb), which broke many emerge I run. Sylvm (talk) 13:07, 15 October 2021 (UTC)

Section Portage build failing with errors that are apparently not connected with distcc at all addresses this. But if you think a warning should be placed higher up, feel free to make this change.
Waldo Lemmer 04:26, 23 May 2024 (UTC)

Some improvements regarding security

Talk status
This discussion is done as of 2024-05-23.

I'd like to incorporate above discussions in a few edits:

  • focus on distcc with portage, move the more general usage to a sub page
    I've already created a todo item to do the opposite: Move Portage stuff to Portage with Distcc, matching the theme of Portage with Git. But the name is up for debate.
    Waldo Lemmer 04:32, 23 May 2024 (UTC)
  • point out multi-level strategy upon errors: try pump mode around emerge first, if it fails distcc only if that fails plain emerge – do distinguish failure sources (needs to mention which packages have distcc disabled)
    Now that distcc-pump support has been removed, there are only two levels. The note at the top of the page covers this.
    Waldo Lemmer 04:32, 23 May 2024 (UTC)
  • harden usage of distcc through scenarios:
    • local network, dedicated machine, only up for compiling, dedicated client hosts (through IPs), laptop+desktop
    • local network, always on server, in a chroot environment, listening on only a single interface, maybe even on Btrfs/ ZFS to revert to snapshot after usage (I'd like to use pivot_root instead or jchroot, but until now I couldn't get emerge to work in pivot_root due to my paranoid namespaces not only for mounts and users, yet it works with emerge in regular chroot and start in pivot_root/ jchroot)
    • only as a sketch with container instead of chroot, even more read-only
    • only as a sketch: outbound server connected via VPN (Wireguard, OpenVPN), distcc itself suggested this (back in 2014) without too many details

To my future self regarding »pump mode not necessary for scenario 1&2«: https://github.com/distcc/distcc/issues/332, with explicit mention of Gentoo's bug #702146 and regarding archeology what pump was for: https://opensource.googleblog.com/2008/08/distccs-pump-mode-new-design-for.html. Check some not so trivial packages how build time differs with and without pump – assumption: neglectible (mupdf, ffmpeg).

--Onkobu (talk) 19:22, 27 December 2021 (UTC)

Feel free to add Distcc hardening advice to the article, even if it's just a stub.
Waldo Lemmer 04:32, 23 May 2024 (UTC)

Feature to automatically resolve -march/mtune=native flags

Talk status
This discussion is done as of 2024-05-23.

Please watch the ongoing patch on https://github.com/distcc/distcc/pull/350 which will make -march=native be resolved automatically by distcc... --Massimo B. (talk) 10:10, 20 January 2022 (UTC)

The article now covers resolve-march-native, which is also used by the Handbook and therefore better supported by Gentoo.
Waldo Lemmer 04:34, 23 May 2024 (UTC)

Please provide example of package.env for packages that don't like distcc

Talk status
This discussion is done as of 19 November 2022.

The section in Troubleshooting: Some_packages_do_not_use_distcc describes provisioning a "package.env" in order for distcc-unfriendly packages to compile. Please provide a working example of this plus other coding for this be successfully implemented.

Thanks in advance. Russelld (talk) 09:36, 12 November 2022 (UTC)

I added configuration examples for this. Thank you N (talk) 18:50, 19 November 2022 (UTC)

Talk status
This discussion is still ongoing as of 2024-06-13.

SSH, which user on the distccd server?

In section SSH for communication which user should be used as UserName in the .ssh/config, to login to the distccd server also being Gentoo? root, distcc, portage or another separate ssh-distcc user? --Massimo B. (talk) 08:53, 13 June 2024 (UTC)