Building from source - disadvantages?

Da Oeuf@slrpnk.net · 1 month ago

Building from source - disadvantages?

balsoft@lemmy.ml · edit-2 1 month ago

All x86_64 CPUs support a certain “base” set of instructions. But most of them also support some additional instruction sets: SIMD (single instruction multiple data - operations on vectors and matrices), crypto (encryption/hashing), virtualization (for running VMs), etc. Each of those instructions replaces dozens or hundreds of “base” instructions, speeding certain specific operations dramatically.

When compiling source code into binary form (which is basically a bunch of CPU instructions plus extra fluff), you have to choose which instructions to use for certain operations. E.g. if you want to multiply a vector by a matrix (which is a very common operation in like a dozen branches of computer science), you can either do the multiplication one operation at a time (almost as you would when doing it by hand), or just call a single instruction which “just does it” in hardware.

The problem is “which instruction sets do I use”. If you use none, your resulting binary will be dogshit slow (by modern standards). If you use all, it will likely not work at all on most CPUs because very few will support some bizarre instruction set. There are also certain workarounds. The main one is shipping two versions of your code: one which uses the extensions, the other which doesn’t; and choosing between them at runtime by detecting whether the CPU supports the extension or not. This doubles your binary size and has other drawbacks too. So, in most cases, it falls on whoever is packaging the software for your distro to choose which instruction sets to use. Typically the packager will try to be conservative so that it runs on most CPUs, at the expense of some slowdown. But when you the user compile the source code yourself, you can just tell the compiler to use whatever instruction sets your CPU supports, to get the fastest possible binary (which might not run on other computers).

In the past this all was very important because many SIMD extensions weren’t as common as they are today, and most distros didn’t enable them when compiling. But nowadays the instruction sets on most CPUs are mostly similar with minor exceptions, and so distro packagers enable most of them, and the benefits you get when compiling yourself are minor. Expect a speed improvement in the range of 0%-5%, with 0% being the most common outcome for most software.

TL;DR it used to matter a lot in the past, today it’s not worth bothering unless you are compiling everything anyways for other reasons.