docs/9 - Build procedure overview.txt
changeset 2322 892351110ce8
parent 2321 d896b85e8738
child 2323 ed0725c61fe6
     1.1 --- a/docs/9 - Build procedure overview.txt	Thu Feb 24 22:38:08 2011 +0100
     1.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
     1.3 @@ -1,257 +0,0 @@
     1.4 -File.........: 9 - Build procedure overview.txt
     1.5 -Copyrigth....: (C) 2011 Yann E. MORIN <yann.morin.1998@anciens.enib.fr>
     1.6 -License......: Creative Commons Attribution Share Alike (CC-by-sa), v2.5
     1.7 -
     1.8 -
     1.9 -How is a toolchain constructed? /
    1.10 -_______________________________/
    1.11 -
    1.12 -This is the result of a discussion with Francesco Turco <mail@fturco.org>:
    1.13 -  http://sourceware.org/ml/crossgcc/2011-01/msg00060.html
    1.14 -
    1.15 -Francesco has a nice tutorial for beginners, along with a sample, step-by-
    1.16 -step procedure to build a toolchain for an ARM target from an x86_64 Debian
    1.17 -host:
    1.18 -  http://fturco.org/wiki/doku.php?id=debian:cross-compiler
    1.19 -
    1.20 -Thank you Francesco for initiating this!
    1.21 -
    1.22 -
    1.23 -I want a cross-compiler! What is this toolchain you're speaking about? |
    1.24 ------------------------------------------------------------------------+
    1.25 -
    1.26 -A cross-compiler is in fact a collection of different tools set up to
    1.27 -tightly work together. The tools are arranged in a way that they are
    1.28 -chained, in a kind of cascade, where the output from one becomes the
    1.29 -input to another one, to ultimately produce the actual binary code that
    1.30 -runs on a machine. So, we call this arrangement a "toolchain". When
    1.31 -a toolchain is meant to generate code for a machine different from the
    1.32 -machine it runs on, this is called a cross-toolchain.
    1.33 -
    1.34 -
    1.35 -So, what are those components in a toolchain? |
    1.36 -----------------------------------------------+
    1.37 -
    1.38 -The components that play a role in the toolchain are first and foremost
    1.39 -the compiler itself. The compiler turns source code (in C, C++, whatever)
    1.40 -into assembly code. The compiler of choice is the GNU compiler collection,
    1.41 -well known as 'gcc'.
    1.42 -
    1.43 -The assembly code is interpreted by the assembler to generate object code.
    1.44 -This is done by the binary utilities, such as the GNU 'binutils'.
    1.45 -
    1.46 -Once the different object code files have been generated, they got to get
    1.47 -aggregated together to form the final executable binary. This is called
    1.48 -linking, and is achieved with the use of a linker. The GNU 'binutils' also
    1.49 -come with a linker.
    1.50 -
    1.51 -So far, we get a complete toolchain that is capable of turning source code
    1.52 -into actual executable code. Depending on the Operating System, or the lack
    1.53 -thereof, running on the target, we also need the C library. The C library
    1.54 -provides a standard abstraction layer that performs basic tasks (such as
    1.55 -allocating memory, printing output on a terminal, managing file access...).
    1.56 -There are many C libraries, each targetted to different systems. For the
    1.57 -Linux /desktop/, there is glibc or eglibc or ven uClibc, for embeded Linux,
    1.58 -you have a choice of eglibc or uClibc, while for system without an Operating
    1.59 -System, you may use newlib, dietlibc, or even none at all. There a few other
    1.60 -C libraries, but they are not as widely used, and/or are targetted to very
    1.61 -specific needs (eg. klibc is a very small subset of the C library aimed at
    1.62 -building contrained initial ramdisks).
    1.63 -
    1.64 -Under Linux, the C library needs to know the API to the kernel to decide
    1.65 -what features are present, and if needed, what emulation to include for
    1.66 -missing features. That API is provided by the kernel headers. Note: this
    1.67 -is Linux-specific (and potentially a very few others), the C library on
    1.68 -other OSes do not need the kernel headers.
    1.69 -
    1.70 -
    1.71 -And now, how do all these components chained together? |
    1.72 --------------------------------------------------------+
    1.73 -
    1.74 -So far, all major components have been covered, but yet there is a specific
    1.75 -order they need to be built. Here we see what the dependencies are, starting
    1.76 -with the compiler we want to ultimately use. We call that compiler the
    1.77 -'final compiler'.
    1.78 -
    1.79 -  - the final compiler needs the C library, to know how to use it,
    1.80 -but:
    1.81 -  - building the C library requires a compiler
    1.82 -
    1.83 -A needs B which needs A. This is the classic chicken'n'egg problem... This
    1.84 -is solved by building a stripped-down compiler that does not need the C
    1.85 -library, but is capable of building it. We call it a bootstrap, initial, or
    1.86 -core compiler. So here is the new dependency list:
    1.87 -
    1.88 -  - the final compiler needs the C library, to know how to use it,
    1.89 -  - building the C library requires a core compiler
    1.90 -but:
    1.91 -  - the core compiler needs the C library headers and start files, to know
    1.92 -    how to use the C library
    1.93 -
    1.94 -B needs C which needs B. Chicken'n'egg, again. To solve this one, we will
    1.95 -need to build a C library that will only install its headers and start
    1.96 -files. The start files are a very few files that gcc needs to be able to
    1.97 -turn on thread local storage (TLS) on an NPTL system. So now we have:
    1.98 -
    1.99 -  - the final compiler needs the C library, to know how to use it,
   1.100 -  - building the C library requires a core compiler
   1.101 -  - the core compiler needs the C library headers and start files, to know
   1.102 -    how to use the C library
   1.103 -but:
   1.104 -  - building the start files require a compiler
   1.105 -
   1.106 -Geez... C needs D which needs C, yet again. So we need to build a yet
   1.107 -simpler compiler, that does not need the headers and does need the start
   1.108 -files. This compiler is also a bootstrap, initial or core compiler. In order
   1.109 -to differentiate the two core compilers, let's call that one "core pass 1",
   1.110 -and the former one "core pass 2". The dependency list becomes:
   1.111 -
   1.112 -  - the final compiler needs the C library, to know how to use it,
   1.113 -  - building the C library requires a compiler
   1.114 -  - the core pass 2 compiler needs the C library headers and start files,
   1.115 -    to know how to use the C library
   1.116 -  - building the start files requires a compiler
   1.117 -  - we need a core pass 1 compiler
   1.118 -
   1.119 -And as we said earlier, the C library also requires the kernel headers.
   1.120 -There is no requirement for the kernel headers, so end of story in this
   1.121 -case:
   1.122 -
   1.123 -  - the final compiler needs the C library, to know how to use it,
   1.124 -  - building the C library requires a core compiler
   1.125 -  - the core pass 2 compiler needs the C library headers and start files,
   1.126 -    to know how to use the C library
   1.127 -  - building the start files requires a compiler and the kernel headers
   1.128 -  - we need a core pass 1 compiler
   1.129 -
   1.130 -We need to add a few new requirements. The moment we compile code for the
   1.131 -target, we need the assembler and the linker. Such code is, of course,
   1.132 -built from the C library, so we need to build the binutils before the C
   1.133 -library start files, and the complete C library itself. Also, some code
   1.134 -in gcc will turn to run on the target as well. Luckily, there is no
   1.135 -requirement for the binutils. So, our dependency chain is as follows:
   1.136 -
   1.137 -  - the final compiler needs the C library, to know how to use it, and the
   1.138 -    binutils
   1.139 -  - building the C library requires a core pass 2 compiler and the binutils
   1.140 -  - the core pass 2 compiler needs the C library headers and start files,
   1.141 -    to know how to use the C library, and the binutils
   1.142 -  - building the start files requires a compiler, the kernel headers and the
   1.143 -    binutils
   1.144 -  - the core pass 1 compiler needs the binutils
   1.145 -
   1.146 -Which turns in this order to build the components:
   1.147 -
   1.148 -  1 binutils
   1.149 -  2 core pass 1 compiler
   1.150 -  3 kernel headers
   1.151 -  4 C library headers and start files
   1.152 -  5 core pass 2 compiler
   1.153 -  6 complete C library
   1.154 -  7 final compiler
   1.155 -
   1.156 -Yes! :-) But are we done yet?
   1.157 -
   1.158 -In fact, no, there are still missing dependencies. As far as the tools
   1.159 -themselves are involved, we do not need anything else.
   1.160 -
   1.161 -But gcc has a few pre-requisites. It relies on a few external libraries to
   1.162 -perform some non-trivial tasks (such as handling complex numbers in
   1.163 -constants...). There are a few options to build those libraries. First, one
   1.164 -may think to rely on a Linux distribution to provide those libraries. Alas,
   1.165 -they were not widely available until very, very recently. So, if the distro
   1.166 -is not too recent, chances are that we will have to build those libraries
   1.167 -(which we do below). The affected libraries are:
   1.168 -
   1.169 -  - the GNU Multiple Precision Arithmetic Library, GMP
   1.170 -  - the C library for multiple-precision floating-point computations with
   1.171 -    correct rounding, MPFR
   1.172 -  - the C library for the arithmetic of complex numbers, MPC
   1.173 -
   1.174 -The dependencies for those liraries are:
   1.175 -
   1.176 -  - MPC requires GMP and MPFR
   1.177 -  - MPFR requires GMP
   1.178 -  - GMP has no pre-requisite
   1.179 -
   1.180 -So, the build order becomes:
   1.181 -
   1.182 -  1 GMP
   1.183 -  2 MPFR
   1.184 -  3 MPC
   1.185 -  4 binutils
   1.186 -  5 core pass 1 compiler
   1.187 -  6 kernel headers
   1.188 -  7 C library headers and start files
   1.189 -  8 core pass 2 compiler
   1.190 -  9 complete C library
   1.191 - 10 final compiler
   1.192 -
   1.193 -Yes! Or yet some more?
   1.194 -
   1.195 -This is now sufficient to build a functional toolchain. So if you've had
   1.196 -enough for now, you can stop here. Or if you are curious, you can continue
   1.197 -reading.
   1.198 -
   1.199 -gcc can also make use of a few other external libraries. These additional,
   1.200 -optional libraries are used to enable advanced features in gcc, such as
   1.201 -loop optimisation (GRAPHITE) and Link Time Optimisation (LTO). If you want
   1.202 -to use these, you'll need three additional libraries:
   1.203 -
   1.204 -To enable GRAPHITE:
   1.205 -  - the Parma Polyhedra Library, PPL
   1.206 -  - the Chunky Loop Generator, using the PPL backend, CLooG/PPL
   1.207 -
   1.208 -To enable LTO:
   1.209 -  - the ELF object file access library, libelf
   1.210 -
   1.211 -The depencies for those libraries are:
   1.212 -
   1.213 -  - PPL requires GMP
   1.214 -  - CLooG/PPL requires GMP and PPL
   1.215 -  - libelf has no pre-requisites
   1.216 -
   1.217 -The list now looks like (optional libs with a *):
   1.218 -
   1.219 -  1 GMP
   1.220 -  2 MPFR
   1.221 -  3 MPC
   1.222 -  4 PPL *
   1.223 -  5 CLooG/PPL *
   1.224 -  6 libelf *
   1.225 -  7 binutils
   1.226 -  8 core pass 1 compiler
   1.227 -  9 kernel headers
   1.228 - 10 C library headers and start files
   1.229 - 11 core pass 2 compiler
   1.230 - 12 complete C library
   1.231 - 13 final compiler
   1.232 -
   1.233 -This list is now complete! Wouhou! :-)
   1.234 -
   1.235 -
   1.236 -So the list is complete. But why does crosstool-NG have more steps? |
   1.237 ---------------------------------------------------------------------+
   1.238 -
   1.239 -The already thirteen steps are the necessary steps, from a theorical point
   1.240 -of view. In reality, though, there are small differences; there are three
   1.241 -different reasons for the additional steps in crosstool-NG.
   1.242 -
   1.243 -First, the GNU binutils do not support some kinds of output. It is not possible
   1.244 -to generate 'flat' binaries with binutils, so we have to use another component
   1.245 -that adds this support: elf2flt. Another binary utility called sstrip has been
   1.246 -added. It allows for super-stripping the target binaries, although it is not
   1.247 -strictly required.
   1.248 -
   1.249 -Second, some C libraries require another step after the compiler is built, to
   1.250 -install additional stuff. This is the case for mingw and newlib. Hence the
   1.251 -libc_finish step.
   1.252 -
   1.253 -Third, crosstool-NG can also build some additional debug utilities to run on
   1.254 -the target. This is where we build, for example, the cross-gdb, the gdbserver
   1.255 -and the native gdb (the last two run on the target, the furst runs on the
   1.256 -same machine as the toolchain). The others (strace, ltrace, DUMA and dmalloc)
   1.257 -are absolutely not related to the toolchain, but are nice-to-have stuff that
   1.258 -can greatly help when developping, so are included as goodies (and they are
   1.259 -quite easy to build, so it's OK; more complex stuff is not worth the effort
   1.260 -to include in crosstool-NG).