1.1 --- a/docs/9 - Build procedure overview.txt Thu Feb 24 22:38:08 2011 +0100
1.2 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000
1.3 @@ -1,257 +0,0 @@
1.4 -File.........: 9 - Build procedure overview.txt
1.5 -Copyrigth....: (C) 2011 Yann E. MORIN <yann.morin.1998@anciens.enib.fr>
1.6 -License......: Creative Commons Attribution Share Alike (CC-by-sa), v2.5
1.7 -
1.8 -
1.9 -How is a toolchain constructed? /
1.10 -_______________________________/
1.11 -
1.12 -This is the result of a discussion with Francesco Turco <mail@fturco.org>:
1.13 - http://sourceware.org/ml/crossgcc/2011-01/msg00060.html
1.14 -
1.15 -Francesco has a nice tutorial for beginners, along with a sample, step-by-
1.16 -step procedure to build a toolchain for an ARM target from an x86_64 Debian
1.17 -host:
1.18 - http://fturco.org/wiki/doku.php?id=debian:cross-compiler
1.19 -
1.20 -Thank you Francesco for initiating this!
1.21 -
1.22 -
1.23 -I want a cross-compiler! What is this toolchain you're speaking about? |
1.24 ------------------------------------------------------------------------+
1.25 -
1.26 -A cross-compiler is in fact a collection of different tools set up to
1.27 -tightly work together. The tools are arranged in a way that they are
1.28 -chained, in a kind of cascade, where the output from one becomes the
1.29 -input to another one, to ultimately produce the actual binary code that
1.30 -runs on a machine. So, we call this arrangement a "toolchain". When
1.31 -a toolchain is meant to generate code for a machine different from the
1.32 -machine it runs on, this is called a cross-toolchain.
1.33 -
1.34 -
1.35 -So, what are those components in a toolchain? |
1.36 -----------------------------------------------+
1.37 -
1.38 -The components that play a role in the toolchain are first and foremost
1.39 -the compiler itself. The compiler turns source code (in C, C++, whatever)
1.40 -into assembly code. The compiler of choice is the GNU compiler collection,
1.41 -well known as 'gcc'.
1.42 -
1.43 -The assembly code is interpreted by the assembler to generate object code.
1.44 -This is done by the binary utilities, such as the GNU 'binutils'.
1.45 -
1.46 -Once the different object code files have been generated, they got to get
1.47 -aggregated together to form the final executable binary. This is called
1.48 -linking, and is achieved with the use of a linker. The GNU 'binutils' also
1.49 -come with a linker.
1.50 -
1.51 -So far, we get a complete toolchain that is capable of turning source code
1.52 -into actual executable code. Depending on the Operating System, or the lack
1.53 -thereof, running on the target, we also need the C library. The C library
1.54 -provides a standard abstraction layer that performs basic tasks (such as
1.55 -allocating memory, printing output on a terminal, managing file access...).
1.56 -There are many C libraries, each targetted to different systems. For the
1.57 -Linux /desktop/, there is glibc or eglibc or ven uClibc, for embeded Linux,
1.58 -you have a choice of eglibc or uClibc, while for system without an Operating
1.59 -System, you may use newlib, dietlibc, or even none at all. There a few other
1.60 -C libraries, but they are not as widely used, and/or are targetted to very
1.61 -specific needs (eg. klibc is a very small subset of the C library aimed at
1.62 -building contrained initial ramdisks).
1.63 -
1.64 -Under Linux, the C library needs to know the API to the kernel to decide
1.65 -what features are present, and if needed, what emulation to include for
1.66 -missing features. That API is provided by the kernel headers. Note: this
1.67 -is Linux-specific (and potentially a very few others), the C library on
1.68 -other OSes do not need the kernel headers.
1.69 -
1.70 -
1.71 -And now, how do all these components chained together? |
1.72 --------------------------------------------------------+
1.73 -
1.74 -So far, all major components have been covered, but yet there is a specific
1.75 -order they need to be built. Here we see what the dependencies are, starting
1.76 -with the compiler we want to ultimately use. We call that compiler the
1.77 -'final compiler'.
1.78 -
1.79 - - the final compiler needs the C library, to know how to use it,
1.80 -but:
1.81 - - building the C library requires a compiler
1.82 -
1.83 -A needs B which needs A. This is the classic chicken'n'egg problem... This
1.84 -is solved by building a stripped-down compiler that does not need the C
1.85 -library, but is capable of building it. We call it a bootstrap, initial, or
1.86 -core compiler. So here is the new dependency list:
1.87 -
1.88 - - the final compiler needs the C library, to know how to use it,
1.89 - - building the C library requires a core compiler
1.90 -but:
1.91 - - the core compiler needs the C library headers and start files, to know
1.92 - how to use the C library
1.93 -
1.94 -B needs C which needs B. Chicken'n'egg, again. To solve this one, we will
1.95 -need to build a C library that will only install its headers and start
1.96 -files. The start files are a very few files that gcc needs to be able to
1.97 -turn on thread local storage (TLS) on an NPTL system. So now we have:
1.98 -
1.99 - - the final compiler needs the C library, to know how to use it,
1.100 - - building the C library requires a core compiler
1.101 - - the core compiler needs the C library headers and start files, to know
1.102 - how to use the C library
1.103 -but:
1.104 - - building the start files require a compiler
1.105 -
1.106 -Geez... C needs D which needs C, yet again. So we need to build a yet
1.107 -simpler compiler, that does not need the headers and does need the start
1.108 -files. This compiler is also a bootstrap, initial or core compiler. In order
1.109 -to differentiate the two core compilers, let's call that one "core pass 1",
1.110 -and the former one "core pass 2". The dependency list becomes:
1.111 -
1.112 - - the final compiler needs the C library, to know how to use it,
1.113 - - building the C library requires a compiler
1.114 - - the core pass 2 compiler needs the C library headers and start files,
1.115 - to know how to use the C library
1.116 - - building the start files requires a compiler
1.117 - - we need a core pass 1 compiler
1.118 -
1.119 -And as we said earlier, the C library also requires the kernel headers.
1.120 -There is no requirement for the kernel headers, so end of story in this
1.121 -case:
1.122 -
1.123 - - the final compiler needs the C library, to know how to use it,
1.124 - - building the C library requires a core compiler
1.125 - - the core pass 2 compiler needs the C library headers and start files,
1.126 - to know how to use the C library
1.127 - - building the start files requires a compiler and the kernel headers
1.128 - - we need a core pass 1 compiler
1.129 -
1.130 -We need to add a few new requirements. The moment we compile code for the
1.131 -target, we need the assembler and the linker. Such code is, of course,
1.132 -built from the C library, so we need to build the binutils before the C
1.133 -library start files, and the complete C library itself. Also, some code
1.134 -in gcc will turn to run on the target as well. Luckily, there is no
1.135 -requirement for the binutils. So, our dependency chain is as follows:
1.136 -
1.137 - - the final compiler needs the C library, to know how to use it, and the
1.138 - binutils
1.139 - - building the C library requires a core pass 2 compiler and the binutils
1.140 - - the core pass 2 compiler needs the C library headers and start files,
1.141 - to know how to use the C library, and the binutils
1.142 - - building the start files requires a compiler, the kernel headers and the
1.143 - binutils
1.144 - - the core pass 1 compiler needs the binutils
1.145 -
1.146 -Which turns in this order to build the components:
1.147 -
1.148 - 1 binutils
1.149 - 2 core pass 1 compiler
1.150 - 3 kernel headers
1.151 - 4 C library headers and start files
1.152 - 5 core pass 2 compiler
1.153 - 6 complete C library
1.154 - 7 final compiler
1.155 -
1.156 -Yes! :-) But are we done yet?
1.157 -
1.158 -In fact, no, there are still missing dependencies. As far as the tools
1.159 -themselves are involved, we do not need anything else.
1.160 -
1.161 -But gcc has a few pre-requisites. It relies on a few external libraries to
1.162 -perform some non-trivial tasks (such as handling complex numbers in
1.163 -constants...). There are a few options to build those libraries. First, one
1.164 -may think to rely on a Linux distribution to provide those libraries. Alas,
1.165 -they were not widely available until very, very recently. So, if the distro
1.166 -is not too recent, chances are that we will have to build those libraries
1.167 -(which we do below). The affected libraries are:
1.168 -
1.169 - - the GNU Multiple Precision Arithmetic Library, GMP
1.170 - - the C library for multiple-precision floating-point computations with
1.171 - correct rounding, MPFR
1.172 - - the C library for the arithmetic of complex numbers, MPC
1.173 -
1.174 -The dependencies for those liraries are:
1.175 -
1.176 - - MPC requires GMP and MPFR
1.177 - - MPFR requires GMP
1.178 - - GMP has no pre-requisite
1.179 -
1.180 -So, the build order becomes:
1.181 -
1.182 - 1 GMP
1.183 - 2 MPFR
1.184 - 3 MPC
1.185 - 4 binutils
1.186 - 5 core pass 1 compiler
1.187 - 6 kernel headers
1.188 - 7 C library headers and start files
1.189 - 8 core pass 2 compiler
1.190 - 9 complete C library
1.191 - 10 final compiler
1.192 -
1.193 -Yes! Or yet some more?
1.194 -
1.195 -This is now sufficient to build a functional toolchain. So if you've had
1.196 -enough for now, you can stop here. Or if you are curious, you can continue
1.197 -reading.
1.198 -
1.199 -gcc can also make use of a few other external libraries. These additional,
1.200 -optional libraries are used to enable advanced features in gcc, such as
1.201 -loop optimisation (GRAPHITE) and Link Time Optimisation (LTO). If you want
1.202 -to use these, you'll need three additional libraries:
1.203 -
1.204 -To enable GRAPHITE:
1.205 - - the Parma Polyhedra Library, PPL
1.206 - - the Chunky Loop Generator, using the PPL backend, CLooG/PPL
1.207 -
1.208 -To enable LTO:
1.209 - - the ELF object file access library, libelf
1.210 -
1.211 -The depencies for those libraries are:
1.212 -
1.213 - - PPL requires GMP
1.214 - - CLooG/PPL requires GMP and PPL
1.215 - - libelf has no pre-requisites
1.216 -
1.217 -The list now looks like (optional libs with a *):
1.218 -
1.219 - 1 GMP
1.220 - 2 MPFR
1.221 - 3 MPC
1.222 - 4 PPL *
1.223 - 5 CLooG/PPL *
1.224 - 6 libelf *
1.225 - 7 binutils
1.226 - 8 core pass 1 compiler
1.227 - 9 kernel headers
1.228 - 10 C library headers and start files
1.229 - 11 core pass 2 compiler
1.230 - 12 complete C library
1.231 - 13 final compiler
1.232 -
1.233 -This list is now complete! Wouhou! :-)
1.234 -
1.235 -
1.236 -So the list is complete. But why does crosstool-NG have more steps? |
1.237 ---------------------------------------------------------------------+
1.238 -
1.239 -The already thirteen steps are the necessary steps, from a theorical point
1.240 -of view. In reality, though, there are small differences; there are three
1.241 -different reasons for the additional steps in crosstool-NG.
1.242 -
1.243 -First, the GNU binutils do not support some kinds of output. It is not possible
1.244 -to generate 'flat' binaries with binutils, so we have to use another component
1.245 -that adds this support: elf2flt. Another binary utility called sstrip has been
1.246 -added. It allows for super-stripping the target binaries, although it is not
1.247 -strictly required.
1.248 -
1.249 -Second, some C libraries require another step after the compiler is built, to
1.250 -install additional stuff. This is the case for mingw and newlib. Hence the
1.251 -libc_finish step.
1.252 -
1.253 -Third, crosstool-NG can also build some additional debug utilities to run on
1.254 -the target. This is where we build, for example, the cross-gdb, the gdbserver
1.255 -and the native gdb (the last two run on the target, the furst runs on the
1.256 -same machine as the toolchain). The others (strace, ltrace, DUMA and dmalloc)
1.257 -are absolutely not related to the toolchain, but are nice-to-have stuff that
1.258 -can greatly help when developping, so are included as goodies (and they are
1.259 -quite easy to build, so it's OK; more complex stuff is not worth the effort
1.260 -to include in crosstool-NG).
2.1 --- /dev/null Thu Jan 01 00:00:00 1970 +0000
2.2 +++ b/docs/9 - How is a toolchain constructed.txt Sun Feb 27 15:27:54 2011 +0100
2.3 @@ -0,0 +1,257 @@
2.4 +File.........: 9 - Build procedure overview.txt
2.5 +Copyrigth....: (C) 2011 Yann E. MORIN <yann.morin.1998@anciens.enib.fr>
2.6 +License......: Creative Commons Attribution Share Alike (CC-by-sa), v2.5
2.7 +
2.8 +
2.9 +How is a toolchain constructed? /
2.10 +_______________________________/
2.11 +
2.12 +This is the result of a discussion with Francesco Turco <mail@fturco.org>:
2.13 + http://sourceware.org/ml/crossgcc/2011-01/msg00060.html
2.14 +
2.15 +Francesco has a nice tutorial for beginners, along with a sample, step-by-
2.16 +step procedure to build a toolchain for an ARM target from an x86_64 Debian
2.17 +host:
2.18 + http://fturco.org/wiki/doku.php?id=debian:cross-compiler
2.19 +
2.20 +Thank you Francesco for initiating this!
2.21 +
2.22 +
2.23 +I want a cross-compiler! What is this toolchain you're speaking about? |
2.24 +-----------------------------------------------------------------------+
2.25 +
2.26 +A cross-compiler is in fact a collection of different tools set up to
2.27 +tightly work together. The tools are arranged in a way that they are
2.28 +chained, in a kind of cascade, where the output from one becomes the
2.29 +input to another one, to ultimately produce the actual binary code that
2.30 +runs on a machine. So, we call this arrangement a "toolchain". When
2.31 +a toolchain is meant to generate code for a machine different from the
2.32 +machine it runs on, this is called a cross-toolchain.
2.33 +
2.34 +
2.35 +So, what are those components in a toolchain? |
2.36 +----------------------------------------------+
2.37 +
2.38 +The components that play a role in the toolchain are first and foremost
2.39 +the compiler itself. The compiler turns source code (in C, C++, whatever)
2.40 +into assembly code. The compiler of choice is the GNU compiler collection,
2.41 +well known as 'gcc'.
2.42 +
2.43 +The assembly code is interpreted by the assembler to generate object code.
2.44 +This is done by the binary utilities, such as the GNU 'binutils'.
2.45 +
2.46 +Once the different object code files have been generated, they got to get
2.47 +aggregated together to form the final executable binary. This is called
2.48 +linking, and is achieved with the use of a linker. The GNU 'binutils' also
2.49 +come with a linker.
2.50 +
2.51 +So far, we get a complete toolchain that is capable of turning source code
2.52 +into actual executable code. Depending on the Operating System, or the lack
2.53 +thereof, running on the target, we also need the C library. The C library
2.54 +provides a standard abstraction layer that performs basic tasks (such as
2.55 +allocating memory, printing output on a terminal, managing file access...).
2.56 +There are many C libraries, each targetted to different systems. For the
2.57 +Linux /desktop/, there is glibc or eglibc or ven uClibc, for embeded Linux,
2.58 +you have a choice of eglibc or uClibc, while for system without an Operating
2.59 +System, you may use newlib, dietlibc, or even none at all. There a few other
2.60 +C libraries, but they are not as widely used, and/or are targetted to very
2.61 +specific needs (eg. klibc is a very small subset of the C library aimed at
2.62 +building contrained initial ramdisks).
2.63 +
2.64 +Under Linux, the C library needs to know the API to the kernel to decide
2.65 +what features are present, and if needed, what emulation to include for
2.66 +missing features. That API is provided by the kernel headers. Note: this
2.67 +is Linux-specific (and potentially a very few others), the C library on
2.68 +other OSes do not need the kernel headers.
2.69 +
2.70 +
2.71 +And now, how do all these components chained together? |
2.72 +-------------------------------------------------------+
2.73 +
2.74 +So far, all major components have been covered, but yet there is a specific
2.75 +order they need to be built. Here we see what the dependencies are, starting
2.76 +with the compiler we want to ultimately use. We call that compiler the
2.77 +'final compiler'.
2.78 +
2.79 + - the final compiler needs the C library, to know how to use it,
2.80 +but:
2.81 + - building the C library requires a compiler
2.82 +
2.83 +A needs B which needs A. This is the classic chicken'n'egg problem... This
2.84 +is solved by building a stripped-down compiler that does not need the C
2.85 +library, but is capable of building it. We call it a bootstrap, initial, or
2.86 +core compiler. So here is the new dependency list:
2.87 +
2.88 + - the final compiler needs the C library, to know how to use it,
2.89 + - building the C library requires a core compiler
2.90 +but:
2.91 + - the core compiler needs the C library headers and start files, to know
2.92 + how to use the C library
2.93 +
2.94 +B needs C which needs B. Chicken'n'egg, again. To solve this one, we will
2.95 +need to build a C library that will only install its headers and start
2.96 +files. The start files are a very few files that gcc needs to be able to
2.97 +turn on thread local storage (TLS) on an NPTL system. So now we have:
2.98 +
2.99 + - the final compiler needs the C library, to know how to use it,
2.100 + - building the C library requires a core compiler
2.101 + - the core compiler needs the C library headers and start files, to know
2.102 + how to use the C library
2.103 +but:
2.104 + - building the start files require a compiler
2.105 +
2.106 +Geez... C needs D which needs C, yet again. So we need to build a yet
2.107 +simpler compiler, that does not need the headers and does need the start
2.108 +files. This compiler is also a bootstrap, initial or core compiler. In order
2.109 +to differentiate the two core compilers, let's call that one "core pass 1",
2.110 +and the former one "core pass 2". The dependency list becomes:
2.111 +
2.112 + - the final compiler needs the C library, to know how to use it,
2.113 + - building the C library requires a compiler
2.114 + - the core pass 2 compiler needs the C library headers and start files,
2.115 + to know how to use the C library
2.116 + - building the start files requires a compiler
2.117 + - we need a core pass 1 compiler
2.118 +
2.119 +And as we said earlier, the C library also requires the kernel headers.
2.120 +There is no requirement for the kernel headers, so end of story in this
2.121 +case:
2.122 +
2.123 + - the final compiler needs the C library, to know how to use it,
2.124 + - building the C library requires a core compiler
2.125 + - the core pass 2 compiler needs the C library headers and start files,
2.126 + to know how to use the C library
2.127 + - building the start files requires a compiler and the kernel headers
2.128 + - we need a core pass 1 compiler
2.129 +
2.130 +We need to add a few new requirements. The moment we compile code for the
2.131 +target, we need the assembler and the linker. Such code is, of course,
2.132 +built from the C library, so we need to build the binutils before the C
2.133 +library start files, and the complete C library itself. Also, some code
2.134 +in gcc will turn to run on the target as well. Luckily, there is no
2.135 +requirement for the binutils. So, our dependency chain is as follows:
2.136 +
2.137 + - the final compiler needs the C library, to know how to use it, and the
2.138 + binutils
2.139 + - building the C library requires a core pass 2 compiler and the binutils
2.140 + - the core pass 2 compiler needs the C library headers and start files,
2.141 + to know how to use the C library, and the binutils
2.142 + - building the start files requires a compiler, the kernel headers and the
2.143 + binutils
2.144 + - the core pass 1 compiler needs the binutils
2.145 +
2.146 +Which turns in this order to build the components:
2.147 +
2.148 + 1 binutils
2.149 + 2 core pass 1 compiler
2.150 + 3 kernel headers
2.151 + 4 C library headers and start files
2.152 + 5 core pass 2 compiler
2.153 + 6 complete C library
2.154 + 7 final compiler
2.155 +
2.156 +Yes! :-) But are we done yet?
2.157 +
2.158 +In fact, no, there are still missing dependencies. As far as the tools
2.159 +themselves are involved, we do not need anything else.
2.160 +
2.161 +But gcc has a few pre-requisites. It relies on a few external libraries to
2.162 +perform some non-trivial tasks (such as handling complex numbers in
2.163 +constants...). There are a few options to build those libraries. First, one
2.164 +may think to rely on a Linux distribution to provide those libraries. Alas,
2.165 +they were not widely available until very, very recently. So, if the distro
2.166 +is not too recent, chances are that we will have to build those libraries
2.167 +(which we do below). The affected libraries are:
2.168 +
2.169 + - the GNU Multiple Precision Arithmetic Library, GMP
2.170 + - the C library for multiple-precision floating-point computations with
2.171 + correct rounding, MPFR
2.172 + - the C library for the arithmetic of complex numbers, MPC
2.173 +
2.174 +The dependencies for those liraries are:
2.175 +
2.176 + - MPC requires GMP and MPFR
2.177 + - MPFR requires GMP
2.178 + - GMP has no pre-requisite
2.179 +
2.180 +So, the build order becomes:
2.181 +
2.182 + 1 GMP
2.183 + 2 MPFR
2.184 + 3 MPC
2.185 + 4 binutils
2.186 + 5 core pass 1 compiler
2.187 + 6 kernel headers
2.188 + 7 C library headers and start files
2.189 + 8 core pass 2 compiler
2.190 + 9 complete C library
2.191 + 10 final compiler
2.192 +
2.193 +Yes! Or yet some more?
2.194 +
2.195 +This is now sufficient to build a functional toolchain. So if you've had
2.196 +enough for now, you can stop here. Or if you are curious, you can continue
2.197 +reading.
2.198 +
2.199 +gcc can also make use of a few other external libraries. These additional,
2.200 +optional libraries are used to enable advanced features in gcc, such as
2.201 +loop optimisation (GRAPHITE) and Link Time Optimisation (LTO). If you want
2.202 +to use these, you'll need three additional libraries:
2.203 +
2.204 +To enable GRAPHITE:
2.205 + - the Parma Polyhedra Library, PPL
2.206 + - the Chunky Loop Generator, using the PPL backend, CLooG/PPL
2.207 +
2.208 +To enable LTO:
2.209 + - the ELF object file access library, libelf
2.210 +
2.211 +The depencies for those libraries are:
2.212 +
2.213 + - PPL requires GMP
2.214 + - CLooG/PPL requires GMP and PPL
2.215 + - libelf has no pre-requisites
2.216 +
2.217 +The list now looks like (optional libs with a *):
2.218 +
2.219 + 1 GMP
2.220 + 2 MPFR
2.221 + 3 MPC
2.222 + 4 PPL *
2.223 + 5 CLooG/PPL *
2.224 + 6 libelf *
2.225 + 7 binutils
2.226 + 8 core pass 1 compiler
2.227 + 9 kernel headers
2.228 + 10 C library headers and start files
2.229 + 11 core pass 2 compiler
2.230 + 12 complete C library
2.231 + 13 final compiler
2.232 +
2.233 +This list is now complete! Wouhou! :-)
2.234 +
2.235 +
2.236 +So the list is complete. But why does crosstool-NG have more steps? |
2.237 +--------------------------------------------------------------------+
2.238 +
2.239 +The already thirteen steps are the necessary steps, from a theorical point
2.240 +of view. In reality, though, there are small differences; there are three
2.241 +different reasons for the additional steps in crosstool-NG.
2.242 +
2.243 +First, the GNU binutils do not support some kinds of output. It is not possible
2.244 +to generate 'flat' binaries with binutils, so we have to use another component
2.245 +that adds this support: elf2flt. Another binary utility called sstrip has been
2.246 +added. It allows for super-stripping the target binaries, although it is not
2.247 +strictly required.
2.248 +
2.249 +Second, some C libraries require another step after the compiler is built, to
2.250 +install additional stuff. This is the case for mingw and newlib. Hence the
2.251 +libc_finish step.
2.252 +
2.253 +Third, crosstool-NG can also build some additional debug utilities to run on
2.254 +the target. This is where we build, for example, the cross-gdb, the gdbserver
2.255 +and the native gdb (the last two run on the target, the furst runs on the
2.256 +same machine as the toolchain). The others (strace, ltrace, DUMA and dmalloc)
2.257 +are absolutely not related to the toolchain, but are nice-to-have stuff that
2.258 +can greatly help when developping, so are included as goodies (and they are
2.259 +quite easy to build, so it's OK; more complex stuff is not worth the effort
2.260 +to include in crosstool-NG).