Embedded Linux Booting Process (Multi-Stage Bootloaders, Kernel, Filesystem)Jun 10, 2021
Hello everyone, welcome to the Embedded Linux IOT Course for Red Blue Teams at Pentester Academy. My name is Vivek Ramachandran and I am the course instructor. In this video we will understand the
embeddeddevice in great detail, so let's get started. Now, if we were to look at the architecture of an
embeddedsystem, we will have the SOC, which is the system chip or the microcontroller, this is really the brain of the system. Now the SOC will require some kind of persistent storage such as flash storage where the OS or programs you have to run will be stored after reboots, it will also require you to know the RAM where you can load programs at runtime and run them now apart of this, depending on the functionality of this system integrated there.
There may be different peripheral ports, debug ports, GPIO pins, etc. available now it's important to note that most modern Susies generally support dozens of different external peripheral devices; However, the number of pins they have is sometimes not enough to be able to interact with all the devices they support. and this is actually where pin
multiplexing happens where we can configure the SOC to go ahead and use certain pins and only talk to certain peripheral devices with them. Okay, now let's move ahead and understand the Linux boot
processfrom the perspective of embedded devices. The first
stageis actually the
More Interesting Facts About,
embedded linux booting process multi stage bootloaders kernel filesystem...
Now when we go forward and power a board, what happens is that control goes to a location called a reset vector. This is something that the manufacturer has already pre-programmed into the board, now into the reset vector, usually the associated ROM. lauder exists and at that point the wrong bootloader will start running. Normally the task of the bootloader rom is to configure some basic hardware and then fetch the first
stagebootloader from a boot device so this boot device can be on a network could be over a memory bus could be USB, you know, it could be an SD card, so once the rom bootloader has gone through the different boot devices and located the first stage bootloader, it goes ahead and loads the bootloader first stage bootloader normally in the internal memory of the SOC and then passes control to it now the first stage bootloader will continue searching for the second stage bootloader which will then be loaded into RAM and the first stage bootloader passes the control to the second stage boot loader, which then starts running now hold Note that interestingly, I'm always saying pass control instead of call and this is very important to distinguish, usually when you say pass control, what it really means is that after that the previous program that was running somehow ceases to exist when it says calls actually means that when the next stage finishes running it will actually go back to the previous stage which doesn't It happens at all with
bootloaders, so when we say that the bad bootloader passes control to the first stage bootloader, that means after that the bad bootloader will not run. until the next reboot.
The first stage bootloader passes control back to the second stage and then ceases to exist. Depending on the integrated system, there could be just one stage or
multiple stages and this all depends. about the architecture of that embedded system and how many stages it would require to load, you know a powerful boot like u-boot and we got into that in the context of the BeagleBone black in a moment, now the second stage bootloader, its primary responsibility is to continue and load the
kerneland device tree into RAM and basically look at your own configuration, which could be statically embedded or could be some kind of external configuration file and then use that to start the
kernel, pass it the boot arguments and that That's it, at that point in the second stage, the bootloader ceases to exist and the kernel is taken over by the Linux kernel, in turn of course it will be initialized.
You know that the different hardware components will move forward. Look at the boot arguments passed by the boot loader. the rule for the root
filesystemand then it will mount the root
filesystemonce it does so successfully it will look for the startup process in that filesystem which is actually the first userspace process with zero PID and then once you invoke the startup process on it. will in turn look at your configuration files and start other userspace processes, so I think this is a very good summary of how the boot process works for most Linux systems and now we will focus on understanding how this works for the BeagleBone black as we go. discussed in the video above the BeagleBone black is actually based on a Texas instrument SOC called am 3 3 5 x series so let's go to their website and see what it is so I'm the first one to go for the first time to the Beagle bone. website where you know they have a link to am 3 3 5 X so you can open it in a new tab, scroll down and then go to the whitepapers page.
There we are, we are three, three, five and eight. Satara arm cortex a8. Great actually Open the datasheet in a new tab and there are a ton of docs and there are usually docs available here for people who build hardware with this SOC as well as for people who write software for this SOC , so depending on what category you're in, these documents may be relevant to you now if you look at them, there's a ton of documentation here, the one we're really interested in is from a software perspective, it's what they call TR M or the technical reference manual and you can I clearly see that this is quite large in size, so let's open this up to be all text, I mean, 2 to 3 MB, of course, right?
This is 23 at most, it's not that big, but when it's all text, it's huge and of course 5000 pages so you can clearly appreciate how complicated it would be to write software for such a device and the size of the computer it would be. would really need to go ahead and already know how to build an embedded system, its software, etc. Ok, this is the data sheet. Ok let's go. Keep both let's keep them open in the tab now let's go back to our slides okay so what I did was from the data sheet that I actually copied over the functional block diagram of the a m33 5x now no Don't worry some of these Things are going to seem complicated at first glance, but let me assure you that it is actually very simple.
Five thousand pages is intimidating, but most of what's written there is beautiful, very crisp technical documentation and I mean I wouldn't. I once wanted to encourage you to read all of that, but if you can at least review some of the relevant parts, the SOC documentation is some of the most thorough and beautiful you will ever read in the technical business, it is simply because If Texas Instruments or any of the other guys don't document this properly, no one is going to go ahead and build a board and write software for these SOCs and microcontrollers and all that, so they do an amazing job of documenting.
So we see that this is an Arm Cortex A8 that can go up to 1 gigahertz, so the reason they say up to 1 gigahertz is that most of the time it's hardware configurable, so depending on certain pin configurations, you can tell the processor that you know what it is. the clock speed you need to go ahead and use it now has an a1 l2 cache but the key thing that catches my eye of course is the ROM and RAM and keep in mind that this is inside the SOC so It's inside that little chip so it only has 76 kilobytes of ROM, this is the read-only memory and then there's a 64 KB RAM section and there's also a 64 KB shared RAM section, so, as you can clearly see the net amount of RAM appears to be around 128 the ROM is 176 clearly not much this almost takes you probably back to when personal computers booted properly 64kb now of course this is more than enough for you Know that the SOC does its job and we will talk about it.
In just a moment, apart from that, you have a lot of peripheral drivers, so you have the display drivers here and then you know you have all the external ports, so we clearly see that you know all these serial ports, so you have UART SPI I to see the bus can USB etc. and there are many others too, you also have parallel devices like mmcd SD card etc. and then the GPIO ports and then you have memory interface controllers and all that so this is an extremely complicated device now note that it says six UART ports correctly but you will have to multiplex the pins if you really want all six exposed , so keep this in mind on the BeagleBone black, you may only find one or two.
You have art because using pin multiplexing, that's what they've decided to expose right now, some of this can be set up very easily, but a lot of times, if you know that things are already connected to those pins, you might not want to. play too much. That's fine, so the key takeaway is that we have a very small ROM and a very small internal ramp. Now if we look at the cortex and its memory map, what we would actually find is that the boot ROM is actually what we were on. I'm talking about this here so on the previous slide there are 176 ROMs and if you go to the next slide you will see 128 plus 48 which is 176 so all of this is actually the correct boot ROM and that is the place where execution begins afterwards. the processor reboots fine, now the little boot code we are talking about, which is only 176 KB, how advanced it really is.
I mean, you know what the architecture is, what it can really do for us now. You will be completely surprised and surprised if this is the first time you are dealing with SOC at such a low level that the boot ROM is actually a reasonably complicated piece of software and actually a pretty well done piece of software if you can't look. the cool public ROM architecture that's available, you know, with the technical technology. reference manual, you will actually see that there is library support, the bad thing, it is not library support, but you know that the support for UART USB SPI x correct IP x IP is running instead, so these are devices from where you can pretty much run the device without having to copy it to RAM or anything else and of course we have hardware support we have an SD MMC card this is important to us and they also support fat which is interesting boot TFTP, which means you could go ahead and do a network boot there's a lot of stuff here now.
I would just like to draw our attention to the thick part, the UART part and then the SD MMC card part, okay, so the key takeaway is that although the boot ROM code is only 176KB and of course, You probably won't take up all that space with the actual code, it's quite complicated and the boot ROM code already supports communication with MMC SD cards and also supports the thick file system. These are two important things to remember, let's move on so if you actually open up the soc manuals and this might be a good training ground in case you're interested later on so I opened up the technical reference manual and what I showed you a couple of minutes ago was actually the memory map here. so the boot ROM exists right here in this memory location, you'll actually find a bunch of other things, memory maps too, when you hear about memory mapped I/O, you know now you can understand how it works all that.
So now let's get to the initialization part, which is what happens when the system initializes. This seems to be documented in chapter 26. Well, now this device is 3 3 5 in the morning. X the SOC actually supports two different types of boot, one is high security. boot or HS device and the other one is the general purpose device, so we will use the GP device more and not the high security device. The high security device, if you notice, is doing the reliable boot stuff right. requires a digital signature, you know, a piece of firmware and moral, it won't boot it, so we'll look at the general purpose device and in a later series we might look at devices that have reliable boot and you know something different. ways people have been compromised and what the attack vectors are, etc., but in this course I'm only going to focus on the general purpose device, so the SOC manual has a ton of details on how it all works this and you.
You know, they're probably a couple hundred pages long. What I've done now is I've just taken the big ideas and put them in the presentation, so you know this is all coming from the technical reference manual, but you're kind. Instructor we wake upRamachandran has gone through the pain of reading, not the Erm of the forest. I'm NOT crazy, but you know the important relevant parts and you can. I've included all that, you know, juicy stuff on the slides, so the slide should be more than enough, but if you're more curious, you can always go back and consult the technical reference manual.
I also mentioned the page number of the next slides where I got a lot of these cool flowcharts and diagrams from, so let's continue. so the system boots up and then there is a basic initialization and of course this is all now you know the ROM code that is running correctly, the SOC ROM code is what is running, after that we go into the main routine where the stack is configured. and then the ROM bootloader goes ahead and sets the watchdog timer, so if you look at the documentation, it actually states that the watchdog timer will not give you the wrong bootloader code about three minutes to find a boot device and load the next stage, if you don't do this within three minutes you will probably reset the board, so after the watchdog timer has been set the next job is to go ahead and set the clock so you know . it's like looking at the phase locked loop of the phase loop and based on that it will correctly configure the clock settings so that the PLLs are very fundamental or facing locked loops, it's almost like a tongue twister and using ground, basically you set the system clock now, once done the ROM bootloader will try to see how it can start the next stage, so of course to start the next stage the most logical thing is where is the next stage, definitely it's not in the SOC because the SOC was only 76KB, so this will be external to the SOC, so how can the ROM bootloader figure out where to find the next stage?
Now once again the order in which it will search for the wrong bootloader. A boot device is something that hardware designers have gone ahead and decided by using different pins on high and low and that tells the SOC the commands to follow correctly, so we come here
bootingand configuring the device list based on the boot configuration settings as the boot pins generate both pins and for each of those devices find out if it is a peripheral boot or if it is a memory boot. Now in our case, we're going to do a boot using the SD card, so that's actually going to be the correct memory approach, so what we're really going to do from here is go down to the memory boot part and if you have success, continue, if not, we will retry the next device and then finally, I think if it fails, it goes into a dead loop, then MMC SD boot, how is it going to work?
Assuming the type of MMC SD card you know is in the list, the ROM bootloader will initialize the driver and then that decides whether the next stage bootloader is on the SD card in raw mode or is there as a boot file. Now this gets interesting, so raw mode is really where we just put the next stage bootloader and the next stage bootloader file name. It's basically mlo, so the next stage bootloader can be put into raw mode, which currently says to be at offset 0 or from 0 for the first 128 KB or then up to 256 or 384, alternatively, where you can actually read from. a thick partition on the SD card, so based on this we can choose one of these different options to locate the first stage bootloader.
Now what we're going to do is we're going to go ahead and use the thick option throughout this series, but I'm also going to show you how to do booting in raw mode where we're going to use DD to write the first stage boot loader and the second stage boot loader to these exact locations and then we will start the device correctly, but since it supports fat, I would prefer it because it is very easy to move around the files and play with them while you learn different things in this course, so keep in mind that what we will do is place the first stage bootloader that you know with a file name called mlo on the thick partition of our SD card, okay?
In fact, let's do a quick summary of what we've learned so far and then add that so that the SOC ROM bootloader is inside the SOC, of course, very small, 176 Kb. Its job really is to boot the bootloader. first stage start, which is the SPL. o ml o SP l means secondary program loader ml o for memory loader now the first stage boot loader will load the second stage boot loader. In the case of BeagleBone black, we will use a very popular bootloader called you boot. and its configuration file which is ue + v dot txt, now bootstrap is open source and has massive adoption in the embedded market.
I would even hazard a guess and say that almost 90% of embedded devices seen lately use u boot as the default bootloader. now it boots or the second stage bootloader will show the
linuxkernel which can be a Z image or au image along with the device tree and then the
linuxkernel in turn will show the fantastic file system road, so now I understand how this process is going to work. I know this is a long video. The next part is where we can't summarize everything and then in the next video we do our first hands-on demo.
Okay, so how does this actually work? So let's say that. am 3 3 5 than 176 KB yes Remember now externally you have the SD card that is plugged in or rather the type of SD card that you know you are going to plug in right now, there is no card there and then you also have ROM in the system , so now what we do is insert our SD. card and on that SD card we are going to have three partitions, the first partition will actually be a fat32 partition and it will contain the first stage bootloader which is mlo and the second stage bootloader which you boot and then the configuration of the second stage bootloaders. file which is your env dot txt is actually more environment variables than actual configuration, but we'll get to that, let's keep it simple right now, the second partition is an ext4 partition which will contain the linux kernel and the device. binary tree, so the device tree is basically nothing more than an enumeration of all the undiscoverable devices and how to open them and how to turn them on in their power profiles etc.
We'll talk about it later. The third partition is. which will contain our familiar root file system, that whole directory structure that we are all so used to seeing now, you can partition the SD card differently, but for this video series this is how I will do it because many times I I would like to quickly change my colonel change, you know, a NV dot txt, so I prefer to have all of these on separate partitions. Okay, now how would this work? Let's summarize, so in the first step we turn on the SOC, the wrong bootloader starts running the wrong one. bootloader when we are actually going to use it, you know that the second way to boot it is to look for a fat32 partition on the SD card, so the wrong bootloader decides that it will look at the SD card on its startup. sequence and then inside it finds the fat32 partition, it sees that there is a ml or a file called ml or inside and then it takes it to its internal RAM.
Now if ml or does not exist then of course the boot will fail from the SD card and then the rom bootloader will move on to the next device in the list so step one to summarize gets the ml or it puts it in RAM and then passes control to it right now. The ml lobe will, in turn, go ahead, check its start, and then copy it. boot into RAM now, this differentiation is very very critical, if you notice that the wrong bootloader installed mlo into your internal RAM while mlo booted into external RAM, this is important and I will get to a couple of implications in just a bit of boot in turn will check to see if it knows your env dot txt is present and then go ahead and copy it to RAM.
Now, upon booting in turn, it will go ahead and look for the Linux kernel. and the binary tree of the device and how it would boot, it will actually know that there is an ext4 partition, it is the second partition, etcetera, well, all that information is in your dot txt env, so it boots now and recovers the Linux kernel and the free binary loads of the device. uploads them to the ROM, then also continues, passes the boot arguments and then continues and passes control to the colonel, who in turn will mount the root filesystem and then run the boot process from the filesystem in RAM and the boot process will in turn start other user space programs are fantastic, so if you understood this, you will now understand how the boot process of the BeagleBone black and indeed most embedded devices works properly.
The key variation for other devices will actually be just the first step, which is how the ROM bootloader, after knowing that it gets one or two more stages or whatever it does, finally manages to put your boot into memory and pass the control, only that part will be very specific to the board once it boots, it's in memory, the rest is will always be the same, so of course you're asking some questions that are very intuitive, why the heck do we need the first stage bootloader? Why can't the wrong bootloader just search for your boot? and run it from internal RAM.
I mean, why do we need the first stage bootloader? Now it is very important to remember that the internal RAM was only 128 KB, while if you look at the boot size with all the features it has built in. You will never be able to fit inside the internal Ram outside of the SOC and the RAM inside a SOC is very expensive so the only way SOC Wenders managed to keep its cost down is to make sure you know the ROM Raman and some of this memory really expensive is actually kept to a minimum and this is why you will always find that the entire first second and sometimes even n stages may be necessary, so it is a very, very important thing to keep in mind.
The boot is so big that you know it can't fit inside the internal RAM. Now of course your next logical question is fine, why doesn't the ROM bootloader load your boot directly into RAM? No need to charge it. Internal Ram, but you can load it directly into it right now. This is a very interesting question and the answer really is to keep in mind that the SOC vendor at the time they created the SOC knows very little about the external devices you are going to connect to it and usually you know that DRAM drivers depend a lot on you knowing the manufacturer of the product, how they built it and there are a lot of things that really go into making the driver work and the run function, so unfortunately the wrong bootloader will have no idea how to set it up. this Ram and therefore cannot directly load the boot into RAM.
Mlo really understands how the specific Ram controller works and will configure everything to make the RAM functional and that is why. we require the mlo in the middle and the rom bootloader can't just take the boot and put it in the ram right? I hope I have been very clear with this explanation, it is very interesting, but once you know this, you understand the purpose of first stage bootloaders. In many systems, there may even be more stages. Note that really all we are trying to do is go from the wrong bootloader to configuring the RAM and in between we may need as many stages as we don't need depending on the SOC.
This architecture, once the RAM is configured, the DRAM controller will be functional enough, after that we can put on its boot and all the programs that we know and love in Raman ik security, so this part is very important to understand and digest. Okay, this was a long video guys and I hate making videos, you can go beyond 10 minutes because I know it's extremely difficult to stay focused and sharp, but I really wanted to make this video, you know, in one sitting, so I promise I won't. exceed more than 10 to 15 minutes in a video, but I hope you enjoyed this session.
Actually, it took me. I'm not kidding, two days to create the slides just for this video. I had to look through the trm docs and a bunch of other stuff, so I really hope you enjoyed it. Thank you so much. If you are not a student of the Friends Tester Academy, please consider joining us, we have a lot of courses there and if you are, I hope you enjoy it andTell your friends and colleagues. we thank you have a great day
If you have any copyright issue, please Contact