Вы находитесь на странице: 1из 13

Best Practices in Boot Loader Design

By

Sumanth Vidyadhara

(Technical Architect, Wipro Technologies) sumanth.vidyadhara@wipro.com

And

Arun Patil

(Project Lead, Wipro Technologies) arun.patil@wipro.com

#364 at the Embedded Systems Conference – San Jose 2006

Best Practices in Boot Loader Design

Best Practices in Boot Loader Design

1

Overview:

3

Assumptions:

3

Definitions:

4

Boot Process:

5

Initial startup:

6

Issues during startup:

6

Suggestions:

7

Tips for initial boot code debugging:

7

Different approaches to boot loader Design:

8

Generic boot loader:

8

EFI based boot loader:

8

Good boot loader traits:

9

Network Boot:

9

File System Support:

9

Multilayered boot loader approach:

9

Command Line interpreter:

9

Loadable Module Support:

10

Decompression support:

10

Asynchronous Interrupt support:

10

Firmware Upgrade Support:

11

Cache Enabled address Space:

11

References

13

Best Practices in Boot Loader Design

Overview:

A good boot loader is an essential component of any embedded system and any design deficiencies can lead to a system that is slow on boot and performance. This tutorial would cover the best practices in boot loader design. First section is a summary of the basic assumptions about the reader used throughout this paper. Definitions contain a description of terms used in the context of this paper.

Boot process gives an overview of boot procedure in embedded system. Initial Startup explains the behavior of CPU after power on reset, the issues faced during startup and some suggested solutions to those issues . Boot Loader Feature requirement , explains t he features or enhancements that can be added to make boot loader more efficient.

Assumptions:

The main assumption for this paper is that the reader has a good understanding of embedded systems. A good knowledge of hardware platform and processor architecture is also necessary. This paper is to provide the reader with a good starting point in designing an effective boot loader for various target platforms. The design alternatives presented in this paper is just the tip of the iceberg and there are other ways as well for an effective design.

Best Practices in Boot Loader Design

Definitions:

Boot Device: This typically is a Flash device which holds the Firmware image. Ex:

Compact flash, NAND flash.

Boot Window: After reset CPU jumps to the reset vector. The reset vector is usually mapped to a boot device. The CPU fetches and executes code from this boot device. This initial execution environment is termed as the boot window.

JTAG: Joint Test Action Group, JTAG also allows the internal components of the device (the CPU, for example ) to be scanned. This means you can use JTAG to debug embedded devices by allowing access to any part of the device that is accessible via the CPU, and still test at full speed. This has since become a standard emulation debug method used by silicon vendors. JTAG can also provide system level debug capability. Having extra pins on a device provides additional system integration capabilities for benchmarking, profiling, and system level breakpoints.

MIU: Memory Interface unit is an interface between the CPU and the memory. This controls the data transfer to and from the memory.

PLL: Phase locked loop, an electronic circuit that controls an oscillator so that it maintains a constant phase angle (i.e., lock) on the frequency of an input, or reference, signal. A PLL ensures that a communication signal is locked on a specific frequency and can also be used to generate, modulate and demodulate a signal and divide a frequency.

UART: Universal asynchronous receiver transmitter handles the asynchronous serial communication.

Best Practices in Boot Loader Design

Boot Process:

Power on Reset

Jump to reset vector

Boot Media

Copy image to RAM

RAM

Execute from RAM

Hardware Initialization Cache Initialization

Load and Jump to OS

OS

Figure 1 – Boot process

After reset the CPU fetches instructions from the reset vector, the reset vector is the default location a CPU will go to find the first instruction it will execute after a reset. That is to say, the reset vector is an address where the CPU should always begin as soon as it is able to execute instructions . Reset vector is usually mapped to the first sector in a boot device. The reset vector code does minimal hardware initialization and copies the rest of the firmware image to RAM. Once copied, CPU executes instruction from RAM.

The boot loader code then initializes ASIC’s hardware modules and processors cache. The boot loader then loads OS image at OS base address and hands over the control of execution to Operating System.

Best Practices in Boot Loader Design

Initial startup:

After power on reset the CPU fetches and executes instructions from reset vector. The execution of first few instructions in the reset vector during power on reset usually involves,

ß Setup PLL’s for the ASIC

ß Initialize MIU of ASIC

ß Copy image to RAM.

Issues during startup:

The issues that are seen during the initial startup process are:

ß The execution from boot device (Flash) during power on reset is usually slow.

ß The size of the boot window is limited.

ß Entire boot up code written in assembly would not be easier to understand and also not portable.

ß To setup a “C” environment requires access to contiguous area of memory and the system RAM is still not initialized at this time.

Best Practices in Boot Loader Design

Suggestions:

A few suggestion or design proposals that can be followed to overcome the issues during the initial boot:

ß Keep the code that executes from flash to a minimum.

ß Setup ‘C’ environment in the initial boot code. This makes the code easier to understand and portable. To setup ‘C’ environment the stack pointer needs to be initialized to a contiguous memory area.

ß As MIU is not initialized at power on reset, the stack pointer can be set to processor secondary RAM. If processor’s RAM is not available, stack pointer register can be setup using processors data cache or contiguous unused memory space in the ASIC.

ß Usually the available space to setup stack either using data cache or processor RAM is less, care should be taken not to exceed the stack size One way to cross check this is to disassemble the compiled initial boot code and check if stack pointer overgrows the size.

ß If there is any repetitive loop during initial execution, every fetch and execution of the same instruction to the flash is time consuming. To improve boot performance we can enable the instruction cache, traverse the initial boot code once so that the code gets into instruction cache. Any instruction fetch after this goes to instruction cache.

Tips for initial boot code debugging:

ß If available, setup UART to monitor code progress by printing out messages at various checkpoints in the code.

ß The different blink pattern of LED’s on the ASIC can also help us understand the code progress.

ß JTAG: Allows the internal components of the device (the CPU, for example) to be scanned. Ex: It can be used to verify the correctness of memory contents

Best Practices in Boot Loader Design

Different approaches to boot loader Design:

Generic boot loader:

In systems where BIOS firmware is available as in X86 based systems, BIOS configures the hardware in the system and then transfers the control of execution to the boot loader. The boot loader then bootstraps the OS. In a non X86 based systems where BIOS are not available, the boot loader detects all the hardware and configures it and then bootstraps the OS. The advantages are:

ß Code is minimal

ß Can be tuned for faster boot and performance, since they are tightly coupled to the hardware. Disadvantages are:

ß Change in hardware involves significant change in the boot loader code.

EFI based boot loader:

The extensible Firmware interface (EFI) specification describes an interface between the operating system and the platform firmware. The interface is in the form of data tables that contain platform related information, and boot and runtime service calls that are available to the OS loader and the OS. Together, these provide a standard environment for booting an OS. The EFI specification is designed as pure interface specification. As such, the specification defines the set of interfaces and structures that platform firmware must

implement. Similarly, the specification defines the set of interfaces and structures that the OS may use in booting. The advantages of EFI based boot service are:

ß Modular approach to boot loader design, modules can be loaded independently.

ß Simple OS loader can be designed.

ß Saves time in Firmware development.

Best Practices in Boot Loader Design

Good boot loader traits:

Here we would like to highlight few of the features in boot loader that help make it more efficient.

Network Boot:

To fetch a firmware/OS image over the network, all you need is DHCP server and a TFTP server. It does this via standard Internet protocols. When the appliance is powered on, the boot loader makes a DHCP request. The DHCP server, recognizing the appliance as a network-booting client, returns instructions on the location of a TFTP server and the name of the file that it should download from the server. The advantage of having a network boot is you can quickly download a new development run-time firmware image to your target device.

File System Support:

The file system in a boot loader can be used to traverse the files and execute different utilities at runtime.

Multilayered boot loader approach:

The boot loader code is segregated as hardware dependent layer and hardware independent layer. The advantage of this approach is that the development time in bringing up any new hardware is faster, as this involves change in only hardware dependent layer.

Command Line interpreter:

Users can use the CLI to issue various commands such as to configure hardware, upgrade firmware or run diagnostics checks. This requires the user to know the names of the

commands and their parameters, and the syntax of the command line that is interpreted.

However, command line interpreters remain widely used in conjunction with GUIs.

One

such command line interpreter that can be used is Hush Shell, Universal boot loader uses Hush shell for command line parsing

Best Practices in Boot Loader Design

Loadable Module Support:

The boot loader can provide support to load modules dynamically. The modules are to be loaded on a need basis. Typical example can be that of an I/O module. When download of firmware is requested, the boot loader brings the I/O module into memory and executes it to download the firmware into the flash. The advantages of loadable modules are:

New features can be added without changing the boot loader code.

The base memory footprint of the main boot loader code is kept to minimal.

Not all features are used in each boot and hence only the required modules are loaded, this would speed up boot time.

Decompression support:

To reduce the size of firmware in flash the image can be compressed. The boot loader would have to support decompression. The boot loader decompresses the firmware code in flash and copies it to RAM. The advantage of having a compressed image would be that the size of flash required to store firmware image would be small. This would have a huge cost saving.

Note: The compression/decompression algorithm used by boot loader needs to be a lossless one.

Asynchronous Interrupt support:

The boot loader I/O modules can execute either in polled or interrupt mode. There are advantages and disadvantages of each of the modes. The polled mode can be simple to implement, but lag in performance. The interrupt mode has performance advantages but can be tricky to implement. To enable interrupt support the boot loader would need to provide interfaces for interrupt registration.

Best Practices in Boot Loader Design

Firmware Upgrade Support:

Updating firmware on embedded systems can be a pain if it requires programming the flash on a flash burne r. Boot Loader can enable support to upgrade firmware at runtime. To enable support for firmware upgrade, the boot loader would need to initialize an I/O device. The different I/O’s that can be used for firmware upgrade could be USB, parallel, serial, network, etc. Note: if network upgrade is supported in boot loader, support for socket APIs, network application clients are required to be supported. The main advantage of using network download is that the firmware can be upgraded from a remote site as well.

Cache Enabled address Space:

The boot loader code executes faster when caches are enabled. The boot loader code should be linked and executed in cache enabled address space. The cache enabled address space is specific to each processor platform.

Some processor architectures have separate instruction and data caches. In this architecture there’s no need for the caches to be clever. The caches must be transparent to application software. Conceptually, cache is an associative memory, a chunk of storage where data is written marked with an arbitrary data pattern as a key. In a cache, the key is the full memory address. Produce the same key back to an associative memory and you’ll get the same data back again.

Write Through Caches: The CPU’s data is always written to main memory; if a copy of that memory location is resident in the cache, the cached copy is updated too. If we always do this, then any data in the cache is known to be in memory too, so we can discard the contents of a cache line any time we need a cache location and lose nothing but time. This will slow down the processor drastically, processors usually keep writes destined to main memory in a buffer while the memory controller gets itself ready and completes the write. The place where writes are kept temporarily is called a write buffer.

Write Back Caches: The later CPU’s are too fast for the write through caches that they would swamp their memory systems with writes and slow to a crawl. The solution is to retain write data in the cache. Write data goes into the cache only, and the cache line is marked to makes sure it’s written back to memory sometime (a line that needs writing back is called dirty). There are sub-variants here: if addressed data is not currently in processors cache, either it writes to main memory and ignores the cache, or it can bring the data into cache so that it can write – this is called write allocate.

Best Practices in Boot Loader Design

With this background on cache policies, here are few suggestions that make program behave better in a cache.

Make it smaller: Use modest compiler optimization (exotic optimization often

makes programs larger). Make everything cacheable except hardware registers: The hardware registers are

volatile in nature and register contents change dynamically. The cached entry would not give the correct value. Make the heavily used portion of the program smaller: Access density in programs is not at all uniformly distributed. There’s often significant amount of code that is almost never used (error handling, obscure system management), or used only once (initialization code). If you can separate off the rarely used code, you might be able to get better cache hit rates for the remainder.

Best Practices in Boot Loader Design

References

Das U-Boot – Universal boot loader

http://sourceforge.net/projects/u- boot

The GNU GRUB Boot Loader

http://www.linuxgazette.com/issue64/kohli.html

http://msdn.microsoft.com/library/default.asp?url=/library/en-

us/wcehardware5/html/wce50howhowtodevelopabootloader.asp

How to Develop a Boot Loader(Microsoft Windows CE 5.0)