This section gives a "very brief" and "introduction" to some of the Linux Kernel System. If you have time you can give one reading.
Caution: You should be extra careful about these Kernel Files and you must not edit or touch or move/delete/rename them.
The "vm" stands for "Virtual Memory". Linux supports virtual memory unlike old operating systems like DOS which have a hard limit of 640KB memory. Linux can use the hard disk space as "virtual memory" and hence the name "vm". The vmlinuz is the Linux kernel executable. This is located at /boot/vmlinuz. This can be a soft link to something like /boot/vmlinuz-2.4.18-19.8.0. The vmlinuz is created by "make zImage" and you do "cp /usr/src/linux/arch/i386/linux/boot/zImage /boot/vmlinuz". The vmlinuz is a compressed file of vmlinux. The zImage is there for backward compatibility (smaller kernels). Note that, in near future zImage may be dropped in favor of 'make bzImage' (big zImage). The zImage (vmlinuz) is not just a compressed image, but has gzip decompressor code built into it as well (at the beginning). So you cannot use gunzip or gzip -dc to unpack vmlinuz.
Both zImage and bzImage are compressed with gzip. The kernel includes a mini-gunzip to uncompress the kernel and boot into it. The difference is that the old zImage uncompresses the kernel into low memory (the first 640k), and bzImage uncompresses the kernel into high memory (over 1M).
The vmlinux is the uncompressed built kernel, vmlinuz is the compressed one, that has been made bootable. (Note both names vmlinux and vmlinuz look same except for last letter z). Generally, you don't need to worry about vmlinux, it is just an intermediate step.
The kernel usually makes a bzImage, and stores it in arch/i386/boot, and it is up to the user to copy it to /boot and configure GRUB or LILO.
The .b files are "bootloader" files. they are part of the dance required to get a kernel into memory to begin with. You should NOT touch them.
ls -l /boot/*.b -rw-r--r-- 1 root root 5824 Sep 5 2002 /boot/boot.b -rw-r--r-- 1 root root 612 Sep 5 2002 /boot/chain.b -rw-r--r-- 1 root root 640 Sep 5 2002 /boot/os2_d.b
The 'message' file contains the message your bootloader will display, prompting you to choose an OS. So DO NOT touch it.
ls -l /boot/message* -rw-r--r-- 1 root root 23108 Sep 6 2002 /boot/message -rw-r--r-- 1 root root 21282 Sep 6 2002 /boot/message.ja
See the Appendix A at Section 15, “ Appendix A - Creating initrd.img file ” .
The bzImage is the compressed kernel image created with command 'make bzImage' during kernel compile. It important to note that bzImage is not compressed with bzip2 !! The name bz in bzImage is misleading!! It stands for "Big Zimage". The "b" in bzImage is "big". Both zImage and bzImage are compressed with gzip. The kernel includes a mini-gunzip to uncompress the kernel and boot into it. The difference is that the old zImage uncompresses the kernel into low memory (the first 640k), and bzImage uncompresses the kernel into high memory (over 1M). The only problem is that there are a very few machines where bzImage is known to have problems (because the machines are buggy). The bzImage actually boots faster, but other than that, there's no difference in the way the system *runs*. The rule is that if all drivers cannot fit into the zImage, then you need to modularize more.
If the kernel is small, it will work as both a zImage and a bzImage, and the booted system runs the same way. A big kernel will work as a bzImage, but not as a zImage. Both bootimages are gzipped, (bzImage is not bzipped as the name would suggest), but are put together and loaded differently, that allows the kernel to load in higher address space, that does not limit it to lower memory in the pathetic intel architecture. So why have both? Backward compatability. Some older lilos and loadlins don't handle the bzImage format. Note, they *boot* differently, but *run* the same. There is a lot of misinformation given out about what a bzImage file is (mostly about it being bzip2ed).
This file "module-info" is a symbolic link:
$ uname -r 2.4.18-19.8.0custom # ls -l /boot/module-info* lrwxrwxrwx 1 root root 25 Jan 26 10:44 /boot/module-info -> module-info-2.4.18-19.8.0 -rw-r--r-- 1 root root 15436 Sep 4 2002 /boot/module-info-2.4.18-14 -rw-r--r-- 1 root root 15436 Jan 26 01:29 /boot/module-info-2.4.18-19.8.0
Note above that it is not a requirement is make module-info symlink to a kernel specific file like the way System-map and vmlinuz is required. It's just a text file so as long as module-info list is recent, you'll be fine. Before removing all the stock RH kernels from your system, you should make a backup of this file:
# cp /boot/module-info-2.4.20-19.9 /boot/module-info-2.4.20-19.9.backup
It should be safe to do this as module-info rarely changes within the same point release of RH kernels.
This file 'module-info' is created by anaconda/utils/modlist (specific to Redhat Linux Anaconda installer). Other Linux distributions may be having equivalent command. Refer to your Linux distributor's manual pages.
See this script and search for "module-info" updmodules .
Below is a cut from this script:
#!/bin/bash # updmodules.sh MODLIST=$PWD/../anaconda/utils/modlist MODINFO=$KERNELROOT/boot/module-info-$version -- snip cut blah blah blah -- snip cut # create the module-info file $MODLIST --modinfo-file $MODINFO --ignore-missing --modinfo \ $(ls *.o | sed 's/\.o$//') > ../modinfo
The program anaconda/utils/modlist is located in anaconda-runtime*.rpm on the Redhat CDROM
cd /mnt/cdrom/RedHat/RPMS rpm -i anaconda-8.0-4.i386.rpm rpm -i anaconda-runtime-8.0-4.i386.rpm ls -l /usr/lib/anaconda-runtime/modlist
Get the source code for anaconda/utils/modlist.c from anaconda*.src.rpm at "http://www.rpmfind.net/linux/rpm2html/search.php?query=anaconda" Read online at modlist.c . .
The file 'module-info' is generated during the compile. It is an information file that is at least used during filing proper kernel OOPS reports. It is a list of the module entry points. It may also be used by depmod in building the tables that are used by insmod and its kith and kin. This includes dependancy information for other modules needed to be loaded before any other given module, etc.
Bottom line is: "Don't remove this file module-info."
Some points about module-info:
Is provided by the kernel rpms (built by anaconda-runtime*.rpm)
Is a link to module-info-{kernel-version}
Contains information about all available modules (at least those included in the default kernel config.)
Important to anaconda - in anaconda/utils/modlist command.
Might be used by kudzu to determine default parameters for modules when it creates entries in /etc/modules.conf. If you move module-info out of the way, shut down, install a new network card, and re-boot then kudzu would complain loudly. Look at the kudzu source code.
Everytime you compile and install the kernel image in /boot, you should also copy the corresponding config file to /boot area, for documentation and future reference. Do NOT touch or edit these files!!
ls -l /boot/config-* -rw-r--r-- 1 root root 42111 Sep 4 2002 /boot/config-2.4.18-14 -rw-r--r-- 1 root root 42328 Jan 26 01:29 /boot/config-2.4.18-19.8.0 -rw-r--r-- 1 root root 51426 Jan 25 22:21 /boot/config-2.4.18-19.8.0BOOT -rw-r--r-- 1 root root 52328 Jan 28 03:22 /boot/config-2.4.18-19.8.0-26mar2003
If you are using GRUB, then there will be 'grub' directory.
ls /boot/grub device.map ffs_stage1_5 menu.lst reiserfs_stage1_5 stage2 e2fs_stage1_5 grub.conf minix_stage1_5 splash.xpm.gz vstafs_stage1_5 fat_stage1_5 jfs_stage1_5 stage1 xfs_stage1_5
See also Section 17, “ Appendix C - GRUB Details And A Sample grub.conf ” file.
System.map is a "phone directory" list of function in a particular build of a kernel. It is typically a symlink to the System.map of the currently running kernel. If you use the wrong (or no) System.map, debugging crashes is harder, but has no other effects. Without System.map, you may face minor annoyance messages.
Do NOT touch the System.map files.
ls -ld /boot/System.map* lrwxrwxrwx 1 root root 30 Jan 26 19:26 /boot/System.map -> System.map-2.4.18-19.8.0custom -rw-r--r-- 1 root root 501166 Sep 4 2002 /boot/System.map-2.4.18-14 -rw-r--r-- 1 root root 510786 Jan 26 01:29 /boot/System.map-2.4.18-19.8.0 -rw-r--r-- 1 root root 331213 Jan 25 22:21 /boot/System.map-2.4.18-19.8.0BOOT -rw-r--r-- 1 root root 503246 Jan 26 19:26 /boot/System.map-2.4.18-19.8.0custom
How The Kernel Symbol Table Is Created ? System.map is produced by 'nm vmlinux' and irrelevant or uninteresting symbols are grepped out, When you compile the kernel, this file 'System.map' is created at /usr/src/linux/System.map. Something like below:
nm /boot/vmlinux-2.4.18-19.8.0 > System.map # Below is the line from /usr/src/linux/Makefile nm vmlinux | grep -v '\(compiled\)\|\(\.o$$\)\|\( [aUw] \)\|\(\.\.ng$$\)\|\(LASH[RL]DI\)' | sort > System.map cp /usr/src/linux/System.map /boot/System.map-2.4.18-14 # For v2.4.18
From "http://www.dirac.org/linux/systemmap.html"
There seems to be a dearth of information about the System.map file. It's really nothing mysterious, and in the scheme of things, it's really not that important. But a lack of documentation makes it shady. It's like an earlobe; we all have one, but nobody really knows why. This is a little web page I cooked up that explains the why.
Note, I'm not out to be 100[percnt] correct. For instance, it's possible for a system to not have /proc filesystem support, but most systems do. I'm going to assume you "go with the flow" and have a fairly typical system.
Some of the stuff on oopses comes from Alessandro Rubini's "Linux Device Drivers" which is where I learned most of what I know about kernel programming.
In the context of programming, a symbol is the building block of a program: it is a variable name or a function name. It should be of no surprise that the kernel has symbols, just like the programs you write. The difference is, of course, that the kernel is a very complicated piece of coding and has many, many global symbols.
The kernel doesn't use symbol names. It's much happier knowing a variable or function name by the variable or function's address. Rather than using size_t BytesRead, the kernel prefers to refer to this variable as (for example) c0343f20.
Humans, on the other hand, do not appreciate names like c0343f20. We prefer to use something like size_t BytesRead. Normally, this doesn't present much of a problem. The kernel is mainly written in C, so the compiler/linker allows us to use symbol names when we code and allows the kernel to use addresses when it runs. Everyone is happy.
There are situations, however, where we need to know the address of a symbol (or the symbol for an address). This is done by a symbol table, and is very similar to how gdb can give you the function name from a address (or an address from a function name). A symbol table is a listing of all symbols along with their address. Here is an example of a symbol table:
c03441a0 B dmi_broken c03441a4 B is_sony_vaio_laptop c03441c0 b dmi_ident c0344200 b pci_bios_present c0344204 b pirq_table c0344208 b pirq_router c034420c b pirq_router_dev c0344220 b ascii_buffer c0344224 b ascii_buf_bytes
You can see that the variable named dmi_broken is at the kernel address c03441a0.
There are 2 files that are used as a symbol table:
/proc/ksyms
System.map
There. You now know what the System.map file is.
Every time you compile a new kernel, the addresses of various symbol names are bound to change.
/proc/ksyms is a "proc file" and is created on the fly when a kernel boots up. Actually, it's not really a file; it's simply a representation of kernel data which is given the illusion of being a disk file. If you don't believe me, try finding the filesize of /proc/ksyms. Therefore, it will always be correct for the kernel that is currently running..
However, System.map is an actual file on your filesystem. When you compile a new kernel, your old System.map has wrong symbol information. A new System.map is generated with each kernel compile and you need to replace the old copy with your new copy.
What is the most common bug in your homebrewed programs? The segfault. Good ol' signal 11.
What is the most common bug in the Linux kernel? The segfault. Except here, the notion of a segfault is much more complicated and can be, as you can imagine, much more serious. When the kernel dereferences an invalid pointer, it's not called a segfault -- it's called an "oops". An oops indicates a kernel bug and should always be reported and fixed.
Note that an oops is not the same thing as a segfault. Your program cannot recover from a segfault. The kernel doesn't necessarily have to be in an unstable state when an oops occurs. The Linux kernel is very robust; the oops may just kill the current process and leave the rest of the kernel in a good, solid state.
An oops is not a kernel panic. In a panic, the kernel cannot continue; the system grinds to a halt and must be restarted. An oops may cause a panic if a vital part of the system is destroyed. An oops in a device driver, for example, will almost never cause a panic.
When an oops occurs, the system will print out information that is relevent to debugging the problem, like the contents of all the CPU registers, and the location of page descriptor tables. In particular, the contents of the EIP (instruction pointer) is printed. Like this:
EIP: 0010:[<00000000>] Call Trace: [<c010b860>]
You can agree that the information given in EIP and Call Trace is not very informative. But more importantly, it's really not informative to a kernel developer either. Since a symbol doesn't have a fixed address, c010b860 can point anywhere.
To help us use this cryptic oops output, Linux uses a daemon called klogd, the kernel logging daemon. klogd intercepts kernel oopses and logs them with syslogd, changing some of the useless information like c010b860 with information that humans can use. In other words, klogd is a kernel message logger which can perform name-address resolution. Once klogd tranforms the kernel message, it uses whatever logger is in place to log system wide messages, usually syslogd.
To perform name-address resolution, klogd uses System.map. Now you know what an oops has to do with System.map.
Fine print: There are actually two types of address resolution are performed by klogd.
Static translation, which uses the System.map file.
Dynamic translation which is used with loadable modules, doesn't use
System.map and is therefore not relevant to this discussion, but I'll describe it briefly anyhow.
Klogd Dynamic Translation
Suppose you load a kernel module which generates an oops. An oops message is generated, and klogd intercepts it. It is found that the oops occured at d00cf810. Since this address belongs to a dynamically loaded module, it has no entry in the System.map file. klogd will search for it, find nothing, and conclude that a loadable module must have generated the oops. klogd then queries the kernel for symbols that were exported by loadable modules. Even if the module author didn't export his symbols, at the very least, klogd will know what module generated the oops, which is better than knowing nothing about the oops at all.
There's other software that uses System.map, and I'll get into that shortly.
System.map should be located wherever the software that uses it looks for it. That being said, let me talk about where klogd looks for it. Upon bootup, if klogd isn't given the location of System.map as an argument, it will look for System.map in 3 places, in the following order:
/boot/System.map
/System.map
/usr/src/linux/System.map
System.map also has versioning information, and klogd intelligently searches for the correct map file. For instance, suppose you're running kernel 2.4.18 and the associated map file is /boot/System.map. You now compile a new kernel 2.5.1 in the tree /usr/src/linux. During the compiling process, the file /usr/src/linux/System.map is created. When you boot your new kernel, klogd will first look at /boot/System.map, determine it's not the correct map file for the booting kernel, then look at /usr/src/linux/System.map, determine that it is the correct map file for the booting kernel and start reading the symbols.
A few nota bene's:
Somewhere during the 2.5.x series, the Linux kernel started to untar into linux-version, rather than just linux (show of hands -- how many people have been waiting for this to happen?). I don't know if klogd has been modified to search in /usr/src/linux-version/System.map yet. TODO: Look at the klogd srouce. If someone beats me to it, please email me and let me know if klogd has been modified to look in the new directory name for the linux source code.
The man page doesn't tell the whole the story. Look at this:
# strace -f /sbin/klogd | grep 'System.map' 31208 open("/boot/System.map-2.4.18", O_RDONLY|O_LARGEFILE) = 2
Apparently, not only does klogd look for the correct version of the map in the 3 klogd search directories, but klogd also knows to look for the name "System.map" followed by "-kernelversion", like System.map-2.4.18. This is undocumented feature of klogd.
A few drivers will need System.map to resolve symbols (since they're linked against the kernel headers instead of, say, glibc). They will not work correctly without the System.map created for the particular kernel you're currently running. This is NOT the same thing as a module not loading because of a kernel version mismatch. That has to do with the kernel version, not the kernel symbol table which changes between kernels of the same version!
Don't think that System.map is only useful for kernel oopses. Although the kernel itself doesn't really use System.map, other programs such as klogd, lsof,
satan# strace lsof 2>&1 1> /dev/null | grep System readlink("/proc/22711/fd/4", "/boot/System.map-2.4.18", 4095) = 23
and ps :
satan# strace ps 2>&1 1> /dev/null | grep System open("/boot/System.map-2.4.18", O_RDONLY|O_NONBLOCK|O_NOCTTY) = 6
and many other pieces of software like dosemu require a correct System.map.
Suppose you have multiple kernels on the same machine. You need a separate System.map files for each kernel! If boot a kernel that doesn't have a System.map file, you'll periodically see a message like: System.map does not match actual kernel Not a fatal error, but can be annoying to see everytime you do a ps ax. Some software, like dosemu, may not work correctly (although I don't know of anything off the top of my head). Lastly, your klogd or ksymoops output will not be reliable in case of a kernel oops. See the manual pages by giving commands 'man ksymoops' and 'man klogd'.
The solution is to keep all your System.map files in /boot and rename them with the kernel version. Suppose you have multiple kernels like:
/boot/vmlinuz-2.2.14
/boot/vmlinuz-2.2.13
Then just rename your map files according to the kernel version and put them in /boot, like:
/boot/System.map-2.2.14 /boot/System.map-2.2.13
Now what if you have two copies of the same kernel? Like:
/boot/vmlinuz-2.2.14
/boot/vmlinuz-2.2.14.nosound
The best answer would be if all software looked for the following files:
/boot/System.map-2.2.14 /boot/System.map-2.2.14.nosound
You can also use symlinks:
System.map-2.2.14 System.map-2.2.14.sound ln -s System.map-2.2.14.sound System.map # Here System.map -> System.map-2.2.14.sound