The /etc/profile
file contains system wide environment stuff and startup programs. All customizations that you put in this file will apply for the entire environment variable on your system, so putting optimization
flags in this file is a good choice. To squeeze the most performance from your x86 programs, you can use full optimization when compiling with the -O9
flag. Many programs contain -O2
in
the Makefile. -O9
is the highest level of optimization. It will increase the size of what it produces, but it runs faster.
Please Note it is not always true that the -O9
flag will make the best performance for your processor. If you have an x686 and above processor, surely, but below x686, not necessarily.
When compiling, use the -fomit-frame-pointer
switch for any kind of processor you may have. This will use the stack for accessing variables. Unfortunately, debugging is almost impossible with this option. You can also use
the -mcpu=cpu_type
and -march=cpu_type
switch to optimize the program for the CPU listed to the best of GCC's ability. However, the resulting code will only be run able on
the indicated CPU or higher.
The optimization options apply only when we compile and install a new program in our server. These
optimizations don't play any role in our Linux base system; it just tells our compiler to optimize the new programs that we will install with the optimization flags we have specified in the /etc/profile
file.
Below are the optimization flags that we recommend you put in your /etc/profile
file depending on your CPU architecture.
Procedure 6.1. Recommended optimization flags
For CPU i686 or PentiumPro, Pentium II, Pentium III
In the /etc/profile
file, put this line for a PentiumPro, Pentium II and III Pro Processor family:
CFLAGS=-O9 -funroll-loops -ffast-math -malign-double -mcpu=pentiumpro -march=pentiumpro -fomit-frame-pointer -fno-exceptions
For CPU i586 or Pentium:
In the /etc/profile
file, put this line for a Pentium Processor family:
CFLAGS=-O3 -march=pentium -mcpu=pentium -ffast-math -funroll-loops -fomit-frame-pointer -fforce-mem -fforce-addr -malign-double -fno-exceptions
For CPU i486:
In the /etc/profile
file, put this line for a i486 Processor family:
CFLAGS=-O3 -funroll-all-loops -malign-double -mcpu=i486 -march=i486 -fomit-frame-pointer -fno-exceptions
Now after the selection of your CPU settings -i686, i586, or i486 a bit further down in the /etc/profile
file,
add CFLAGS LANG LESSCHARSET
to the export line:
export PATH PS1 HOSTNAME HISTSIZE HISTFILESIZE USER LOGNAME MAIL INPUTRC CFLAGS LANG LESSCHARSET
Log out and log back in; after this, the new CFLAGS
environment variable is set, and software and other configure tool will recognize that.
Pentium Pro/II/III optimizations will only work with egcs or pgcc compilers. The egcs compiler is already installed on your Server by default
so you don't need to worry about it.
Below is the explanation of the different optimization options we use:
-funroll-loops
The -funroll-loops
optimization option will perform the optimization of loop unrolling and will do it only for loops whose number of iterations can be
determined at compile time or run time.
-funroll-all-loops
The -funroll-all-loops
optimization option will also perform the optimization of loop unrolling and is done for all loops.
-ffast-math
The -ffast-math
optimization option will allow the GCC compiler, in the interest of optimizing code for speed, to violate
some ANSI or IEEE rules/specifications.
-malign-double
The -malign-double
optimization option will control whether the GCC compiler aligns double, long double, and long long variables on a two-word boundary or a one-word
boundary. This will produce code that runs somewhat faster on a Pentium at the expense of more memory.
-mcpu=cpu_type
The -mcpu=cpu_type
optimization option will set the default CPU to use for the machine type when scheduling instructions.
-fforce-mem
The -fforce-mem
optimization option will produce better code by forcing memory operands to be copied into registers before doing arithmetic on them and by making
all memory references potential common subexpressions.
-fforce-addr
The -fforce-addr
optimization option will produce better code by forcing memory address constants to be copied into registers before doing arithmetic on them.
-fomit-frame-pointer
The -fomit-frame-pointer
optimization option, one of the most interesting, will allow the program to not keep the frame pointer in a register for functions that don't need one. This
avoids the instructions to save, set up and restores frame pointers; it also makes an extra register available in many functions and makes debugging impossible on most machines.
All future optimizations that we will describe in this book refer by default to a Pentium II/III CPU family. So you must, if required, adjust the compilation flags for your specific CPU processor type in
the /etc/profile
file and also during your compilation time.