From: eLinux.org

Application Init Optimizations

Contents

Description

This page describes optimizations to a large application and to the
kernel, to shorten the time required to load and execute an application.

Two main techniques are described here: 1) use of mmap vs. read and 2)
control over page mapping characteristics. These techniques are
discussed below.

Rationale

Kernel bootup time is drastically improved with recent efforts including
CELF activities. As a next step, application bootup time should be
considered to cut down the system total bootup time. The techniques
described here are applicable to a large number of embedded systems,
which consist of large, single-application programs.

Application Tuning

Using mmap() instead of read() for initial application data load

An application may load a large amount of data when it is first
initialized. This can result in a long delay as the file data is read
into memory. It is possible to avoid the initial cost of this read, by
using mmap() instead of read().

Instead of loading all of the data into memory with the read system
call, the file can be mapped into memory with the mmap system call. Once
the data file is mapped, individual pages will be demand loaded during
execution, when the application reads them. Depending on the initial
working set size of the data in the file, this can result in significant
time savings. (For example, if an application only initially uses 50% of
the data from the file, then only 50% of the data will be read into
memory from persistent storage. There is extra overhead due to the cost
of page-faults incurred in loading the pages on demand. However, this
page fault overhead is offset by the savings in the number of page reads
(compared to the read() case).

Customizing file cache control in the Kernel

To further improve this method, the kernel can be modified to reduce
page copying and page faults.

Eliminating redundant page copies

When pages are demand loaded to a memory-mapped file, the pages are kept
in memory as part of the kernel “file cache” and mapped into the
requesting process’s address space. If the page is accessed via a write
operation, then the page in the file system cache is copied to a newly
allocated memory page. (This is referred to as “copy-on-write”). The
copied page can be then be freely modified by the process which maps it.

Suppose, however, that a file is mapped or accessed by only one process.
Then, copying the page is redundant. In this case, we can convert the
page in the file cache to a private page immediately. By utilizing this
assumption (only one user for the page), the cost of the copy can be
eliminated. This has the side benefit of reducing memory consumption as
well.

Reducing page faults

In some cases, an individual page in the process address space is
accessed first with a read operation, then with a write operation. This
results in two page faults for the same page (one to load the page and
move it “through” the file cache, and the other to get a local copy of
the page.) By eliminating the page copy, and making the page private on
the first access (whether read or write), the second page fault can be
reduced.

Controlling API

The current system is experimental, in the way it manages the files
affected by this caching/virtual memory customization. It would be
better to control this mechanism per file or virtual memory area. The
fcntl system call or mmap system call are candidates where this control
could be introduced.

Resources

Projects

None.

Specifications

Boot Time

Downloads

Patch

Sorry but there is no available patch at this time.

Sample Results

Case Study 1

Hardware’

Software Kernel

  • 2.4.27 kernel.

Target application

Methods explanation

  1. read(CF/ext3) The data file is loaded using read system call from a
    ext3 file system on a CompactFlash memory.
  2. mmap(CF/ext3) The data file is mapped to the process virtual space
    using mmap system call.
  3. takeover(CF/ext3) The data file is mapped and the page in the file
    system cache (which is created during page fault handling) is
    converted to private page immediately.
  4. takeover(CF/squash) Same as No.3 except using the SquashFS file
    system.
  5. takeover(RD/squash) Same as No.3 except the file system is on read
    from a RAM Disk instead of Compact Flash.

Results









































































































No.MethodMediaFSAve.1st2nd3rdDiff.
1readCFext34.4204.4184.4204.421-
2mmapCFext33.9953.9953.9953.996-0.424
3takeoverCFext33.9593.9593.9583.966-0.461
4takeoverCFsquash4.0024.0004.0004.007-0.417
5takeover(total)RDsquash4.5884.5794.5904.5950.168
dd(CF -> RD)RDsquash1.2121.2091.2091.217
mountRDsquash0.0410.0400.0410.041
takeoverRDsquash3.3363.3303.3403.337
  • UNIT: sec
  • CF: CompactFlash / RD: RAM Disk

Chart1.png

  1. As the result of using mmap system call, bootup time is reduced by
    about 400msec (10% of total init time).
  2. By using the takeover method, page faults are reduced to 317 times,
    versus 496 under the mmap method. Also, redundant page copies are
    eliminated. As the result, about 40msec is eliminated.
  3. Squashfs is compressed ROM file system and there are some extra cost
    to access data, decompression and so on… But the performance is not
    so bad against ext3fs. Using squashfs is a good choice to reduce
    consumption of storage spaces.
  4. Using a file system on a RAM disk is the most efficient way to
    increase file access performance. If the storage

device which stores the file system image is enough fast and extra RAM
usage is affordable, it might be a good choice to reduce bootup time.

Case Study 2

  • Status: measured
  • Architecture Support:

  • i386: unknown

  • ARM: unknown
  • PPC: unknown
  • MIPS: unknown
  • SH: works on SH3

Future Work/Action Items

Here is a list of things that could be worked on for this feature:

  • I’m considering to implement similar file cache control using
    fadvise system call under 2.6 kernel.

Other resource

This project was demo-ed at the 2005 CELF Technical Conference. The
picture of the poster is here:

Celf-demo-poster-fujitsu.jpg

Category: