From: eLinux.org

Data Read In Place

This page has information about “Data Read-In-Place”, which is of
interest to CE Linux Forum members, because it allows data pages to
reside on ROM or flash, until they are written to. This is essentially a
form of XIP, or copy-on-write, for data pages. XIP is used to keep text
segment pages in flash permanently. This technique (“DRIP”) is used to
keep data pages in flash until they are written to. Since many
application data pages are never written to, the net effect is a
substantial reduction in RAM usage for application data segments. This
feature was also called “Data Allocate On Write” previously, but the
name “Data Read In Place” is closer to the already-existing term for
text (Execute In Place), and is now preferred.

The total effect for one system measured by Panasonic was a reduction of
26% of the page cache allocated to processes, when the product was in
the stand-by state.

The technique was described by Masashige Mizuyama, Chief Architect in
the System Architecture Develompent Group, Base System Development
Center, Panasonic Mobile Communications Co., Ltd.

Description

There is no need to change kernel code for this feature. We changed the
dynamic linker (in glibc of MVL CEE3.1) only. This was used with a 2.4.x
Linux kernel.

Usually, the dynamic linker maps each ELF segment to the virtual address
space of the process, using mmap.

We change it as follows:

  1. if(segment includes a .data section){
  2. Do mmap(), forcing PROT_WRITE bit off. ------(1)
  3. Set PROT_WRITE bit on, with mprotect(). ------(2)
  4. } else {
  5. Do mmap() as usual.
  6. }

This is very simple.

Below is the description of how it works.

When the XIP ELF shared library is dynamically linked at runtime,
because the PROT_WRITE bit is off ((1) above) when the section is
mmap’ed, the kernel assumes the linker is mapping an XIP text segment.
So the kernel builds a page directory table to map every physical ROM
page of the segment to the process virtual address space. Each page
table entry (PTE) is write-proctected.

Then, becaused of the mprotect call setting PROT_WRITE on the mapped
area, the virtual memory area for the segment has write permission (in
the kernel vm_area_struct). The write permission combination of PTE
and vm_area_struct is identical with a page which is enabled “copy on
write”.

So, the the pages in the segment are mapped to ROM pages directly until
they are written.

This is a kind of “fake” approach to support the feature with minimal
changes. So there are some pitfalls to this approach. One problem we
already notice is that get_user_pages() does not work a segment mapped
like this.

The get_user_pages() function is used for mlock, ptrace and core dump
by kernel. So they don’t work for the segment correctly with the current
implementation.

However, the advantage was much enough for us, we decided to use it. I
think the implementation needs to be cleaned up by adding direct kernel
support for this type of page mapping. }}}

Documents

draft patch

This patch can be applied to the runtime linker [what program is this?
ld-linux.so??]

  1. *** dl-load7.c Mon Jul 11 21:26:47 2005
  2. --- dl-load.c Sat Jan 8 11:37:38 2005
  3. ***************
  4. *** 801,819 ****
  5. --- 801,849 ----
  6. if (! (locked_load_mode & (RTLD_LOCK_DEPENDENT_LIB_PAGES
  7. | RTLD_LOCK_LIB_PAGES)))
  8. {
  9. + if((prot & PROT_WRITE) != 0 ){
  10. + prot = (prot & ~PROT_WRITE);
  11. + mapat = __mmap ((caddr_t) mapstart, len, prot,
  12. + fixed|MAP_COPY|MAP_FILE,
  13. + fd, offset);
  14. + if (mapat != MAP_FAILED){
  15. + prot = (prot | PROT_WRITE);
  16. + if( __mprotect(mapat,len,prot)==-1){
  17. + return N_("failed to map segment from shared object");
  18. + }
  19. + } else {
  20. + return N_("failed to map segment from shared object");
  21. + }
  22. + } else {
  23. mapat = __mmap ((caddr_t) mapstart, len, prot,
  24. fixed|MAP_COPY|MAP_FILE,
  25. fd, offset);
  26. if (mapat == MAP_FAILED)
  27. return N_("failed to map segment from shared object");
  28. + }
  29. }
  30. else if (locked_load_mode & RTLD_LOCK_MLOCK)
  31. {
  32. + if((prot & PROT_WRITE) != 0 ){
  33. + prot = (prot & ~PROT_WRITE);
  34. + mapat = __mmap ((caddr_t) mapstart, len, prot,
  35. + fixed|MAP_COPY|MAP_FILE,
  36. + fd, offset);
  37. + if (mapat != MAP_FAILED){
  38. + prot = (prot | PROT_WRITE);
  39. + if( __mprotect(mapat,len,prot)==-1){
  40. + return N_("failed to map segment from shared object");
  41. + }
  42. + } else {
  43. + return N_("failed to map segment from shared object");
  44. + }
  45. + } else {
  46. mapat = __mmap ((caddr_t) mapstart, len, prot,
  47. fixed|MAP_COPY|MAP_FILE,
  48. fd, offset);
  49. if (mapat == MAP_FAILED)
  50. return N_("failed to map segment from shared object");
  51. + }
  52. if (mlock((caddr_t) mapat, len) != 0)
  53. {
  54. return N_("failed to memory lock segment from shared object");

Category: