Garbage collector documentation and configuration

Incminimark

PyPy’s default garbage collector is called incminimark - it’s an incremental,generational moving collector. Here we hope to explain a bit how it worksand how it can be tuned to suit the workload.

Incminimark first allocates objects in so called nursery - place for youngobjects, where allocation is very cheap, being just a pointer bump. The nurserysize is a very crucial variable - depending on your workload (one or manyprocesses) and cache sizes you might want to experiment with it viaPYPY_GC_NURSERY environment variable. When the nursery is full, there isperformed a minor collection. Freed objects are no longer referencable andjust die, just by not being referenced any more; on the other hand, objectsfound to still be alive must survive and are copied from the nurseryto the old generation. Either to arenas, which are collectionsof objects of the same size, or directly allocated with malloc if they’relarger. (A third category, the very large objects, are initially allocatedoutside the nursery and never move.)

Since Incminimark is an incremental GC, the major collection is incremental:the goal is not to have any pause longer than 1ms, but in practice it dependson the size and characteristics of the heap: occasionally, there can be pausesbetween 10-100ms.

Semi-manual GC management

If there are parts of the program where it is important to have a low latency,you might want to control precisely when the GC runs, to avoid unexpectedpauses. Note that this has effect only on major collections, while minorcollections continue to work as usual.

As explained above, a full major collection consists of N steps, whereN depends on the size of the heap; generally speaking, it is not possibleto predict how many steps will be needed to complete a collection.

gc.enable() and gc.disable() control whether the GC runs collectionsteps automatically. When the GC is disabled the memory usage will growindefinitely, unless you manually call gc.collect() andgc.collect_step().

gc.collect() runs a full major collection.

gc.collect_step() runs a single collection step. It returns an object oftype GcCollectStepStats, the same which is passed to the corresponding GCHooks. The following code is roughly equivalent to a gc.collect():

  1. while True:
  2. if gc.collect_step().major_is_done:
  3. break

For a real-world example of usage of this API, you can look at the 3rd-partymodule pypytools.gc.custom, which also provides a with customgc.nogc()context manager to mark sections where the GC is forbidden.

Fragmentation

Before we discuss issues of “fragmentation”, we need a bit of precision.There are two kinds of related but distinct issues:

  • If the program allocates a lot of memory, and then frees it all bydropping all references to it, then we might expect to see the RSSto drop. (RSS = Resident Set Size on Linux, as seen by “top”; it is anapproximation of the actual memory usage from the OS’s point of view.)This might not occur: the RSS may remain at its highest value. Thisissue is more precisely caused by the process not returning “free”memory to the OS. We call this case “unreturned memory”.
  • After doing the above, if the RSS didn’t go down, then at least futureallocations should not cause the RSS to grow more. That is, the processshould reuse unreturned memory as long as it has got some left. If thisdoes not occur, the RSS grows even larger and we have real fragmentationissues.

gc.get_stats

There is a special function in the gc module calledget_stats(memory_pressure=False).

memory_pressure controls whether or not to report memory pressure fromobjects allocated outside of the GC, which requires walking the entire heap,so it’s disabled by default due to its cost. Enable it when debuggingmysterious memory disappearance.

Example call looks like that:

  1. >>> gc.get_stats(True)
  2. Total memory consumed:
  3. GC used: 4.2MB (peak: 4.2MB)
  4. in arenas: 763.7kB
  5. rawmalloced: 383.1kB
  6. nursery: 3.1MB
  7. raw assembler used: 0.0kB
  8. memory pressure: 0.0kB
  9. -----------------------------
  10. Total: 4.2MB
  11.  
  12. Total memory allocated:
  13. GC allocated: 4.5MB (peak: 4.5MB)
  14. in arenas: 763.7kB
  15. rawmalloced: 383.1kB
  16. nursery: 3.1MB
  17. raw assembler allocated: 0.0kB
  18. memory pressure: 0.0kB
  19. -----------------------------
  20. Total: 4.5MB

In this particular case, which is just at startup, GC consumes relativelylittle memory and there is even less unused, but allocated memory. In casethere is a lot of unreturned memory or actual fragmentation, the “allocated”can be much higher than “used”. Generally speaking, “peak” will more closelyresemble the actual memory consumed as reported by RSS. Indeed, returningmemory to the OS is a hard and not solved problem. In PyPy, it occurs only ifan arena is entirely free—a contiguous block of 64 pages of 4 or 8 KB each.It is also rare for the “rawmalloced” category, at least for common systemimplementations of malloc().

The details of various fields:

  • GC in arenas - small old objects held in arenas. If the amount “allocated”is much higher than the amount “used”, we have unreturned memory. It ispossible but unlikely that we have internal fragmentation here. However,this unreturned memory cannot be reused for any malloc(), including thememory from the “rawmalloced” section.
  • GC rawmalloced - large objects allocated with malloc. This is gives thecurrent (first block of text) and peak (second block of text) memoryallocated with malloc(). The amount of unreturned memory orfragmentation caused by malloc() cannot easily be reported. Usuallyyou can guess there is some if the RSS is much larger than the totalmemory reported for “GC allocated”, but do keep in mind that this totaldoes not include malloc’ed memory not known to PyPy’s GC at all. If youguess there is some, consider using jemalloc as opposed to system malloc.
  • nursery - amount of memory allocated for nursery, fixed at startup,controlled via an environment variable
  • raw assembler allocated - amount of assembler memory that JIT feelsresponsible for
  • memory pressure, if asked for - amount of memory we think got allocatedvia external malloc (eg loading cert store in SSL contexts) that is keptalive by GC objects, but not accounted in the GC

GC Hooks

GC hooks are user-defined functions which are called whenever a specific GCevent occur, and can be used to monitor GC activity and pauses. You caninstall the hooks by setting the following attributes:

  • gc.hook.on_gc_minor
  • Called whenever a minor collection occurs. It corresponds togc-minor sections inside PYPYLOG.
  • gc.hook.on_gc_collect_step
  • Called whenever an incremental step of a major collection occurs. Itcorresponds to gc-collect-step sections inside PYPYLOG.
  • gc.hook.on_gc_collect
  • Called after the last incremental step, when a major collection is fullydone. It corresponds to gc-collect-done sections inside PYPYLOG.

To uninstall a hook, simply set the corresponding attribute to None. Toinstall all hooks at once, you can call gc.hooks.set(obj), which will lookfor methods ongc* on obj. To uninstall all the hooks at once, youcan call gc.hooks.reset().

The functions called by the hooks receive a single stats argument, whichcontains various statistics about the event.

Note that PyPy cannot call the hooks immediately after a GC event, but it hasto wait until it reaches a point in which the interpreter is in a known stateand calling user-defined code is harmless. It might happen that multipleevents occur before the hook is invoked: in this case, you can inspect thevalue stats.count to know how many times the event occurred since the lasttime the hook was called. Similarly, stats.duration contains thetotal time spent by the GC for this specific event since the last time thehook was called.

On the other hand, all the other fields of the stats object are relativeonly to the last event of the series.

The attributes for GcMinorStats are:

  • count
  • The number of minor collections occurred since the last hook call.
  • duration
  • The total time spent inside minor collections since the last hookcall, in seconds.
  • duration_min
  • The duration of the fastest minor collection since the last hook call.
  • duration_max
The duration of the slowest minor collection since the last hook call.
  • total_memory_used
  • The amount of memory used at the end of the minor collection, inbytes. This include the memory used in arenas (for GC-managed memory) andraw-malloced memory (e.g., the content of numpy arrays).
    • pinned_objects
    • the number of pinned objects.

The attributes for GcCollectStepStats are:

  • count, duration, duration_min, duration_max
  • See above.
  • oldstate, newstate
  • Integers which indicate the state of the GC before and after the step.
  • major_is_done
  • Boolean which indicate whether this was the last step of the majorcollection

The value of oldstate and newstate is one of these constants, definedinside gc.GcCollectStepStats: STATE_SCANNING, STATE_MARKING,STATE_SWEEPING, STATE_FINALIZING, STATE_USERDEL. It is possibleto get a string representation of it by indexing the GC_STATES tuple.

The attributes for GcCollectStats are:

  • count
  • See above.
  • num_major_collects
  • The total number of major collections which have been done since thestart. Contrarily to count, this is an always-growing counter and it’snot reset between invocations.
  • arenas_count_before, arenas_count_after
  • Number of arenas used before and after the major collection.
  • arenas_bytes
  • Total number of bytes used by GC-managed objects.
  • rawmalloc_bytes_before, rawmalloc_bytes_after
  • Total number of bytes used by raw-malloced objects, before and after themajor collection.

Note that GcCollectStats has not got a duration field. This isbecause all the GC work is done inside gc-collect-step:gc-collect-done is used only to give additional stats, but doesn’t do anyactual work.

Here is an example of GC hooks in use:

  1. import sys
  2. import gc
  3.  
  4. class MyHooks(object):
  5. done = False
  6.  
  7. def on_gc_minor(self, stats):
  8. print 'gc-minor: count = %02d, duration = %d' % (stats.count,
  9. stats.duration)
  10.  
  11. def on_gc_collect_step(self, stats):
  12. old = gc.GcCollectStepStats.GC_STATES[stats.oldstate]
  13. new = gc.GcCollectStepStats.GC_STATES[stats.newstate]
  14. print 'gc-collect-step: %s --> %s' % (old, new)
  15. print ' count = %02d, duration = %d' % (stats.count,
  16. stats.duration)
  17.  
  18. def on_gc_collect(self, stats):
  19. print 'gc-collect-done: count = %02d' % stats.count
  20. self.done = True
  21.  
  22. hooks = MyHooks()
  23. gc.hooks.set(hooks)
  24.  
  25. # simulate some GC activity
  26. lst = []
  27. while not hooks.done:
  28. lst = [lst, 1, 2, 3]

Environment variables

PyPy’s default incminimark garbage collector is configurable throughseveral environment variables:

  • PYPY_GC_NURSERY
  • The nursery size.Defaults to 1/2 of your last-level cache, or 4M if unknown.Small values (like 1 or 1KB) are useful for debugging.
  • PYPY_GC_NURSERY_DEBUG
  • If set to non-zero, will fill nursery with garbage, to helpdebugging.
  • PYPY_GC_INCREMENT_STEP
  • The size of memory marked during the marking step. Default is size ofnursery times 2. If you mark it too high your GC is not incremental atall. The minimum is set to size that survives minor collection times1.5 so we reclaim anything all the time.
  • PYPY_GC_MAJOR_COLLECT
  • Major collection memory factor.Default is 1.82, which means trigger a major collection when thememory consumed equals 1.82 times the memory really used at the endof the previous major collection.
  • PYPY_GC_GROWTH
  • Major collection threshold’s max growth rate.Default is 1.4.Useful to collect more often than normally on sudden memory growth,e.g. when there is a temporary peak in memory usage.
  • PYPY_GC_MAX
  • The max heap size.If coming near this limit, it will first collect more often, thenraise an RPython MemoryError, and if that is not enough, crash theprogram with a fatal error.Try values like 1.6GB.
  • PYPY_GC_MAX_DELTA
  • The major collection threshold will never be set to more thanPYPY_GC_MAX_DELTA the amount really used after a collection.Defaults to 1/8th of the total RAM size (which is constrained to beat most 2/3/4GB on 32-bit systems).Try values like 200MB.
  • PYPY_GC_MIN
  • Don’t collect while the memory size is below this limit.Useful to avoid spending all the time in the GC in very smallprograms.Defaults to 8 times the nursery.
  • PYPY_GC_DEBUG
  • Enable extra checks around collections that are too slow for normaluse.Values are 0 (off), 1 (on major collections) or 2 (alsoon minor collections).
  • PYPY_GC_MAX_PINNED
  • The maximal number of pinned objects at any point in time. Defaultsto a conservative value depending on nursery size and maximum objectsize inside the nursery. Useful for debugging by setting it to 0.