Periodic Executors

PyMongo implements a PeriodicExecutor for twopurposes: as the background thread for Monitor, and toregularly check if there are OP_KILL_CURSORS messages that must be sent to the server.

Killing Cursors

An incompletely iterated Cursor on the client represents anopen cursor object on the server. In code like this, we lose a reference tothe cursor before finishing iteration:

  1. for doc in collection.find():
  2. raise Exception()

We try to send an OP_KILL_CURSORS to the server to tell it to clean up theserver-side cursor. But we must not take any locks directly from the cursor’sdestructor (see PYTHON-799), so we cannot safely use the PyMongo datastructures required to send a message. The solution is to add the cursor’s idto an array on the MongoClient without taking any locks.

Each client has a PeriodicExecutor devoted tochecking the array for cursor ids. Any it sees are the result of cursors thatwere freed while the server-side cursor was still open. The executor can safelytake the locks it needs in order to send the OP_KILL_CURSORS message.

Stopping Executors

Just as Cursor must not take any locks from its destructor,neither can MongoClient and Topology.Thus, although the client calls close() on its kill-cursors thread, andthe topology calls close() on all its monitor threads, the close()method cannot actually call wake() on the executor, since wake()takes a lock.

Instead, executors wake periodically to check if self.close is set,and if so they exit.

A thread can log spurious errors if it wakes late in the Python interpreter’sshutdown sequence, so we try to join threads before then. Each periodicexecutor (either a monitor or a kill-cursors thread) adds a weakref to itselfto a set called _EXECUTORS, in the periodic_executor module.

An exit handler runs on shutdown and tells all executors to stop, thentries (with a short timeout) to join all executor threads.

Monitoring

For each server in the topology, Topology uses a periodicexecutor to launch a monitor thread. This thread must not prevent the topologyfrom being freed, so it weakrefs the topology. Furthermore, it uses a weakrefcallback to terminate itself soon after the topology is freed.

Solid lines represent strong references, dashed lines weak ones:../_images/periodic-executor-refs.pngSee Stopping Executors above for an explanation of the _EXECUTORS set.

It is a requirement of the Server Discovery And Monitoring Spec that asleeping monitor can be awakened early. Aside from infrequent wakeups to dotheir appointed chores, and occasional interruptions, periodic executors alsowake periodically to check if they should terminate.

Our first implementation of this idea was the obvious one: use the Pythonstandard library’s threading.Condition.wait with a timeout. Another threadwakes the executor early by signaling the condition variable.

A topology cannot signal the condition variable to tell the executor toterminate, because it would risk a deadlock in the garbage collector: nodestructor or weakref callback can take a lock to signal the condition variable(see PYTHON-863); thus the only way for a dying object to terminate aperiodic executor is to set its “stopped” flag and let the executor see theflag next time it wakes.

We erred on the side of prompt cleanup, and set the check interval at 100ms. Weassumed that checking a flag and going back to sleep 10 times a second wascheap on modern machines.

Starting in Python 3.2, the builtin C implementation of lock.acquire takes atimeout parameter, so Python 3.2+ Condition variables sleep simply by callinglock.acquire; they are implemented as efficiently as expected.

But in Python 2, lock.acquire has no timeout. To wait with a timeout, a Python2 condition variable sleeps a millisecond, tries to acquire the lock, sleepstwice as long, and tries again. This exponential backoff reaches a maximumsleep time of 50ms.

If PyMongo calls the condition variable’s “wait” method with a short timeout,the exponential backoff is restarted frequently. Overall, the condition variableis not waking a few times a second, but hundreds of times. (See PYTHON-983.)

Thus the current design of periodic executors is surprisingly simple: theydo a simple time.sleep for a half-second, check if it is time to wake orterminate, and sleep again.