Connections / Engines

How do I configure logging?

See Configuring Logging.

How do I pool database connections? Are my connections pooled?

SQLAlchemy performs application-level connection pooling automaticallyin most cases. With the exception of SQLite, a Engine objectrefers to a QueuePool as a source of connectivity.

For more detail, see Engine Configuration and Connection Pooling.

How do I pass custom connect arguments to my database API?

The create_engine() call accepts additional arguments eitherdirectly via the connect_args keyword argument:

  1. e = create_engine("mysql://scott:tiger@localhost/test",
  2. connect_args={"encoding": "utf8"})

Or for basic string and integer arguments, they can usually be specifiedin the query string of the URL:

  1. e = create_engine("mysql://scott:tiger@localhost/test?encoding=utf8")

See also

Custom DBAPI connect() arguments

“MySQL Server has gone away”

The primary cause of this error is that the MySQL connection has timed outand has been closed by the server. The MySQL server closes connectionswhich have been idle a period of time which defaults to eight hours.To accommodate this, the immediate setting is to enable thecreate_engine.pool_recycle setting, which will ensure that aconnection which is older than a set amount of seconds will be discardedand replaced with a new connection when it is next checked out.

For the more general case of accommodating database restarts and othertemporary loss of connectivity due to network issues, connections thatare in the pool may be recycled in response to more generalized disconnectdetection techniques. The section Dealing with Disconnects providesbackground on both “pessimistic” (e.g. pre-ping) and “optimistic”(e.g. graceful recovery) techniques. Modern SQLAlchemy tends to favorthe “pessimistic” approach.

See also

Dealing with Disconnects

“Commands out of sync; you can’t run this command now” / “This result object does not return rows. It has been closed automatically”

The MySQL drivers have a fairly wide class of failure modes whereby the state ofthe connection to the server is in an invalid state. Typically, when the connectionis used again, one of these two error messages will occur. The reason is becausethe state of the server has been changed to one in which the client librarydoes not expect, such that when the client library emits a new statementon the connection, the server does not respond as expected.

In SQLAlchemy, because database connections are pooled, the issue of the messagingbeing out of sync on a connection becomes more important, since when an operationfails, if the connection itself is in an unusable state, if it goes back into theconnection pool, it will malfunction when checked out again. The mitigationfor this issue is that the connection is invalidated when such a failuremode occurs so that the underlying database connection to MySQL is discarded.This invalidation occurs automatically for many known failure modes and canalso be called explicitly via the Connection.invalidate() method.

There is also a second class of failure modes within this category where a context managersuch as with session.begin_nested(): wants to “roll back” the transactionwhen an error occurs; however within some failure modes of the connection, therollback itself (which can also be a RELEASE SAVEPOINT operation) alsofails, causing misleading stack traces.

Originally, the cause of this error used to be fairly simple, it meant thata multithreaded program was invoking commands on a single connection from morethan one thread. This applied to the original “MySQLdb” native-C driver that waspretty much the only driver in use. However, with the introduction of pure Pythondrivers like PyMySQL and MySQL-connector-Python, as well as increased use oftools such as gevent/eventlet, multiprocessing (often with Celery), and others,there is a whole series of factors that has been known to cause this problem, some ofwhich have been improved across SQLAlchemy versions but others which are unavoidable:

  • Sharing a connection among threads - This is the original reason these kindsof errors occurred. A program used the same connection in two or more threads atthe same time, meaning multiple sets of messages got mixed up on the connection,putting the server-side session into a state that the client no longer knows howto interpret. However, other causes are usually more likely today.

  • Sharing the filehandle for the connection among processes - This usually occurswhen a program uses os.fork() to spawn a new process, and a TCP connectionthat is present in th parent process gets shared into one or more child processes.As multiple processes are now emitting messages to essentially the same filehandle,the server receives interleaved messages and breaks the state of the connection.

This scenario can occur very easily if a program uses Python’s “multiprocessing”module and makes use of an Engine that was created in the parentprocess. It’s common that “multiprocessing” is in use when using tools likeCelery. The correct approach should be either that a new Engineis produced when a child process first starts, discarding any Enginethat came down from the parent process; or, the Engine that’s inheritedfrom the parent process can have it’s internal pool of connections disposed bycalling Engine.dispose().

  • Greenlet Monkeypatching w/ Exits - When using a library like gevent or eventletthat monkeypatches the Python networking API, libraries like PyMySQL are nowworking in an asynchronous mode of operation, even though they are not developedexplicitly against this model. A common issue is that a greenthread is interrupted,often due to timeout logic in the application. This results in the GreenletExitexception being raised, and the pure-Python MySQL driver is interrupted fromits work, which may have been that it was receiving a response from the serveror preparing to otherwise reset the state of the connection. When the exceptioncuts all that work short, the conversation between client and server is nowout of sync and subsequent usage of the connection may fail. SQLAlchemyas of version 1.1.0 knows how to guard against this, as if a database operationis interrupted by a so-called “exit exception”, which includes GreenletExitand any other subclass of Python BaseException that is not also a subclassof Exception, the connection is invalidated.

  • Rollbacks / SAVEPOINT releases failing - Some classes of error causethe connection to be unusable within the context of a transaction, as wellas when operating in a “SAVEPOINT” block. In these cases, the failureon the connection has rendered any SAVEPOINT as no longer existing, yetwhen SQLAlchemy, or the application, attempts to “roll back” this savepoint,the “RELEASE SAVEPOINT” operation fails, typically with a message like“savepoint does not exist”. In this case, under Python 3 there will bea chain of exceptions output, where the ultimate “cause” of the errorwill be displayed as well. Under Python 2, there are no “chained” exceptions,however recent versions of SQLAlchemy will attempt to emit a warningillustrating the original failure cause, while still throwing theimmediate error which is the failure of the ROLLBACK.

Why does SQLAlchemy issue so many ROLLBACKs?

SQLAlchemy currently assumes DBAPI connections are in “non-autocommit” mode -this is the default behavior of the Python database API, meaning itmust be assumed that a transaction is always in progress. Theconnection pool issues connection.rollback() when a connection is returned.This is so that any transactional resources remaining on the connection arereleased. On a database like PostgreSQL or MSSQL where table resources areaggressively locked, this is critical so that rows and tables don’t remainlocked within connections that are no longer in use. An application canotherwise hang. It’s not just for locks, however, and is equally critical onany database that has any kind of transaction isolation, including MySQL withInnoDB. Any connection that is still inside an old transaction will returnstale data, if that data was already queried on that connection withinisolation. For background on why you might see stale data even on MySQL, seehttp://dev.mysql.com/doc/refman/5.1/en/innodb-transaction-model.html

I’m on MyISAM - how do I turn it off?

The behavior of the connection pool’s connection return behavior can beconfigured using reset_on_return:

  1. from sqlalchemy import create_engine
  2. from sqlalchemy.pool import QueuePool
  3.  
  4. engine = create_engine('mysql://scott:tiger@localhost/myisam_database', pool=QueuePool(reset_on_return=False))

I’m on SQL Server - how do I turn those ROLLBACKs into COMMITs?

reset_on_return accepts the values commit, rollback in additionto True, False, and None. Setting to commit will causea COMMIT as any connection is returned to the pool:

  1. engine = create_engine('mssql://scott:tiger@mydsn', pool=QueuePool(reset_on_return='commit'))

I am using multiple connections with a SQLite database (typically to test transaction operation), and my test program is not working!

If using a SQLite :memory: database, or a version of SQLAlchemy priorto version 0.7, the default connection pool is the SingletonThreadPool,which maintains exactly one SQLite connection per thread. So twoconnections in use in the same thread will actually be the same SQLiteconnection. Make sure you’re not using a :memory: database anduse NullPool, which is the default for non-memory databases incurrent SQLAlchemy versions.

See also

Threading/Pooling Behavior - info on PySQLite’s behavior.

How do I get at the raw DBAPI connection when using an Engine?

With a regular SA engine-level Connection, you can get at a pool-proxiedversion of the DBAPI connection via the Connection.connection attribute onConnection, and for the really-real DBAPI connection you can call theConnectionFairy.connection attribute on that - but there should never be any need to accessthe non-pool-proxied DBAPI connection, as all methods are proxied through:

  1. engine = create_engine(...)
  2. conn = engine.connect()
  3. conn.connection.<do DBAPI things>
  4. cursor = conn.connection.cursor(<DBAPI specific arguments..>)

You must ensure that you revert any isolation level settings or otheroperation-specific settings on the connection back to normal before returningit to the pool.

As an alternative to reverting settings, you can call the Connection.detach() method oneither Connection or the proxied connection, which will de-associatethe connection from the pool such that it will be closed and discardedwhen Connection.close() is called:

  1. conn = engine.connect()
  2. conn.detach() # detaches the DBAPI connection from the connection pool
  3. conn.connection.<go nuts>
  4. conn.close() # connection is closed for real, the pool replaces it with a new connection

How do I use engines / connections / sessions with Python multiprocessing, or os.fork()?

The key goal with multiple python processes is to prevent any database connectionsfrom being shared across processes. Depending on specifics of the driver and OS,the issues that arise here range from non-working connections to socket connections thatare used by multiple processes concurrently, leading to broken messaging (the lattercase is typically the most common).

The SQLAlchemy Engine object refers to a connection pool of existingdatabase connections. So when this object is replicated to a child process,the goal is to ensure that no database connections are carried over. Thereare three general approaches to this:

  • Disable pooling using NullPool. This is the most simplistic,one shot system that prevents the Engine from using any connectionmore than once.

  • Call Engine.dispose() on any given Engine as soon one iswithin the new process. In Python multiprocessing, constructs such asmultiprocessing.Pool include “initializer” hooks which are a placethat this can be performed; otherwise at the top of where os.fork()or where the Process object begins the child fork, a single callto Engine.dispose() will ensure any remaining connections are flushed.

  • An event handler can be applied to the connection pool that tests for connectionsbeing shared across process boundaries, and invalidates them. This looks likethe following:

  1. import os
  2. import warnings
  3.  
  4. from sqlalchemy import event
  5. from sqlalchemy import exc
  6.  
  7. def add_engine_pidguard(engine):
  8. """Add multiprocessing guards.
  9.  
  10. Forces a connection to be reconnected if it is detected
  11. as having been shared to a sub-process.
  12.  
  13. """
  14.  
  15. @event.listens_for(engine, "connect")
  16. def connect(dbapi_connection, connection_record):
  17. connection_record.info['pid'] = os.getpid()
  18.  
  19. @event.listens_for(engine, "checkout")
  20. def checkout(dbapi_connection, connection_record, connection_proxy):
  21. pid = os.getpid()
  22. if connection_record.info['pid'] != pid:
  23. # substitute log.debug() or similar here as desired
  24. warnings.warn(
  25. "Parent process %(orig)s forked (%(newproc)s) with an open "
  26. "database connection, "
  27. "which is being discarded and recreated." %
  28. {"newproc": pid, "orig": connection_record.info['pid']})
  29. connection_record.connection = connection_proxy.connection = None
  30. raise exc.DisconnectionError(
  31. "Connection record belongs to pid %s, "
  32. "attempting to check out in pid %s" %
  33. (connection_record.info['pid'], pid)
  34. )

These events are applied to an Engine as soon as its created:

  1. engine = create_engine("...")
  2.  
  3. add_engine_pidguard(engine)

The above strategies will accommodate the case of an Enginebeing shared among processes. However, for the case of a transaction-activeSession or Connection being shared, there’s no automaticfix for this; an application needs to ensure a new child process onlyinitiate new Connection objects and transactions, as well as ORMSession objects. For a Session object, technicallythis is only needed if the session is currently transaction-bound, howeverthe scope of a single Session is in any case intended to bekept within a single call stack in any case (e.g. not a global object, notshared between processes or threads).