What’s New in SQLAlchemy 0.6?

About this Document

This document describes changes between SQLAlchemy version 0.5,last released January 16, 2010, and SQLAlchemy version 0.6,last released May 5, 2012.

Document date: June 6, 2010

This guide documents API changes which affect usersmigrating their applications from the 0.5 series ofSQLAlchemy to 0.6. Note that SQLAlchemy 0.6 removes somebehaviors which were deprecated throughout the span of the0.5 series, and also deprecates more behaviors specific to0.5.

Platform Support

  • cPython versions 2.4 and upwards throughout the 2.xxseries

  • Jython 2.5.1 - using the zxJDBC DBAPI included withJython.

  • cPython 3.x - see [source:sqlalchemy/trunk/README.py3k]for information on how to build for python3.

New Dialect System

Dialect modules are now broken up into distinctsubcomponents, within the scope of a single databasebackend. Dialect implementations are now in thesqlalchemy.dialects package. Thesqlalchemy.databases package still exists as aplaceholder to provide some level of backwards compatibilityfor simple imports.

For each supported database, a sub-package exists withinsqlalchemy.dialects where several files are contained.Each package contains a module called base.py whichdefines the specific SQL dialect used by that database. Italso contains one or more “driver” modules, each onecorresponding to a specific DBAPI - these files are namedcorresponding to the DBAPI itself, such as pysqlite,cx_oracle, or pyodbc. The classes used bySQLAlchemy dialects are first declared in the base.pymodule, defining all behavioral characteristics defined bythe database. These include capability mappings, such as“supports sequences”, “supports returning”, etc., typedefinitions, and SQL compilation rules. Each “driver”module in turn provides subclasses of those classes asneeded which override the default behavior to accommodatethe additional features, behaviors, and quirks of thatDBAPI. For DBAPIs that support multiple backends (pyodbc,zxJDBC, mxODBC), the dialect module will use mixins from thesqlalchemy.connectors package, which providefunctionality common to that DBAPI across all backends, mosttypically dealing with connect arguments. This means thatconnecting using pyodbc, zxJDBC or mxODBC (when implemented)is extremely consistent across supported backends.

The URL format used by create_engine() has been enhancedto handle any number of DBAPIs for a particular backend,using a scheme that is inspired by that of JDBC. Theprevious format still works, and will select a “default”DBAPI implementation, such as the PostgreSQL URL below thatwill use psycopg2:

  1. create_engine('postgresql://scott:tiger@localhost/test')

However to specify a specific DBAPI backend such as pg8000,add it to the “protocol” section of the URL using a plussign “+”:

  1. create_engine('postgresql+pg8000://scott:tiger@localhost/test')

Important Dialect Links:

Other notes regarding dialects:

  • the type system has been changed dramatically inSQLAlchemy 0.6. This has an impact on all dialectsregarding naming conventions, behaviors, andimplementations. See the section on “Types” below.

  • the ResultProxy object now offers a 2x speedimprovement in some cases thanks to some refactorings.

  • the RowProxy, i.e. individual result row object, isnow directly pickleable.

  • the setuptools entrypoint used to locate external dialectsis now called sqlalchemy.dialects. An externaldialect written against 0.4 or 0.5 will need to bemodified to work with 0.6 in any case so this change doesnot add any additional difficulties.

  • dialects now receive an initialize() event on initialconnection to determine connection properties.

  • Functions and operators generated by the compiler now use(almost) regular dispatch functions of the form“visit” and “visit_fn” to providecustomed processing. This replaces the need to copy the“functions” and “operators” dictionaries in compilersubclasses with straightforward visitor methods, and alsoallows compiler subclasses complete control overrendering, as the full _Function or _BinaryExpressionobject is passed in.

Dialect Imports

The import structure of dialects has changed. Each dialectnow exports its base “dialect” class as well as the full setof SQL types supported on that dialect viasqlalchemy.dialects.<name>. For example, to import aset of PG types:

  1. from sqlalchemy.dialects.postgresql import INTEGER, BIGINT, SMALLINT,\
  2. VARCHAR, MACADDR, DATE, BYTEA

Above, INTEGER is actually the plain INTEGER typefrom sqlalchemy.types, but the PG dialect makes itavailable in the same way as those types which are specificto PG, such as BYTEA and MACADDR.

Expression Language Changes

An Important Expression Language Gotcha

There’s one quite significant behavioral change to theexpression language which may affect some applications.The boolean value of Python boolean expressions, i.e.==, !=, and similar, now evaluates accurately withregards to the two clause objects being compared.

As we know, comparing a ClauseElement to any otherobject returns another ClauseElement:

  1. >>> from sqlalchemy.sql import column
  2. >>> column('foo') == 5
  3. <sqlalchemy.sql.expression._BinaryExpression object at 0x1252490>

This so that Python expressions produce SQL expressions whenconverted to strings:

  1. >>> str(column('foo') == 5)
  2. 'foo = :foo_1'

But what happens if we say this?

  1. >>> if column('foo') == 5:
  2. ... print("yes")
  3. ...

In previous versions of SQLAlchemy, the returned_BinaryExpression was a plain Python object whichevaluated to True. Now it evaluates to whether or notthe actual ClauseElement should have the same hash valueas to that being compared. Meaning:

  1. >>> bool(column('foo') == 5)
  2. False
  3. >>> bool(column('foo') == column('foo'))
  4. False
  5. >>> c = column('foo')
  6. >>> bool(c == c)
  7. True
  8. >>>

That means code such as the following:

  1. if expression:
  2. print("the expression is:", expression)

Would not evaluate if expression was a binary clause.Since the above pattern should never be used, the baseClauseElement now raises an exception if called in aboolean context:

  1. >>> bool(c)
  2. Traceback (most recent call last):
  3. File "<stdin>", line 1, in <module>
  4. ...
  5. raise TypeError("Boolean value of this clause is not defined")
  6. TypeError: Boolean value of this clause is not defined

Code that wants to check for the presence of aClauseElement expression should instead say:

  1. if expression is not None:
  2. print("the expression is:", expression)

Keep in mind, this applies to Table and Column objectstoo.

The rationale for the change is twofold:

  • Comparisons of the form if c1 == c2: <do something>can actually be written now

  • Support for correct hashing of ClauseElement objectsnow works on alternate platforms, namely Jython. Up untilthis point SQLAlchemy relied heavily on the specificbehavior of cPython in this regard (and still hadoccasional problems with it).

Stricter “executemany” Behavior

An “executemany” in SQLAlchemy corresponds to a call toexecute(), passing along a collection of bind parametersets:

  1. connection.execute(table.insert(), {'data':'row1'}, {'data':'row2'}, {'data':'row3'})

When the Connection object sends off the giveninsert() construct for compilation, it passes to thecompiler the keynames present in the first set of bindspassed along to determine the construction of thestatement’s VALUES clause. Users familiar with thisconstruct will know that additional keys present in theremaining dictionaries don’t have any impact. What’sdifferent now is that all subsequent dictionaries need toinclude at least every key that is present in the firstdictionary. This means that a call like this no longerworks:

  1. connection.execute(table.insert(),
  2. {'timestamp':today, 'data':'row1'},
  3. {'timestamp':today, 'data':'row2'},
  4. {'data':'row3'})

Because the third row does not specify the ‘timestamp’column. Previous versions of SQLAlchemy would simply insertNULL for these missing columns. However, if thetimestamp column in the above example contained aPython-side default value or function, it would not beused. This because the “executemany” operation is optimizedfor maximum performance across huge numbers of parametersets, and does not attempt to evaluate Python-side defaultsfor those missing keys. Because defaults are oftenimplemented either as SQL expressions which are embeddedinline with the INSERT statement, or are server sideexpressions which again are triggered based on the structureof the INSERT string, which by definition cannot fire offconditionally based on each parameter set, it would beinconsistent for Python side defaults to behave differentlyvs. SQL/server side defaults. (SQL expression baseddefaults are embedded inline as of the 0.5 series, again tominimize the impact of huge numbers of parameter sets).

SQLAlchemy 0.6 therefore establishes predictable consistencyby forbidding any subsequent parameter sets from leaving anyfields blank. That way, there’s no more silent failure ofPython side default values and functions, which additionallyare allowed to remain consistent in their behavior versusSQL and server side defaults.

UNION and other “compound” constructs parenthesize consistently

A rule that was designed to help SQLite has been removed,that of the first compound element within another compound(such as, a union() inside of an except_()) wouldn’tbe parenthesized. This is inconsistent and produces thewrong results on PostgreSQL, which has precedence rulesregarding INTERSECTION, and its generally a surprise. Whenusing complex composites with SQLite, you now need to turnthe first element into a subquery (which is also compatibleon PG). A new example is in the SQL expression tutorial atthe end of[http://www.sqlalchemy.org/docs/06/sqlexpression.html#unions-and-other-set-operations]. See #1665 andr6690 for more background.

C Extensions for Result Fetching

The ResultProxy and related elements, including mostcommon “row processing” functions such as unicodeconversion, numerical/boolean conversions and date parsing,have been re-implemented as optional C extensions for thepurposes of performance. This represents the beginning ofSQLAlchemy’s path to the “dark side” where we hope tocontinue improving performance by reimplementing criticalsections in C. The extensions can be built by specifying—with-cextensions, i.e. python setup.py —with-cextensions install.

The extensions have the most dramatic impact on resultfetching using direct ResultProxy access, i.e. thatwhich is returned by engine.execute(),connection.execute(), or session.execute(). Withinresults returned by an ORM Query object, result fetchingis not as high a percentage of overhead, so ORM performanceimproves more modestly, and mostly in the realm of fetchinglarge result sets. The performance improvements highlydepend on the dbapi in use and on the syntax used to accessthe columns of each row (eg row['name'] is much fasterthan row.name). The current extensions have no impacton the speed of inserts/updates/deletes, nor do they improvethe latency of SQL execution, that is, an application thatspends most of its time executing many statements with verysmall result sets will not see much improvement.

Performance has been improved in 0.6 versus 0.5 regardlessof the extensions. A quick overview of what connecting andfetching 50,000 rows looks like with SQLite, using mostlydirect SQLite access, a ResultProxy, and a simple mappedORM object:

  1. sqlite select/native: 0.260s
  2.  
  3. 0.6 / C extension
  4.  
  5. sqlalchemy.sql select: 0.360s
  6. sqlalchemy.orm fetch: 2.500s
  7.  
  8. 0.6 / Pure Python
  9.  
  10. sqlalchemy.sql select: 0.600s
  11. sqlalchemy.orm fetch: 3.000s
  12.  
  13. 0.5 / Pure Python
  14.  
  15. sqlalchemy.sql select: 0.790s
  16. sqlalchemy.orm fetch: 4.030s

Above, the ORM fetches the rows 33% faster than 0.5 due toin-python performance enhancements. With the C extensionswe get another 20%. However, ResultProxy fetchesimprove by 67% with the C extension versus not. Othertests report as much as a 200% speed improvement for somescenarios, such as those where lots of string conversionsare occurring.

New Schema Capabilities

The sqlalchemy.schema package has received some long-needed attention. The most visible change is the newlyexpanded DDL system. In SQLAlchemy, it was possible sinceversion 0.5 to create custom DDL strings and associate themwith tables or metadata objects:

  1. from sqlalchemy.schema import DDL
  2.  
  3. DDL('CREATE TRIGGER users_trigger ...').execute_at('after-create', metadata)

Now the full suite of DDL constructs are available under thesame system, including those for CREATE TABLE, ADDCONSTRAINT, etc.:

  1. from sqlalchemy.schema import Constraint, AddConstraint
  2.  
  3. AddContraint(CheckConstraint("value > 5")).execute_at('after-create', mytable)

Additionally, all the DDL objects are now regularClauseElement objects just like any other SQLAlchemyexpression object:

  1. from sqlalchemy.schema import CreateTable
  2.  
  3. create = CreateTable(mytable)
  4.  
  5. # dumps the CREATE TABLE as a string
  6. print(create)
  7.  
  8. # executes the CREATE TABLE statement
  9. engine.execute(create)

and using the sqlalchemy.ext.compiler extension you canmake your own:

  1. from sqlalchemy.schema import DDLElement
  2. from sqlalchemy.ext.compiler import compiles
  3.  
  4. class AlterColumn(DDLElement):
  5.  
  6. def __init__(self, column, cmd):
  7. self.column = column
  8. self.cmd = cmd
  9.  
  10. @compiles(AlterColumn)
  11. def visit_alter_column(element, compiler, **kw):
  12. return "ALTER TABLE %s ALTER COLUMN %s %s ..." % (
  13. element.column.table.name,
  14. element.column.name,
  15. element.cmd
  16. )
  17.  
  18. engine.execute(AlterColumn(table.c.mycolumn, "SET DEFAULT 'test'"))

Deprecated/Removed Schema Elements

The schema package has also been greatly streamlined. Manyoptions and methods which were deprecated throughout 0.5have been removed. Other little known accessors and methodshave also been removed.

  • the “owner” keyword argument is removed from Table.Use “schema” to represent any namespaces to be prependedto the table name.

  • deprecated MetaData.connect() andThreadLocalMetaData.connect() have been removed - sendthe “bind” attribute to bind a metadata.

  • deprecated metadata.table_iterator() method removed (usesorted_tables)

  • the “metadata” argument is removed fromDefaultGenerator and subclasses, but remains locallypresent on Sequence, which is a standalone constructin DDL.

  • deprecated PassiveDefault - use DefaultClause.

  • Removed public mutability from Index andConstraint objects:

    • ForeignKeyConstraint.append_element()

    • Index.append_column()

    • UniqueConstraint.append_column()

    • PrimaryKeyConstraint.add()

    • PrimaryKeyConstraint.remove()

These should be constructed declaratively (i.e. in oneconstruction).

  • Other removed things:

    • Table.key (no idea what this was for)

    • Column.bind (get via column.table.bind)

    • Column.metadata (get via column.table.metadata)

    • Column.sequence (use column.default)

Other Behavioral Changes

  • UniqueConstraint, Index, PrimaryKeyConstraintall accept lists of column names or column objects asarguments.

  • The usealter flag on ForeignKey is now a shortcutoption for operations that can be hand-constructed usingthe DDL() event system. A side effect of this refactoris that ForeignKeyConstraint objects withuse_alter=True will _not be emitted on SQLite, whichdoes not support ALTER for foreign keys. This has noeffect on SQLite’s behavior since SQLite does not actuallyhonor FOREIGN KEY constraints.

  • Table.primary_key is not assignable - usetable.append_constraint(PrimaryKeyConstraint(…))

  • A Column definition with a ForeignKey and no type,e.g. Column(name, ForeignKey(sometable.c.somecol))used to get the type of the referenced column. Now supportfor that automatic type inference is partial and may notwork in all cases.

Logging opened up

At the expense of a few extra method calls here and there,you can set log levels for INFO and DEBUG after an engine,pool, or mapper has been created, and logging will commence.The isEnabledFor(INFO) method is now calledper-Connection and isEnabledFor(DEBUG)per-ResultProxy if already enabled on the parentconnection. Pool logging sends to log.info() andlog.debug() with no check - note that poolcheckout/checkin is typically once per transaction.

Reflection/Inspector API

The reflection system, which allows reflection of tablecolumns via Table('sometable', metadata, autoload=True)has been opened up into its own fine-grained API, whichallows direct inspection of database elements such astables, columns, constraints, indexes, and more. This APIexpresses return values as simple lists of strings,dictionaries, and TypeEngine objects. The internals ofautoload=True now build upon this system such that thetranslation of raw database information intosqlalchemy.schema constructs is centralized and thecontract of individual dialects greatly simplified, vastlyreducing bugs and inconsistencies across different backends.

To use an inspector:

  1. from sqlalchemy.engine.reflection import Inspector
  2. insp = Inspector.from_engine(my_engine)
  3.  
  4. print(insp.get_schema_names())

the from_engine() method will in some cases provide abackend-specific inspector with additional capabilities,such as that of PostgreSQL which provides aget_table_oid() method:

  1. my_engine = create_engine('postgresql://...')
  2. pg_insp = Inspector.from_engine(my_engine)
  3.  
  4. print(pg_insp.get_table_oid('my_table'))

RETURNING Support

The insert(), update() and delete() constructsnow support a returning() method, which corresponds tothe SQL RETURNING clause as supported by PostgreSQL, Oracle,MS-SQL, and Firebird. It is not supported for any otherbackend at this time.

Given a list of column expressions in the same manner asthat of a select() construct, the values of thesecolumns will be returned as a regular result set:

  1. result = connection.execute(
  2. table.insert().values(data='some data').returning(table.c.id, table.c.timestamp)
  3. )
  4. row = result.first()
  5. print("ID:", row['id'], "Timestamp:", row['timestamp'])

The implementation of RETURNING across the four supportedbackends varies wildly, in the case of Oracle requiring anintricate usage of OUT parameters which are re-routed into a“mock” result set, and in the case of MS-SQL using anawkward SQL syntax. The usage of RETURNING is subject tolimitations:

  • it does not work for any “executemany()” style ofexecution. This is a limitation of all supported DBAPIs.

  • Some backends, such as Oracle, only support RETURNING thatreturns a single row - this includes UPDATE and DELETEstatements, meaning the update() or delete() constructmust match only a single row, or an error is raised (byOracle, not SQLAlchemy).

RETURNING is also used automatically by SQLAlchemy, whenavailable and when not otherwise specified by an explicitreturning() call, to fetch the value of newly generatedprimary key values for single-row INSERT statements. Thismeans there’s no more “SELECT nextval(sequence)” pre-execution for insert statements where the primary key valueis required. Truth be told, implicit RETURNING featuredoes incur more method overhead than the old “selectnextval()” system, which used a quick and dirtycursor.execute() to get at the sequence value, and in thecase of Oracle requires additional binding of outparameters. So if method/protocol overhead is proving to bemore expensive than additional database round trips, thefeature can be disabled by specifyingimplicit_returning=False to create_engine().

Type System Changes

New Architecture

The type system has been completely reworked behind thescenes to provide two goals:

  • Separate the handling of bind parameters and result rowvalues, typically a DBAPI requirement, from the SQLspecification of the type itself, which is a databaserequirement. This is consistent with the overall dialectrefactor that separates database SQL behavior from DBAPI.

  • Establish a clear and consistent contract for generatingDDL from a TypeEngine object and for constructingTypeEngine objects based on column reflection.

Highlights of these changes include:

  • The construction of types within dialects has been totallyoverhauled. Dialects now define publicly available typesas UPPERCASE names exclusively, and internalimplementation types using underscore identifiers (i.e.are private). The system by which types are expressed inSQL and DDL has been moved to the compiler system. Thishas the effect that there are much fewer type objectswithin most dialects. A detailed document on thisarchitecture for dialect authors is in [source:/lib/sqlalchemy/dialects/type_migration_guidelines.txt].

  • Reflection of types now returns the exact UPPERCASE typewithin types.py, or the UPPERCASE type within the dialectitself if the type is not a standard SQL type. This meansreflection now returns more accurate information aboutreflected types.

  • User defined types that subclass TypeEngine and wishto provide get_col_spec() should now subclassUserDefinedType.

  • The result_processor() method on all type classes nowaccepts an additional argument coltype. This is theDBAPI type object attached to cursor.description, andshould be used when applicable to make better decisions onwhat kind of result-processing callable should bereturned. Ideally result processor functions would neverneed to use isinstance(), which is an expensive callat this level.

Native Unicode Mode

As more DBAPIs support returning Python unicode objectsdirectly, the base dialect now performs a check upon thefirst connection which establishes whether or not the DBAPIreturns a Python unicode object for a basic select of aVARCHAR value. If so, the String type and allsubclasses (i.e. Text, Unicode, etc.) will skip the“unicode” check/conversion step when result rows arereceived. This offers a dramatic performance increase forlarge result sets. The “unicode mode” currently is known towork with:

  • sqlite3 / pysqlite

  • psycopg2 - SQLA 0.6 now uses the “UNICODE” type extensionby default on each psycopg2 connection object

  • pg8000

  • cx_oracle (we use an output processor - nice feature !)

Other types may choose to disable unicode processing asneeded, such as the NVARCHAR type when used with MS-SQL.

In particular, if porting an application based on a DBAPIthat formerly returned non-unicode strings, the “nativeunicode” mode has a plainly different default behavior -columns that are declared as String or VARCHAR nowreturn unicode by default whereas they would return stringsbefore. This can break code which expects non-unicodestrings. The psycopg2 “native unicode” mode can bedisabled by passing use_native_unicode=False tocreate_engine().

A more general solution for string columns that explicitlydo not want a unicode object is to use a TypeDecoratorthat converts unicode back to utf-8, or whatever is desired:

  1. class UTF8Encoded(TypeDecorator):
  2. """Unicode type which coerces to utf-8."""
  3.  
  4. impl = sa.VARCHAR
  5.  
  6. def process_result_value(self, value, dialect):
  7. if isinstance(value, unicode):
  8. value = value.encode('utf-8')
  9. return value

Note that the assert_unicode flag is now deprecated.SQLAlchemy allows the DBAPI and backend database in use tohandle Unicode parameters when available, and does not addoperational overhead by checking the incoming type; modernsystems like sqlite and PostgreSQL will raise an encodingerror on their end if invalid data is passed. In thosecases where SQLAlchemy does need to coerce a bind parameterfrom Python Unicode to an encoded string, or when theUnicode type is used explicitly, a warning is raised if theobject is a bytestring. This warning can be suppressed orconverted to an exception using the Python warnings filterdocumented at: http://docs.python.org/library/warnings.html

Generic Enum Type

We now have an Enum in the types module. This is astring type that is given a collection of “labels” whichconstrain the possible values given to those labels. Bydefault, this type generates a VARCHAR using the size ofthe largest label, and applies a CHECK constraint to thetable within the CREATE TABLE statement. When using MySQL,the type by default uses MySQL’s ENUM type, and when usingPostgreSQL the type will generate a user defined type usingCREATE TYPE <mytype> AS ENUM. In order to create thetype using PostgreSQL, the name parameter must bespecified to the constructor. The type also accepts anative_enum=False option which will issue theVARCHAR/CHECK strategy for all databases. Note thatPostgreSQL ENUM types currently don’t work with pg8000 orzxjdbc.

Reflection Returns Dialect-Specific Types

Reflection now returns the most specific type possible fromthe database. That is, if you create a table usingString, then reflect it back, the reflected column willlikely be VARCHAR. For dialects that support a morespecific form of the type, that’s what you’ll get. So aText type would come back as oracle.CLOB on Oracle,a LargeBinary might be an mysql.MEDIUMBLOB etc. Theobvious advantage here is that reflection preserves as muchinformation possible from what the database had to say.

Some applications that deal heavily in table metadata maywish to compare types across reflected tables and/or non-reflected tables. There’s a semi-private accessor availableon TypeEngine called _type_affinity and anassociated comparison helper _compare_type_affinity.This accessor returns the “generic” types class whichthe type corresponds to:

  1. >>> String(50)._compare_type_affinity(postgresql.VARCHAR(50))
  2. True
  3. >>> Integer()._compare_type_affinity(mysql.REAL)
  4. False

Miscellaneous API Changes

The usual “generic” types are still the general system inuse, i.e. String, Float, DateTime. There’s afew changes there:

  • Types no longer make any guesses as to default parameters.In particular, Numeric, Float, as well assubclasses NUMERIC, FLOAT, DECIMAL don’t generate anylength or scale unless specified. This also continues toinclude the controversial String and VARCHAR types(although MySQL dialect will pre-emptively raise whenasked to render VARCHAR with no length). No defaults areassumed, and if they are used in a CREATE TABLE statement,an error will be raised if the underlying database doesnot allow non-lengthed versions of these types.

  • the Binary type has been renamed to LargeBinary,for BLOB/BYTEA/similar types. For BINARY andVARBINARY, those are present directly astypes.BINARY, types.VARBINARY, as well as in theMySQL and MS-SQL dialects.

  • PickleType now uses == for comparison of values whenmutable=True, unless the “comparator” argument with acomparison function is specified to the type. If you arepickling a custom object you should implement aneq() method so that value-based comparisons areaccurate.

  • The default “precision” and “scale” arguments of Numericand Float have been removed and now default to None.NUMERIC and FLOAT will be rendered with no numericarguments by default unless these values are provided.

  • DATE, TIME and DATETIME types on SQLite can now takeoptional “storage_format” and “regexp” argument.“storage_format” can be used to store those types using acustom string format. “regexp” allows to use a customregular expression to match string values from thedatabase.

  • legacy_microseconds on SQLite Time andDateTime types is not supported anymore. You shoulduse the new “storage_format” argument instead.

  • DateTime types on SQLite now use by a default astricter regular expression to match strings from thedatabase. Use the new “regexp” argument if you are usingdata stored in a legacy format.

ORM Changes

Upgrading an ORM application from 0.5 to 0.6 should requirelittle to no changes, as the ORM’s behavior remains almostidentical. There are some default argument and namechanges, and some loading behaviors have been improved.

New Unit of Work

The internals for the unit of work, primarilytopological.py and unitofwork.py, have beencompletely rewritten and are vastly simplified. Thisshould have no impact on usage, as all existing behaviorduring flush has been maintained exactly (or at least, asfar as it is exercised by our testsuite and the handful ofproduction environments which have tested it heavily). Theperformance of flush() now uses 20-30% fewer method callsand should also use less memory. The intent and flow of thesource code should now be reasonably easy to follow, and thearchitecture of the flush is fairly open-ended at thispoint, creating room for potential new areas ofsophistication. The flush process no longer has anyreliance on recursion so flush plans of arbitrary size andcomplexity can be flushed. Additionally, the mapper’s“save” process, which issues INSERT and UPDATE statements,now caches the “compiled” form of the two statements so thatcallcounts are further dramatically reduced with very largeflushes.

Any changes in behavior observed with flush versus earlierversions of 0.6 or 0.5 should be reported to us ASAP - we’llmake sure no functionality is lost.

Changes to query.update() and query.delete()

  • the ‘expire’ option on query.update() has been renamed to‘fetch’, thus matching that of query.delete()

  • query.update() and query.delete() both default to‘evaluate’ for the synchronize strategy.

  • the ‘synchronize’ strategy for update() and delete()raises an error on failure. There is no implicit fallbackonto “fetch”. Failure of evaluation is based on thestructure of criteria, so success/failure is deterministicbased on code structure.

relation() is officially named relationship()

This to solve the long running issue that “relation” means a“table or derived table” in relational algebra terms. Therelation() name, which is less typing, will hang aroundfor the foreseeable future so this change should be entirelypainless.

Subquery eager loading

A new kind of eager loading is added called “subquery”loading. This is a load that emits a second SQL queryimmediately after the first which loads full collections forall the parents in the first query, joining upwards to theparent using INNER JOIN. Subquery loading is used similarlyto the current joined-eager loading, using thesubqueryload()``</code> and <code>``subqueryload_all()``</code> optionsas well as the <code>``lazy=&#39;subquery&#39;``</code> setting on<code>``relationship(). The subquery load is usually muchmore efficient for loading many larger collections as ituses INNER JOIN unconditionally and also doesn’t re-loadparent rows.

eagerload()``, ``eagerload_all()`` is now ``joinedload()``, ``joinedload_all()

To make room for the new subquery load feature, the existingeagerload()``</code>/<code>``eagerload_all()``</code> options are nowsuperseded by <code>``joinedload()``</code> and<code>``joinedload_all()``</code>. The old names will hang aroundfor the foreseeable future just like <code>``relation().

lazy=False|None|True|&#39;dynamic&#39;`` now accepts ``lazy=&#39;noload&#39;|&#39;joined&#39;|&#39;subquery&#39;|&#39;select&#39;|&#39;dynamic&#39;

Continuing on the theme of loader strategies opened up, thestandard keywords for the lazy``</code> option on<code>``relationship()``</code> are now <code>``select``</code> for lazyloading (via a SELECT issued on attribute access),<code>``joined``</code> for joined-eager loading, <code>``subquery``</code>for subquery-eager loading, <code>``noload``</code> for no loadingshould occur, and <code>``dynamic``</code> for a “dynamic”relationship. The old <code>``True``</code>, <code>``False``</code>,<code>``None arguments are still accepted with the identicalbehavior as before.

innerjoin=True on relation, joinedload

Joined-eagerly loaded scalars and collections can now beinstructed to use INNER JOIN instead of OUTER JOIN. OnPostgreSQL this is observed to provide a 300-600% speedup onsome queries. Set this flag for any many-to-one which ison a NOT NULLable foreign key, and similarly for anycollection where related items are guaranteed to exist.

At mapper level:

  1. mapper(Child, child)
  2. mapper(Parent, parent, properties={
  3. 'child':relationship(Child, lazy='joined', innerjoin=True)
  4. })

At query time level:

  1. session.query(Parent).options(joinedload(Parent.child, innerjoin=True)).all()

The innerjoin=True flag at the relationship() levelwill also take effect for any joinedload() option whichdoes not override the value.

Many-to-one Enhancements

  • many-to-one relations now fire off a lazyload in fewercases, including in most cases will not fetch the “old”value when a new one is replaced.

  • many-to-one relation to a joined-table subclass now usesget() for a simple load (known as the “use_get”condition), i.e. Related->Sub(Base), without theneed to redefine the primaryjoin condition in terms of thebase table. [ticket:1186]

  • specifying a foreign key with a declarative column, i.e.ForeignKey(MyRelatedClass.id) doesn’t break the“use_get” condition from taking place [ticket:1492]

  • relationship(), joinedload(), and joinedload_all() nowfeature an option called “innerjoin”. Specify True orFalse to control whether an eager join is constructedas an INNER or OUTER join. Default is False as always.The mapper options will override whichever setting isspecified on relationship(). Should generally be set formany-to-one, not nullable foreign key relations to allowimproved join performance. [ticket:1544]

  • the behavior of joined eager loading such that the mainquery is wrapped in a subquery when LIMIT/OFFSET arepresent now makes an exception for the case when all eagerloads are many-to-one joins. In those cases, the eagerjoins are against the parent table directly along with thelimit/offset without the extra overhead of a subquery,since a many-to-one join does not add rows to the result.

For example, in 0.5 this query:

  1. session.query(Address).options(eagerload(Address.user)).limit(10)

would produce SQL like:

  1. SELECT * FROM
  2. (SELECT * FROM addresses LIMIT 10) AS anon_1
  3. LEFT OUTER JOIN users AS users_1 ON users_1.id = anon_1.addresses_user_id

This because the presence of any eager loaders suggeststhat some or all of them may relate to multi-rowcollections, which would necessitate wrapping any kind ofrowcount-sensitive modifiers like LIMIT inside of asubquery.

In 0.6, that logic is more sensitive and can detect if alleager loaders represent many-to-ones, in which case theeager joins don’t affect the rowcount:

  1. SELECT * FROM addresses LEFT OUTER JOIN users AS users_1 ON users_1.id = addresses.user_id LIMIT 10

Mutable Primary Keys with Joined Table Inheritance

A joined table inheritance config where the child table hasa PK that foreign keys to the parent PK can now be updatedon a CASCADE-capable database like PostgreSQL.mapper() now has an option passive_updates=Truewhich indicates this foreign key is updated automatically.If on a non-cascading database like SQLite or MySQL/MyISAM,set this flag to False. A future feature enhancementwill try to get this flag to be auto-configuring based ondialect/table style in use.

Beaker Caching

A promising new example of Beaker integration is inexamples/beaker_caching. This is a straightforwardrecipe which applies a Beaker cache within the result-generation engine of Query. Cache parameters areprovided via query.options(), and allows full controlover the contents of the cache. SQLAlchemy 0.6 includesimprovements to the Session.merge() method to supportthis and similar recipes, as well as to providesignificantly improved performance in most scenarios.

Other Changes

  • the “row tuple” object returned by Query when multiplecolumn/entities are selected is now picklable as well ashigher performing.

  • query.join() has been reworked to provide moreconsistent behavior and more flexibility (includes[ticket:1537])

  • query.select_from() accepts multiple clauses toproduce multiple comma separated entries within the FROMclause. Useful when selecting from multiple-homed join()clauses.

  • the “dont_load=True” flag on Session.merge() isdeprecated and is now “load=False”.

  • added “make_transient()” helper function which transformsa persistent/ detached instance into a transient one (i.e.deletes the instance_key and removes from any session.)[ticket:1052]

  • the allow_null_pks flag on mapper() is deprecated and hasbeen renamed to allow_partial_pks. It is turned “on” bydefault. This means that a row which has a non-null valuefor any of its primary key columns will be considered anidentity. The need for this scenario typically only occurswhen mapping to an outer join. When set to False, a PKthat has NULLs in it will not be considered a primary key- in particular this means a result row will come back asNone (or not be filled into a collection), and new in 0.6also indicates that session.merge() won’t issue a roundtrip to the database for such a PK value. [ticket:1680]

  • the mechanics of “backref” have been fully merged into thefiner grained “back_populates” system, and take placeentirely within the _generate_backref() method ofRelationProperty. This makes the initializationprocedure of RelationProperty simpler and allowseasier propagation of settings (such as from subclasses ofRelationProperty) into the reverse reference. Theinternal BackRef() is gone and backref() returns aplain tuple that is understood by RelationProperty.

  • the keys attribute of ResultProxy is now a method, soreferences to it (result.keys) must be changed tomethod invocations (result.keys())

  • ResultProxy.last_inserted_ids is now deprecated, useResultProxy.inserted_primary_key instead.

Deprecated/Removed ORM Elements

Most elements that were deprecated throughout 0.5 and raiseddeprecation warnings have been removed (with a fewexceptions). All elements that were marked “pendingdeprecation” are now deprecated and will raise a warningupon use.

  • ‘transactional’ flag on sessionmaker() and others isremoved. Use ‘autocommit=True’ to indicate‘transactional=False’.

  • ‘polymorphic_fetch’ argument on mapper() is removed.Loading can be controlled using the ‘with_polymorphic’option.

  • ‘select_table’ argument on mapper() is removed. Use‘with_polymorphic=(“*”, )’ for thisfunctionality.

  • ‘proxy’ argument on synonym() is removed. This flag didnothing throughout 0.5, as the “proxy generation”behavior is now automatic.

  • Passing a single list of elements to joinedload(),joinedload_all(), contains_eager(), lazyload(), defer(),and undefer() instead of multiple positional *args isdeprecated.

  • Passing a single list of elements to query.order_by(),query.group_by(), query.join(), or query.outerjoin()instead of multiple positional *args is deprecated.

  • query.iterate_instances() is removed. Usequery.instances().

  • Query.query_from_parent() is removed. Use thesqlalchemy.orm.with_parent() function to produce a“parent” clause, or alternatively query.with_parent().

  • query._from_self() is removed, usequery.from_self() instead.

  • the “comparator” argument to composite() is removed. Use“comparator_factory”.

  • RelationProperty._get_join() is removed.

  • the ‘echo_uow’ flag on Session is removed. Use loggingon the “sqlalchemy.orm.unitofwork” name.

  • session.clear() is removed. usesession.expunge_all().

  • session.save(), session.update(),session.save_or_update() are removed. Usesession.add() and session.add_all().

  • the “objects” flag on session.flush() remains deprecated.

  • the “dont_load=True” flag on session.merge() is deprecatedin favor of “load=False”.

  • ScopedSession.mapper remains deprecated. See theusage recipe at http://www.sqlalchemy.org/trac/wiki/UsageRecipes/SessionAwareMapper

  • passing an InstanceState (internal SQLAlchemy stateobject) to attributes.init_collection() orattributes.get_history() is deprecated. Thesefunctions are public API and normally expect a regularmapped object instance.

  • the ‘engine’ parameter to declarative_base() isremoved. Use the ‘bind’ keyword argument.

Extensions

SQLSoup

SQLSoup has been modernized and updated to reflect common0.5/0.6 capabilities, including well defined sessionintegration. Please read the new docs at [http://www.sqlalchemy.org/docs/06/reference/ext/sqlsoup.html].

Declarative

The DeclarativeMeta (default metaclass fordeclarativebase) previously allowed subclasses tomodify dict to add class attributes (e.g. columns).This no longer works, the DeclarativeMeta constructornow ignores dict_. Instead, the class attributes shouldbe assigned directly, e.g. cls.id=Column(…), or theMixIn class approach should be usedinstead of the metaclass approach.