19. Multitenancy

19.1. What is multitenancy?

The term multitenancy, in general, is applied to software development to indicate an architecture in which a single running instance of an application simultaneously serves multiple clients (tenants). This is highly common in SaaS solutions. Isolating information (data, customizations, etc.) pertaining to the various tenants is a particular challenge in these systems. This includes the data owned by each tenant stored in the database. It is this last piece, sometimes called multitenant data, that we will focus on.

19.2. Multitenant data approaches

There are three main approaches to isolating information in these multitenant systems which go hand-in-hand with different database schema definitions and JDBC setups.

Each multitenancy strategy has pros and cons as well as specific techniques and considerations. Such topics are beyond the scope of this documentation.

19.2.1. Separate database

multitenacy database

Each tenant’s data is kept in a physically separate database instance. JDBC Connections would point specifically to each database so any pooling would be per-tenant. A general application approach, here, would be to define a JDBC Connection pool per-tenant and to select the pool to use based on the tenant identifier associated with the currently logged in user.

19.2.2. Separate schema

multitenacy schema

Each tenant’s data is kept in a distinct database schema on a single database instance. There are two different ways to define JDBC Connections here:

  • Connections could point specifically to each schema as we saw with the Separate database approach. This is an option provided that the driver supports naming the default schema in the connection URL or if the pooling mechanism supports naming a schema to use for its Connections. Using this approach, we would have a distinct JDBC Connection pool per-tenant where the pool to use would be selected based on the “tenant identifier” associated with the currently logged in user.

  • Connections could point to the database itself (using some default schema) but the Connections would be altered using the SQL SET SCHEMA (or similar) command. Using this approach, we would have a single JDBC Connection pool for use to service all tenants, but before using the Connection, it would be altered to reference the schema named by the “tenant identifier” associated with the currently logged in user.

19.3. Partitioned (discriminator) data

multitenacy discriminator

All data is kept in a single database schema. The data for each tenant is partitioned by the use of partition value or discriminator. The complexity of this discriminator might range from a simple column value to a complex SQL formula. Again, this approach would use a single Connection pool to service all tenants. However, in this approach, the application needs to alter each and every SQL statement sent to the database to reference the “tenant identifier” discriminator.

19.4. Multitenancy in Hibernate

Using Hibernate with multitenant data comes down to both an API and then integration piece(s). As usual, Hibernate strives to keep the API simple and isolated from any underlying integration complexities. The API is really just defined by passing the tenant identifier as part of opening any session.

Example 637. Specifying tenant identifier from SessionFactory

  1. private void doInSession(String tenant, Consumer<Session> function) {
  2. Session session = null;
  3. Transaction txn = null;
  4. try {
  5. session = sessionFactory
  6. .withOptions()
  7. .tenantIdentifier( tenant )
  8. .openSession();
  9. txn = session.getTransaction();
  10. txn.begin();
  11. function.accept(session);
  12. txn.commit();
  13. } catch (Throwable e) {
  14. if ( txn != null ) txn.rollback();
  15. throw e;
  16. } finally {
  17. if (session != null) {
  18. session.close();
  19. }
  20. }
  21. }

Additionally, when specifying the configuration, an org.hibernate.MultiTenancyStrategy should be named using the hibernate.multiTenancy setting. Hibernate will perform validations based on the type of strategy you specify. The strategy here correlates with the isolation approach discussed above.

NONE

(the default) No multitenancy is expected. In fact, it is considered an error if a tenant identifier is specified when opening a session using this strategy.

SCHEMA

Correlates to the separate schema approach. It is an error to attempt to open a session without a tenant identifier using this strategy. Additionally, a MultiTenantConnectionProvider must be specified.

DATABASE

Correlates to the separate database approach. It is an error to attempt to open a session without a tenant identifier using this strategy. Additionally, a MultiTenantConnectionProvider must be specified.

DISCRIMINATOR

Correlates to the partitioned (discriminator) approach. It is an error to attempt to open a session without a tenant identifier using this strategy. This strategy is not yet implemented and you can follow its progress via the HHH-6054 Jira issue.

19.4.1. MultiTenantConnectionProvider

When using either the DATABASE or SCHEMA approach, Hibernate needs to be able to obtain Connections in a tenant-specific manner.

That is the role of the MultiTenantConnectionProvider contract. Application developers will need to provide an implementation of this contract.

Most of its methods are extremely self-explanatory. The only ones which might not be are getAnyConnection and releaseAnyConnection. It is important to note also that these methods do not accept the tenant identifier. Hibernate uses these methods during startup to perform various configuration, mainly via the java.sql.DatabaseMetaData object.

The MultiTenantConnectionProvider to use can be specified in a number of ways:

  • Use the hibernate.multi_tenant_connection_provider setting. It could name a MultiTenantConnectionProvider instance, a MultiTenantConnectionProvider implementation class reference or a MultiTenantConnectionProvider implementation class name.

  • Passed directly to the org.hibernate.boot.registry.StandardServiceRegistryBuilder.

  • If none of the above options match, but the settings do specify a hibernate.connection.datasource value, Hibernate will assume it should use the specific DataSourceBasedMultiTenantConnectionProviderImpl implementation which works on a number of pretty reasonable assumptions when running inside of an app server and using one javax.sql.DataSource per tenant. See its Javadocs for more details.

The following example portrays a MultiTenantConnectionProvider implementation that handles multiple ConnectionProviders.

Example 638. A MultiTenantConnectionProvider implementation

  1. public class ConfigurableMultiTenantConnectionProvider
  2. extends AbstractMultiTenantConnectionProvider {
  3. private final Map<String, ConnectionProvider> connectionProviderMap =
  4. new HashMap<>( );
  5. public ConfigurableMultiTenantConnectionProvider(
  6. Map<String, ConnectionProvider> connectionProviderMap) {
  7. this.connectionProviderMap.putAll( connectionProviderMap );
  8. }
  9. @Override
  10. protected ConnectionProvider getAnyConnectionProvider() {
  11. return connectionProviderMap.values().iterator().next();
  12. }
  13. @Override
  14. protected ConnectionProvider selectConnectionProvider(String tenantIdentifier) {
  15. return connectionProviderMap.get( tenantIdentifier );
  16. }
  17. }

The ConfigurableMultiTenantConnectionProvider can be set up as follows:

Example 639. A MultiTenantConnectionProvider usage example

  1. private void init() {
  2. registerConnectionProvider( FRONT_END_TENANT );
  3. registerConnectionProvider( BACK_END_TENANT );
  4. Map<String, Object> settings = new HashMap<>( );
  5. settings.put( AvailableSettings.MULTI_TENANT, multiTenancyStrategy() );
  6. settings.put( AvailableSettings.MULTI_TENANT_CONNECTION_PROVIDER,
  7. new ConfigurableMultiTenantConnectionProvider( connectionProviderMap ) );
  8. sessionFactory = sessionFactory(settings);
  9. }
  10. protected void registerConnectionProvider(String tenantIdentifier) {
  11. Properties properties = properties();
  12. properties.put( Environment.URL,
  13. tenantUrl(properties.getProperty( Environment.URL ), tenantIdentifier) );
  14. DriverManagerConnectionProviderImpl connectionProvider =
  15. new DriverManagerConnectionProviderImpl();
  16. connectionProvider.configure( properties );
  17. connectionProviderMap.put( tenantIdentifier, connectionProvider );
  18. }

When using multitenancy, it’s possible to save an entity with the same identifier across different tenants:

Example 640. An example of saving entities with the same identifier across different tenants

  1. doInSession( FRONT_END_TENANT, session -> {
  2. Person person = new Person( );
  3. person.setId( 1L );
  4. person.setName( "John Doe" );
  5. session.persist( person );
  6. } );
  7. doInSession( BACK_END_TENANT, session -> {
  8. Person person = new Person( );
  9. person.setId( 1L );
  10. person.setName( "John Doe" );
  11. session.persist( person );
  12. } );

19.4.2. CurrentTenantIdentifierResolver

org.hibernate.context.spi.CurrentTenantIdentifierResolver is a contract for Hibernate to be able to resolve what the application considers the current tenant identifier. The implementation to use can be either passed directly to Configuration via its setCurrentTenantIdentifierResolver method, or be specified via the hibernate.tenant_identifier_resolver setting.

There are two situations where CurrentTenantIdentifierResolver is used:

  • The first situation is when the application is using the org.hibernate.context.spi.CurrentSessionContext feature in conjunction with multitenancy. In the case of the current-session feature, Hibernate will need to open a session if it cannot find an existing one in scope. However, when a session is opened in a multitenant environment, the tenant identifier has to be specified. This is where the CurrentTenantIdentifierResolver comes into play; Hibernate will consult the implementation you provide to determine the tenant identifier to use when opening the session. In this case, it is required that a CurrentTenantIdentifierResolver is supplied.

  • The other situation is when you do not want to explicitly specify the tenant identifier all the time. If a CurrentTenantIdentifierResolver has been specified, Hibernate will use it to determine the default tenant identifier to use when opening the session.

Additionally, if the CurrentTenantIdentifierResolver implementation returns true for its validateExistingCurrentSessions method, Hibernate will make sure any existing sessions that are found in scope have a matching tenant identifier. This capability is only pertinent when the CurrentTenantIdentifierResolver is used in current-session settings.

19.4.3. Caching

Multitenancy support in Hibernate works seamlessly with the Hibernate second level cache. The key used to cache data encodes the tenant identifier.

Currently, schema export will not really work with multitenancy.

The JPA expert group is in the process of defining multitenancy support for an upcoming version of the specification.

19.4.4. Multitenancy Hibernate Session configuration

When using multitenancy, you might want to configure each tenant-specific Session differently. For instance, each tenant could specify a different time zone configuration.

Example 641. Registering the tenant-specific time zone information

  1. registerConnectionProvider( FRONT_END_TENANT, TimeZone.getTimeZone( "UTC" ) );
  2. registerConnectionProvider( BACK_END_TENANT, TimeZone.getTimeZone( "CST" ) );

The registerConnectionProvider method is used to define the tenant-specific context.

Example 642. The registerConnectionProvider method used for defining the tenant-specific context

  1. protected void registerConnectionProvider(String tenantIdentifier, TimeZone timeZone) {
  2. Properties properties = properties();
  3. properties.put(
  4. Environment.URL,
  5. tenantUrl( properties.getProperty( Environment.URL ), tenantIdentifier )
  6. );
  7. DriverManagerConnectionProviderImpl connectionProvider =
  8. new DriverManagerConnectionProviderImpl();
  9. connectionProvider.configure( properties );
  10. connectionProviderMap.put( tenantIdentifier, connectionProvider );
  11. timeZoneTenantMap.put( tenantIdentifier, timeZone );
  12. }

For our example, the tenant-specific context is held in the connectionProviderMap and timeZoneTenantMap.

  1. private Map<String, ConnectionProvider> connectionProviderMap = new HashMap<>();
  2. private Map<String, TimeZone> timeZoneTenantMap = new HashMap<>();

Now, when building the Hibernate Session, aside from passing the tenant identifier, we could also configure the Session to use the tenant-specific time zone.

Example 643. The Hibernate Session can be configured using the tenant-specific context

  1. private void doInSession(String tenant, Consumer<Session> function, boolean useTenantTimeZone) {
  2. Session session = null;
  3. Transaction txn = null;
  4. try {
  5. SessionBuilder sessionBuilder = sessionFactory
  6. .withOptions()
  7. .tenantIdentifier( tenant );
  8. if ( useTenantTimeZone ) {
  9. sessionBuilder.jdbcTimeZone( timeZoneTenantMap.get( tenant ) );
  10. }
  11. session = sessionBuilder.openSession();
  12. txn = session.getTransaction();
  13. txn.begin();
  14. function.accept( session );
  15. txn.commit();
  16. }
  17. catch (Throwable e) {
  18. if ( txn != null ) {
  19. txn.rollback();
  20. }
  21. throw e;
  22. }
  23. finally {
  24. if ( session != null ) {
  25. session.close();
  26. }
  27. }
  28. }

So, if we set the useTenantTimeZone parameter to true, Hibernate will persist the Timestamp properties using the tenant-specific time zone. As you can see in the following example, the Timestamp is successfully retrieved even if the currently running JVM uses a different time zone.

Example 644. The useTenantTimeZone allows you to persist a Timestamp in the provided time zone

  1. doInSession( FRONT_END_TENANT, session -> {
  2. Person person = new Person();
  3. person.setId( 1L );
  4. person.setName( "John Doe" );
  5. person.setCreatedOn( LocalDateTime.of( 2018, 11, 23, 12, 0, 0 ) );
  6. session.persist( person );
  7. }, true );
  8. doInSession( BACK_END_TENANT, session -> {
  9. Person person = new Person();
  10. person.setId( 1L );
  11. person.setName( "John Doe" );
  12. person.setCreatedOn( LocalDateTime.of( 2018, 11, 23, 12, 0, 0 ) );
  13. session.persist( person );
  14. }, true );
  15. doInSession( FRONT_END_TENANT, session -> {
  16. Timestamp personCreationTimestamp = (Timestamp) session
  17. .createNativeQuery(
  18. "select p.created_on " +
  19. "from Person p " +
  20. "where p.id = :personId" )
  21. .setParameter( "personId", 1L )
  22. .getSingleResult();
  23. assertEquals(
  24. Timestamp.valueOf( LocalDateTime.of( 2018, 11, 23, 12, 0, 0 ) ),
  25. personCreationTimestamp
  26. );
  27. }, true );
  28. doInSession( BACK_END_TENANT, session -> {
  29. Timestamp personCreationTimestamp = (Timestamp) session
  30. .createNativeQuery(
  31. "select p.created_on " +
  32. "from Person p " +
  33. "where p.id = :personId" )
  34. .setParameter( "personId", 1L )
  35. .getSingleResult();
  36. assertEquals(
  37. Timestamp.valueOf( LocalDateTime.of( 2018, 11, 23, 12, 0, 0 ) ),
  38. personCreationTimestamp
  39. );
  40. }, true );

However, behind the scenes, we can see that Hibernate has saved the created_on property in the tenant-specific time zone. The following example shows you that the Timestamp was saved in the UTC time zone, hence the offset displayed in the test output.

Example 645. With the useTenantTimeZone property set to false, the Timestamp is fetched in the tenant-specific time zone

  1. doInSession( FRONT_END_TENANT, session -> {
  2. Timestamp personCreationTimestamp = (Timestamp) session
  3. .createNativeQuery(
  4. "select p.created_on " +
  5. "from Person p " +
  6. "where p.id = :personId" )
  7. .setParameter( "personId", 1L )
  8. .getSingleResult();
  9. log.infof(
  10. "The created_on timestamp value is: [%s]",
  11. personCreationTimestamp
  12. );
  13. long timeZoneOffsetMillis =
  14. Timestamp.valueOf( LocalDateTime.of( 2018, 11, 23, 12, 0, 0 ) ).getTime() -
  15. personCreationTimestamp.getTime();
  16. assertEquals(
  17. TimeZone.getTimeZone(ZoneId.systemDefault()).getRawOffset(),
  18. timeZoneOffsetMillis
  19. );
  20. log.infof(
  21. "For the current time zone: [%s], the UTC time zone offset is: [%d]",
  22. TimeZone.getDefault().getDisplayName(), timeZoneOffsetMillis
  23. );
  24. }, false );
  1. SELECT
  2. p.created_on
  3. FROM
  4. Person p
  5. WHERE
  6. p.id = ?
  7. -- binding parameter [1] as [BIGINT] - [1]
  8. -- extracted value ([CREATED_ON] : [TIMESTAMP]) - [2018-11-23 10:00:00.0]
  9. -- The created_on timestamp value is: [2018-11-23 10:00:00.0]
  10. -- For the current time zone: [Eastern European Time], the UTC time zone offset is: [7200000]

Notice that for the Eastern European Time time zone, the time zone offset was 2 hours when the test was executed.