Wednesday, March 18, 2009

Apache Commons DBCP and Tomcat JDBC Connection Pool

Introduction

The SpringSource tomcat architects are coming up with a new connection pool called Tomcat JDBC Connection Pool. This poses many questions to so many people like why do we need another connection pool when there is already an established Apache Commons DBCP pool available. In this article, I would like to point out some drastic differences between Apache DBCP and the new Tomcat JDBC Connection Pool. I strongly feel Tomcat JDBC Connection Pool is much superior to DBCP and highly recommend all users to check it out.

Database connection

A database connection is a facility in computer science that allows the client software to talk to database server. The database server could be running on the same machine where the client software runs or not. A database connection is required to send commands and receive answers via result set.

Connection Pool

Database connections are finite and expensive and can take disproportionately long time to create relative to the operations performed on them. It is very inefficient for an application to create and close a database connection whenever it needs to send a read request or update request to the database.

Connection Pooling is a technique designed to alleviate the above mentioned problem. A pool of database connections can be created and then shared among the applications that need to access the database. When an application needs database access, it requests a connection from the pool. When it is finished, it returns the connection to the pool, where it becomes available for use by other applications.

Apache Commons DBCP Connection Pool

There are several Database Connection Pools already available, both within Jakarta products and elsewhere. This Commons package provides an opportunity to coordinate the efforts required to create and maintain an efficient, feature-rich package under the ASF license.
Applications can use the commons-dbcp component directly or through the existing interface of their container / supporting framework.

Jakarta Tomcat the leading application server is also packaged with DBCP Datasource as the JNDI Datasource. The beauty of DBCP is that it can be used with so many applications or frameworks and it works with almost all databases in the market.

However, DBCP presents some challenges or concerns as well in spite of its popularity. Some of the challenges or concerns with DBCP are given below.

  1. commons-dbcp is single threaded. In order to be thread safe commons-dbcp locks the entire pool, even during query validation.
  2. commons-dbcp is slow - As the number of logical CPUs grow, the performance suffers, the above point shows that there is no support for high concurrency even with the enormous optimizations of the synchronized statement in Java 6, commons-dbcp still suffers in speed and concurrency.
  3. commons-dbcp is complex, over 60 classes. tomcat-jdbc-pool, is 8 classes, hence modifications for future requirement will require much less changes.
  4. commons-dbcp uses static interfaces. This means you can't compile it with JDK 1.6, or if you run on JDK 1.6/1.7 you will get NoSuchMethodException for all the methods not implemented, even if the driver supports it.
  5. The commons-dbcp has become fairly stagnant. Sparse updates, releases, and new feature support.
  6. It's not worth rewriting over 60 classes, when something as a connection pool can be accomplished with a much simpler implementation.
Some of the benefits of Tomcat JDBC Connection Pool are given below.
  1. Tomcat jdbc pool implements a fairness option not available in commons-dbcp and still performs faster than commons-dbcp.
  2. Tomcat jdbc pool implements the ability to retrieve a connection asynchronously, without adding additional threads to the library itself.
  3. Tomcat jdbc pool is a Tomcat module, it depends on Tomcat JULI, a simplified logging framework used in Tomcat.
  4. Support for highly concurrent environments and multi core/cpu systems.
  5. Dynamic implementation of interface, will support java.sql and javax.sql interfaces for your runtime environment (as long as your JDBC driver does the same), even when compiled with a lower version of the JDK.
  6. Validation intervals - we don't have to validate every single time we use the connection, we can do this when we borrow or return the connection, just not more frequent than an interval we can configure.
  7. Run-Once query, a configurable query that will be run only once, when the connection to the database is established. Very useful to setup session settings, that you want to exist during the entire time the connection is established.
  8. Ability to configure custom interceptors. This allows you to write custom interceptors to enhance the functionality. You can use interceptors to gather query stats, cache session states, reconnect the connection upon failures, retry queries, cache query results, and so on. Your options are endless and the interceptors are dynamic, not tied to a JDK version of a java.sql/javax.sql interface.
  9. High performance
  10. Extremely simple, due to the very simplified implementation, the line count and source file count are very low, compare with c3p0 that has over 200 source files. Tomcat jdbc has a core of 8 files, the connection pool itself is about half that.
  11. Asynchronous connection retrieval - you can queue your request for a connection and receive a Future back.
The usage of Tomcat JDBC Connection Pool is very simple and for people who are already familiar with DBCP, the transistion is very simple.

The Tomcat connection pool offers a few additional features over what most other pools let you do:

  • initSQL - the ability to run a SQL statement exactly once, when the connection is created.
  • validationInterval - in addition to running validations on connections, avoid running them too frequently.
  • jdbcInterceptors - flexible and pluggable interceptors to create any customizations around the pool, the query execution and the result set handling.
  • fairQueue - Set the fair flag to true to achieve thread fairness or to use asynchronous connection retrieval.
JNDI Factory and Type

Most attributes are same and have the same meaning as DBCP.

  • factory - factory is required, and the value should be org.apache.tomcat.jdbc.pool.DataSourceFactory
  • type - type should always be javax.sql.DataSource

Common Attributes

The following attributes are shared between commons-dbcp and tomcat-jdbc-pool, in some cases default values are different.

  • defaultAutoCommit - (boolean) The default auto-commit state of connections created by this pool. If not set, default is JDBC driver default (If not set then the setAutoCommit method will not be called.)
  • defaultReadOnly - (boolean) The default read-only state of connections created by this pool. If not set then the setReadOnly method will not be called. (Some drivers don't support read only mode, ex: Informix)
  • defaultTransactionIsolation - (String) The default TransactionIsolation state of connections created by this pool. One of the following: (see javadoc )
  1. NONE
  2. READ_COMMITTED
  3. READ_UNCOMMITTED
  4. REPEATABLE_READ
  5. SERIALIZABLE

If not set, the method will not be called and it defaults to the JDBC driver.

  • defaultCatalog - (String) The default catalog of connections created by this pool.
  • driverClassName - (String) The fully qualified Java class name of the JDBC driver to be used. The driver has to be accessible from the same classloader as tomcat-jdbc.jar
  • username -(String) The connection username to be passed to our JDBC driver to establish a connection. Note, at this point, DataSource.getConnection(username,password) is not using the credentials passed into the method.
  • password - (String) The connection password to be passed to our JDBC driver to establish a connection. Note, at this point, DataSource.getConnection(username,password) is not using the credentials passed into the method.
  • maxActive - (int) The maximum number of active connections that can be allocated from this pool at the same time. The default value is 100.
  • maxIdle - (int) The maximum number of connections that should be kept in the pool at all times. Default value is maxActive:100 Idle connections are checked periodically (if enabled) and connections that been idle for longer than minEvictableIdleTimeMillis will be released. (also see testWhileIdle)
  • minIdle - (int) The minimum number of established connections that should be kept in the pool at all times. The connection pool can shrink below this number if validation queries fail. Default value is derived from initialSize:10 (also see testWhileIdle)
  • initialSize - (int)The initial number of connections that are created when the pool is started. Default value is 10
  • maxWait - (long) The maximum number of milliseconds that the pool will wait (when there are no available connections) for a connection to be returned before throwing an exception. Default value is 30000 (30 seconds)
  • testOnBorrow - (boolean) The indication of whether objects will be validated before being borrowed from the pool. If the object fails to validate, it will be dropped from the pool, and we will attempt to borrow another. NOTE - for a true value to have any effect, the validationQuery parameter must be set to a non-null string. Default value is false
  • testOnReturn - (boolean) The indication of whether objects will be validated before being returned to the pool. NOTE - for a true value to have any effect, the validationQuery parameter must be set to a non-null string. The default value is false.
  • testWhileIdle - (boolean) The indication of whether objects will be validated by the idle object evictor (if any). If an object fails to validate, it will be dropped from the pool. NOTE - for a true value to have any effect, the validationQuery parameter must be set to a non-null string. The default value is false and this property has to be set in order for the pool cleaner/test thread is to run (also see timeBetweenEvictionRunsMillis)
  • validationQuery - (String) The SQL query that will be used to validate connections from this pool before returning them to the caller. If specified, this query does not have to return any data, it just can't throw a SQLException. The default value is null. Example values are SELECT 1(mysql), select 1 from dual(oracle), SELECT 1(MS Sql Server)
  • timeBetweenEvictionRunsMillis - (long) The number of milliseconds to sleep between runs of the idle connection validation/cleaner thread. This value should not be set under 1 second. It dictates how often we check for idle, abandoned connections, and how often we validate idle connections. The default value is 5000 (5 seconds).
  • numTestsPerEvictionRun - (int) Property not used in tomcat-jdbc-pool.
  • minEvictableIdleTimeMillis - (long) The minimum amount of time an object may sit idle in the pool before it is eligible for eviction. The default value is 60000 (60 seconds).
  • accessToUnderlyingConnectionAllowed - (boolean) Property not used. Access can be achieved by calling unwrap on the pooled connection. see javax.sql.DataSource interface, or call getConnection through reflection.
  • removeAbandoned - (boolean) Flag to remove abandoned connections if they exceed the removeAbandonedTimout. If set to true a connection is considered abandoned and eligible for removal if it has been in use longer than the removeAbandonedTimeout Setting this to true can recover db connections from applications that fail to close a connection. See also logAbandoned The default value is false.
  • removeAbandonedTimeout - (long) Timeout in seconds before an abandoned(in use) connection can be removed. The default value is 60 (60 seconds). The value should be set to the longest running query your applications might have.
  • logAbandoned - (boolean) Flag to log stack traces for application code which abandoned a Connection. Logging of abandoned Connections adds overhead for every Connection borrow because a stack trace has to be generated. The default value is false.
  • connectionProperties - (String) The connection properties that will be sent to our JDBC driver when establishing new connections. Format of the string must be [propertyName=property;]* NOTE - The "user" and "password" properties will be passed explicitly, so they do not need to be included here. The default value is null.
  • poolPreparedStatements - (boolean) Property not used. The default value is false.
  • maxOpenPreparedStatements - (int) Property not used. The default value is false.
Tomcat JDBC Connection Pool Specific Properties

  • initSQL - (String) A custom query to be run when a connection is first created. The default value is null.
  • jdbcInterceptors (String) A semicolon separated list of classnames extending org.apache.tomcat.jdbc.pool.JdbcInterceptor class. These interceptors will be inserted as an interceptor into the chain of operations on a java.sql.Connection object. The default value is null.
    Predefined interceptors:
    1. org.apache.tomcat.jdbc.pool.interceptor.ConnectionState - keeps track of auto commit, read only, catalog and transaction isolation level.
    2. org.apache.tomcat.jdbc.pool.interceptor.StatementFinalizer - keeps track of opened statements, and closes them when the connection is returned to the pool.
  • validationInterval - (long) avoid excess validation, only run validation at most at this frequency - time in milliseconds. If a connection is due for validation, but has been validated previously within this interval, it will not be validated again. The default value is 30000 (30 seconds).
  • jmxEnabled - (boolean) Register the pool with JMX or not. The default value is true.
  • fairQueue - (boolean) Set to true if you wish that calls to getConnection should be treated fairly in a true FIFO fashion. This uses the org.apache.tomcat.jdbc.pool.FairBlockingQueue implementation for the list of the idle connections. The default value is false. This flag is required when you want to use asynchronous connection retrieval.
  • useEquals - (boolean) Set to true if you wish the ProxyConnection class to use String.equals instead of == when comparing method names. This property does not apply to added interceptors as those are configured individually. The default value is false.
As a Resource

Please see the example below as to how to configure the Tomcat JDBC DataSource as a resource. I am using DB2 as the sample database for this example.


<Resource
auth="Container"
defaultAutoCommit="true"
defaultReadOnly="false"
defaultTransactionIsolation="READ_COMMITTED"
driverClassName="com.ibm.db2.jcc.DB2Driver"
factory="org.apache.tomcat.jdbc.pool.DataSourceFactory"
fairQueue="false"
initSQL="SELECT DTS FROM DT_TM_TS FOR READ ONLY WITH UR"
initialSize="10"
jdbcInterceptors="org.apache.tomcat.jdbc.pool.interceptor.ConnectionState;org.apache.tomcat.jdbc.pool.interceptor.StatementFinalizer"
jmxEnabled="true"
logAbandoned="true"
maxActive="100"
maxIdle="100"
maxWait="6000"
minEvictableIdleTimeMillis="30000"
minIdle="10"
name="jdbc/sampleDS"
password="xxxxxxxxxxx"
removeAbandoned="true"
removeAbandonedTimeout="60"
testOnBorrow="true"
testOnReturn="false"
testWhileIdle="false"
timeBetweenEvictionRunsMillis="30000"
type="javax.sql.DataSource"
url="jdbc:db2://os01.in.vbose.com:4745/DSNQ:currentFunctionPath=SAMPLE;IGNORE_DONE_IN_PROC=true;currentSchema=SAMPLE;"
useEquals="false"
username="abcdefg"
validationInterval="1800000" validationQuery="SELECT DTS FROM DT_TM_TS FOR READ ONLY WITH UR"/>


Wiring Tomcat JDBC DataSource using Spring Application Context

The DataSource class available within Tomcat JDBC Pool can also be instantiated through IoC and implements the DataSource interface since the DataSourceProxy is used as a generic proxy. The following is an example of using Spring application context to wire the DataSource dependency. The example database used is DB2 9 running on mainframe z/OS.


<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:p="http://www.springframework.org/schema/p"
xmlns:context="http://www.springframework.org/schema/context"
xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-2.5.xsd
http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context-2.5.xsd">

<context:property-placeholder location="classpath:sample/jdbc.properties"/>
<bean id="datasource" class="org.apache.tomcat.jdbc.pool.DataSource" destroy-method="close"
p:driverClassName="${jdbc.sample.db.driverClassName}"
p:url="${jdbc.sample.db.url}"
p:username="${jdbc.sample.db.username}"
p:password="${jdbc.sample.db.password}"
p:initialSize="10"
p:initSQL="SELECT DTS FROM DT_TM_TS FOR READ ONLY WITH UR"
p:minIdle="10"
p:maxIdle="100"
p:maxActive="100"
p:maxWait="6000"
p:jmxEnabled="true"
p:jdbcInterceptors="org.apache.tomcat.jdbc.pool.interceptor.ConnectionState;org.apache.tomcat.jdbc.pool.interceptor.StatementFinalizer"
p:removeAbandoned="true"
p:removeAbandonedTimeout="60"
p:logAbandoned="true"
p:testOnBorrow="true"
p:testOnReturn="false"
p:testWhileIdle="false"
p:useEquals="false"
p:fairQueue="false"
p:timeBetweenEvictionRunsMillis="30000"
p:minEvictableIdleTimeMillis="30000"
p:validationInterval="1800000"
p:validationQuery="SELECT DTS FROM DT_TM_TS FOR READ ONLY WITH UR"
/>


In order to keep track of query performance and issues log entries when queries exceed a time threshold of fail, you may also use another built-in interceptor called org.apache.tomcat.jdbc.pool.interceptor.SlowQueryReport(threshold=10000). In this case, the log level used is WARN.

Additionally, if used within tomcat, you can add a JMX enabled interceptor called org.apache.tomcat.jdbc.pool.interceptor.SlowQueryReportJmx(threshold=10000). This class uses Tomcat's JMX engine so it wont work outside of the Tomcat container.

Conclusion

The tomcat JDBC connection pool (tomcat-jdbc.jar) is currently available as part of SpringSource tc Server, the enterprise version of Tomcat Server. In tc server, both DBCP and Tomcat JDBC are available and it is upto the system architect to decide which option is best to use in their applications.

The tomcat JDBC also depends on Tomcat JULI, a simplified logging framework used in Tomcat. So you may need tomcat-juli.jar if you want to use it outside tomcat container.