Written by Marek Smęt
Software Engineer
Published June 11, 2015

Groovy SQL – an easy way to database scripting

If you have to write some database script at SQL level then try out Groovy SQL. Why? This story is the answer.

Story

Some time ago I was assigned to a task, which was strictly connected with database. After some migration processes it turned out that we had many duplicates among the authors of articles. The main goal of that task was to check the integrity, correct it if we encountered any deviation, and of course remove those duplicates. It was not a very urgent issue, therefore I decided to look around and find a new tool and/or programming language to accomplish that task. At the end I thought it would be a good chance to learn something new and have fun too. I chose to select from a rich JVM-based languages family and picked up Groovy language. As a natural choice I decided to use its SQL package as a vehicle to access and modify database content.

Objectives

Here is a short description of what had to be done in my task. First thing to do was to find all article entities that met certain criteria. Then, for all of these entities, find previously mentioned authors and make a backup of their names in a new column. Replace current author(s) with a new generic one. At the end, find and remove all authors which were involved in this process. As you can see, it looked like a series of SELECT, UPDATE and DELETE statements. Nothing fancy and the amount of data was not intimidating either – roughly 100k articles to process.

Script

For me it was the first time when I was using Groovy language in the production code. I decided to use the script and did not bother with classes, main method and so on.
First step: connect to the database and run a simple query. It is easy, look:

def db = [url: "jdbc:mysql://$props.host_and_port/$props.dbname?useUnicode=true&characterEncoding=UTF-8",
          user: props.dbuser, password: props.dbpassword, driver: 'com.mysql.jdbc.Driver']
def sql = Sql.newInstance(db.url, db.user, db.password, db.driver)
println "DB connection ready"

Thanks to Groovy string interpolation we can easily replace placeholders in the connection string.
Actual parameters such as: host, database name, user and password can be taken from a command line or, as in my case, they come from the property file. Do not forget to place JDBC driver in the classpath.

Simple operations

Now it is time for a simple query. Let’s print a list of articles’ IDs:

sql.eachRow('''SELECT * FROM ArticleTable WHERE some_criteria''', { article ->
    println article.articleID
})

It is really straightforward. Each row is represented as the article object where columns’ names become attributes. Compare it to the typical Java structure such as this one:

...
ResultSet rs = stmt.executeQuery("SELECT * FROM ArticleTable WHERE some_criteria");
while (rs.next()) {
    System.out.println("Article ID: " + rs.getString("articleID"));
}
...

Updating and deleting is easy too. We can use executeUpdate and execute methods like in the examples below. String interpolation is used again and provide more human readable SQL statements.

sql.executeUpdate("UPDATE ArticleTable SET authorID=$newAuthor where ID=$id")
sql.execute("DELETE FROM AuthorTable WHERE authorID=$id")

The execute variant that accepts GString constructs PrepareStatement with placeholders under the covers and helps us protect against SQL injection.

Batch processing

For the performance reason it would be wise to submit UPDATE statements in chunks. Groovy SQL comes in handy in that area too. There are few variants of withBatch method. In my example processArticle method produces update statement for each article. Those updates are aggregated for future execution.

sql.withBatch({ stmt ->
    sql.eachRow("SELECT * FROM ArticleTable WHERE some_criteria ", { article ->
        // Process Article and add update statement to batch 
        def updateStmts = processArticle(article)
        stmt.addBatch(updateStmts)
    })
})

Method executeBatch(), which submits the batch, will be called automatically after the withBatch closure has finished but may be called earlier if we want to.
For integrity reasons all batch operation can be nested inside a transaction, see example below.

sql.withTransaction {
     def result = sql.withBatch({ ... })
     ...
}

Clean up

At the end of the script, when all is done, do not forget about closing the connection:

sql.close()
println "Connection closed. End"

Summary

In this short post I wanted to convey how easy and pleasant a database scripts writing can be. Groovy SQL provides with a nice facade over the JDBC API and allows developers to focus more on business logic instead of dealing with cumbersome JDBC constructs. Combining Groovy SQL with reactive approach may be another interesting subject to explore. If you have free time, please give it a try.

Written by Marek Smęt
Software Engineer
Published June 11, 2015