You are on page 1of 8

Advanced Databases

Object Relational Mapping Tools

1. Introduction
When developing a small or a big application, we always deal with data in different forms. It
is not enough to keep our data in structures that are provided by the programming language
that we use, only while the application is running. There are times when the application is
shut down, or it gets jammed or we need to perform some maintenance, and so the data
kept in those structures must be kept somewhere else until we restart the system.
Therefore the need for persistency reveals.
The implementation of persistency is much time tedious work, repeating some routines and
so errors may be included in our application. As a result, Object Relational Mapping tools
have been developed. Their goal is to simplify the creation of data layer access, automate
data access or to generate data access code.
The principle of object-relational mapping is to delegate to tools the management of
persistency, and to work at code-level with objects representing a domain model, and not
with data structures in the same format as the relational database. Object-relational
mapping tools establish a bidirectional link with data in a relational database and objects in
code, based on a configuration and by executing SQL queries (dynamic most of the time) on
the database. They all have their pros and cons, just as it is the case for mapping tools
themselves of course.
The criteria we considered is presented briefly:
- Customization of queries. We often need to go beyond what is possible with the
provided query language. In these cases, we need to be able to provide custom
SQL queries. HQL, which is a strong point of Hibernate/NHibernate, allows for
this. We could also wish a dynamic mapping to be possible from developer
provided SQL queries.
-

Support any type of SQL joins (inner join, outer join)

Concurrency management (support for optimistic and pessimistic approaches)

Be able to map a single object to data coming from multiple tables (joins, views).
Most of the tools handle a direct mapping of a class to one table. We often need
more.

Be able to dispatch the data from a single table to multiple objects.

Global performance (good implementation of the object-relational mapping


concept, ease of use, flexibility)

Lazy loading (the loading of some data is deferred until it is needed)


o for the data through relations
o for some columns. When we want to display just a list of names, we do not
need all the columns of a table to be loaded. We may need the blob fields
only at certain point, under certain conditions, and so it is better to load
them only at that time.

Cache dynamically generated queries, so that they do not get rebuilt at each call.

Cache some data to avoid too many calls to the data source.

Bulk updates or deletions. When we want to update or delete thousands of


records at a time, it is not possible to load all the objects in memory, while this
can be easily and quickly done with a SQL query (DELETE FROM Customer WHERE
Balance < 0). Support from the tool is welcome to handle such massive
operations without having to deal with SQL. Hibernate is not very good on this
point for example.

Supported databases

Query Language. There is sometimes the need to execute dynamic queries.

1.1 The Database


The application used for this analysis implements an invoice management system. An invoice
contains one or more products and a product can be found on one or more invoices (the
relation is based on the product ID). The invoices will have a Content, which also represents
the key database for a many-to-many relational database. An user can execute one of the
following operations for his invoices: adding/deleting an invoice, list the contents of an
invoice, list a listing of his invoices, delete or modify the content of an invoice.
1.2 The tested ORMs
- ActiveJDBC
- Carbonado
- Hibernate
-

2. ActiveJDBC
ActiveJDBC is a Java implementation of Active Record design pattern. It was inspired by
ActiveRecord ORM from Ruby on Rails.
Supported Databases:
- MySQL
- PostgreSQL
- Oracle
- H2
- MS SQL Server
There is also the possibility to provide support for other databases, but certain build scripts
must be executed.
Advantages:
- No implementation for the models; they must only extend the Model.class

ActiveJDBC has a certain inference mechanism through which it can infer table
names and properties for the tables. (Eg.: class User -> Table: Users);
For every model, the ActiveJDBC framework considers there is an id field/db
property;
You can also redefine Primary Keys and Foreign Keys through annotations;
(Eg.: @Table("Facturi")
@IdName("pk_IDFactura")
@BelongsTo(parent = User.class, foreignKeyName = "IDUser")
public class Facturi extends Model{ }
)
When a database is modelled, there is no need to have an external mapping file,
but an Instrumentation script must be run before run-time, in order to
generate the bytecode files of the models. Although no extra files are needed for
mapping, this can become annoying if it is not integrated within a compile script;
The basic CRUD operations have the expected effect and it is guaranteed;
The user can also give an SQL statement to a find() call, as sometimes the code
can become hard to read if only methods are used;
ActiveJDBC offers the capability to export a result of whatever type, List, Map,
Object to a JSON, XML file;
All SQL generated statements can be parameterized;
You have the possibility of extracting all the columns or only a certain one;
Caching is by default not turned on; even if you use the @Cached annotation it
is still necessary to make the correct entry in the activejdbc.properties file.
The framework takes care to purge the tables that have been modified;
Manual caching is available, but the user must concern about the stale data.
Supports all types of relations;
Batch processing resumes at updates and deletes;

Disadvantages:
- The constant need to run the Instrumentation script is a drawback in terms of
usability;
- The inference mechanism is almost alright to use if you have tables named in
English, as the mechanism uses the plural forms of nouns.
- In order to get a row value from a result set the user must know the name of the
column and give it as a parameter to the getString(String col_name) function for
example;
- The caching framework is an open source framework, so it may have certain bugs
but it can also have a lot of documentation;
- The use of multiple keys is not that easy;
- Multiple Foreign Keys per model can be annotated but only one FK will be used;
- Multiple Primary Keys can be used, but workaround code is necessary, therefore
the code becomes a bit heavy;
- The native query cannot be analysed at request; sometimes it is shown in the
error stack trace;

The only way to get to a parent is through a method call, and the code becomes
a
bit
too
heavy;
(Eg.:System.out.println(fact.parent(Facturi.class).parent(User.class).getString("nu
me"));
You can only come from the child to the parent but not the other way;(it is a bit
counter intuitive);
The framework does not generate getters/setters for the properties;
It needs other jars to work;
The documentation is not covering many cases of use;

As a conclusion, ActiveJDBC is a framework easy to implement in your project/application,


and offers basically all you need in order to execute CRUD operations and more. As many
other ORM frameworks for Java applications it needs different other jar files, of which the
documentation does not tell you about. The documentation is also a bit skimmy. A big
minus for ActiveJDBC can be considered the fact that it does not support properly the
extensions.

3. Carbonado
Defining new types in Carbonado involves creating an interface or abstract class which
follows Java bean conventions. Additional information is specified by inserting special
annotations. At the very least, an annotation is required to specify the primary key.
Annotations are a feature first available in Java 5, and as a result, Carbonado depends on
Java 5. Carbonado is able to achieve high performance by imposing very low overhead when
accessing the actual storage. Low overhead is achieved in part by auto generating
performance critical code, via the Cojen library.
Supported databases:
- JDBC accessible SQL relational database
- Berkley DB
Advantages:
- Supports transactions, optimistic locking, joins and LOBs;
- Has class wrapping for different database structures;
- The documentation is strongly oriented towards Berkley DB;
- If transactions are not user, persist operations use auto-commit;
- All Storables must have a Primary Key;
- The annotations let you set all properties that a db has;
- It allows you to set details such as Text format, length constraints, integer
constraints.
- You can print the query plan, which gives details of how the query will be
executed;
- Supports joins for all relations;
- Different types of isolation for transactions;

Disadvantages:
- Documentation examples are not completely explicit;
- Much more support for Berkley DBs than for JDBC dependent databases;
- Indexing can be applied from classes only to Berkley DB.
- The behaviour when working through a JDBC is not constant;
- The Load operation sometimes deletes record when fetching them;
- Code is a bit too tangled;
- A complete implementation must be given to the Storables;

4. Hibernate
Hibernate is considered to be a high-performance Object Relational persistence and query
service which is GNU licensed. As many Java dedicated ORM frameworks, Hibernate takes
care of the mapping between classes and relational data bases, and also provides data
query and retrieval facilities.
Supported Databases:
- HSQL database engine
- DB2/NT
- MySQL
- PostgreSQL
- FrontBase
- Oracle
- MS SQL
- Sybase SQL Server

Advantages:
-

Hibernate takes care of mapping Java classes to database tables using XML files and
without writing any line of code.
Abstract away the unfamiliar SQL types and provide us to work around familiar Java
Objects.
Hibernate does not require an application server to operate.
Minimize database access with smart fetching strategies.
Provides Simple querying of data.
Has dedicated objects for transactions, queries and sessions; these objects take care
almost entirely of the db interaction that they represent.
A major plus for Hibernate is the fact that is quite easy to install.
It benefits from lots of documentation and comes with well structured start up guide
and manual
As it is very used on different projects of different sizes, the product is maintained up
to date;
As said, Hibernate offers you its own query language, a programatical one, HQL, but
the user can also use when needed native SQL. Such times are represented by

moments when you need to use specific properties of the DB you are connected to
(for example Oracles Connected);
HQL supports all types of CRUD operations, and also parameterized queries.
Through the xml mapping files, table columns are mapped to Java primitives or
wrapper classes;
The hibernate queries operate on persistent objects, not directly on tables;
The API provides the possibility of commit or rollback;
Hibernate has different types and level of caching;
In order to avoid an Out of Memory exception, a batch size can be set;
This batch size represents how many objects from the ones cached in the session
cache will be Inserted or Updated on one Insert/Update.
The 3 levels of caching are: First-level cache, Second-level cache and Query-level
cache;
The First-level cache is used mandatory; this level of cache is at the level of a session
object; the object is cached before being committed to the database; multiple
updates will be delayed as long as possible.
For the Second-level of caching a Cache Provider must be set and available, and
Hibernate support 4 types; this level of caching is used for caching objects between
sessions;

Disadvantages:
-

Although very well documented, implementation is still challenging for certain


features( eg.: mapping many tables in one object)
Besides the xml mapping that you have to write, full implementation of the classes
that mode the tables is needed;
Annotations can be used, but the code can become hard to read;
Feature enabling requires knowledge of certain xml configuration files;
It has no visual helping tool;
The SQL statements are generated at runtime, therefore it is sometimes slow;
A lot of documentation to read when you get to work in different circumstances with
it;
Its caching default technique uses your local memory, so you might have a slow
computer if you run a complex application.
You cannot access parents or children directly;
You might run into an overhead of code;

As a small conclusion Hibernate offers a lot of good tools and API, but it will be easier for
you if someone who knows it well is there to answer your questions.

5. Conclusions
After exploring throughout these four frameworks, we can say that they all have their good
and bad parts. Most of them take care of the users hard coding, as they give him the
chance to interact with the database through dynamic language. The fact that tables are

mapped to an object is also to the benefit of the programmer, as it brings the data in its
area of expertise, giving him the ability to manipulate it in different manners.
A general fact that can be observed about the JAVA ORMs is that they all need some
complementary jar files, and so the user will have to keep up to date, if there is the case,
with all the dependencies of the chosen ORM. These tools take care of a lot of code
generation, so that the developer focuses more on the applications logic.
When choosing a framework, one must consider things like those mentioned in the
beginning of the paper, and also must consider if his developers will have to go to a time
stretching process in order to accommodate to something new.