Free IT / Luis Fernando Planella Gonzalez: jpa

Mostrando postagens com marcador jpa. Mostrar todas as postagens

quarta-feira, 12 de novembro de 2014

Generating DDL with EclipseLink JPA and PostgreSQL

I normally say that either the project I work on (http://www.cyclos.org) is too special or we're just unlucky with the default operation in most libraries we use.
As we need streaming BLOBs (we don't want to load entire images into memory), and EclipseLink by default doesn't handle streaming.

So I had to do a subclass of org.eclipse.persistence.platform.database.PostgreSQLPlatform. The following methods were implemented:

    @Override
    public Object getObjectFromResultSet(ResultSet resultSet, int columnNumber, int type, AbstractSession session) throws SQLException {
        String name;
        if (type == Types.BIGINT) {
            // May be a number or an OID
            name = resultSet.getMetaData().getColumnTypeName(columnNumber);
            if ("OID".equalsIgnoreCase(name)) {
                return resultSet.getBlob(columnNumber);
            }
        }
        return super.getObjectFromResultSet(resultSet, columnNumber, type, session);
    }

    @Override
    public void setParameterValueInDatabaseCall(Object parameter, PreparedStatement statement, int index, AbstractSession session) throws SQLException {
        if (parameter instanceof DatabaseField) {
            DatabaseField field = (DatabaseField) parameter;
            if (Blob.class.equals(field.getType())) {
                statement.setBlob(index, (Blob) null);
            } else {
                super.setParameterValueInDatabaseCall(parameter, statement, index, session);
            }
        } else if (parameter instanceof Blob) {
            statement.setBlob(index, ((Blob) parameter));
        } else {
            super.setParameterValueInDatabaseCall(parameter, statement, index, session);
        }
    }

    @Override
    public boolean shouldUseCustomModifyForCall(DatabaseField field) {
        if (Blob.class.equals(field.getType())) {
            return true;
        }
        return super.shouldUseCustomModifyForCall(field);
    }

    @Override
    @SuppressWarnings({ "rawtypes", "unchecked" })
    protected Hashtable buildFieldTypes() {
        Hashtable types = super.buildFieldTypes();
        types.put(Blob.class, new FieldTypeDefinition("OID", false));
        return types;
    }

This way we can control: small binary data is mapped in entities via byte[]. Large binary data, via java.sql.Blob.

Then, to generate the schema:

    EntityManagerFactoryImpl emf = (EntityManagerFactoryImpl) realEMF;
    DatabaseSessionImpl databaseSession = emf.getDatabaseSession();

    StringWriter sw = new StringWriter();
    SchemaManager schemaManager = new SchemaManager(databaseSession);
    schemaManager.outputDDLToWriter(sw);

    DefaultTableGenerator tableGenerator = new DefaultTableGenerator(databaseSession.getProject()) {
        @Override
        protected void resetFieldTypeForLOB(DirectToFieldMapping mapping) {
            // Hack to avoid the workaround for oracle 4k thin driver bug
        }
    };
    TableCreator tableCreator = tableGenerator.generateDefaultTableCreator();
    tableCreator.createTables(databaseSession, schemaManager);

    String script = sw.toString();

That DefaultTableGenerator inner class took me some hours debugging EclipseLink to figure out. The method comments says it is there to fix issues with oracle 4k thin driver. And it messed up the other use cases, as Blob was being handled as Byte[], and we want OID type specifically for Blobs.

Congratulations, Oracle! (facepalm)

sábado, 2 de fevereiro de 2013

The beauty of Querydsl: calling database functions

It's a common pattern to have a self-referencing table, in order to model an hierarchy tree. Now, sorting results by hierarchy, is an entirely different subject. Some databases, like Oracle, has the start with / connect by clauses. But to bring that to the JPA world is another different story. And imagine that with the über ugly JPA 2 criteria queries. I was already using Querydsl with JPA (using EclipseLink), and it allowed a very clean solution.

First, I brought database functions to the rescue, and wrote a function which receives an entity id, a table name, the name column and the parent id column. So, the function can be reused on any type of entity. As the DB is Postgres, here comes the code:

create or replace function name_hierarchy
    (p_id bigint, 
     p_table varchar, 
     p_name_col varchar, 
     p_parent_id_col varchar)
    returns varchar
    as $$
        declare
            sql text;
            current_id bigint;
            v_name varchar;
            v_parent_id bigint;
            path varchar[];
        begin
            current_id := p_id;
            while current_id is not null loop
                sql := 'select id, ' 
                     || p_name_col || ', ' 
                     || p_parent_id_col
                     || ' from ' || p_table 
                     || ' where id = ' || current_id;
                execute sql into current_id, v_name, v_parent_id;
                path := array_prepend(v_name, path);
                current_id := v_parent_id;
            end loop;
            return array_to_string(path, ' > ');
        end;
    $$ language plpgsql
    stable;

Then, I needed to create a Querydsl operator to represent the function. It contains a name and the argument types.

public class CustomOperators {
    public static final Operator<String>
        NAME_HIERARCHY = new OperatorImpl<String>(
            "name_hierarchy", 
            Long.class, String.class, String.class, String.class);
}

There was also a custom Querydsl templates class to actually convert that operator into JPQL code (note that EclipseLink uses FUNC('name', args...) to invoke native database functions):

    public static class CustomTemplates
        extends EclipseLinkTemplates {

        private static final CustomTemplates INSTANCE = 
            new CustomTemplates();

        public static CustomTemplates getInstance() {
            return INSTANCE;
        }

        private CustomTemplates() {
            add(CustomOperators.NAME_HIERARCHY, 
                "FUNC('name_hierarchy', {0}, {1}, {2}, {3})");
        }
    }

As the example entity is Configuration, the final plumbing needed is a method annotated with @QueryDelegate(Configuration.class), so the extension method can be created in any class:

    @QueryDelegate(Configuration.class)
    public static StringExpression 
        nameHierarchy(EntityPath<Configuration> configuration) {
        NumberPath<Long> id = (NumberPath<Long>)
            FieldUtils.readField(configuration, "id");
        return StringOperation.create(
            CustomOperators.NAME_HIERARCHY, 
            id,
            StringTemplate.create("'configurations'"),
            StringTemplate.create("'name'"),
            StringTemplate.create("'parent_id'"));
    }

Finally, I can use the nameHierarchy method anywhere on queries, like:

QConfiguration c = QConfiguration.configuration;
List<Configuration> configs = new JPAQuery()
    .from(c)
    .orderBy(c.nameHierarchy.asc())
    .list(c);

Looks like that nameHierarchy method was always there, doesn't it? And the same idea can be reused on any other function, seamlessly blending them on the query metamodel. Now try to make something similar with JPA's criteria api!

domingo, 20 de janeiro de 2013

Replaced Hibernate as JPA provider... To never look back!!!

Hibernate is probably the most well-known ORM tool for Java. I first used it on version 1.X back on 2002. It even influenced the JPA (Java Persistence API), which is a standard ORM API.
The problem is: the application (has about 200 entities) was taking up +- 350MB of heap size on startup right after forcing a garbage collect (using jvisualvm).
That was too much. But things would improve. There was a setting which I've always mislooked as a batch size equivalent, called hibernate.default_batch_fetch_size, which we had with value 20.
After some investigation, I found it was used to load several records at once, at the expense of memory. So, just to test out, I changed it to 1 and, surprise... The same application was now taking up +- 150MB! What a change for something misunderstood!
But I was not satisfied, and decided to try another JPA provider. From some researches, I decided to go with EclipseLink. Result? The same application now starts up (after a garbage collection) with +- 35 MB!!!
Ok, what a huge difference! But performance should be worst, shouldn't it? No!!! On the load tests I did, EclipseLink was actually 2.5x faster than Hibernate!
There were some bumps, several queries were done with some non-standard (from JPA's point of view) elements, and so on. But they all could be resolved, one at a time.
Conclusion: After being a loyal Hibernate user for several years (well, not that loyal, as some projects I did with plain JDBC using Querydsl SQL module), I'll now try to avoid it as much as possible, and use EclipseLink instead. Even a future possibility is Batoo JPA, which claims to be 15-20x faster than Hibernate. However, as it cannot be used with Spring's LocalContainerEntityManagerFactoryBean (at least for now, as it requires a persistence.xml, and I like bootstrapping things programmatically), I'll stick with EclipseLink for now.

quarta-feira, 12 de maio de 2010

JPA 2 Criteria

One of the most expected features in JPA 2 is a Criteria API. Something that Hibernate has had for ages, but a notable absence in JPA 1.

Even better, JPA 2 criteria is compiled (generated from the source code) and type safe. So, for example, whenever an attribute is removed or changed on the entity, the queries stop compiling immediately, instead of having to wait until the application is running to detect errors. Neat, huh?

However, there's a problem. The way it is, queries are unusable. Well, usable, but very, VERY hard to code, read and maintain. Not for the JSR 317 expert group, of course, but everyone I've asked, has the same opinion as me.

Take a look (example extracted from this link, with little changes):

CriteriaBuilder builder = em.getCriteriaBuilder();
CriteriaQuery<Person> criteria = builder.createQuery(
    Person.class);
Root<Person> personRoot = criteria.from(Person.class);
criteria.select(personRoot);
ParameterExpression<String> eyeColorParam = builder.
    parameter(String.class);
criteria.where(builder.equal(personRoot.get(
    Person_.eyeColor), eyeColorParam));
TypedQuery<Person> query = em.createQuery(criteria);
query.setParameter(eyeColorParam, "brown");
List<Person> people = query.getResultList();

Is this example anything close to 'easy'? The very same query in JPQL would be:

String jpql = 
    "select p from Person p where p.eyeColor = :eyeColor";
TypedQuery<Person> query =
    em.createQuery(jpql, Person.class);
query.setParameter("eyeColor", "brown");
List<Person> people = query.getResultList();

To make things a bit worse, the Query object returned from the em.createQuery(criteria) never has parameters already set. And parameters are only used when a ParameterExpression is created. Otherwise, the values are passed as literals (so, subject to things like SQL injection). Yikes! There's absolutely no reason for this. Even the plain old Hibernate criteria already converted given literals to bind parameters...

C'mon, how could an expert group do such terrible decisions, impacting the lives of thousands Java programmers out there having to live with this abomination?

Thanks God, there is a very nice solution. It's Querydsl. It has the main advantage of JPA 2 criteria: being type safe (an annotation processor is used to generate a meta model which is used on queries), uses fluent interfaces (code is very readable) and generates queries with bind parameters on all expressions. The Querydsl metamodel has a Q prefix, for example, QEntity, instead of JPA's Entity_. So, let's take a look on the same previous example in Querydsl:

JPAQuery query = new JPAQuery(em);
QPerson person = QPerson.person;
List<Person> people = query.from(person)
  .where(person.eyeColor.eq("brown"))
  .list(person);

Now, that's readable!!! Also, in the project I'm working, I've also extended the query (actually, extending AbstractJPAQuery) and added other useful methods, like page(currentPage, pageSize). Such things can't be done in JPA because all objects (Query, CriteriaQuery, CriteriaBuilder) are interfaces given by the JPA provider, and can't be easily extended.

So, here is my tip to anyone thinking about using a Criteria API: Give Querydsl a try! By the way, did I mention that it can also be used with JDO, Lucene, JDBC and even plain collections?

Syntax coloring