Free IT / Luis Fernando Planella Gonzalez: eclipselink

quarta-feira, 12 de novembro de 2014

Generating DDL with EclipseLink JPA and PostgreSQL

I normally say that either the project I work on (http://www.cyclos.org) is too special or we're just unlucky with the default operation in most libraries we use.
As we need streaming BLOBs (we don't want to load entire images into memory), and EclipseLink by default doesn't handle streaming.

So I had to do a subclass of org.eclipse.persistence.platform.database.PostgreSQLPlatform. The following methods were implemented:

    @Override
    public Object getObjectFromResultSet(ResultSet resultSet, int columnNumber, int type, AbstractSession session) throws SQLException {
        String name;
        if (type == Types.BIGINT) {
            // May be a number or an OID
            name = resultSet.getMetaData().getColumnTypeName(columnNumber);
            if ("OID".equalsIgnoreCase(name)) {
                return resultSet.getBlob(columnNumber);
            }
        }
        return super.getObjectFromResultSet(resultSet, columnNumber, type, session);
    }

    @Override
    public void setParameterValueInDatabaseCall(Object parameter, PreparedStatement statement, int index, AbstractSession session) throws SQLException {
        if (parameter instanceof DatabaseField) {
            DatabaseField field = (DatabaseField) parameter;
            if (Blob.class.equals(field.getType())) {
                statement.setBlob(index, (Blob) null);
            } else {
                super.setParameterValueInDatabaseCall(parameter, statement, index, session);
            }
        } else if (parameter instanceof Blob) {
            statement.setBlob(index, ((Blob) parameter));
        } else {
            super.setParameterValueInDatabaseCall(parameter, statement, index, session);
        }
    }

    @Override
    public boolean shouldUseCustomModifyForCall(DatabaseField field) {
        if (Blob.class.equals(field.getType())) {
            return true;
        }
        return super.shouldUseCustomModifyForCall(field);
    }

    @Override
    @SuppressWarnings({ "rawtypes", "unchecked" })
    protected Hashtable buildFieldTypes() {
        Hashtable types = super.buildFieldTypes();
        types.put(Blob.class, new FieldTypeDefinition("OID", false));
        return types;
    }

This way we can control: small binary data is mapped in entities via byte[]. Large binary data, via java.sql.Blob.

Then, to generate the schema:

    EntityManagerFactoryImpl emf = (EntityManagerFactoryImpl) realEMF;
    DatabaseSessionImpl databaseSession = emf.getDatabaseSession();

    StringWriter sw = new StringWriter();
    SchemaManager schemaManager = new SchemaManager(databaseSession);
    schemaManager.outputDDLToWriter(sw);

    DefaultTableGenerator tableGenerator = new DefaultTableGenerator(databaseSession.getProject()) {
        @Override
        protected void resetFieldTypeForLOB(DirectToFieldMapping mapping) {
            // Hack to avoid the workaround for oracle 4k thin driver bug
        }
    };
    TableCreator tableCreator = tableGenerator.generateDefaultTableCreator();
    tableCreator.createTables(databaseSession, schemaManager);

    String script = sw.toString();

That DefaultTableGenerator inner class took me some hours debugging EclipseLink to figure out. The method comments says it is there to fix issues with oracle 4k thin driver. And it messed up the other use cases, as Blob was being handled as Byte[], and we want OID type specifically for Blobs.

Congratulations, Oracle! (facepalm)

domingo, 20 de janeiro de 2013

Replaced Hibernate as JPA provider... To never look back!!!

Hibernate is probably the most well-known ORM tool for Java. I first used it on version 1.X back on 2002. It even influenced the JPA (Java Persistence API), which is a standard ORM API.
The problem is: the application (has about 200 entities) was taking up +- 350MB of heap size on startup right after forcing a garbage collect (using jvisualvm).
That was too much. But things would improve. There was a setting which I've always mislooked as a batch size equivalent, called hibernate.default_batch_fetch_size, which we had with value 20.
After some investigation, I found it was used to load several records at once, at the expense of memory. So, just to test out, I changed it to 1 and, surprise... The same application was now taking up +- 150MB! What a change for something misunderstood!
But I was not satisfied, and decided to try another JPA provider. From some researches, I decided to go with EclipseLink. Result? The same application now starts up (after a garbage collection) with +- 35 MB!!!
Ok, what a huge difference! But performance should be worst, shouldn't it? No!!! On the load tests I did, EclipseLink was actually 2.5x faster than Hibernate!
There were some bumps, several queries were done with some non-standard (from JPA's point of view) elements, and so on. But they all could be resolved, one at a time.
Conclusion: After being a loyal Hibernate user for several years (well, not that loyal, as some projects I did with plain JDBC using Querydsl SQL module), I'll now try to avoid it as much as possible, and use EclipseLink instead. Even a future possibility is Batoo JPA, which claims to be 15-20x faster than Hibernate. However, as it cannot be used with Spring's LocalContainerEntityManagerFactoryBean (at least for now, as it requires a persistence.xml, and I like bootstrapping things programmatically), I'll stick with EclipseLink for now.

Syntax coloring

quarta-feira, 12 de novembro de 2014

Generating DDL with EclipseLink JPA and PostgreSQL

domingo, 20 de janeiro de 2013

Replaced Hibernate as JPA provider... To never look back!!!