Add search features to your application, try Elasticsearch part 3 : attaching indexation to events

Now that we are able to index, we should think of when whe should trigger indexing tasks.

A simple answer would be : whenever some indexed data has changed. “Changed” means change cardinality (add/remove) or change existing data.

Either we invoke indexing tasks whenever we code an action that changes data or we use an event model which listens to precise events.

1 – JPA event model

If you use JPA as a persistence mechanism you can take advantage of its elegant mechanism. You can register an entity’s listeners either at class level or at method level via annotations.
One can annotate an entity method as a listner to an event. The method is the will be executed when the event defined by the annotation occurs.
If this solution seems too intrusive or too specific, one can externalize this behaviour in a class and annotate the entity.


Below, an example:

import javax.persistence.Entity;
import javax.persistence.EntityListeners;
import javax.persistence.Id;
import javax.persistence.PostLoad;
import javax.persistence.PostPersist;
import javax.persistence.PostUpdate;
import javax.persistence.PrePersist;
import javax.persistence.PreRemove;
import javax.persistence.PreUpdate;
import javax.persistence.Transient;

@Entity
@EntityListeners({EmployeeDebugListener.class, NameValidator.class})
public class Employee {
    @Id private int id;
    private String name;
    @Transient private long syncTime;

    @PostPersist
    @PostUpdate
    @PostLoad
    private void resetSyncTime() {
        syncTime = System.currentTimeMillis();
        System.out.println("Employee.resetSyncTime called on employee id: " + getId());
    }

    public long getCachedAge() {
        return System.currentTimeMillis() - syncTime;
    }

    public int getId() {
        return id;
    }

    public void setId(int id) {
        this.id = id;
    }

    public String toString() {
        return "Employee id: " + getId() ;
    }
}

That model, although very elegant, doesn’t suit you if you use Spring because the persistence can’t use a bean instance. It creates its own instances which totally goes against dependency injection. This post is a rather complete material about the solution based on JPA.

2 – Hibernate event model

When using Hibernate, without JPA, with Spring you can register instances, not only classes. Your application can listen to post-insert/post-update/post-delete events. This solution is my favorite one if your application writes little and reads much.
You can specify your listeners by setting the eventListeners property of the LocalSessionFactoryBean. It’s a map which associates an event key to an array of listeners instance.

    <bean name="sessionFactory" class="org.springframework.orm.hibernate3.LocalSessionFactoryBean">
        <property name="dataSource" ref="dataSource"/>
        <property name="mappingLocations" value="classpath:hibernate/mapping/*.xml"/>
        <property name="hibernateProperties">
            <props>
                <prop key="hibernate.dialect">${hibernate.dialect}</prop>
                <prop key="hibernate.hbm2ddl.auto">${hibernate.hbm2ddl.auto}</prop>
                <prop key="hibernate.show_sql">${hibernate.show_sql}</prop>
                <prop key="hibernate.connection.useUnicode">true</prop>
                <prop key="hibernate.connection.characterEncoding">UTF-8</prop>
            </props>
        </property>
        <property name="eventListeners">
            <map>
                <entry key="post-insert">
                    <ref bean="PostCommitInsertEventListener"/>
                </entry>
                <entry key="post-update">
                    <ref bean="PostCommitUpdateEventListener"/>
                </entry>
                <entry key="post-delete">
                    <ref bean="PostCommitDeleteEventListener"/>
                </entry>
            </map>
        </property>

    </bean>

PostCommitDeleteEventListener source code:

...
import org.hibernate.event.PostDeleteEvent;
import org.hibernate.event.PostDeleteEventListener;

public class PostCommitDeleteEventListener implements PostDeleteEventListener {
    public static final String BEAN_ID = "PostCommitDeleteEventListener";
    @Autowired
    private SearchEngine searchEngine;
    @Override
    public void onPostDelete(PostDeleteEvent event) {
        if (event == null) return;
        Object eventEntity = event.getEntity();
        if (!(eventEntity instanceof Advert)) return;
        Advert advert = (Advert) eventEntity;
        Long id = advert.getId();
        searchEngine.removeFromIndex(id);
    }
}
...

This is one of the most non intrusive solution. It also ensures that even if you add a new business method that updates the database state, changes will automatically reflect in the index. No need to manually call index tasks.

3 – Spring event model

When you’re stuck with JPA you can use Spring event model. You use ApplicationEventPublisher to publish CRUD event then implement ApplicationListener to react to the event.
Parameterized types (generics) ensure your code will react to one type only, this can be quite convenient: not reacting to Job event but reacting to Advert events.
That solution is not very resistant to changes because if you forget to trigger an event nothing will happen. It is equivalent to manually call index taks but it is the only one available when using JPA.

Example of ApplicationEventPublisher call:

...
	@Autowired
	private ApplicationEventPublisher	eventPublisher;
	/**
	 * @see org.diveintojee.poc.jbehave.domain.business.Facade#deleteAdvert(java.lang.Long)
	 */
	@Override
	@Transactional(propagation = Propagation.REQUIRED)
	public void deleteAdvert(final Long advertId) {
		Preconditions.checkArgument(advertId != null,
                    "Illegal call to deleteAdvert, advert identifier is required");
		this.baseDao.delete(Advert.class, advertId);
		this.eventPublisher.publishEvent(new PostDeleteAdvertEvent(new Advert(advertId)));
	}
...

Example of event listener:

...
/**
 * @author louis.gueye@gmail.com
 */
@Component
public class PostDeleteAdvertEventListener implements ApplicationListener<PostDeleteAdvertEvent> {
	@Autowired
	private SearchEngine	searchEngine;
	/**
	 * @see org.springframework.context.ApplicationListener#onApplicationEvent(org.springframework.context.ApplicationEvent)
	 */
	@Override
	public void onApplicationEvent(PostDeleteAdvertEvent event) {
		if (event == null) return;
		final Advert entity = event.getSource();
		if (entity == null || entity.getId() == null) return;
		this.searchEngine.removeFromIndex(Advert.class, entity.getId());
	}
}
...

4 – Elasticsearch river

An Elasticsearch River is a mechanism which pulls data from a datasource (couchdb, twitter, wikipedia, rabbitmq, rss) on a regular basis (500 ms for example) and updates the index based on what changed since the last refresh. The idea is really nice but there a too few plugin yet. Elasticsearch provides only 4 rivers plugin but contributions are more than welcome :).

So far we’ve got familiar with search engine concepts then we started to have a first contact with Elasticsearch writing CRUD tests.
We just discussed several solutions to trigger indexing. Now that we know how to index data we can finally focus on the search business which is what the next post will try to present.

The source code hasn’t moved, still on github. Feel free to explore it

Advertisements

4 thoughts on “Add search features to your application, try Elasticsearch part 3 : attaching indexation to events

    1. Hi Sebastian,
      The @Transient annotation indeed prevents any serialization technology to apply but my post is not about what gets serialized or not.
      The post is about finding a mechanism (preferably a simple yet reliable one) to keep in sync the persisted data and the indexed data.
      I hope it answers your question

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s