PostgreSQL Full Text Search capabilities are good enough, this guide will allow you to incorporate Full Text Search for your JPA entities using this.
Unaccent
Install unaccent extension in postgresl database.
postgres@pemehue:~$ psql mydatabase psql (9.6.5, servidor 9.5.9) Digite «help» para obtener ayuda. postgres=# create extension unaccent;
Create class that implementes a Custom Dialect
You must pass this class to hibernate with properties or xml, in
play frameworks 1.4.x it’s jpa.dialect=util.jpa.PgFullTextDialect
in application.conf
package util.jpa; import org.hibernate.dialect.PostgreSQL82Dialect; import org.hibernate.dialect.function.StandardSQLFunction; import org.hibernate.type.DoubleType; import org.hibernate.type.ObjectType; public class PgFullTextDialect extends PostgreSQL82Dialect { public PgFullTextDialect() { registerFunction("fts", new PgFullTextFunction()); registerFunction("ts_rank", new StandardSQLFunction("ts_rank", DoubleType.INSTANCE)); registerFunction("to_tsquery", new StandardSQLFunction("to_tsquery", ObjectType.INSTANCE)); }
Create a class that implements the function “fts” mentioned above.
package util.jpa; import java.util.List; import org.hibernate.QueryException; import org.hibernate.dialect.function.SQLFunction; import org.hibernate.engine.spi.Mapping; import org.hibernate.engine.spi.SessionFactoryImplementor; import org.hibernate.type.BooleanType; import org.hibernate.type.Type; public class PgFullTextFunction implements SQLFunction { @SuppressWarnings("unchecked") @Override public String render(Type type, List args, SessionFactoryImplementor sessionFactoryImplementor) throws QueryException { if (args.size() < 2) { throw new IllegalArgumentException("The function must be passed 2 arguments"); } String field = (String) args.get(0); String value = (String) args.get(1); value = value.replaceAll("\\s+", " & ").replaceAll("'", ""); value = org.apache.commons.lang3.StringUtils.stripAccents(value); boolean isLike = false; if (args.size() == 3) isLike = Boolean.valueOf(args.get(2).toString()); String fragment = ""; fragment += "to_tsvector(unaccent(" + field + ")) @@ "; if (isLike) { fragment += "to_tsquery('%" + value + "%')"; } else { fragment += "to_tsquery('" + value + "')"; } System.err.println("fragment:" + fragment); return fragment; } @Override public Type getReturnType(Type columnType, Mapping mapping) throws QueryException { return new BooleanType(); } @Override public boolean hasArguments() { return true; } @Override public boolean hasParenthesesIfNoArguments() { return false; } }
Using full text search samples
In this case using class User with property name
... @Test public void buscaConAcentos() { String jpql = "select c from User c where fts(name, 'Pérez López') = true "; Query q = JPA.em().createQuery(jpql); List list = q.getResultList(); assertThat(list.size(), is(1)); } @Test public void buscaSinAcentos() { String jpql = "select c from User c where fts(name, 'Perez Lopez') = true "; Query q = JPA.em().createQuery(jpql); List list = q.getResultList(); assertThat(list.size(), is(1)); } ...
Keep the text to index in a field
Sometimes is recommended to have a field with all the text to index, and keep it updated with JPA update an persist callbacks.
... @PrePersist @PreUpdate public void preUpdate() { String s = this.getNombre() + " " + nombreProfesionOVacio(); this.search = Normalizer.normalize(s.toLowerCase().replaceAll("\\s+", " "), Normalizer.Form.NFD); } ...
Maintain tsvector precalculated and indexed
Instead of calculating tsvector on the fly, you can add a tsvector field on the
table. This field can be updated with a trigger, this is the sql code.
ALTER TABLE ident.user ADD COLUMN fulltext tsvector; CREATE OR REPLACE FUNCTION public.user_vector_update() RETURNS trigger AS $BODY$ BEGIN IF TG_OP = 'INSERT' OR TG_OP = 'UPDATE' THEN new.fulltext = to_tsvector('pg_catalog.spanish', COALESCE(unaccent(NEW.search), '')); END IF; RETURN NEW; END $BODY$ LANGUAGE plpgsql VOLATILE COST 100; CREATE INDEX full_text_index_user ON ident.user USING GIN (fulltext); UPDATE ident.user SET fulltext = to_tsvector('pg_catalog.spanish', unaccent(search)); CREATE TRIGGER tsvectorupdate_user BEFORE INSERT OR UPDATE ON ident.user FOR EACH ROW EXECUTE PROCEDURE public.user_vector_update();
In this case the function PgFullTextFunction must be redefined to use this precalculated field called fulltext and
not to tsvector it every time.
Replicación Postgrsql 9.6 on Ubuntu 16.04
Run this script in the replica server, be careful this will drop the database:
postgres@europa:~$ cat replica.sh #!/bin/bash echo Stopping PostgreSQL sudo service postgresql stop echo Cleaning up old cluster directory sudo -u postgres rm -rf /var/lib/postgresql/9.6/main echo Starting base backup as replication sudo -u postgres pg_basebackup -h my.master.server -D /var/lib/postgresql/9.6/main -U replication -v -P echo Writing recovery.conf file sudo -u postgres bash -c "cat > /var/lib/postgresql/9.6/main/recovery.conf <<- _EOF1_ standby_mode = 'on' primary_conninfo = 'host=my.master.server port=5432 user=replication password=passreplicauser sslmode=require' trigger_file = '/tmp/postgresql.trigger' _EOF1_ " echo Starting PostgreSQL sudo service postgresql start
Add this lines to the postgresql.conf of the replica server
wal_level = hot_standby max_wal_senders = 3 wal_keep_segments = 8 hot_standby = on default_statistics_target = 50 maintenance_work_mem = 960MB constraint_exclusion = on checkpoint_completion_target = 0.9 effective_cache_size = 11GB work_mem = 40MB wal_buffers = 8MB shared_buffers = 3840MB max_connections = 600