[ACCEPTED]-Postgres accent insensitive LIKE search in Rails 3.1 on Heroku-unaccent

Accepted answer
Score: 33

Poor man's solution

If you are able to create a function, you 12 can use this one. I compiled the list starting 11 here and added to it over time. It is pretty 10 complete. You may even want to remove some 9 characters:

CREATE OR REPLACE FUNCTION lower_unaccent(text)
  RETURNS text AS
$func$
SELECT lower(translate($1
     , '¹²³áàâãäåāăąÀÁÂÃÄÅĀĂĄÆćčç©ĆČÇĐÐèéêёëēĕėęěÈÊËЁĒĔĖĘĚ€ğĞıìíîïìĩīĭÌÍÎÏЇÌĨĪĬłŁńňñŃŇÑòóôõöōŏőøÒÓÔÕÖŌŎŐØŒř®ŘšşșߊŞȘùúûüũūŭůÙÚÛÜŨŪŬŮýÿÝŸžżźŽŻŹ'
     , '123aaaaaaaaaaaaaaaaaaacccccccddeeeeeeeeeeeeeeeeeeeeggiiiiiiiiiiiiiiiiiillnnnnnnooooooooooooooooooorrrsssssssuuuuuuuuuuuuuuuuyyyyzzzzzz'
     ));
$func$ LANGUAGE sql IMMUTABLE;

Your query should work like that:

find(:all, :conditions => ["lower_unaccent(name) LIKE ?", "%#{search.downcase}%"])

For 8 left-anchored searches, you can utilize 7 an index on the function for very fast results:

CREATE INDEX tbl_name_lower_unaccent_idx
  ON fest (lower_unaccent(name) text_pattern_ops);

For 6 queries like:

SELECT * FROM tbl WHERE (lower_unaccent(name)) ~~ 'bob%'

Proper solution

In PostgreSQL 9.1+, with the necessary privileges, you 5 can just:

CREATE EXTENSION unaccent;

which provides a function unaccent(), doing 4 what you need (except for lower(), just use that 3 additionally if needed). Read the manual about this extension.
Also 2 available for PostgreSQL 9.0 but CREATE EXTENSION syntax is new in 9.1.

More 1 about unaccent and indexes:

Score: 23

For those like me who are having trouble 5 on add the unaccent extension for PostgreSQL and 4 get it working with the Rails application, here 3 is the migration you need to create:

class AddUnaccentExtension < ActiveRecord::Migration
  def up
    execute "create extension unaccent"
  end

  def down
    execute "drop extension unaccent"
  end
end

And, of 2 course, after rake db:migrate you will be able to use the 1 unaccent function in your queries: unaccent(column) similar to ... or unaccent(lower(column)) ...

Score: 3

First of all, you install postgresql-contrib. Then 17 you connect to your DB and execute:

CREATE EXTENSION unaccent;

to enable 16 the extension for your DB.

Depending on your 15 language, you might need to create a new 14 rule file (in my case greek.rules, located in /usr/share/postgresql/9.1/tsearch_data), or 13 just append to the existing unaccent.rules (quite straightforward).

In 12 case you create your own .rules file, you need 11 to make it default:

ALTER TEXT SEARCH DICTIONARY unaccent (RULES='greek');

This change is persistent, so 10 you need not redo it.

The next step would 9 be to add a method to a model to make use 8 of this function.

One simple solution would 7 be defining a function in the model. For 6 instance:

class Model < ActiveRecord::Base
    [...]
    def self.unaccent(column,value)
        a=self.where('unaccent(?) LIKE ?', column, "%value%")
        a
    end
    [...]
end

Then, I can simply invoke:

Model.unaccent("name","text")

Invoking 5 the same command without the model definition 4 would be as plain as:

Model.where('unaccent(name) LIKE ?', "%text%"

Note: The above example 3 has been tested and works for postgres9.1, Rails 2 4.0, Ruby 2.0.

UPDATE INFO
Fixed potential SQLi backdoor 1 thanks to @Henrik N's feedback

Score: 2

There are 2 questions related to your search 6 on the StackExchange: https://serverfault.com/questions/266373/postgresql-accent-diacritic-insensitive-search

But as you are on 5 Heroku, I doubt this is a good match (unless 4 you have a dedicated database plan).

There 3 is also this one on SO: Removing accents/diacritics from string while preserving other special chars.

But this assumes 2 that your data is stored without any accent.

I 1 hope it will point you in the right direction.

Score: 0

Assuming Foo is the model you are searching 3 against and name is the column. Combining Postgres 2 translate and ActiveSupport's transliterate. You can do something 1 like:

Foo.where(
  "translate(
    LOWER(name),
    'âãäåāăąÁÂÃÄÅĀĂĄèééêëēĕėęěĒĔĖĘĚìíîïìĩīĭÌÍÎÏÌĨĪĬóôõöōŏőÒÓÔÕÖŌŎŐùúûüũūŭůÙÚÛÜŨŪŬŮ',
    'aaaaaaaaaaaaaaaeeeeeeeeeeeeeeeiiiiiiiiiiiiiiiiooooooooooooooouuuuuuuuuuuuuuuu'
  )
  LIKE ?", "%#{ActiveSupport::Inflector.transliterate("%qué%").downcase}%"
)

More Related questions