[ACCEPTED]-Postgres accent insensitive LIKE search in Rails 3.1 on Heroku-unaccent
Poor man's solution
If you are able to create a function, you 12 can use this one. I compiled the list starting 11 here and added to it over time. It is pretty 10 complete. You may even want to remove some 9 characters:
CREATE OR REPLACE FUNCTION lower_unaccent(text)
RETURNS text AS
$func$
SELECT lower(translate($1
, '¹²³áàâãäåāăąÀÁÂÃÄÅĀĂĄÆćčç©ĆČÇĐÐèéêёëēĕėęěÈÊËЁĒĔĖĘĚ€ğĞıìíîïìĩīĭÌÍÎÏЇÌĨĪĬłŁńňñŃŇÑòóôõöōŏőøÒÓÔÕÖŌŎŐØŒř®ŘšşșߊŞȘùúûüũūŭůÙÚÛÜŨŪŬŮýÿÝŸžżźŽŻŹ'
, '123aaaaaaaaaaaaaaaaaaacccccccddeeeeeeeeeeeeeeeeeeeeggiiiiiiiiiiiiiiiiiillnnnnnnooooooooooooooooooorrrsssssssuuuuuuuuuuuuuuuuyyyyzzzzzz'
));
$func$ LANGUAGE sql IMMUTABLE;
Your query should work like that:
find(:all, :conditions => ["lower_unaccent(name) LIKE ?", "%#{search.downcase}%"])
For 8 left-anchored searches, you can utilize 7 an index on the function for very fast results:
CREATE INDEX tbl_name_lower_unaccent_idx
ON fest (lower_unaccent(name) text_pattern_ops);
For 6 queries like:
SELECT * FROM tbl WHERE (lower_unaccent(name)) ~~ 'bob%'
Proper solution
In PostgreSQL 9.1+, with the necessary privileges, you 5 can just:
CREATE EXTENSION unaccent;
which provides a function unaccent()
, doing 4 what you need (except for lower()
, just use that 3 additionally if needed). Read the manual about this extension.
Also 2 available for PostgreSQL 9.0 but CREATE EXTENSION
syntax is new in 9.1.
More 1 about unaccent and indexes:
For those like me who are having trouble 5 on add the unaccent
extension for PostgreSQL and 4 get it working with the Rails application, here 3 is the migration you need to create:
class AddUnaccentExtension < ActiveRecord::Migration
def up
execute "create extension unaccent"
end
def down
execute "drop extension unaccent"
end
end
And, of 2 course, after rake db:migrate
you will be able to use the 1 unaccent
function in your queries: unaccent(column) similar to ...
or unaccent(lower(column)) ...
First of all, you install postgresql-contrib. Then 17 you connect to your DB and execute:
CREATE EXTENSION unaccent;
to enable 16 the extension for your DB.
Depending on your 15 language, you might need to create a new 14 rule file (in my case greek.rules
, located in /usr/share/postgresql/9.1/tsearch_data
), or 13 just append to the existing unaccent.rules
(quite straightforward).
In 12 case you create your own .rules
file, you need 11 to make it default:
ALTER TEXT SEARCH DICTIONARY unaccent (RULES='greek');
This change is persistent, so 10 you need not redo it.
The next step would 9 be to add a method to a model to make use 8 of this function.
One simple solution would 7 be defining a function in the model. For 6 instance:
class Model < ActiveRecord::Base
[...]
def self.unaccent(column,value)
a=self.where('unaccent(?) LIKE ?', column, "%value%")
a
end
[...]
end
Then, I can simply invoke:
Model.unaccent("name","text")
Invoking 5 the same command without the model definition 4 would be as plain as:
Model.where('unaccent(name) LIKE ?', "%text%"
Note: The above example 3 has been tested and works for postgres9.1, Rails 2 4.0, Ruby 2.0.
UPDATE INFO
Fixed potential SQLi backdoor 1 thanks to @Henrik N's feedback
There are 2 questions related to your search 6 on the StackExchange: https://serverfault.com/questions/266373/postgresql-accent-diacritic-insensitive-search
But as you are on 5 Heroku, I doubt this is a good match (unless 4 you have a dedicated database plan).
There 3 is also this one on SO: Removing accents/diacritics from string while preserving other special chars.
But this assumes 2 that your data is stored without any accent.
I 1 hope it will point you in the right direction.
Assuming Foo
is the model you are searching 3 against and name
is the column. Combining Postgres 2 translate and ActiveSupport's transliterate. You can do something 1 like:
Foo.where(
"translate(
LOWER(name),
'âãäåāăąÁÂÃÄÅĀĂĄèééêëēĕėęěĒĔĖĘĚìíîïìĩīĭÌÍÎÏÌĨĪĬóôõöōŏőÒÓÔÕÖŌŎŐùúûüũūŭůÙÚÛÜŨŪŬŮ',
'aaaaaaaaaaaaaaaeeeeeeeeeeeeeeeiiiiiiiiiiiiiiiiooooooooooooooouuuuuuuuuuuuuuuu'
)
LIKE ?", "%#{ActiveSupport::Inflector.transliterate("%qué%").downcase}%"
)
More Related questions
We use cookies to improve the performance of the site. By staying on our site, you agree to the terms of use of cookies.