Accent-insensitive fulltext indexing with MySQL

by Bruno Pedro

I've been trying for the last couple days to use MySQL's full-text indexing with accent-insensitivity with no success.

From what I know, when you're using accent-insensitive strings it doesn't matter what is the case of the accented characters. For example, café and CAFÉ (coffee, in portuguese) are considered the same.

Well, it just doesn't work that way. I'm using the utf8 charset and the utf8_unicode_ci collation. I created a fulltext index on two columns and it will return different results when using different cases of the same accented character.

Does anybody have any clues? I'm about to create two lower-case columns specifically for the purpose of indexing. Although I really don't like this solution, I think it's the only way to make it work as supposed.

3 Comments

Dominic Mitchell
2006-06-20 08:37:11
In the past, I've found that the best way to deal with this is to have a second copy of the data, with all accents stripped off. It takes a lot of space, but it does work well.
Janos
2006-08-08 10:15:26
Hell-o


Add in your query: CONVERT( _utf8 '$text' USING latin1 )


Bye
JanOS

Guzmán Brasó
2008-08-04 09:50:55
This is not working for me with Mysql 5 on a table with charset latin1 and collation latin1_swedish_ci


Any ideas?