How to weight in full-text search?

MySQL Fulltext searching

  • Is it possible to implement full-text substring table searches in MySQL? I'm relatively new to MySQL, and have been developing a website that uses its FULLTEXT search function. This works okay, but the issue is that the builtin search only searches whole words, and doesn't perform a substring search. For example, if "Metafilter" was in the index, a search for "filter" won't return it. I've checked the MySQL manual to no avail. Is this possible to do this using MySQL, or will I have to switch DBMSes? This is pretty important to the functionality of the site.

  • Answer:

    That's not what fulltext indexing is for, and not the kind of query that can be easily optimized with an index, whatever the DBMS is. You obviously need to use WHERE text LIKE '%filter%', but that's going to be slow on a large table. My intuition is there's a better way to arrange your data in the database that wouldn't require searching this way at all. Can you give us some detail on why you need to search for partial words?

esch at Ask.Metafilter.Com Visit the source

Was this solution helpful to you?

Other answers

It's my understanding, cillit bang, that MySQL 4 has been in some way optimized for fulltext searching. It makes its own index or something, I suspect. I just made the 3->4 jump a week or so ago, and haven't looked into it yet. Perhaps somebody is familiar with this improvement, should it exist?

waldo

I believe you can use wildcards in certain ways in full text searches.. such as 'filter*'. At least, someone mentions it in the http://dev.mysql.com/doc/mysql/en/fulltext-search.html However, you can't use prefix wildcards, only suffix. Lots of good tips near the bottom of the comments too.

wackybrit

That's not what fulltext indexing is for, and not the kind of query that can be easily optimized with an index, whatever the DBMS is. Only partly true. You could easily do partial matches on the starts of words using a fulltext index. This is why, I would guess, MySQL fulltext searching only supports wildcards as suffixes, since you just match for the start of the word using the same indexing method as always. Of course, when it comes to matching the middles or ends of words, yeah, you're right.

wackybrit

Are you using something like this? SELECT id, MATCH (title,body) AGAINST ('Tutorial') FROM articles; You can get mysql to http://dev.mysql.com/doc/mysql/en/fulltext-search.html...

ph00dz

MySQL full text indexing is pretty good, but if you find it doesn't do what you need then consider using http://lucene.apache.org/java/docs/ to do your full text searching.

Nelson

We are currently using exactly what ph00dz said. (I'm working on this with esch, and it's actually really my question.) It works fine and getting the relevance is great and all, but I'd really like to be able to get partial matches (e.g. "meta" -> "Metafilter"). Using LIKE is an idea, but I have a feeling it'll be slower than using FULLTEXT, and I don't get relevance results with that. (And yes, we are using MySQL 4. We're hosting through Dreamhost.)

neckro23

You should do a MATCH(...) AGAINST (...) IN BOOLEAN MODE. That allows you to use * as a wildcard, albeit only at the end of the word -- ie you can search for meta*, but not *filter. Check out the MySQL documentation http://dev.mysql.com/doc/mysql/en/fulltext-boolean.html for more info.

littleme

Oops. Sorry; that should be AGAINST(... IN BOOLEAN MODE).

littleme

Yeah, they clearly missed when I said it, but you said it better ;-)

wackybrit

Related Q & A:

Just Added Q & A:

Find solution

For every problem there is a solution! Proved by Solucija.

  • Got an issue and looking for advice?

  • Ask Solucija to search every corner of the Web for help.

  • Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.