How does google search engine works?

What full-text search library supports the most modern (Google-like) search engine query syntax?

  • Lucene's query syntax seems to be designed after AltaVista's then-very-popular search engine.  Since that time, many consumers have since moved on to using Google and Bing, which have a somewhat different syntax for search queries, which springs in part from different approaches to things such as stopwords, so it's not a translatable, merely syntactic difference. But it is that modern query syntax and "how search works" that users have come to expect in web search.  It is troublesome and perhaps naïve to try to teach users otherwise just to search a particular site (stuck-in-the-90's Craigslist excepted).  As a result, site search provided by these libraries is often disappointing and seems to work poorly to most users (even though experts may be able to produce good results), especially when compared with Google and Bing site search products. Yet nearly every full-text search library still seems to be built on or modeled after Lucene and uses its query syntax.  Has Lucene made significant changes that make it more compatible with the expectations of modern web search users?  Are there newer full-text search libraries, newer query syntax and indexing options for Lucene or other options for a site that won't require users to search like it's 1999?

  • Answer:

    I believe there is a wrong premise in the core of this question. Lucene is a search library, whose focus is on search implementation rather than search user interface. It has a default query parser, but it is very customizable. About 80 percent of Lucene's behavior, including all of its querying options, is customizable. Take stop words, which are the concrete example given in the question, for example. You can index a text field with its stop words, without them, or use a middle ground - commongrams [1] Solr [2] is a search server built on top of Lucene. It also has myriad customization options, while being much easier to set up than bare Lucene. Please see [3] for a specific example of customizing Lucene. [1] http://www.hathitrust.org/blogs/large-scale-search/slow-queries-and-common-words-part-2 [2] http://lucene.apache.org/solr/ [3] http://stackoverflow.com/questions/1977815/full-text-search-like-google/1996460#1996460

Yuval Feinstein at Quora Visit the source

Was this solution helpful to you?

Related Q & A:

Just Added Q & A:

Find solution

For every problem there is a solution! Proved by Solucija.

  • Got an issue and looking for advice?

  • Ask Solucija to search every corner of the Web for help.

  • Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.