Setup an Open Source Search Engine with Sphinx

Written by on November 25, 2006 in Tutorials - 6 Comments

Sphinx is a full-text search engine, distributed under GPL version 2. a standalone search engine, meant to provide fast, size-efficient and relevant fulltext search functions to other applications. Sphinx was specially designed to integrate well with SQL databases and scripting languages. Currently built-in data source drivers support fetching data either via direct connection to MySQL, PostgreSQL, or from a pipe in a custom XML format.
What is great with Sphinx that it came with a pure-PHP searchd client API, this will make its integration with PHP application much more easier. In addition to many features that make Sphinx a very interesting alternative to an open source SQL full-text search engine.


Key features Sphinx :

  • high indexing speed (upto 10 MB/sec on modern CPUs)
  • high search speed (avg query is under 0.1 sec on 2-4 GB text collections)
  • high scalability (upto 100 GB of text, upto 100 M documents on a single CPU)
  • supports distributed searching (since v.0.9.6)
  • supports MySQL natively (MyISAM and InnoDB tables are both supported)
  • supports phrase searching
  • supports phrase proximity ranking, providing good relevance
  • supports English and Russian stemming
  • supports any number of document fields (weights can be changed on the fly)
  • supports document groups
  • supports stopwords
  • supports different search modes (“match all”, “match phrase” and “match any” as of v.0.9.5)
  • generic XML interface which grealy simplifies custom integration
  • pure-PHP (ie. NO module compiling etc) searchd client API


Current Sphinx distribution includes the following software:

  • indexer: an utility to create fulltext indices;
  • search: a simple (test) utility to query fulltext indices from command line;
  • searchd: a daemon to search through fulltext indices from external software (such as Web scripts);
  • sphinxapi: a set of API libraries for popular Web scripting languages (currently, PHP);

Download
You can download Sphinx from http://www.sphinxsearch.com/downloads.html
Documentation available on http://www.sphinxsearch.com/doc.html#installation
Sphinx is Open Source under GPL license but Commercial license is also available for embedded use.

6 Comments on "Setup an Open Source Search Engine with Sphinx"

  1. JackB February 4, 2007 at 9:21 pm · Reply

    You know what? Correct me if I’m wrong, but this will never be something that could be compared to a custom solution…

  2. ituloy angsulong February 18, 2007 at 9:54 am · Reply

    I cant think of any use for it since advanced search engines already staisfies all my needs.

  3. Hatem February 18, 2007 at 10:09 am · Reply

    This is not necessarily to compete with Google, there is lots of usage for such solutions in Web Marketing for example.

  4. Huan October 29, 2008 at 7:22 pm · Reply

    Sphinx is being used in many big projects with success. Sure it’s not a direct competitor of Google but think about more specific databases that need a search function. Take a look at a list of sites that use it: http://www.sphinxsearch.com/powered.html

  5. halfer December 4, 2008 at 4:07 pm · Reply

    In response to a previous comment – you’d be surprised what a generic custom search system can achieve (e.g. Zend_Lucene, and this one also). I dare say you might also be surprised how complex a search system can be – making a “custom solution” represent far much more work than it is often worth.

  6. Ivan Jovanovic December 31, 2008 at 2:31 pm · Reply

    Sphinx is a great piece of software. I use it on a website that has millions of records to index and search within. Its performance is great and also the flexibility to set weights of the fields dynamically. I’d mention possibility to externally set sorting strategies.
    Also, support from the developer is of great help when things just won’t fix.

Leave a Comment