Sphinx search introduction

507110_binocular

After reading my introduction to full text search or you have read article somewhere else and decided to go with full text search in your next project, but you still confuse what full text search engine to use. One implementation of full text search engine is Sphinx. And I’ll give you a short course on how you installing Sphinx for your full text search engine.

Sphinx is a full-text search engine, distributed under GPL version 2. It is not only fast in searching but it is also fast in indexing your data. Currently, Sphinx API has binding in PHP, Python, Perl, Ruby and Java.

Sphinx features

  • high indexing speed (upto 10 MB/sec on modern CPUs);
  • high search speed (avg query is under 0.1 sec on 2-4 GB text collections);
  • high scalability (upto 100 GB of text, upto 100 M documents on a single CPU);
  • provides good relevance ranking through combination of phrase proximity ranking and statistical (BM25) ranking;
  • provides distributed searching capabilities;
  • provides document exceprts generation;
  • provides searching from within MySQL through pluggable storage engine;
  • supports boolean, phrase, and word proximity queries;
  • supports multiple full-text fields per document (upto 32 by default);
  • supports multiple additional attributes per document (ie. groups, timestamps, etc);
  • supports stopwords;
  • supports both single-byte encodings and UTF-8;
  • supports English stemming, Russian stemming, and Soundex for morphology;
  • supports MySQL natively (MyISAM and InnoDB tables are both supported);
  • supports PostgreSQL natively.

There you go, so fire up your terminal or console, and let’s get thing done. Continue reading »

Full text search introduction

What is full text search?

Full text search is searching method that will use the entire words and find all match search against our database.

And according to Wikipedia

In text retrieval, full text search refers to a technique for searching a computer-stored document or database. In a full text search, the search engine examines all of the words in every stored document as it tries to match search words supplied by the user

In our traditional search method mostly we only search our criteria for against 1 column in our table.

Let’s say we have 1 million data about books, and we have title, author, and summary.

And now, we’re trying to create simple search against them. If I want to search book with title ‘PHP and MySQL Web Development’

forms01

We fill the form and click search button, and ta da, we got the result. But wait … Continue reading »