Pick Up SphinxSearch: Advanced Configurations For Indexing

By:   –  Last updated:   –  #sphinxsearch ·  #search

Code Theme [Dark]

Content Overview [Hide]

Following are notes about several advanced configurations in "index" section of sphinx.conf.

1. morphology

SphinxSearch will index and search the exact word you are looking for by default. Words have the same meaning but has different forms will not be considered as "matched".

For example, "dog" and "dogs", "mouse" and "mice", "run" and "ran", "operation" and "operational".

SphinxSearch will take morphology processing during its indexing phase, which is called Stemming and Lemmatization.

To turn on this feature, you should add following line to index secion of sphinx.conf.

# file: sphinx.conf

# ... ...

index {
    ......
    
    morphology = stem_en, lemmatize_en
    
    ......
}

2. Enable Wildcard Search

If you want to achieve wildcard/fuzzy searching or want to use the extended quering syntax of SphinxSearch, you should enable either prefix indexing or infix indexing.

Infix indexing means SphinxSearch will index all substrings of a word. For instance, indexing a keyword "cake" with min_infix_len=2 will result in indexing "ca", "ak", "ke", "cak", "ake" infixes along with the orginal word itself.

Prefix indexing means SphinxSearch will index all the possible keyword prefixes in addition to the keyword itself. For example, indexing a keyword "cake" with min_prefix_len=2 will result in indexing "ca", "cak", "cake".

Obviously, these two indexing methods work in totally different ways. So you cannot enable both Infix and Prefix at the same time. Just choose Infix OR Prefix and enable it by adding following setting:

# file: sphinx.conf

# ... ...

index {
    ......
    
    min_infix_len = 3
    
    # OR
    # min_prefix_len = 3
    
    ......
}

After changing any configurations in sphinx.conf, don't forget to reindex the data or restart SphinxSearch service.

$ sudo indexer -c /etc/sphinxsearch/sphinx.conf --rotate --all
# OR
$ sudo service sphinxsearch restart