Nixiesearch: Lucene Fine-Tuned Search

Discover Nixiesearch, a new Lucene-based hybrid search engine. Learn how it offers traditional search features, cloud-native deployment, and democratized fine-tuning for your data.

Overview

Building a new search engine in 2024 sounds like a stupid idea: a new vector search startup is created every week, so how can you be different from so many competitors?

But in practice, you can quickly discover that putting vectors into an HNSW index is not the most challenging part of building a search application that your customers would use and like. Relevance tuning, multi-field search, facets, filters, autocomplete suggestions - the RAG-vector search crowd is still discovering all these “novel” things.

In this talk, we’re going to introduce Nixiesearch, an open-source hybrid search engine focused on solving typical search problems:

Based on Lucene. You get filters, facets, autocomplete, and complex queries out of the box with decent performance.
Cloud-native and serverless. Can use S3-compatible block storage for index persistence, being able to scale to zero.
Can fine-tune the underlying embedding model on your relevance labels (if you have them) or LLM-generated synthetic labels.

Nixiesearch is still in an early development stage, so your opinion on “how to do a search engine right” is really important.

Links

https://github.com/nixiesearch/nixiesearch
Hybrid search engine built on Lucene, S3, and local ONNX embedding inference.

Tech stack