concept

BM25

BM25 (Best Matching 25) is a probabilistic ranking function used in information retrieval to score and rank documents based on their relevance to a given search query. It builds upon the TF-IDF (Term Frequency-Inverse Document Frequency) model by incorporating document length normalization and term saturation to improve accuracy, particularly for full-text search in search engines and databases. The algorithm calculates a relevance score for each document by considering term frequency, inverse document frequency, and document length relative to the average document length in the collection.

Also known as: Best Matching 25, Okapi BM25, BM25F, BM25 ranking algorithm, BM25 scoring
🧊Why learn BM25?

Developers should learn BM25 when building or optimizing search systems, such as in search engines, recommendation systems, or database queries, as it provides a robust and widely-adopted method for relevance ranking that outperforms simpler models like TF-IDF in many real-world scenarios. It is particularly useful in applications like Elasticsearch, Apache Lucene, and other full-text search tools where handling large document collections with varying lengths and term distributions is critical for delivering accurate search results.

Compare BM25

Learning Resources

Related Tools

Alternatives to BM25