Skip to content

cachevector/fuzzybunny

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FuzzyBunny Logo

FuzzyBunny

A fuzzy search tool written in C++ with Python bindings

Overview

FuzzyBunny is a lightweight, high-performance Python library for fuzzy string matching and ranking. It is implemented in C++ for speed and exposes a Pythonic API via Pybind11. It supports various scoring algorithms including Levenshtein, Jaccard, and Token Sort, along with partial matching capabilities.

Features

  • Fast C++ Core: Optimized string matching algorithms.
  • Multiple Scorers:
    • levenshtein: Standard edit distance ratio.
    • jaccard: Set-based similarity.
    • token_sort: Sorts tokens before comparing (good for "Apple Banana" vs "Banana Apple").
  • Ranking: Efficiently rank a list of candidates against a query.
  • Partial Matching: Support for substring matching via mode='partial'.
  • Unicode Support: Correctly handles UTF-8 input.

Installation

Prerequisites

  • Python 3.8+
  • C++17 compatible compiler (GCC, Clang, MSVC)

Using uv (Recommended)

uv pip install .

Using pip

pip install .

Usage

import fuzzybunny

# Basic Levenshtein Ratio
score = fuzzybunny.levenshtein("kitten", "sitting")
print(f"Score: {score}")  # ~0.57

# Partial Matching
# "apple" is a perfect substring of "apple pie"
score = fuzzybunny.partial_ratio("apple", "apple pie")
print(f"Partial Score: {score}")  # 1.0

# Ranking Candidates
candidates = ["apple pie", "banana bread", "cherry tart", "apple crisp"]
results = fuzzybunny.rank(
    query="apple", 
    candidates=candidates, 
    scorer="levenshtein", 
    mode="partial", 
    top_n=2
)

for candidate, score in results:
    print(f"{candidate}: {score}")
# Output:
# apple pie: 1.0
# apple crisp: 1.0

Development

  1. Setup Environment:

    uv venv
    source .venv/bin/activate
  2. Install in Editable Mode:

    uv pip install -e .
  3. Run Tests:

    pytest

License

This project is licensed under the MIT License.

About

A fuzzy search tool written in python

Resources

License

Stars

Watchers

Forks

Packages

No packages published