A fuzzy search tool written in C++ with Python bindings
FuzzyBunny is a lightweight, high-performance Python library for fuzzy string matching and ranking. It is implemented in C++ for speed and exposes a Pythonic API via Pybind11. It supports various scoring algorithms including Levenshtein, Jaccard, and Token Sort, along with partial matching capabilities.
- Fast C++ Core: Optimized string matching algorithms.
- Multiple Scorers:
levenshtein: Standard edit distance ratio.jaccard: Set-based similarity.token_sort: Sorts tokens before comparing (good for "Apple Banana" vs "Banana Apple").
- Ranking: Efficiently rank a list of candidates against a query.
- Partial Matching: Support for substring matching via
mode='partial'. - Unicode Support: Correctly handles UTF-8 input.
- Python 3.8+
- C++17 compatible compiler (GCC, Clang, MSVC)
uv pip install .pip install .import fuzzybunny
# Basic Levenshtein Ratio
score = fuzzybunny.levenshtein("kitten", "sitting")
print(f"Score: {score}") # ~0.57
# Partial Matching
# "apple" is a perfect substring of "apple pie"
score = fuzzybunny.partial_ratio("apple", "apple pie")
print(f"Partial Score: {score}") # 1.0
# Ranking Candidates
candidates = ["apple pie", "banana bread", "cherry tart", "apple crisp"]
results = fuzzybunny.rank(
query="apple",
candidates=candidates,
scorer="levenshtein",
mode="partial",
top_n=2
)
for candidate, score in results:
print(f"{candidate}: {score}")
# Output:
# apple pie: 1.0
# apple crisp: 1.0-
Setup Environment:
uv venv source .venv/bin/activate -
Install in Editable Mode:
uv pip install -e . -
Run Tests:
pytest
This project is licensed under the MIT License.
