You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -11,92 +11,347 @@ Check out the configuration reference at https://huggingface.co/docs/hub/spaces-
11
11
12
12
# 🧠 Unified Embedding API
13
13
14
-
> 🧩 Unified API for all your Embedding & Sparse needs — plug and play with any model from Hugging Face or your own fine-tuned versions. This official repository from huggingface space
14
+
> 🧩 Unified API for all your Embedding, Sparse & Reranking Models — plug and play with any model from Hugging Face or your own fine-tuned versions.
15
15
16
16
---
17
17
18
18
## 🚀 Overview
19
19
20
-
**Unified Embedding API** is a modular and open-source **RAG-ready API** built for developers who want a simple, unified way to access **dense**, and **sparse** models.
20
+
**Unified Embedding API** is a modular and open-source **RAG-ready API** built for developers who want a simple, unified way to access **dense**, **sparse**, and **reranking** models.
21
21
22
22
It’s designed for **vector search**, **semantic retrieval**, and **AI-powered pipelines** — all controlled from a single `config.yaml` file.
23
23
24
24
⚠️ **Note:** This is a development API.
25
-
For production deployment, host it on cloud platforms such as **Hugging Face TGI**, **AWS**, or **GCP**.
25
+
For production deployment, host it on cloud platforms such as **Hugging Face TEI**, **AWS**, **GCP**, or any cloud provider of your choice.
26
26
27
27
---
28
28
29
29
## 🧩 Features
30
30
31
31
- 🧠 **Unified Interface** — One API to handle dense, sparse, and reranking models.
32
-
- ⚙️ **Configurable** — Switch models instantly via `config.yaml`.
- 🔧 **Flexible Parameters** — Full control via kwargs and options
33
34
- 🔍 **Vector DB Ready** — Easily integrates with FAISS, Chroma, Qdrant, Milvus, etc.
34
35
- 📈 **RAG Support** — Perfect base for Retrieval-Augmented Generation systems.
35
36
- ⚡ **Fast & Lightweight** — Powered by FastAPI and optimized with async processing.
36
-
- 🧰 **Extendable** — Add your own models or pipelines effortlessly.
37
+
- 🧰 **Extendable** — Switch models instantly via `config.yaml` and add your own models or pipelines effortlessly.
37
38
38
39
---
39
40
40
41
## 📁 Project Structure
41
42
42
43
```
43
-
44
44
unified-embedding-api/
45
+
├── src/
46
+
│ ├── api/
47
+
│ │ ├── dependencies.py
48
+
│ │ └── routes/
49
+
│ │ ├── embeddings.py # endpoint sparse & dense
50
+
│ │ ├── models.py
51
+
│ │ |── health.py
52
+
│ │ └── rerank.py # endpoint reranking
53
+
│ ├── core/
54
+
│ │ ├── base.py
55
+
│ │ ├── config.py
56
+
│ │ ├── exceptions.py
57
+
│ │ └── manager.py
58
+
│ ├── models/
59
+
│ │ ├── embeddings/
60
+
│ │ │ ├── dense.py # dense model
61
+
│ │ │ └── sparse.py # sparse model
62
+
│ │ │ └── rank.py # reranking model
63
+
│ │ └── schemas/
64
+
│ │ ├── common.py
65
+
│ │ ├── requests.py
66
+
│ │ └── responses.py
67
+
│ ├── config/
68
+
│ │ ├── settings.py
69
+
│ │ └── models.yaml # add/change models here
70
+
│ └── utils/
71
+
│ ├── logger.py
72
+
│ └── validators.py
45
73
│
46
-
├── core/
47
-
│ ├── embedding.py
48
-
│ └── model_manager.py
49
-
├── models/
50
-
| └──model.py
51
-
├── app.py # Entry point (FastAPI server)
52
-
|── config.yaml # Model + system configuration
53
-
├── Dockerfile
74
+
├── app.py
54
75
├── requirements.txt
76
+
├── LICENSE
77
+
├── Dockerfile
55
78
└── README.md
56
-
57
79
```
58
80
---
59
81
## 🧩 Model Selection
60
82
61
-
Default configuration is optimized for **CPU 2vCPU / 16GB RAM**. See [MTEB Leaderboard](https://huggingface.co/spaces/mteb/leaderboard) for memory usage reference.
83
+
Default configuration is optimized for **CPU 2vCPU / 16GB RAM**. See [MTEB Leaderboard](https://huggingface.co/spaces/mteb/leaderboard) for model recommendations and memory usage reference.
84
+
85
+
**Add More Models:** Edit `src/config/models.yaml`
86
+
87
+
```yaml
88
+
models:
89
+
your-model-name:
90
+
name: "org/model-name"
91
+
type: "embeddings"# or "sparse-embeddings" or "rerank"
92
+
```
62
93
63
94
⚠️ If you plan to use larger models like `Qwen2-embedding-8B`, please upgrade your Space.
64
95
65
96
---
66
97
67
98
## ☁️ How to Deploy (Free 🚀)
68
99
69
-
Deploy your **custom Embedding API** on **Hugging Face Spaces** — free, fast, and serverless.
100
+
Deploy your **Custom Embedding API** on **Hugging Face Spaces** — free, fast, and serverless.
-[Hugging Face Spaces](https://huggingface.co/docs/hub/spaces)
81
316
-[Deploy Applications on Hugging Face Spaces (Official Guide)](https://huggingface.co/blog/HemanthSai7/deploy-applications-on-huggingface-spaces)
82
317
-[How-to-Sync-Hugging-Face-Spaces-with-a-GitHub-Repository by Ruslanmv](https://github.com/ruslanmv/How-to-Sync-Hugging-Face-Spaces-with-a-GitHub-Repository?tab=readme-ov-file)
318
+
-[Duplicate & Clone space to local machine](https://huggingface.co/docs/hub/spaces-overview#duplicating-a-space)
319
+
---
83
320
84
321
---
85
322
323
+
## 📝 License
324
+
325
+
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
326
+
327
+
---
86
328
87
-
## 🧑💻 Contributing
329
+
## 🙏 Acknowledgments
88
330
89
-
Contributions are welcome!
90
-
Please open an issue or submit a pull request to discuss changes.
331
+
-**Sentence Transformers** for the embedding models
332
+
-**FastAPI** for the excellent web framework
333
+
-**Hugging Face** for model hosting and Spaces
334
+
-**Open Source Community** for inspiration and support
0 commit comments