Skip to content

Commit 9847166

Browse files
committed
init README
1 parent 3b88f19 commit 9847166

File tree

8 files changed

+1014
-596
lines changed

8 files changed

+1014
-596
lines changed

API.md

Lines changed: 729 additions & 0 deletions
Large diffs are not rendered by default.

README.md

Lines changed: 285 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -11,92 +11,347 @@ Check out the configuration reference at https://huggingface.co/docs/hub/spaces-
1111

1212
# 🧠 Unified Embedding API
1313

14-
> 🧩 Unified API for all your Embedding & Sparse needs — plug and play with any model from Hugging Face or your own fine-tuned versions. This official repository from huggingface space
14+
> 🧩 Unified API for all your Embedding, Sparse & Reranking Models — plug and play with any model from Hugging Face or your own fine-tuned versions.
1515
1616
---
1717

1818
## 🚀 Overview
1919

20-
**Unified Embedding API** is a modular and open-source **RAG-ready API** built for developers who want a simple, unified way to access **dense**, and **sparse** models.
20+
**Unified Embedding API** is a modular and open-source **RAG-ready API** built for developers who want a simple, unified way to access **dense**, **sparse**, and **reranking** models.
2121

2222
It’s designed for **vector search**, **semantic retrieval**, and **AI-powered pipelines** — all controlled from a single `config.yaml` file.
2323

2424
⚠️ **Note:** This is a development API.
25-
For production deployment, host it on cloud platforms such as **Hugging Face TGI**, **AWS**, or **GCP**.
25+
For production deployment, host it on cloud platforms such as **Hugging Face TEI**, **AWS**, **GCP**, or any cloud provider of your choice.
2626

2727
---
2828

2929
## 🧩 Features
3030

3131
- 🧠 **Unified Interface** — One API to handle dense, sparse, and reranking models.
32-
- ⚙️ **Configurable** — Switch models instantly via `config.yaml`.
32+
-**Batch Processing** — Automatic single/batch.
33+
- 🔧 **Flexible Parameters** — Full control via kwargs and options
3334
- 🔍 **Vector DB Ready** — Easily integrates with FAISS, Chroma, Qdrant, Milvus, etc.
3435
- 📈 **RAG Support** — Perfect base for Retrieval-Augmented Generation systems.
3536
-**Fast & Lightweight** — Powered by FastAPI and optimized with async processing.
36-
- 🧰 **Extendable**Add your own models or pipelines effortlessly.
37+
- 🧰 **Extendable** Switch models instantly via `config.yaml` and add your own models or pipelines effortlessly.
3738

3839
---
3940

4041
## 📁 Project Structure
4142

4243
```
43-
4444
unified-embedding-api/
45+
├── src/
46+
│ ├── api/
47+
│ │ ├── dependencies.py
48+
│ │ └── routes/
49+
│ │ ├── embeddings.py # endpoint sparse & dense
50+
│ │ ├── models.py
51+
│ │ |── health.py
52+
│ │ └── rerank.py # endpoint reranking
53+
│ ├── core/
54+
│ │ ├── base.py
55+
│ │ ├── config.py
56+
│ │ ├── exceptions.py
57+
│ │ └── manager.py
58+
│ ├── models/
59+
│ │ ├── embeddings/
60+
│ │ │ ├── dense.py # dense model
61+
│ │ │ └── sparse.py # sparse model
62+
│ │ │ └── rank.py # reranking model
63+
│ │ └── schemas/
64+
│ │ ├── common.py
65+
│ │ ├── requests.py
66+
│ │ └── responses.py
67+
│ ├── config/
68+
│ │ ├── settings.py
69+
│ │ └── models.yaml # add/change models here
70+
│ └── utils/
71+
│ ├── logger.py
72+
│ └── validators.py
4573
46-
├── core/
47-
│ ├── embedding.py
48-
│ └── model_manager.py
49-
├── models/
50-
| └──model.py
51-
├── app.py # Entry point (FastAPI server)
52-
|── config.yaml # Model + system configuration
53-
├── Dockerfile
74+
├── app.py
5475
├── requirements.txt
76+
├── LICENSE
77+
├── Dockerfile
5578
└── README.md
56-
5779
```
5880
---
5981
## 🧩 Model Selection
6082

61-
Default configuration is optimized for **CPU 2vCPU / 16GB RAM**. See [MTEB Leaderboard](https://huggingface.co/spaces/mteb/leaderboard) for memory usage reference.
83+
Default configuration is optimized for **CPU 2vCPU / 16GB RAM**. See [MTEB Leaderboard](https://huggingface.co/spaces/mteb/leaderboard) for model recommendations and memory usage reference.
84+
85+
**Add More Models:** Edit `src/config/models.yaml`
86+
87+
```yaml
88+
models:
89+
your-model-name:
90+
name: "org/model-name"
91+
type: "embeddings" # or "sparse-embeddings" or "rerank"
92+
```
6293
6394
⚠️ If you plan to use larger models like `Qwen2-embedding-8B`, please upgrade your Space.
6495

6596
---
6697

6798
## ☁️ How to Deploy (Free 🚀)
6899

69-
Deploy your **custom Embedding API** on **Hugging Face Spaces** — free, fast, and serverless.
100+
Deploy your **Custom Embedding API** on **Hugging Face Spaces** — free, fast, and serverless.
101+
102+
### **1️⃣ Deploy on Hugging Face Spaces (Free!)**
103+
104+
1. **Duplicate this Space:**
105+
👉 [fahmiaziz/api-embedding](https://huggingface.co/spaces/fahmiaziz/api-embedding)
106+
Click **⋯** (three dots) → **Duplicate this Space**
107+
108+
2. **Add HF_TOKEN environment variable** Make sure your space is public
109+
110+
3. **Clone your Space locally:**
111+
Click **⋯** → **Clone repository**
112+
```bash
113+
git clone https://huggingface.co/spaces/YOUR_USERNAME/api-embedding
114+
cd api-embedding
115+
```
116+
117+
4. **Edit `src/config/models.yaml`** to customize models:
118+
```yaml
119+
models:
120+
your-model:
121+
name: "org/model-name"
122+
type: "embeddings" # or "sparse-embeddings" or "rerank"
123+
```
124+
125+
5. **Commit and push changes:**
126+
```bash
127+
git add src/config/models.yaml
128+
git commit -m "Update models configuration"
129+
git push
130+
```
131+
132+
6. **Access your API:**
133+
Click **⋯** → **Embed this Space** -> copy **Direct URL**
134+
```
135+
https://YOUR_USERNAME-api-embedding.hf.space
136+
https://YOUR_USERNAME-api-embedding.hf.space/docs # Interactive docs
137+
```
70138
71-
### 🔧 Steps:
139+
That’s it! You now have a live embedding API endpoint powered by your models.
72140
73-
1. **Clone this Space Template:**
74-
👉 [Hugging Face Space — fahmiaziz/api-embedding](https://huggingface.co/spaces/fahmiaziz/api-embedding)
75-
2. **Edit `config.yaml`** to set your own model names and backend preferences.
76-
3. **Push your code** — Spaces will automatically rebuild and host your API.
141+
### **2️⃣ Run Locally (NOT RECOMMENDED)**
77142
78-
That’s it! You now have a live embedding API endpoint powered by your models.
143+
```bash
144+
# Clone repository
145+
git clone https://github.com/fahmiaziz98/unified-embedding-api.git
146+
cd unified-embedding-api
147+
148+
# Create virtual environment
149+
python -m venv venv
150+
source venv/bin/activate
151+
152+
# Install dependencies
153+
pip install -r requirements.txt
154+
155+
# Run server
156+
python app.py
157+
```
158+
159+
API available at: `http://localhost:7860`
160+
161+
### **3️⃣ Run with Docker**
162+
163+
```bash
164+
# Build and run
165+
docker-compose up --build
166+
167+
# Or with Docker only
168+
docker build -t embedding-api .
169+
docker run -p 7860:7860 embedding-api
170+
```
171+
172+
## 📖 Usage Examples
173+
174+
### **Python**
175+
176+
```python
177+
import requests
178+
179+
url = "http://localhost:7860/api/v1/embeddings/embed"
180+
181+
# Single embedding
182+
response = requests.post(url, json={
183+
"texts": ["What is artificial intelligence?"],
184+
"model_id": "qwen3-0.6b"
185+
})
186+
print(response.json())
187+
188+
# Batch embeddings
189+
response = requests.post(url, json={
190+
"texts": [
191+
"First document",
192+
"Second document",
193+
"Third document"
194+
],
195+
"model_id": "qwen3-0.6b",
196+
"options": {
197+
"normalize_embeddings": True
198+
}
199+
})
200+
embeddings = response.json()["embeddings"]
201+
```
202+
203+
### **cURL**
204+
205+
```bash
206+
# Single embedding (Dense)
207+
curl -X POST "http://localhost:7860/api/v1/embeddings/embed" \
208+
-H "Content-Type: application/json" \
209+
-d '{
210+
"texts": ["Hello world"],
211+
"prompt": "add instructions here",
212+
"model_id": "qwen3-0.6b"
213+
}'
214+
215+
# Batch embeddings (Sparse)
216+
curl -X POST "http://localhost:7860/api/v1/embeddings/embed" \
217+
-H "Content-Type: application/json" \
218+
-d '{
219+
"texts": ["First doc", "Second doc", "Third doc"],
220+
"model_id": "splade-pp-v2"
221+
}'
222+
223+
# Reranking
224+
curl -X POST "http://localhost:7860/api/v1/rerank" \
225+
-H "Content-Type: application/json" \
226+
-d '{
227+
"documents": [
228+
"Python is a popular language for data science due to its extensive libraries.",
229+
"R is widely used in statistical computing and data analysis.",
230+
"Java is a versatile language used in various applications, including data science.",
231+
"SQL is essential for managing and querying relational databases.",
232+
"Julia is a high-performance language gaining popularity for numerical computing and data science."
233+
],
234+
"model_id": "bge-v2-m3",
235+
"query": "Python best programming languages for data science",
236+
"top_k": 3
237+
}'
238+
239+
# Query embedding with options
240+
curl -X POST "http://localhost:7860/api/v1/embeddings/query" \
241+
-H "Content-Type: application/json" \
242+
-d '{
243+
"texts": ["What is machine learning?"],
244+
"model_id": "qwen3-0.6b",
245+
"options": {
246+
"normalize_embeddings": true,
247+
"batch_size": 32
248+
}
249+
}'
250+
```
251+
252+
### **JavaScript/TypeScript**
253+
254+
```typescript
255+
const url = "http://localhost:7860/api/v1/embeddings/embed";
256+
257+
const response = await fetch(url, {
258+
method: "POST",
259+
headers: {
260+
"Content-Type": "application/json",
261+
},
262+
body: JSON.stringify({
263+
texts: ["Hello world"],
264+
model_id: "qwen3-0.6b",
265+
}),
266+
});
267+
268+
const data = await response.json();
269+
console.log(data.embedding);
270+
```
79271

80-
📘 **Tutorial Reference:**
272+
---
273+
274+
## 📊 API Endpoints
275+
276+
| Endpoint | Method | Description |
277+
|----------|--------|-------------|
278+
| `/api/v1/embeddings/embed` | POST | Generate document embeddings (single/batch) |
279+
| `/api/v1/embeddings/query` | POST | Generate query embeddings (single/batch) |
280+
| `/api/v1/rerank` | POST | Rerank documents based on a query |
281+
| `/api/v1/models` | GET | List available models |
282+
| `/api/v1/models/{model_id}` | GET | Get model information |
283+
| `/health` | GET | Health check |
284+
| `/` | GET | API information |
285+
| `/docs` | GET | Interactive API documentation |
286+
287+
288+
### 🤝 Contributing
289+
290+
Contributions are welcome! Please:
291+
292+
1. Fork the repository
293+
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
294+
3. Commit your changes (`git commit -m 'Add amazing feature'`)
295+
4. Push to the branch (`git push origin feature/amazing-feature`)
296+
5. Open a Pull Request
297+
298+
**Development Setup:**
299+
300+
```bash
301+
git clone https://github.com/fahmiaziz/unified-embedding-api.git
302+
cd unified-embedding-api
303+
pip install -r requirements-dev.txt
304+
pre-commit install # (optional)
305+
```
306+
307+
---
308+
309+
## 📚 Resources
310+
311+
- [API Documentation](API.md)
312+
- [Sentence Transformers](https://www.sbert.net/)
313+
- [FastAPI Docs](https://fastapi.tiangolo.com/)
314+
- [MTEB Leaderboard](https://huggingface.co/spaces/mteb/leaderboard)
315+
- [Hugging Face Spaces](https://huggingface.co/docs/hub/spaces)
81316
- [Deploy Applications on Hugging Face Spaces (Official Guide)](https://huggingface.co/blog/HemanthSai7/deploy-applications-on-huggingface-spaces)
82317
- [How-to-Sync-Hugging-Face-Spaces-with-a-GitHub-Repository by Ruslanmv](https://github.com/ruslanmv/How-to-Sync-Hugging-Face-Spaces-with-a-GitHub-Repository?tab=readme-ov-file)
318+
- [Duplicate & Clone space to local machine](https://huggingface.co/docs/hub/spaces-overview#duplicating-a-space)
319+
---
83320

84321
---
85322

323+
## 📝 License
324+
325+
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
326+
327+
---
86328

87-
## 🧑‍💻 Contributing
329+
## 🙏 Acknowledgments
88330

89-
Contributions are welcome!
90-
Please open an issue or submit a pull request to discuss changes.
331+
- **Sentence Transformers** for the embedding models
332+
- **FastAPI** for the excellent web framework
333+
- **Hugging Face** for model hosting and Spaces
334+
- **Open Source Community** for inspiration and support
91335

92336
---
93337

94-
## ⚠️ License
338+
## 📞 Support
95339

96-
MIT License © 2025
97-
Developed with ❤️ by the Open-Source Community.
340+
- **Issues:** [GitHub Issues](https://github.com/fahmiaziz/unified-embedding-api/issues)
341+
- **Discussions:** [GitHub Discussions](https://github.com/fahmiaziz/unified-embedding-api/discussions)
342+
- **Hugging Face Space:** [fahmiaziz/api-embedding](https://huggingface.co/spaces/fahmiaziz/api-embedding)
98343

99344
---
100345

101346
> ✨ “Unify your embeddings. Simplify your AI stack.”
102347
348+
<div align="center">
349+
350+
**⭐ Star this repo if you find it useful!**
351+
352+
Made with ❤️ by the Open-Source Community
353+
354+
</div>
355+
356+
357+

core/__init__.py

Lines changed: 0 additions & 3 deletions
This file was deleted.

0 commit comments

Comments
 (0)