GraphRAG-Local-Bridge is a deployment and adaptation toolkit for Microsoft GraphRAG. It enables seamless integration with local LLMs (e.g., SenseNova, Qwen) and embedding models (e.g., BGE-M3) by providing protocol translation and intelligent JSON error correction.
- Intelligent JSON Repairing: Automatically corrects illegal control characters and formatting errors in JSON outputs from local LLMs, preventing indexing failures.
- Protocol Bridging: Translates OpenAI-style requests into custom API formats required by local inference engines.
- BGE-M3 Optimization: Includes a dedicated proxy to make HuggingFace Text-Embeddings-Inference (TEI) compatible with GraphRAG.
- Long-Context Ready: Pre-configured for models with 64K+ context windows to maximize knowledge discovery.
- OS: Ubuntu 22.04 or later
- Python: 3.11.5 (Recommended)
- GraphRAG Version: 2.7.0
- Docker: Required for running the embedding service.
# Install core libraries
pip install graphrag==2.7.0 fastapi uvicorn httpx modelscopemkdir -p ./my_graphrag/input
# Place your .txt or .csv files into the input directory
# cp /your/source/data/*.txt ./my_graphrag/input/Download the model using ModelScope:
cd ./my_graphrag
modelscope download BAAI/bge-m3 --cache_dir ./bge-m3-modelStart the TEI container (adjust device ID as needed):
docker run -d --gpus "device=0" -p 8001:80 \
--name bge-m3 \
-v $(pwd)/bge-m3-model/BAAI/bge-m3:/data \
--security-opt seccomp=unconfined \
--pull always ghcr.io/huggingface/text-embeddings-inference:1.5 \
--model-id /data \
--dtype float16 \
--port 80The bridge services are designed to handle specific API schemas. If your backend differs, you may need to modify the proxy scripts.
The current bridge_server.py is tailored for backends (like TGI or custom VLLM) that accept the following format:
# Example of the raw backend call the bridge makes:
curl -X POST http://10.119.70.11:8088/generate \
-H "Content-Type: application/json" \
-d '{
"inputs": "<|im_start|>user\nWhat is a Knowledge Graph?<|im_end|>\n<|im_start|>assistant\n",
"parameters": {
"max_new_tokens": 2048,
"temperature": 0.3,
"stop": ["<|im_end|>"],
"details": true
}
}'The fix_tei_proxy.py ensures GraphRAG can communicate with the TEI service by forcing the encoding_format to float.
# Example of the call handled by the embedding proxy (port 8102):
curl -X POST http://localhost:8102/v1/embeddings \
-H "Content-Type: application/json" \
-d '{
"input": "GraphRAG is a powerful RAG technology",
"model": "bge-m3"
}'Keep these two services running in the background during indexing and querying.
Translates GraphRAG requests and performs JSON sanitization.
# Start the LLM bridge (listens on port 8900)
python bridge_server.pyFixes compatibility issues between GraphRAG and TEI.
# Start the Embedding proxy (listens on port 8102)
python fix_tei_proxy.pygraphrag init --root .Overwrite the generated settings.yaml with the optimized version provided in this repository. Ensure the api_base points to the bridge ports:
- LLM:
http://localhost:8900/v1 - Embedding:
http://localhost:8102/v1
# Clear old cache
rm -rf output/* cache/* logs/*
# Run indexing with verbose logging
export LITELLM_LOG=DEBUG
graphrag index --root . --verbosegraphrag query \
--root . \
--method global \
--query "In 2003, who was the top executive of DeepBlue Optoelectronics’ parent company, and what was his management style?"graphrag query \
--root . \
--method local \
--query "What was the direct trigger for Gu Changfeng leaving Tianqiong Group?"The bridge_server.py is a template. If your LLM uses a different API (e.g., different field names or prompt wrappers), you must modify the chat_completions function in bridge_server.py to match your model's requirements.
If your LLM is already 100% OpenAI-compatible, you can bypass bridge_server.py. However, if you encounter Invalid control character or Invalid JSON errors during indexing, use the bridge to benefit from its JSON cleaning logic.
GraphRAG currently supports .txt and .csv. For PDF, Word, or Excel files, please convert them to plain text before placing them in the input/ folder.
If community reports fail to generate, check the raw LLM responses:
cat cache/community_reporting/chat_create_community_report_* | lessThis project is licensed under the MIT License.