Run the powerful Ministral-3-8B-Reasoning model with vision capabilities locally on consumer hardware (tested on NVIDIA RTX 3060 Ti 8GB).
This project demonstrates how to use the Unsloth 4-bit quantized version of the model with a modern Gradio frontend for image analysis and reasoning.
- Local Inference: Runs entirely on your machine. No API keys, no data privacy concerns.
- Vision Capabilities: Upload images and ask complex reasoning questions.
- Efficient: Uses 4-bit quantization (BitsAndBytes) to fit within 8GB VRAM.
- Modern UI: Clean, vertical layout built with Gradio 6.x.
- GPU: NVIDIA GPU with at least 8GB VRAM (e.g., RTX 3060 Ti, 3070, 4060).
- OS: Linux (tested) or Windows (WSL2 recommended).
- Python: 3.10+
-
Clone the repository:
git clone https://github.com/yourusername/ministral-vision-local.git cd ministral-vision-local -
Create a virtual environment (recommended):
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies:
pip install -r requirements_ministral.txt
-
Run the application:
python main.py
The first run will download the model weights (~5GB) and tokenizer.
-
Open your browser: Navigate to
http://localhost:7860. -
Interact:
- Upload an image.
- Ask a question (e.g., "Describe the main object," "What is unusual about this scene?").
- Click ✨ Analyze Image.
The project is structured for modularity and maintainability:
backend/: Handles model loading and inference logic.frontend/: Manages the Gradio UI and styling.config.py: Centralizes configuration constants.main.py: The entry point that ties everything together.
- Model Weights (4-bit): unsloth/Ministral-3-8B-Reasoning-2512-bnb-4bit
- Base Model: mistralai/Ministral-3-8B-Reasoning-2512
ImportError: libGL.so.1: If you see this, you might need to install system OpenCV dependencies:sudo apt-get update && sudo apt-get install libgl1- OOM (Out of Memory): If you have less than 8GB VRAM, try closing other applications. The 4-bit model requires roughly 6-7GB VRAM.
- Model: Ministral-3-8B-Reasoning by Mistral AI.
- Quantization: Unsloth for the 4-bit BNB version.
- Frontend: Built with Gradio.
Created by CiscoPonce
