Skip to content

This is a RAG Retrieval-Augmented Generation for the Operating System RIOT, it uses RIOT's documentation and also the examples directory retrieve relevant parts from code and documentation for a user query. These can be used to improve prompts to LLMs for generating RIOT code or explaining RIOT functionality

Notifications You must be signed in to change notification settings

KiyoLelou10/RAGforRIOTOS

Repository files navigation

RAG for RIOT OS

A Retrieval-Augmented Generation (RAG) toolchain for the RIOT operating system. This repository helps you build local vector search indices over RIOT's documentation and example code to power LLM-assisted development, code generation, and explanations.


Table of Contents

  1. Prerequisites

  2. Documentation RAG

  3. Autoencoder (Optional)

  4. Code RAG (Examples Directory)

  5. Combining the RAGS

  6. Ease of use


Prerequisites

  • Python 3.8+

  • Doxygen to generate API documentation

  • Required pip packages:

    pip install chromadb sentence-transformers torch numpy tqdm scikit-learn flask flask-cors beautifulsoup4 tiktoken langchain

Documentation RAG

Leverage RIOT's API docs for retrieval-augmented prompts.

1. Clone RIOT

git clone https://github.com/RIOT-OS/RIOT.git
cd RIOT

2. Generate Documentation

Use Doxygen to build HTML or XML docs locally:

doxygen Doxyfile
# Output will be in ./doc or ./html by default

3. Chunk & Embed Documentation

  1. Chunk the generated docs:

    python3 RIOTDocuChunker2.py path/to/RIOT/doc/html
    • Produces riot_chunks.json containing overlapping text chunks and metadata.
  2. Embed chunks into a vector database:

    python3 RIOTRRAGDocuDB3.py riot_chunks.json
    • Creates a ChromaDB at ./riot_vector_db (default path, configurable).

4. Query the RAG

Retrieve relevant documentation snippets for any query:

python3 RIORDocuRAGRequest2.py "<your query>"

The script returns:

  • Your original user query
  • Top matching documentation chunks
  • A ready-to-use prompt template for your LLM

Autoencoder (Optional)

Compress embeddings to speed up search and potentially improve relevance.

  1. Standard Autoencoder:

    python3 AutoencoderRIOT2.py
  2. Triplet Autoencoder (margin-based grouping):

    python3 AutoencoderRIOTTriplet.py --epochs 100 --lambda-triplet 5.0 --margin 1.5

Note: Compare performance with the uncompressed RAG to evaluate impact.

To query with compressed vectors:

python3 RIORDocuRAGRequestCompressed.py "<your query>"
python3 RIORDocuRAGRequestCompressedTriplet.py "<your query>"

Code RAG (Examples Directory)

Perform RAG over RIOT's examples/ codebase.

  1. Set examples directory in chunker.py (line 11):

    EXAMPLES_DIR = "/path/to/RIOT/examples"
  2. Chunk the examples:

    python3 chunker.py
  3. Embed the chunks:

    python3 embedder.py
  4. Query the example RAG:

    python3 request.py "<your query>"

Warning: If no example matches your query, results may be irrelevant. Use alongside Documentation RAG.


Best of two worlds

If you saved all python files in one common directory (like in this repository), you can try to run a request which searches both rags and returns the combined results:

  python3 RIOTRequestCombined.py "<your query>"

GUI

If you saved all python files in one common directory (like in this repository), you can try to run the python GUI, here you can easily choose which RAG you want to use specify your query and also specify parameters

  python3 RAGSystemGUI.py

Alternatively you can also use the more good looking GUI, First run

  python3 RAGGUIServer.py

Then go to http://127.0.0.1:5000/ here you should now see the website version of the GUI

About

This is a RAG Retrieval-Augmented Generation for the Operating System RIOT, it uses RIOT's documentation and also the examples directory retrieve relevant parts from code and documentation for a user query. These can be used to improve prompts to LLMs for generating RIOT code or explaining RIOT functionality

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •