Proxies API calls to VLLM Server.
API requests should include history, a list of tuples of strings, and the current
query
Example Request:
{ "history": [["user", "hello"], ["llm", "hi"]], "query": "how are you?" }
Responses will be returned as dictionaries. Responses should contain the following:
response- String LLM response to the query
When running this as a docker container, the XDG_CONFIG_HOME envvar is set to /config.
A configuration file at /config/neon/diana.yaml is required and should look like:
MQ:
port: <MQ Port>
server: <MQ Hostname or IP>
users:
neon_llm_vllm:
password: <neon_vllm user's password>
user: neon_vllm
LLM_VLLM:
key: ""
api_url: "http://localhost:5000"
role: "You are NeonLLM."
context_depth: 4
max_tokens: 100
num_parallel_processes: 2To add support for Chatbotsforum personas, a list of names and prompts can be added to configuration:
llm_bots:
vllm:
- name: tutor
description: |
You are an AI bot that specializes in tutoring and guiding learners.
Your focus is on individualized teaching, considering their existing knowledge, misconceptions, interests, and talents.
Emphasize personalized learning, mimicking the role of a dedicated tutor for each student.
You're attempting to provide a concise response within a 40-word limit.
vllmis the MQ service name for this service; each bot has anamethat is used to identify the persona in chats anddescriptionis the prompt passed to VLLM Server.
For example, if your configuration resides in ~/.config:
export CONFIG_PATH="/home/${USER}/.config"
docker run -v ${CONFIG_PATH}:/config neon_llm_vllmNote: If connecting to a local MQ server, you may need to specify
--network host