This video installs RouteLLM locally which is a framework for serving and evaluating LLM routers. It als shows hands-on demo of routing model traffic between Ollama models and OpenAI models.
Code:
conda create -n rl python=3.11 -y && conda activate rl
git clone https://github.com/lm-sys/RouteLLM.git
cd RouteLLM
pip install -e .[serve,eval]
export OPENAI_API_KEY=sk-XXXXXX
python3 -m routellm.openai_server --routers mf --alt-base-url http://localhost:11434/v1 --config config.example.yaml --weak-model llama3
pip install openai
In Python Interpreter:
import openai
client = openai.OpenAI(base_url="http://localhost:6060/v1", api_key="no_api_key")
response = client.chat.completions.create(model="router-mf-0.116",messages=[{"role": "user", "content": "Hello!"}])
In CLI:
python3 -m routellm.calibrate_threshold --routers mf --strong-model-pct 0.5 --config config.example.yaml
No comments:
Post a Comment