Friday, May 24, 2024

How to Install and Run Model with Llama-cpp-Python Locally

 This video is a step-by-step tutorial to locally install Alphex 118B model with llama-cpp-python. As per model card, it defeats GPT-4 on most of the benchmarks and performs close to GPT-4o.




Code:

conda create -n alphex python=3.11

pip install llama-cpp-python

wget

from llama_cpp import Llama

llm = Llama(
      model_path="./Alphex-118b.Q2_K.gguf",
      n_gpu_layers=-1,
      seed=1337,
      n_ctx=2048,
)
output = llm(
      "Q: What is the capital of Australia? A: ",
      max_tokens=120,
      stop=["Q:", "\n"],
      echo=True
)
print(output['choices'][0])

No comments: