Monday, June 17, 2024

Convert Text to SQL with Ollama and Vanna with Any Database - Local and Free

 

This video is a step-by-step tutorial to install Vanna.ai locally with Ollama and with your own custom database to chat with database and convert text to SQL using AI.




Code:

import sqlite3
import pandas as pd

df=pd.read_csv('/home/Ubuntu/finance.csv')

conn=sqlite3.connect('financedb.db')

df = pd.read_sql_query("SELECT * FROM finance", conn)

# Print the DataFrame
print(df)

# Close the connection
conn.close()

%pip install 'vanna[chromadb,ollama]'
from vanna.ollama import Ollama
from vanna.chromadb import ChromaDB_VectorStore
class MyVanna(ChromaDB_VectorStore, Ollama):
    def __init__(self, config=None):
        ChromaDB_VectorStore.__init__(self, config=config)
        Ollama.__init__(self, config=config)

vn = MyVanna(config={'model': 'mistral'})

vn.connect_to_sqlite('financedb.db')

df_ddl = vn.run_sql("SELECT type, sql FROM sqlite_master WHERE sql is not null")

for ddl in df_ddl['sql'].to_list():
  vn.train(ddl=ddl)
 
# The following are methods for adding training data. Make sure you modify the examples to match your database.

# DDL statements are powerful because they specify table names, colume names, types, and potentially relationships
vn.train(ddl="""
    CREATE TABLE IF NOT EXISTS myproducts (
        id INT PRIMARY KEY,
        productname VARCHAR(100),
        Segment INT
    )
""")

# Sometimes you may want to add documentation about your business terminology or definitions.
vn.train(documentation="Our business defines deals with various products in various countries related to finance.")

# You can also add SQL queries to your training data. This is useful if you have some queries already laying around. You can just copy and paste those from your editor to begin generating new SQL.
vn.train(sql="SELECT * FROM finance WHERE Product='Montana'")  

# At any time you can inspect what training data the package is able to reference
training_data = vn.get_training_data()
training_data

Whenever you ask a new question, it will find the 10 most relevant pieces of training data and use it as part of the LLM prompt to generate the SQL.
```python
vn.ask(question="How many products are sold in Canada?")

from vanna.flask import VannaFlaskApp
app = VannaFlaskApp(vn)
app.run()




No comments: