ViduraLangChainGPT-4o-miniFastAPI

Vidura

Vidura is Quper's AI-powered analytics engine that combines a LangChain agent, AWS Athena cost intelligence queries, Redshift performance monitoring, and a RAG (Retrieval-Augmented Generation) system to deliver actionable FinOps insights.

What is Vidura?

Vidura is the AI service layer of Quper. It exposes 25 structured tools to a LangChain agent, enabling it to answer complex natural language questions by composing multiple data lookups across Redshift, Athena, and the RAG knowledge base.

Architecture Overview

System Architecture
┌─────────────────────────────────────────────────────────────┐
│                    FastAPI Server (main.py)                  │
│                                                              │
│  POST /ask ─────────────────────────────────────────────┐  │
│  GET  /health                                            │  │
│                                                          ▼  │
│  ┌────────────────────────────────────────────────────────┐ │
│  │              LangChain Agent (agent.py)                │ │
│  │                                                        │ │
│  │  Model: GPT-4o-mini (temperature=0)                    │ │
│  │  System Prompt: AWS FinOps Analyst persona             │ │
│  │                                                        │ │
│  │  Tools (25 total):                                     │ │
│  │  ┌──────────────────┐  ┌──────────────────┐           │ │
│  │  │ Redshift Tools   │  │   CID Tools      │           │ │
│  │  │ (19 tools)       │  │   (6 tools)      │           │ │
│  │  │ → psycopg2       │  │ → boto3 Athena   │           │ │
│  │  │ → Redshift SQL   │  │ → CID SQL views  │           │ │
│  │  └──────────────────┘  └──────────────────┘           │ │
│  │  ┌──────────────────┐                                  │ │
│  │  │   RAG Tool       │                                  │ │
│  │  │ (book_qa)        │                                  │ │
│  │  │ → Milvus VectorDB│                                  │ │
│  │  │ → MiniLM-L6-v2   │                                  │ │
│  │  └──────────────────┘                                  │ │
│  └────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘

Tech Stack

AI Framework
LangChain 0.1+
Agent orchestration
LLM
GPT-4o-mini
temperature=0, deterministic
Embedding
all-MiniLM-L6-v2
384-dim sentence embeddings
Vector DB
Milvus 2.3+
COSINE similarity search
Cost DB
Amazon Athena
CID/CUR query execution
Perf DB
Amazon Redshift
System views & query history
API
FastAPI + Uvicorn
/ask and /health endpoints
AWS SDK
boto3 1.34+
Athena, S3 access

API Endpoints

POST
/ask

Submit a natural language analytics query to the LangChain agent

GET
/health

Health check endpoint for deployment monitoring

Request/Response Format

POST /ask
// Request
{
  "prompt": "What are my top 5 most expensive EC2 instance types this month?"
}

// Response
{
  "response": "## 💰 EC2 Cost Analysis\n\nYour top 5 most expensive EC2 instance types this month:\n\n| Instance Type | Monthly Cost | % of EC2 Total |\n|---|---|---|\n| m5.4xlarge | $12,400 | 18% |\n..."
}

Tool Categories

Environment Configuration

.env Required Variables
# OpenAI
OPENAI_API_KEY=sk-...

# AWS Credentials
AWS_ACCESS_KEY_ID=AKIA...
AWS_SECRET_ACCESS_KEY=...
AWS_DEFAULT_REGION=ap-south-1

# Redshift Connection
REDSHIFT_HOST=my-cluster.abc123.us-east-1.redshift.amazonaws.com
REDSHIFT_PORT=5439
REDSHIFT_DB=dev
REDSHIFT_USER=admin
REDSHIFT_PASSWORD=...

# Athena (CID)
CID_DATABASE=cid_cur
CID_WORKGROUP=primary
CID_OUTPUT_LOCATION=s3://my-bucket/athena-results/

# Milvus Vector DB
MILVUS_HOST=localhost
MILVUS_PORT=19530
MILVUS_COLLECTION=redshift_books

Deployment

Docker Compose
version: '3.8'
services:
  vidura:
    build: .
    ports:
      - "8001:8501"
    environment:
      - OPENAI_API_KEY=${OPENAI_API_KEY}
      - AWS_DEFAULT_REGION=${AWS_DEFAULT_REGION}
    env_file:
      - .env