QuperSyncPython 3.13FastAPIAPScheduler

QuperSync

QuperSync is an automated data synchronization service that periodically syncs data from Amazon Redshift to PostgreSQL. It follows Domain-Driven Design (DDD) principles with a clean separation between domain logic, infrastructure adapters, and the HTTP interface layer.

The service provides both scheduled cron jobs (via APScheduler) and REST API endpoints for on-demand synchronization triggers.

Purpose

The primary use case for QuperSync is populating the Quper API's PostgreSQL database with Redshift system metrics. Without QuperSync, the API would need to query Redshift directly on every request, which would be slow and resource-intensive. QuperSync pre-materializes the data on a schedule so the API serves fast PostgreSQL reads.

Data Flow

Redshift (system views) → QuperSync → PostgreSQL (quper schema) → Quper API → Web Dashboard

Tech Stack

Language
Python 3.13+
Latest stable Python
Web Framework
FastAPI 0.116.0
Async ASGI server
Scheduler
APScheduler 3.x
AsyncIO scheduler
ORM
SQLAlchemy 1.4+
Async + sync support
Source DB
Amazon Redshift
Via redshift-connector
Target DB
PostgreSQL
Via asyncpg + psycopg2
Migrations
Alembic
Schema versioning
AWS SDK
boto3 1.38.46+
Redshift cluster API
HTTP Client
httpx
Async HTTP requests
Architecture
Domain-Driven Design
Clean separation of concerns

Architecture Overview

Layered Architecture (DDD)
cricut.quper.qupersync/
├── domain/                    # Core business logic
│   ├── entities/              # Data models (pure Python)
│   ├── repositories/          # Abstract repository interfaces
│   └── services/              # Domain service interfaces
│
├── infrastructure/            # Concrete implementations
│   ├── redshift/              # Redshift query adapters
│   ├── postgres/              # PostgreSQL write adapters
│   ├── repositories/          # Repository implementations
│   └── scheduler/             # APScheduler configuration
│
├── interface/                 # HTTP layer
│   └── api/
│       └── v1/                # FastAPI route handlers
│
├── application/               # Application services
│   └── sync_service.py        # Orchestration layer
│
├── main.py                    # App entry point
├── alembic/                   # Database migrations
└── pyproject.toml             # Dependencies

Module Documentation

Environment Configuration

.env Required Variables
# Source: Amazon Redshift
REDSHIFT_HOST=my-cluster.abc.us-east-1.redshift.amazonaws.com
REDSHIFT_PORT=5439
REDSHIFT_DB=dev
REDSHIFT_USER=sync_user
REDSHIFT_PASSWORD=...

# Target: PostgreSQL
POSTGRES_HOST=localhost
POSTGRES_PORT=5432
POSTGRES_DB=quper
POSTGRES_USER=quper_sync
POSTGRES_PASSWORD=...

# AWS (for Redshift cluster management APIs)
AWS_ACCESS_KEY_ID=AKIA...
AWS_SECRET_ACCESS_KEY=...
AWS_DEFAULT_REGION=us-east-1

# Sync Configuration
SYNC_INTERVAL_SECONDS=300    # How often to run scheduled syncs (5 min)
SYNC_BATCH_SIZE=1000         # Rows per batch for large tables
TIMEZONE=US/Pacific          # Scheduler timezone