QuperSync
QuperSync is an automated data synchronization service that periodically syncs data from Amazon Redshift to PostgreSQL. It follows Domain-Driven Design (DDD) principles with a clean separation between domain logic, infrastructure adapters, and the HTTP interface layer.
The service provides both scheduled cron jobs (via APScheduler) and REST API endpoints for on-demand synchronization triggers.
Purpose
The primary use case for QuperSync is populating the Quper API's PostgreSQL database with Redshift system metrics. Without QuperSync, the API would need to query Redshift directly on every request, which would be slow and resource-intensive. QuperSync pre-materializes the data on a schedule so the API serves fast PostgreSQL reads.
Data Flow
Tech Stack
Architecture Overview
cricut.quper.qupersync/
├── domain/ # Core business logic
│ ├── entities/ # Data models (pure Python)
│ ├── repositories/ # Abstract repository interfaces
│ └── services/ # Domain service interfaces
│
├── infrastructure/ # Concrete implementations
│ ├── redshift/ # Redshift query adapters
│ ├── postgres/ # PostgreSQL write adapters
│ ├── repositories/ # Repository implementations
│ └── scheduler/ # APScheduler configuration
│
├── interface/ # HTTP layer
│ └── api/
│ └── v1/ # FastAPI route handlers
│
├── application/ # Application services
│ └── sync_service.py # Orchestration layer
│
├── main.py # App entry point
├── alembic/ # Database migrations
└── pyproject.toml # DependenciesModule Documentation
Sync Engine & Workflow
Complete walkthrough of how data flows from Redshift to PostgreSQL including scheduling, batching, and error recovery.
Domain Layer
Domain entities, repository interfaces, and service contracts that define the core business logic.
Infrastructure Layer
Concrete database adapters, repository implementations, and the APScheduler configuration.
API Endpoints
REST endpoints for triggering on-demand syncs, checking sync status, and configuring schedules.
Environment Configuration
# Source: Amazon Redshift
REDSHIFT_HOST=my-cluster.abc.us-east-1.redshift.amazonaws.com
REDSHIFT_PORT=5439
REDSHIFT_DB=dev
REDSHIFT_USER=sync_user
REDSHIFT_PASSWORD=...
# Target: PostgreSQL
POSTGRES_HOST=localhost
POSTGRES_PORT=5432
POSTGRES_DB=quper
POSTGRES_USER=quper_sync
POSTGRES_PASSWORD=...
# AWS (for Redshift cluster management APIs)
AWS_ACCESS_KEY_ID=AKIA...
AWS_SECRET_ACCESS_KEY=...
AWS_DEFAULT_REGION=us-east-1
# Sync Configuration
SYNC_INTERVAL_SECONDS=300 # How often to run scheduled syncs (5 min)
SYNC_BATCH_SIZE=1000 # Rows per batch for large tables
TIMEZONE=US/Pacific # Scheduler timezone