QuperSyncArchitecture

Architecture

QuperSync is built on Domain-Driven Design (DDD) principles with strict layer separation. The codebase is organized into three layers — domain, infrastructure, and interface — where each layer has clearly defined responsibilities and dependency rules.

Project Structure

cricut.quper.qupersync/ — Folder Structure

cricut.quper.qupersync/
├── domain/                  → Business logic (no external dependencies)
│   ├── entities/            → Core data models (pure Python dataclasses)
│   │   ├── cost_usage.py    → CostUsageRecord, CostUsageSummary
│   │   ├── redshift.py      → RedshiftTable, RedshiftQuery
│   │   └── sync_state.py    → SyncWatermark, SyncResult
│   ├── repositories/        → Abstract repository interfaces (ABCs)
│   │   ├── source.py        → RedshiftSourceRepository (abstract)
│   │   └── target.py        → PostgresTargetRepository (abstract)
│   └── services/            → Sync orchestration business logic
│       ├── sync_service.py  → Extract → Transform → Load orchestration
│       ├── watermark.py     → Incremental sync state management
│       └── validation.py    → Pre-flight schema validation
├── infrastructure/          → Database adapters (implements domain interfaces)
│   ├── redshift/            → Redshift source adapter
│   │   ├── connector.py     → redshift-connector connection pool
│   │   └── repository.py    → RedshiftSourceRepository implementation
│   └── postgres/            → PostgreSQL target adapter
│       ├── connector.py     → psycopg2 + SQLAlchemy connection
│       └── repository.py    → PostgresTargetRepository implementation
└── interface/               → HTTP API + scheduler entry points
    ├── api/                 → FastAPI route definitions
    │   ├── main.py          → App factory and router registration
    │   └── routes.py        → Sync job and scheduler API endpoints
    └── scheduler/           → APScheduler job definitions
        ├── setup.py         → AsyncIOScheduler initialization
        └── jobs.py          → Registered sync job functions

Layer Responsibilities

Domain Layer

The domain layer contains all business logic and has zero external dependencies — no database drivers, no HTTP clients, no framework imports. It defines:

Entities: Pure Python dataclasses representing the core data models. No ORM annotations — entities are framework-agnostic.
Repository Interfaces: Abstract base classes (ABCs) that define the contract for data access. The domain knows what data it needs but not how to fetch it.
Domain Services: Orchestration logic for the sync workflow. Services depend only on repository interfaces, not implementations.

Infrastructure Layer

The infrastructure layer implements the domain repository interfaces using real database drivers:

Redshift adapter: Uses redshift-connector (ODBC-based Python driver) to execute extraction queries
PostgreSQL adapter: Uses psycopg2 and SQLAlchemy Core for upsert operations

Interface Layer

The interface layer exposes the domain services to the outside world:

FastAPI API: REST endpoints for triggering syncs and managing scheduler jobs
APScheduler: AsyncIOScheduler that runs sync jobs on a configurable interval without blocking the API server

Key Architectural Decisions

Decision	Rationale
Domain has zero infrastructure imports	Domain services can be unit tested with mock repositories without any database. Tests run without Redshift or PostgreSQL connections.
Upsert over insert	INSERT ... ON CONFLICT DO UPDATE makes every sync run idempotent — safe to re-run after a failure without duplicating data.
AsyncIOScheduler	Runs in the same event loop as FastAPI so IO-bound sync operations yield correctly without spawning separate threads or processes.

Dependency Direction

Dependencies flow inward: interface → infrastructure → domain. The domain layer never imports from infrastructure or interface. This is the Dependency Inversion Principle — high-level business rules do not depend on low-level implementation details.