Skip to content

Cloud Data

Repository: landerox/cloud-landerox-data

cloud-landerox-data is the public architecture and engineering baseline for my personal GCP data platform.

It documents patterns, contracts, quality standards, and implementation blueprints while keeping sensitive production runtime pipelines in private repositories when needed.

Current State

  • Public repo role: baseline docs, templates, shared standards, tests.
  • Runtime folders exist as placeholders (functions/ and dataflow/ paths).
  • Shared infrastructure utilities are implemented in shared/common.
  • Architecture stance is hybrid: warehouse + lakehouse, batch + streaming, selective Kappa.

What It Covers

  1. Architecture decisions (ADRs) and decision matrix.
  2. Data platform patterns: contracts, DLQ/replay, idempotency, quality gates, observability, governance.
  3. Reference blueprints and templates for first runtime scope and private implementation checklists.

Tech Focus

  • Modern Python toolchains (uv, pydantic, pytest)
  • Cloud Functions, Pub/Sub, Dataflow
  • BigQuery, BigLake, GCS
  • Open formats: JSON/Avro/Parquet
  • Optional interoperability: Databricks/Delta when required

Relationship to Infra

This project complements Cloud Infra, which provisions the Terraform baseline and security foundation.

View on GitHub