Cloud Data
Repository: landerox/cloud-landerox-data
cloud-landerox-data is the public architecture and engineering baseline
for my personal GCP data platform.
It documents patterns, contracts, quality standards, and implementation blueprints while keeping sensitive production runtime pipelines in private repositories when needed.
Current State
- Public repo role: baseline docs, templates, shared standards, tests.
- Runtime folders exist as placeholders (
functions/anddataflow/paths). - Shared infrastructure utilities are implemented in
shared/common. - Architecture stance is hybrid: warehouse + lakehouse, batch + streaming, selective Kappa.
What It Covers
- Architecture decisions (ADRs) and decision matrix.
- Data platform patterns: contracts, DLQ/replay, idempotency, quality gates, observability, governance.
- Reference blueprints and templates for first runtime scope and private implementation checklists.
Tech Focus
- Modern Python toolchains (
uv,pydantic,pytest) - Cloud Functions, Pub/Sub, Dataflow
- BigQuery, BigLake, GCS
- Open formats: JSON/Avro/Parquet
- Optional interoperability: Databricks/Delta when required
Relationship to Infra
This project complements Cloud Infra, which provisions the Terraform baseline and security foundation.