Data Governance & Modeling
Technical strategies to ensure data trust, quality, and scalability. It moves beyond "moving data" to "managing data as a product".
Data Modeling Architectures
| Pattern | Status | Context |
|---|---|---|
| Medallion Architecture | ADOPT | The standard Bronze/Silver/Gold layering. Simple, effective, and widely understood for Lakehouse setups. |
| Anchor Modeling | ASSESS | Lighter alternative to Data Vault for auditability and historization with less overhead. |
| Data Vault 2.0 | ASSESS | Investigating for complex enterprise hubs where auditing and historical tracking are critical, despite the complexity overhead. |
| Dimensional vs. 3NF | ASSESS | Explicit guidance on when not to use Kimball and when normalized 3NF models are more appropriate. |
| Kimball (Star Schema) | ADOPT | The Gold standard for the presentation layer (Data Marts) to ensure fast BI performance. |
| Semantic / Metrics Layer | TRIAL | Unifying business metrics as a reusable layer to reduce metric drift across tools. |
| Event Modeling | ASSESS | Useful when building event-driven pipelines to keep behavioral flows explicit and aligned. |
Quality & Validation
| Tool | Status | Context |
|---|---|---|
| Great Expectations | ADOPT | Robust framework for testing data at ingestion, ensuring strict quality gates before processing. |
| Soda | TRIAL | Lightweight, SQL-native data quality monitoring for detecting anomalies in the warehouse. |
| Data Contracts | ASSESS | Defining data interfaces as code to prevent breaking downstream pipelines. |
| Datahub | ASSESS | Open-source metadata platform for end-to-end lineage and discovery. |
| SQLMesh | ASSESS | Monitoring this as a potential successor to dbt for more robust environment management and semantic understanding. |
| OpenLineage | ASSESS | Open standard for data lineage across orchestration, processing, and observability tools. |
| Amundsen | ASSESS | Open-source data catalog focused on discovery, ownership, and metadata search. |