Tech Stack
A strategic selection of the technologies and architectural patterns I use to build production-ready data ecosystems, cloud infrastructure, and AI platforms.
Cloud Platforms (AWS & GCP)
- Data & Analytics: BigQuery, BigLake, Dataflow, Dataproc, Cloud Storage, Amazon S3, AWS Glue, Amazon Athena, Amazon Redshift.
- AI & ML: Vertex AI (Model Garden, Pipelines, Feature Store), Gemini, Amazon SageMaker, Amazon Bedrock.
- Compute & Messaging: Cloud Run, GKE, Pub/Sub, Cloud Functions, AWS Lambda, Amazon ECS/Fargate.
- Orchestration: Cloud Composer (Airflow), Cloud Workflows, Amazon MWAA.
DevOps & Infrastructure (IaC)
- Provisioning: Terraform (Modular IaC), Terragrunt.
- Containers: Docker, Kubernetes (GKE), Artifact Registry.
- CI/CD: GitHub Actions, GitLab CI, Just, uv toolchain.
- Standards: Conventional Commits, Semantic Versioning (SemVer), Pre-commit Hooks.
Data Engineering & Orchestration
- Tools: dbt (Core/Cloud), Apache Airflow, Dagster, Airbyte, Apache Hadoop.
- Processing: Apache Spark (PySpark/Scala), Databricks, Pandas, Polars.
- Table Formats: Apache Iceberg (Primary), BigQuery Native, Delta Lake, Hudi.
Production AI & Applied MLOps
- Frameworks: LangChain, LangGraph, Pydantic, FastAPI.
- AI Patterns: RAG Architectures, AI Agents, Model Context Protocol (MCP), LLM Evaluation & Governance.
Languages, Libraries & Formats
- Languages: Python, SQL, Rust, Scala, Go, Shell Scripting (Bash).
- Libraries: Pydantic, Pandas, Polars, Pytest, Poetry, uv.
- Data Formats: Apache Parquet, Apache Avro, JSON / NDJSON.
Architecture & Reliability
- Core Patterns: Event-driven architectures, Batch/Streaming hybrid (Kappa/Lambda), Medallion Architecture (Bronze/Silver/Gold).
- Engineering Standards: Data Contracts, Schema Evolution, Idempotency, Deduplication, Quality Gates, and Replayability.
- Governance & Ownership: Data Mesh (Domain Ownership), Data Vault 2.0, Kimball Modeling.
Databases
- Relational: PostgreSQL, MySQL, SQL Server, IBM DB2.
- NoSQL & Graph: MongoDB, Neo4j.