Web Services Version: 1.0.0 Extends: General / Core Template: template-python-web-service
Standards for building APIs, microservices, and web applications. Includes Docker containerization, Cloud Run deployment, and observability.
Use Cases Type Examples REST APIs Public/private APIs, microservices Web Applications FastAPI + templates, Streamlit, Django Webhooks Event receivers, integrations Backend for Frontend BFF pattern
Stack Web Framework Tool Description FastAPI High-performance async framework with automatic OpenAPI docs and Pydantic validation
ASGI Server Tool Description When to Use Granian High-performance ASGI server (Rust), integrated worker management Recommended default Uvicorn Standard ASGI server, very mature Fallback if Granian incompatibilities
Granian vs Uvicorn:
Aspect Granian Uvicorn Performance Superior (Rust) Very good (with uvloop) Worker management Integrated Requires Gunicorn Docker image Simpler More complex Maturity Newer Very mature
HTTP Client Tool Description httpx Async/sync HTTP client with HTTP/2 support, configurable timeouts
Resilience Tool Description tenacity Retries with backoff and circuit breaker patterns
Tool Description When to Use orjson Ultra-fast JSON (Rust), drop-in replacement High-traffic APIs msgspec JSON/MessagePack, 2-5x faster than Pydantic Critical performance
Note: Pydantic v2 is fast enough for most cases. Use orjson/msgspec only when profiling indicates serialization is a bottleneck.
Inter-Service Communication Tool Description When to Use httpx HTTP/REST client External APIs, most cases grpcio gRPC framework High-frequency internal communication
gRPC vs REST:
Criterion gRPC REST Performance High (binary, HTTP/2) Medium (JSON, HTTP/1.1) Streaming Native bidirectional Limited (SSE, WebSocket) Contracts Strict (.proto) Flexible (OpenAPI optional) Browser support Limited (grpc-web) Native Debugging Harder (binary) Easy (plain text)
Recommendation: REST for public APIs and frontends, gRPC for internal high-traffic microservices.
Authentication Password Hashing Algorithm Recommendation Argon2id Preferred (PHC winner) bcrypt Acceptable, widely used scrypt Acceptable PBKDF2 Legacy, avoid MD5, SHA1, SHA256 Prohibited for passwords
JWT Best Practices Practice Description Algorithm Use RS256 or ES256, avoid HS256 in production Expiration Access tokens: 15-60 min Refresh tokens To renew access tokens Validate claims Always verify iss, aud, exp No sensitive data JWT payload is visible (base64)
API Design Structure Component Responsibility Routers Endpoint definition, input validation Schemas Pydantic models for request/response Services Business logic Repositories Data access Dependencies Component injection
Versioning Include version in path: /api/v1/... Maintain backward compatibility within a version Deprecate before removing REST Conventions Resource Convention Example Collection Plural noun /users, /orders Item Collection + ID /users/{id} Action Verb (avoid) /orders/{id}/cancel Relationship Nested resource /users/{id}/orders Filters Query params /users?status=active
HTTP Verbs Verb Use Idempotent GET Read resource(s) Yes POST Create resource No* PUT Replace complete resource Yes PATCH Partially update Yes DELETE Delete resource Yes
*POST can be idempotent with Idempotency-Key header
Pydantic Schemas Practice Description Separate schemas Different schemas for Create/Update/Response Validation in schema Use Pydantic validators from_attributes Enable to map from ORM Examples Include examples for OpenAPI documentation
Error Handling Use Problem Details standard for error responses:
Field Type Description type URI Error type identifier title string Human-readable error title status integer HTTP status code detail string Specific description instance URI URI of the problem instance
Additional fields:
Field Use errors Array of validation errors trace_id ID for log correlation code Internal error code
Strategy Use Pros/Cons Offset-based ?page=2&limit=20 Simple, inefficient on large datasets Cursor-based ?cursor=abc123&limit=20 Efficient, recommended for public APIs Keyset ?after_id=100&limit=20 Similar to cursor, column-based
Standard response:
Field Description data Results array meta.total Total items (if efficient) meta.page / meta.cursor Current position meta.limit Items per page links.next Next page URL links.prev Previous page URL
Security Header Value Purpose Strict-Transport-Security max-age=31536000; includeSubDomains Force HTTPS X-Content-Type-Options nosniff Prevent MIME sniffing X-Frame-Options DENY Prevent clickjacking Content-Security-Policy Per application Control loadable resources Referrer-Policy strict-origin-when-cross-origin Control referrer info
CORS Environment Configuration Development Allow localhost with various ports Staging Specific staging domains Production Only production domains, explicit list
Best practices:
Practice Description Explicit origin list Never use * with credentials Validate Origin header Verify against whitelist Limit methods Only necessary ones Reasonable Max-Age 86400 (24h) is common
Rate Limiting Header Description X-RateLimit-Limit Total limit X-RateLimit-Remaining Remaining requests X-RateLimit-Reset Reset timestamp Retry-After Seconds to retry (on 429)
Idempotency Keys For state-mutating operations (POST, PUT):
Scenario Response First time Process and save result with key Repeated key (success) Return saved result Repeated key (failure) Retry operation Expired key 409 Conflict
Docker Base Image Image Description python:3.12-slim-bookworm Debian Bookworm (stable), slim variant, regular security updates
Why Bookworm slim:
Reason Description Stable base Debian stable, long-term support Minimal size ~150MB vs ~1GB full image Security Regular updates, smaller attack surface Compatibility glibc-based, broad library support
Multi-stage Builds (Mandatory) Stage Base Purpose builder python:3.12-slim-bookworm Install dependencies with uv runtime python:3.12-slim-bookworm Only virtualenv and source code
Benefits:
Benefit Description Smaller image No build tools, compilers in final image Security Reduced attack surface Faster deploys Less to push/pull Layer caching Dependencies cached separately
Build Optimization Practice Description Copy pyproject.toml + uv.lock first Leverage layer caching Install dependencies before copying source Cache dependencies layer Use --no-cache for uv Smaller image Use .dockerignore Faster context, smaller image
Environment Variables Variable Value Purpose PYTHONDONTWRITEBYTECODE 1 Don't create .pyc files PYTHONUNBUFFERED 1 Unbuffered stdout/stderr for logs PYTHONFAULTHANDLER 1 Dump traceback on segfault UV_COMPILE_BYTECODE 1 Compile bytecode during install (faster startup) UV_LINK_MODE copy Copy files instead of hardlinks
Non-Root User (Mandatory) Practice Description Create appuser Dedicated non-root user chown app directory User owns application files USER appuser Switch before CMD/ENTRYPOINT
Security benefits:
Benefit Description Least privilege Process can't modify system Container escape mitigation Limits damage if compromised Cloud Run requirement Runs as non-root by default
.dockerignore (Mandatory) Must exclude:
Category Examples Git .git/, .gitignore Python __pycache__/, .venv/, *.egg-info/ Testing .pytest_cache/, .coverage, .nox/ IDEs .idea/, .vscode/ CI/CD .github/ Secrets *.pem, *.key, .env Docs docs/, *.md
Graceful Shutdown Signal Action SIGTERM Start graceful shutdown SIGINT Same as SIGTERM
Practices:
Practice Description Finish in-flight requests Wait for active requests Close DB connections Release pool connections Flush logs/metrics Send pending data Shutdown timeout Time limit (e.g., 30s)
ENTRYPOINT vs CMD Directive Use ENTRYPOINT The executable (e.g., ["granian"]) CMD Default arguments (overridable)
Best practice: Use exec form ["executable", "arg1"] not shell form.
Health Checks Endpoint Type What it Verifies /health/live Liveness App responds (simple ping) /health/ready Readiness DB connected, dependencies OK /health/startup Startup Complete initialization
Best practices:
Practice Description Simple liveness Don't verify external dependencies Complete readiness Verify DB, cache, critical services Appropriate timeouts Configure initialDelaySeconds, periodSeconds Don't cache Each check should be fresh
Observability and Monitoring Structured Logging Environment Format Development Readable with colors Production JSON for indexing
Log Levels Level Use DEBUG Detailed debugging information INFO Normal operation events WARNING Unexpected but handled situations ERROR Errors preventing operation completion CRITICAL Errors requiring immediate attention
What to Log Event Level Information Request received INFO method, path, request_id Request completed INFO status_code, duration Business operation INFO entity, action, identifiers Recoverable error WARNING error, context Unrecoverable error ERROR error, stack trace, context
Correlation IDs Header Description X-Request-ID Unique request ID X-Correlation-ID Common alternative traceparent W3C Trace Context standard
Behavior:
Scenario Action Header present Use received ID Header absent Generate new UUID Calls to other services Propagate the ID Logs Include ID in each entry Response Return ID in header
GCP Cloud Logging Requirement Description JSON format Mandatory for Cloud Logging severity field Map log level (INFO, ERROR, etc.) logging.googleapis.com/trace Trace ID for correlation spanId Span ID for correlation
Application Configuration 12-Factor App Principles Principle Description Env vars only Configuration in environment, not code Fail fast Validate configuration at startup Type safety All configuration typed with pydantic-settings Sensible defaults Default values for development Separate secrets Never in code or versioned files
Organization Config Type Example Where to Define Required DATABASE_URL Only in environment With default DEBUG=false Default in code Secrets API_KEY Secret manager or env Feature flags ENABLE_FEATURE_X Environment variables
.env Files .env.example versioned with example values .env never versioned (in .gitignore) Use only for local development Deployment Cloud Run Services Consideration Description Stateless Don't maintain state in memory between requests Cold starts Optimize startup time Concurrency Configure max_instances and min_instances Timeouts Maximum 60 minutes for HTTP Memory 256MB - 32GB available
GKE (Kubernetes) Component Tool Authentication Workload Identity Development Skaffold Secrets External Secrets Operator Observability OpenTelemetry Collector Metrics prometheus-client
Considerations:
Aspect Practice Health checks Separate liveness and readiness probes Resources Define requests and limits HPA Auto-scaling based on metrics PodDisruptionBudget Guarantee availability during updates Network policies Restrict traffic between pods
Testing Strategy by Test Type Test Type Strategy Speed What to Test Unit Mocks ~ms Business logic, services, isolated functions Integration Testcontainers ~s API + real DB, queries, constraints E2E Real services ~s Full system, critical paths
Recommended Approach Layer Tool Description Unit tests pytest-mock Mock dependencies, test business logic in isolation API tests httpx.AsyncClient FastAPI TestClient, test endpoints DB tests testcontainers Real PostgreSQL/Redis in Docker External APIs respx Mock httpx requests
Why Testcontainers over Mocks for DB Aspect Mocks Testcontainers Speed Fast (~ms) Slower (~s) SQL validation ❌ No ✅ Yes Constraints/Types ❌ No ✅ Yes Query bugs ❌ Missed ✅ Caught CI/CD Simple Requires Docker Confidence Lower Higher
Recommendation: Use mocks for unit tests (business logic), testcontainers for integration tests (DB queries, API endpoints with persistence).
Test Structure tests/
├── unit/ # Fast, mocked, business logic
│ └── services/
├── integration/ # Real DB (testcontainers), API endpoints
│ └── api/
├── e2e/ # Full system (optional)
├── factories/ # factory-boy factories
├── fixtures/ # Shared fixtures
└── conftest.py # Common fixtures, testcontainers setup
What to Test Where What Unit (Mocks) Integration (Testcontainers) Service business logic ✅ Pydantic validation ✅ API endpoint responses ✅ DB queries (SQLAlchemy) ✅ DB constraints/triggers ✅ Transactions/rollbacks ✅ Cache operations ✅ External API calls ✅ (respx)
Fixtures Pattern Fixture Scope Purpose db_engine session Testcontainers PostgreSQL db_session function Fresh session per test, auto-rollback client function httpx AsyncClient with app authenticated_client function Client with auth headers
Testing Best Practices Practice Description Isolate tests Each test independent, no shared state Auto-rollback Rollback DB after each test Factories over fixtures factory-boy for flexible test data Test behavior, not implementation Assert outcomes, not internal calls Fast feedback Run unit tests first, integration after CI/CD markers @pytest.mark.integration to separate
Additional Files Beyond General / Core , web services include:
File Purpose Dockerfile Multi-stage container build .dockerignore Exclude files from build context compose.yaml Local development with services (DB, cache)
Databases ORM Tool Description When to Use SQLAlchemy 2.0 Mature ORM with async support, typed queries Complex projects, advanced queries SQLModel SQLAlchemy + Pydantic, less boilerplate Simple CRUDs with FastAPI
Drivers Tool Description asyncpg High-performance async PostgreSQL driver psycopg3 Modern PostgreSQL driver (sync/async)
Migrations Tool Description Alembic Database migrations for SQLAlchemy
Practices:
Practice Description Auto-generated migrations As starting point, always review Versioned in Git Track all migrations Idempotent When possible
Connection Pooling Parameter Description pool_size Number of base connections max_overflow Additional allowed connections pool_pre_ping Verify connections before use pool_recycle Recycle connections periodically
Data Practices Practice Description Soft deletes Mark as deleted instead of physical delete Auditing created_at, updated_at, created_by, updated_by Timezone-aware Always use UTC timestamps UUIDs Consider for distributed systems
N+1 Prevention Strategy When to Use joinedload One-to-one, few records selectinload One-to-many, many records contains_eager Existing JOIN in query Batch loading Load IDs first, then batch
Caching Tool Description Redis Distributed cache, sessions, pub/sub Memorystore Managed Redis in GCP cachetools Local in-memory cache (LRU, TTL)
Cache Levels Level Tool Latency Use L1 - In-process cachetools ~1μs Hot data, immutable L2 - Distributed Redis ~1ms Sessions, shared data L3 - CDN Cloud CDN ~10-50ms Static assets
Strategies Strategy Description Cache-aside Application explicitly manages cache Write-through Simultaneous write to cache and backend TTL Expires after N seconds
Key Patterns Pattern Example Namespace users:123:profile Version v2:users:123 Hash query:sha256(params)
Best Practices Practice Description Appropriate TTL Define per data type Cache stampede prevention Use locking or TTL jitter Graceful degradation App should work without cache Monitoring Hit rate, miss rate, latency
Background Tasks Background Task Best Practices Practice Description Idempotency Tasks should be re-executable Timeouts Define time limits Dead letter queue Handle permanent failures Retries with backoff Exponential backoff Observability Metrics and logs
GCP Services Integration Recommended Services GCP Authentication Tool Description Workload Identity Service account for Cloud Run/GKE Application Default Credentials Auto-detect credentials
Scaling Considerations Need GCP Solution Avoid More requests Cloud Run auto-scaling Threads for requests CPU-bound tasks Cloud Run Jobs multiprocessing in API Massive data Dataflow ThreadPoolExecutor I/O-bound async/await threading