Observability
Know what your system is doing in production. Structured logging, metrics, distributed tracing, OpenTelemetry, Prometheus, Grafana, alerting, and building observable systems from day one.
FundamentalsTopics 1–9
- ·The Three Pillars
- ·Structured Logging
- ·Log Levels
- ·Metrics Types
- ·Distributed Tracing
- ·Spans & Trace Context
- ·OpenTelemetry Overview
- ·Monitoring vs Observability
- ·SLIs, SLOs, & SLAs
Start Fundamentals →
IntermediateTopics 10–18
- ·Prometheus & PromQL
- ·Grafana Dashboards
- ·Alertmanager & Rules
- ·OpenTelemetry SDK
- ·Jaeger & Zipkin
- ·Log Aggregation
- ·Correlation IDs
- ·Health Checks
- ·Error Budget Tracking
Start Intermediate →
AdvancedTopics 19–27
- ·High-Cardinality Metrics
- ·Exemplars
- ·Continuous Profiling
- ·Synthetic Monitoring
- ·Real User Monitoring (RUM)
- ·Anomaly Detection
- ·OTel Collector Pipelines
- ·Service Mesh Observability
- ·Cost Optimisation
Start Advanced →
ProductionTopics 28–35
- ·Observability-Driven Dev
- ·On-Call Runbooks
- ·Alerting Fatigue Prevention
- ·SLO-Based Alerting
- ·Incident Response
- ·Capacity Planning
- ·Microservices Obs
- ·Building a Platform
Start Production →