Postmortem: A Corrupted Loki 2.10 Log Store Caused 3 Days of Lost Debug Data
Postmortem: A Corrupted Loki 2.10 Log Store Caused 3 Days of Lost Debug Data Overview On October 12, 2024, at 09:15 UTC, our observability team detected anomalies in the Loki 2.10 log store cluster serving production debug data. By the time mitigation steps were fully effective, 3 full days of debug log data (October 12–14) had been corrupted and was unrecoverable, impacting incident response and root cause analysis for 14 concurrent production incidents. Impact The corrupted log store affected
Comment
Sign in to join the discussion.
Loading comments…