Dev.to8h ago

Your LLM traces are write-only

You spent weeks building observability for your LLM app. Traces in Jaeger. Metrics in Grafana. Alerts in Slack. You can see exactly what your model says, how long it takes, and how much it costs. Then you change the prompt. Did the model get better? Worse? For which inputs? You have no idea — because your traces are write-only. You observe but never evaluate. Your production data sits in Jaeger and never becomes a test. We built the bridge from traces to tests. Then we ran it on our own traces a

Read original on dev.to