feat: add log-watcher workflow for agent run diagnostics by adamhenson · Pull Request #327 · githubnext/agentics

adamhenson · 2026-05-08T15:52:29Z

Summary

Add workflows/log-watcher.md - fires on workflow_run: completed, downloads the agent-artifacts artifact written by gh-aw's firewall, scans run logs for error patterns and retry loops, analyses token-usage.jsonl for anomalies, and posts a health diagnosis on the associated pull request or creates a diagnosis issue
Add docs/log-watcher.md - installation instructions, mermaid flow diagram, health level reference, and detection list
Update README.md - add Log Watcher to the Fault Analysis Workflows section

Related to #297. Companion to #319.

Notes

Silent on non-agent runs: if no agent-artifacts artifact exists the workflow produces no output
Three health levels: Healthy runs get a brief collapsed summary; Degraded and Failed runs get a full diagnosis with log excerpts and token metric details
Detects error/exception/fatal messages, timeouts, rate limits (429), retry loops (same tool called >5 times), and context window truncation warnings
Token anomalies flagged: high output ratio, low cache efficiency, total token spikes, unexpected model mixing
Optional high-cost failure alert when a failed run exceeds 50,000 tokens
Token data from token-usage.jsonl written by gh-aw's firewall - no extra setup needed beyond enabling the firewall (the default)

dsyme · 2026-05-08T20:52:26Z

@adamhenson @lpcox @pelikhan Looks like we also added https://github.com/githubnext/agentics/blob/main/docs/cost-tracker.md - probably inspired by your issue Adam

Could you three reconcile these please?

Great suggestion either way. Slightly concerned it may be expensive if triggering often

pelikhan · 2026-05-08T20:54:23Z

@adamhenson maybe we can link to a report in your org as a community AW?

adamhenson · 2026-05-08T21:19:48Z

Good call on reconciling them. The intended split: cost-tracker answers "what did this run cost" and log-watcher answers "what went wrong." They share the same data source but the output is different - a spend summary vs. a health diagnosis. Happy to clarify that in the docs for both, or to consolidate if you'd prefer a single workflow that does both.

The cost concern is fair. Log-watcher only produces verbose output on degraded or failed runs - healthy runs get a one-liner. That keeps the token footprint low for normal operation. Worth calling out explicitly in the docs either way.

adamhenson · 2026-05-08T21:19:53Z

@pelikhan that would be great - happy to contribute whatever format works. Are you thinking a link in the README, a dedicated docs page, or something else? Let me know what you have in mind and I'll put it together.

feat: add log-watcher workflow for agent run diagnostics

350e954

adamhenson mentioned this pull request May 8, 2026

Idea: Add "log watcher/fixer" sample workflow to analyze OpenTelemetry #297

Open

Merge branch 'main' into add-log-watcher

2dc27d0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add log-watcher workflow for agent run diagnostics#327

feat: add log-watcher workflow for agent run diagnostics#327
adamhenson wants to merge 2 commits intogithubnext:mainfrom
adamhenson:add-log-watcher

adamhenson commented May 8, 2026

Uh oh!

dsyme commented May 8, 2026

Uh oh!

pelikhan commented May 8, 2026

Uh oh!

adamhenson commented May 8, 2026

Uh oh!

adamhenson commented May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

adamhenson commented May 8, 2026

Summary

Notes

Uh oh!

dsyme commented May 8, 2026

Uh oh!

pelikhan commented May 8, 2026

Uh oh!

adamhenson commented May 8, 2026

Uh oh!

adamhenson commented May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants