Skip to content

ping-health#1187

Open
snadrus wants to merge 7 commits intomainfrom
feat/ga-ping-health
Open

ping-health#1187
snadrus wants to merge 7 commits intomainfrom
feat/ga-ping-health

Conversation

@snadrus
Copy link
Copy Markdown
Contributor

@snadrus snadrus commented May 1, 2026

Fixes #851
Caches the relevant alert results for the ping.

Refreshes every 5 minutes.

@snadrus snadrus requested a review from LexLuthr May 1, 2026 03:44
@snadrus snadrus requested a review from a team as a code owner May 1, 2026 03:44
@snadrus
Copy link
Copy Markdown
Contributor Author

snadrus commented May 1, 2026

@beck-8

Copy link
Copy Markdown
Contributor

@LexLuthr LexLuthr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approach is what I wanted to do but 30 minutes window is too big practically for the use case. We probably want to have a much lower window like 5 minutes for these 3 checks. Should we have one more task with PDP specific checks? We don't want to keep saying we are unhealthy for 30 minutes when there was a transient sync issue.

AJ: 👍

Comment thread alertmanager/task_alert.go
@snadrus
Copy link
Copy Markdown
Contributor Author

snadrus commented May 1, 2026

@LexLuthr what's a PDP-specific check?
We could check if PDP tasks are seeing "lots of failures"

@snadrus snadrus requested a review from LexLuthr May 1, 2026 22:07
Copy link
Copy Markdown
Contributor

@LexLuthr LexLuthr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Can we add alert timing tests?

Copy link
Copy Markdown
Contributor

@LexLuthr LexLuthr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably update the docs to new timing changes. It is no longer every 1 hour but every top of the hour.

Comment thread alertmanager/task_alert.go Outdated
@LexLuthr LexLuthr mentioned this pull request May 4, 2026
2 tasks
@snadrus
Copy link
Copy Markdown
Contributor Author

snadrus commented May 4, 2026

Looks good. Can we add alert timing tests?

what is that?

@BigLep BigLep added the team/fs-wg Items being worked on or tracked by the "FS Working Group". See FilOzone/github-mgmt #10 label May 4, 2026
@FilOzzy FilOzzy added this to FOC May 4, 2026
@github-project-automation github-project-automation Bot moved this to 📌 Triage in FOC May 4, 2026
@github-project-automation github-project-automation Bot moved this from 📌 Triage to ✔️ Approved by reviewer in FOC May 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

team/fs-wg Items being worked on or tracked by the "FS Working Group". See FilOzone/github-mgmt #10

Projects

None yet

Development

Successfully merging this pull request may close these issues.

pdpv0: add a simple status check task

4 participants