Skip to content

ANR Profile dropped due to overly strict endTimeMs window when ApplicationExitInfo.getTimestamp() > lastSample + 10s #5468

@wjw2013-07

Description

@wjw2013-07

Integration

sentry-android

Build System

Gradle

AGP Version

8.2

Proguard

Disabled

Other Error Monitoring Solution

No

Version

8.42.0

Steps to Reproduce

Tested on Pixel 7a, Android 13, sentry-android 8.42.0.

Sentry init (Kotlin):
options.anrProfilingSampleRate = 1.0
options.isAttachAnrThreadDump = true
options.isReportHistoricalAnrs = true
options.isDebug = true
options.setDiagnosticLevel(SentryLevel.DEBUG)

  1. App is in foreground (HomeMainActivity). Trigger a main-thread freeze by
    calling Thread.sleep(8_000) on the UI thread (simulated via a "DevModePage"
    button → hMethod() → Thread.sleep).
  2. Wait for the Android system "App not responding" dialog (input dispatch
    timeout fires at ~5s).
  3. Tap "Close app" on the ANR dialog — but only after ~20 seconds
    (a realistic user reaction time). Process dies.
  4. Cold-start the app again.

ANR event appears in Sentry as expected.
ANR Profile (flamegraph) does NOT appear on the issue page.

Expected Result

Expected Result:

The ANR event should be enriched with a ProfileContext, and the captured
profile chunk (foreground sampling from before process death) should be
visible as a flamegraph on the Sentry issue page.

The profile data was collected successfully by AnrProfilingIntegration —
it just got dropped by the matching logic in
ApplicationExitInfoEventProcessor.applyAnrProfile().

Actual Result

Event arrives at Sentry with:
• contexts.profile == null
• exception.mechanism.type == "AppExitInfo" (not "ANR" — i.e. applyAnrProfile
early-returned before reaching the exception replacement line)
• exception.value == "ANR" (not the culprit-derived
className.methodName)

We reproduced this with debug logging on and captured the full session timeline
end-to-end. The profile file IS written successfully by AnrProfilingIntegration
in session N, then rejected in session N+1 because anrTimestamp falls past
endTimeMs.

Timeline (UTC+8, single reproduction)

| Time | Epoch ms | Event |

Source
15:08:05.199
logcat (Sentry tag)
15:08:05.911
logcat
15:08:10.732
logcat
15:08:20.732
derived
15:08:11.843
logcat
15:08:36.444
logcat ("Writing last reported ANR marker with timestamp 1779779316444")
15:08:58.x
logcat
15:08:59.399
logcat

The check that drops the profileApplicationExitInfoEventProcessor.java:863:

  if (anrTimestamp < anrProfile.startTimeMs || anrTimestamp > anrProfile.endTimeMs) {                                    
      options.getLogger().log(SentryLevel.DEBUG, "ANR profile found, but doesn't match");                              
      return;                                                                                                            
  }                                                                                                                    

Substituting:

  anrTimestamp =        1779779316444  (15:08:36.444 — process death)                                                    
  anrProfile.endTimeMs= 1779779300732  (15:08:20.732 — lastSample + 10s)
  delta              =          15712 ms outside the window                                                              

Why this is structurally broken (not just a tuning issue)

Two independent mechanisms guarantee anrTimestamp can exceed endTimeMs
in real-world ANRs:

  1. AnrProfilingIntegration.MAX_NUM_STACKS = 10_000 / 66 ≈ 151. Once 151
    samples are written, the sampler stops calling addStackTrace() but the
    main thread can stay frozen indefinitely. So even for "pure long ANRs"
    where the main thread never recovers, endTimeMs is capped at
    ~20s (10s sampling window + 10s grace) after startTimeMs.
  2. ApplicationExitInfo.getTimestamp() is documented as the time the **
    **process died, not the time the ANR started. Process death can lag the
    ANR onset by:
    • the time the user spends looking at the "App not responding"
      dialog before tapping Close (15-30s is typical),
    • the time the system takes to auto-kill an unresponsive process,
    • kernel scheduling delays under load.

So for any ANR where the user doesn't dismiss the dialog within ~10 seconds,
the profile is silently dropped — which is most ANRs in production.

Suggested fix

sentry-android-core/src/main/java/io/sentry/android/core/anr/AnrProfile.java:

The upper bound on endTimeMs has no semantic value once we trust the
samples were collected during the ANR. By definition anrTimestamp
(= process death) is always after the last sample, so the
anrTimestamp > endTimeMs branch only ever fires for valid ANRs.

Option A (minimal):

  endTimeMs = this.stacks.get(this.stacks.size() - 1).timestampMs + 5 * 60 * 1000L;                                      

Option B (cleaner — drop the upper bound altogether):

In ApplicationExitInfoEventProcessor.java:863, change:

  if (anrTimestamp < anrProfile.startTimeMs || anrTimestamp > anrProfile.endTimeMs) {                                    

to:

  if (anrTimestamp < anrProfile.startTimeMs) {

and remove endTimeMs from AnrProfile.

Repro artifacts


Metadata

Metadata

Assignees

No fields configured for issues without a type.

Projects

Status

Waiting for: Product Owner

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions