Skip to content

Enhance file review status UI, improve error handling, and strengthen test coverage#12

Open
devarshishimpi wants to merge 7 commits into
devfrom
bug/file-review-status-bar-dashboard
Open

Enhance file review status UI, improve error handling, and strengthen test coverage#12
devarshishimpi wants to merge 7 commits into
devfrom
bug/file-review-status-bar-dashboard

Conversation

@devarshishimpi
Copy link
Copy Markdown
Owner

@devarshishimpi devarshishimpi commented May 24, 2026

Description

This PR improves the dashboard experience and system reliability by enhancing the file review status display, refactoring model error handling, and adding comprehensive test coverage. The changes focus on providing better visibility into review job progress and status while ensuring more robust error recovery and webhook handling.

Closes #8

What's Changed

UI/UX Improvements:

  • Enhanced job logs page with detailed file review status cards showing individual file progress
  • Improved job detail metadata cards with better visual hierarchy and responsive layout
  • Added live review stepper UI for tracking multi-file review progress
  • Enhanced Sonner toast notifications styling and job status indicators
  • Better visual feedback for file status states (done, pending, failed, skipped)

Backend Improvements:

  • Refactored model error handling with improved distinction between retryable and terminal errors
  • Enhanced job recovery mechanisms with better lease expiration handling
  • Improved webhook routing and error propagation
  • Added retry delay escalation for transient model provider failures

Testing:

  • Added comprehensive test coverage for model service error scenarios
  • Added webhook handling tests for error cases
  • Enhanced review flow tests with improved error handling verification

Type of Change

  • New feature (non-breaking change which adds functionality)
  • Bug fix (improving status visibility and error recovery)
  • Refactoring (error handling and code organization)

How Has This Been Tested?

  • Unit Tests - Added tests for model error handling, webhook routing, and review flow
  • Manual Dashboard Verification - Tested job detail and logs page UI with various file review states
  • Manual GitHub Webhook Verification - Verified webhook error handling and retry mechanisms

Checklist:

  • I have starred Codra on GitHub
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • My changes generate no new warnings
  • New and existing unit tests pass locally with my changes
  • I have signed the CLA

codra-app-personal[bot]

This comment was marked as resolved.

codra-app-personal[bot]

This comment was marked as resolved.

codra-app-personal[bot]

This comment was marked as resolved.

codra-app-personal[bot]

This comment was marked as resolved.

codra-app-personal[bot]

This comment was marked as resolved.

Copy link
Copy Markdown

@codra-app-personal codra-app-personal Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codra Review

Here are some automated review suggestions for this pull request.

Reviewed commit: dfc637297b

ℹ️ About Codra in GitHub

Your team has set up Codra to review pull requests in this repo. Reviews are triggered when you:

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codra-app review"

If Codra has suggestions, it will comment; otherwise it will react with 👍.

Codra can also answer questions or update the PR. Try commenting "@codra-app address that feedback".

Comment thread src/client/app.css
/* ─────────────────────────────────────────────────────
Sonner Toast — Premium overrides
───────────────────────────────────────────────────── */

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Excessive use of !important

Almost every CSS property in the new Sonner Toast overrides (lines 875-1055) uses !important. This is a critical maintainability anti-pattern that creates specificity wars and makes future modifications extremely difficult. Instead of forcing styles with !important, you should increase selector specificity or utilize the library's intended theme configuration API.

Comment thread src/client/app.css
box-shadow:
0 4px 16px oklch(0% 0 0 / 0.10),
0 1px 4px oklch(0% 0 0 / 0.06),
inset 0 1px 0 oklch(100% 0 0 / 0.05) !important;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Hardcoded color values instead of design tokens

The toast styles use hardcoded oklch() values for backgrounds, borders, and text throughout the various states (success, error, warning, info). Since the project already uses CSS variables for these semantic colors (lines 55-61), these should be referenced (e.g., var(--success-bg)) to ensure consistency and allow for global theme changes without manually updating dozens of lines.

Suggested change
inset 0 1px 0 oklch(100% 0 0 / 0.05) !important;
background: var(--success-bg) !important;

{ label: 'Tokens', value: <span className="font-mono text-sm">{(job.totalInputTokens + job.totalOutputTokens).toLocaleString()}</span> },
{ label: 'Tokens', value:
<span className="font-mono text-sm tabular-nums">
{(job.totalInputTokens + job.totalOutputTokens).toLocaleString()}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Potential NaN in token calculation

The calculation (job.totalInputTokens + job.totalOutputTokens).toLocaleString() does not account for cases where either token value might be null or undefined (e.g., while a job is still running). In such cases, the result will be NaN, and the UI will display 'NaN'.

Suggested change
{(job.totalInputTokens + job.totalOutputTokens).toLocaleString()}
((job.totalInputTokens ?? 0) + (job.totalOutputTokens ?? 0)).toLocaleString()

</div>
<div>
{steps.map((step, idx) => (
<StepRow key={step.name || idx} step={step} index={idx} total={steps.length} />
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3 Unreliable list key in step mapping

Using step.name || idx as a key can lead to rendering issues and performance degradation if step.name is not unique across all steps or if the list order changes. It is preferable to use a unique identifier from the data model (e.g., step.id).

Suggested change
<StepRow key={step.name || idx} step={step} index={idx} total={steps.length} />
key={step.id || step.name || idx}


<div className="col-span-2 pt-1 border-t border-border/40">
<dt className="text-[10px] font-bold uppercase tracking-[0.12em] text-muted-foreground mb-1.5">Created</dt>
<dd className="text-sm text-muted-foreground tabular-nums">{new Date(job.createdAt).toLocaleString()}</dd>
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3 Potential 'Invalid Date' display

The expression new Date(job.createdAt).toLocaleString() will result in the string 'Invalid Date' if job.createdAt is null or undefined. While unlikely for a stored job, adding a null check ensures the UI remains clean.

Suggested change
<dd className="text-sm text-muted-foreground tabular-nums">{new Date(job.createdAt).toLocaleString()}</dd>
job.createdAt ? new Date(job.createdAt).toLocaleString() : '—'

throw new Error('Model response did not contain review JSON keys.');
}
} catch (e) {
logger.error('Failed to extract JSON from model response', { raw: truncateForLog(raw), error: e });
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Loss of diagnostic data in error logs

The change removes the 'raw' model response content from the log when JSON extraction fails (replacing it with only 'rawLength'). This is a regression in observability; without the actual content of the malformed response, it becomes nearly impossible to diagnose why the extraction failed or to create regression tests for the parser.

Suggested change
logger.error('Failed to extract JSON from model response', { raw: truncateForLog(raw), error: e });
logger.error('Failed to extract JSON from model response', {
raw: truncateJsonForLog(raw),
rawLength: raw.length,
error: e instanceof Error ? e.message : String(e),
});

env: AppBindings,
executionCtx?: Pick<ExecutionContext, 'waitUntil'>,
) {
const task = runBestEffortJobMaintenance(env);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Floating promise may lead to unreliable execution

The function scheduleBestEffortJobMaintenance initiates an asynchronous task but only ensures its completion via executionCtx.waitUntil if the optional executionCtx is provided. In serverless environments (such as Cloudflare Workers), promises that are not registered with waitUntil may be terminated by the runtime immediately after the primary response is sent. This could lead to job recovery tasks being killed prematurely, making the 'best effort' maintenance unreliable.

Suggested change
const task = runBestEffortJobMaintenance(env);
const task = runBestEffortJobMaintenance(env);
if (executionCtx) {
executionCtx.waitUntil(task);
}
return task;

Comment thread src/server/core/review.ts
if (!canInheritParentFileReview(config, inherited)) {
logger.info(`Ignoring inherited review for ${file.path}; parent model ${inherited.model_used} is not in the current model strategy`);
await reviewAndPersistFile(env, job, file, pr, config, totalLineCount, model, existingReview);
processedThisChunk += 1;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P0 Premature loop termination in runReviewPhase

The introduction of a 'break' statement inside the 'for' loop in 'runReviewPhase' causes the function to exit the loop entirely once the 'REVIEW_CHUNK_FILE_LIMIT' or the time limit is reached. Since there is no outer loop to process the remaining files in subsequent chunks, the function will only ever process the first chunk of files, leaving the rest of the files unreviewed.

Suggested change
processedThisChunk += 1;
Wrap the chunking logic in an outer loop (e.g., 'while' or a 'for' loop with increments of 'REVIEW_CHUNK_FILE_LIMIT') to ensure all files in the 'files' array are processed in manageable batches.

Comment thread src/server/core/review.ts
: new AggregateError(rejected.map((result) => result.reason), `${rejected.length} review chunk tasks failed`);
}

const latestReviews = await getFileReviewsForJobs(env, [job.id]);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Unsafe mutation of error object

Using 'Object.defineProperty' to attach 'retryAfterSeconds' to an existing error object is risky. If the error object is non-extensible (e.g., if it was frozen by a library or a runtime environment), this call will throw a 'TypeError', potentially crashing the error handling logic and masking the original exception.

Suggested change
const latestReviews = await getFileReviewsForJobs(env, [job.id]);
Instead of mutating the error, wrap the original error in a custom Error subclass or a metadata object that includes the retry delay.

Comment thread src/server/core/review.ts
});
Object.defineProperty(error, 'retryAfterSeconds', {
value: retryableModelFailureDelaySeconds(failureCount),
configurable: true,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Removal of model summary generation logic

The 'runFinalizePhase' function has had its call to 'model.generateSummary' removed. This change effectively disables AI-generated summaries for reviews and sets 'summaryModel' to 'null'. While this might be intended as a refactor, it significantly alters the output of the review process and should be verified against the requirements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant