Skip to content

fix(mothership): async resume and tool result ordering#3735

Merged
icecrasher321 merged 3 commits intostagingfrom
fix/tool-claim-mgmt
Mar 24, 2026
Merged

fix(mothership): async resume and tool result ordering#3735
icecrasher321 merged 3 commits intostagingfrom
fix/tool-claim-mgmt

Conversation

@icecrasher321
Copy link
Collaborator

Summary

Async continuation resume now only trusts durable terminal rows for Sim-executed tools, fails explicitly when a continuation cannot be resumed, and closes the parallel tool race where tool_result could arrive before local execution startup and cause incorrect dedupe or double-processing.

Type of Change

  • Bug fix

Testing

Tested manually

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

@cursor
Copy link

cursor bot commented Mar 24, 2026

PR Summary

Medium Risk
Changes async continuation resume/claim logic and tool execution dedupe in the copilot orchestrator; mistakes could cause stuck runs or missed/double tool deliveries under concurrency.

Overview
Tightens async continuation resume behavior by only resuming when each pending tool call is durably terminal (or explicitly Go-handled), adding bounded retries and failing the orchestration with a clear error instead of silently skipping partial/empty resume attempts.

Fixes tool_call/tool_result ordering races by creating placeholder tool state when a tool_result arrives first, marking results as seen at the handler/execution layer (not in shouldSkipToolResultEvent), and short-circuiting local execution when a terminal result/state is already present. Tests were updated to cover explicit resume failure cases and the out-of-order tool result scenarios.

Written by Cursor Bugbot for commit 39bc691. Configure here.

@vercel
Copy link

vercel bot commented Mar 24, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
docs Skipped Skipped Mar 24, 2026 10:47am

Request Review

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 24, 2026

Greptile Summary

This PR fixes two related async-continuation bugs in the Copilot orchestrator: (1) the resume logic now requires a durable terminal row for all Sim-executed tools before proceeding, and explicitly throws instead of silently dropping continuations it cannot resume; (2) a parallel tool race condition where a tool_result or tool_error SSE event could arrive before local execution starts (or even before the tool_call event itself) is closed by introducing ensureTerminalToolCallState and moving the "already processed" guard from a global wasToolResultSeen check to an inline terminal-state check on the ToolCallState object.

Key changes:

  • ReadyContinuationTool interface and durable-row-first claim logic replace the old speculative claimCompletedAsyncToolCall-first approach in index.ts
  • shouldSkipToolResultEvent is now a pure read; markToolResultSeen is called by the handlers themselves, cleanly separating deduplication logic from filtering
  • ensureTerminalToolCallState in handlers.ts creates a sentinel tool-call state when a result arrives before the call, preventing spurious local execution after the fact
  • executeToolAndReport checks terminal state at both entry and post-executeToolServerSide to cover the mid-execution race
  • Tests added for both out-of-order scenarios and the new explicit-failure behavior

Confidence Score: 4/5

  • Safe to merge; the core race-condition and async-resume fixes are logically sound and well-tested. Remaining notes are minor style/cleanup items.
  • The two bugs being addressed (out-of-order tool_result race and speculative durable-row assumptions) are fixed correctly. The shared resumeRetries counter could reduce effective retries in rare mixed-failure scenarios, and terminalCompletionFromToolCall has a semantic quirk using toolCall.error as a success message, but neither is a production-critical defect. Test coverage for both new behaviors is solid.
  • apps/sim/lib/copilot/orchestrator/sse/handlers/tool-execution.ts — terminalCompletionFromToolCall success-message field and duplicated isTerminalToolCallStatus helper worth cleaning up before the next iteration.

Important Files Changed

Filename Overview
apps/sim/lib/copilot/orchestrator/index.ts Core orchestrator refactored to use a typed ReadyContinuationTool structure, replacing speculative claim-first logic with durable-row-first verification. Error paths now throw explicitly instead of silently breaking the loop. Shared resumeRetries counter is a minor concern for edge cases.
apps/sim/lib/copilot/orchestrator/sse/handlers/tool-execution.ts Race condition fix is sound: pre-execution and post-execution terminal state checks replace wasToolResultSeen. Minor semantic issue: terminalCompletionFromToolCall uses toolCall.error as the success message. isTerminalToolCallStatus duplicates logic in index.ts.
apps/sim/lib/copilot/orchestrator/sse/handlers/handlers.ts ensureTerminalToolCallState correctly handles out-of-order tool_result/tool_error events (creating a synthetic pending state that immediately becomes terminal), preventing local execution when a result is already known. markToolResultSeen moved from shouldSkipToolResultEvent to the handlers themselves — a clean separation of concerns.
apps/sim/lib/copilot/orchestrator/sse/utils.ts shouldSkipToolResultEvent converted from a stateful (mark-and-check) function to a pure read. Marking responsibility correctly delegated to the handlers. The stream core calls this before dispatching, so ordering is preserved.
apps/sim/lib/copilot/orchestrator/index.test.ts Tests updated to cover new explicit-failure behavior. Removed the now-invalid partial-resume test; added two new tests for both branches of the Sim-tool durable-row requirement.
apps/sim/lib/copilot/orchestrator/sse/handlers/handlers.test.ts Two new race-condition tests added: tool_result before upsert completes, and tool_result before tool_call event. Both correctly assert that executeToolServerSide is not called and state is captured from the early result.
apps/sim/lib/copilot/orchestrator/sse/utils.test.ts Test updated to explicitly call markToolResultSeen before asserting shouldSkipToolResultEvent returns true, correctly reflecting the new pure-read contract.

Sequence Diagram

sequenceDiagram
    participant Go as Go Backend SSE
    participant SL as Stream Loop
    participant H as sseHandlers
    participant Exec as executeToolAndReport
    participant DB as Async Runs DB

    Go->>SL: tool_call event
    SL->>SL: shouldSkipToolCallEvent = false
    SL->>H: dispatch tool_call
    H->>Exec: fireToolExecution
    Exec->>DB: upsertAsyncToolCall
    Exec->>Exec: executeToolServerSide
    Exec->>Exec: check terminal state post-exec
    Exec->>DB: completeAsyncToolCall
    Exec->>Go: markToolComplete
    Go->>SL: duplicate tool_result SSE
    SL->>SL: shouldSkipToolResultEvent = true, skip

    Note over Go,SL: Race condition fix
    Go->>SL: tool_result arrives before tool_call
    SL->>H: dispatch tool_result
    H->>H: ensureTerminalToolCallState
    H->>H: set status=success, markToolResultSeen
    Go->>SL: tool_call arrives late
    SL->>H: dispatch tool_call
    H->>H: existing.status not pending, return early

    Note over SL,DB: Async resume
    SL->>DB: getAsyncToolCall per pending tool
    alt terminal durable row found
        SL->>DB: claimCompletedAsyncToolCall
        SL->>Go: POST /api/tools/resume
        Go->>SL: new SSE stream
        SL->>DB: markAsyncToolDelivered
    else retries exhausted
        SL->>SL: throw explicit error
    end
Loading

Reviews (1): Last reviewed commit: "ensure tool call terminal state" | Re-trigger Greptile

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

@icecrasher321 icecrasher321 merged commit 0c80438 into staging Mar 24, 2026
6 checks passed
@waleedlatif1 waleedlatif1 deleted the fix/tool-claim-mgmt branch March 24, 2026 17:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant