Skip to content

Test Execution

BrowseGenius converts GPT-authored steps into deterministic browser actions. This section dives into how the orchestrator works and how to intervene during a run.

Execution lifecycle

  1. Preparation
    • Active tab is detected via chrome.tabs.query.
    • The extension attaches the chrome.debugger protocol and temporarily disables conflicting extensions.
  2. LLM loop
    • The orchestrator (service worker) requests a simplified DOM.
    • Previous successful actions are passed back to determineNextAction.
    • GPT-4 responds in the <Thought>…</Thought><Action>…</Action> grammar inherited from Taxy.
  3. Action dispatch
    • click(elementId) and setValue(elementId, value) are executed via DevTools protocol.
    • Finish/fail directives resolve the test case immediately.
  4. Wait & retry
    • A 1.5s wait helps the page settle.
    • Each case is capped at 40 actions to avoid runaway loops.
  5. Teardown
    • Debugger detaches and disabled extensions are restored.
    • Reports are updated with duration, status, and any failure reason.

Monitoring runs

  • Popup: Shows suite status, per-case badges, and failure summaries.
  • DevTools panel: Streams the action log (Thought → Action), error toasts, and DOM observations in near real-time.
  • Artifacts: Logs, DOM snippets, and screenshots are stored in the active report object for later export.

Handling failures

  • Model parsing errors: If GPT returns malformed actions, the case is marked failed with the parser error message.
  • Max actions exceeded: Cases switch to blocked after hitting the cap—use notes or new captures to guide the LLM.
  • Suite abort: Fatal errors (e.g., DOM capture failures) stop the entire suite, mark the current case failed, and record a log entry.

Manual intervention

You can stop a run by:

  • Closing the popup DevTools session (detaches debugger).
  • Setting the suite status to idle via the UI (planned enhancement).
  • Reloading the tab (forces detach).

After intervening, regenerate the plan or rerun the suite to reset status indicators.

Best practices

  • Start with stable, deterministic paths—the LLM performs best when the DOM structure matches the captured context.
  • Re-capture screens after UI changes to avoid stale element references.
  • Use descriptive expectations (e.g., "Modal closes and success toast appears") so the model aims for clear validation signals.

Released under the MIT License.