Test Execution

BrowseGenius converts GPT-authored steps into deterministic browser actions. This section dives into how the orchestrator works and how to intervene during a run.

Execution lifecycle

Preparation
- Active tab is detected via chrome.tabs.query.
- The extension attaches the chrome.debugger protocol and temporarily disables conflicting extensions.
LLM loop
- The orchestrator (service worker) requests a simplified DOM.
- Previous successful actions are passed back to determineNextAction.
- GPT-4 responds in the <Thought>…</Thought><Action>…</Action> grammar inherited from Taxy.
Action dispatch
- click(elementId) and setValue(elementId, value) are executed via DevTools protocol.
- Finish/fail directives resolve the test case immediately.
Wait & retry
- A 1.5s wait helps the page settle.
- Each case is capped at 40 actions to avoid runaway loops.
Teardown
- Debugger detaches and disabled extensions are restored.
- Reports are updated with duration, status, and any failure reason.

Monitoring runs

Popup: Shows suite status, per-case badges, and failure summaries.
DevTools panel: Streams the action log (Thought → Action), error toasts, and DOM observations in near real-time.
Artifacts: Logs, DOM snippets, and screenshots are stored in the active report object for later export.

Handling failures

Model parsing errors: If GPT returns malformed actions, the case is marked failed with the parser error message.
Max actions exceeded: Cases switch to blocked after hitting the cap—use notes or new captures to guide the LLM.
Suite abort: Fatal errors (e.g., DOM capture failures) stop the entire suite, mark the current case failed, and record a log entry.

Manual intervention

You can stop a run by:

Closing the popup DevTools session (detaches debugger).
Setting the suite status to idle via the UI (planned enhancement).
Reloading the tab (forces detach).

After intervening, regenerate the plan or rerun the suite to reset status indicators.

Best practices

Start with stable, deterministic paths—the LLM performs best when the DOM structure matches the captured context.
Re-capture screens after UI changes to avoid stale element references.
Use descriptive expectations (e.g., "Modal closes and success toast appears") so the model aims for clear validation signals.

Test Execution ​

Execution lifecycle ​

Monitoring runs ​

Handling failures ​

Manual intervention ​

Best practices ​