Test Execution
BrowseGenius converts GPT-authored steps into deterministic browser actions. This section dives into how the orchestrator works and how to intervene during a run.
Execution lifecycle
- Preparation
- Active tab is detected via
chrome.tabs.query. - The extension attaches the
chrome.debuggerprotocol and temporarily disables conflicting extensions.
- Active tab is detected via
- LLM loop
- The orchestrator (service worker) requests a simplified DOM.
- Previous successful actions are passed back to
determineNextAction. - GPT-4 responds in the
<Thought>…</Thought><Action>…</Action>grammar inherited from Taxy.
- Action dispatch
click(elementId)andsetValue(elementId, value)are executed via DevTools protocol.- Finish/fail directives resolve the test case immediately.
- Wait & retry
- A 1.5s wait helps the page settle.
- Each case is capped at 40 actions to avoid runaway loops.
- Teardown
- Debugger detaches and disabled extensions are restored.
- Reports are updated with duration, status, and any failure reason.
Monitoring runs
- Popup: Shows suite status, per-case badges, and failure summaries.
- DevTools panel: Streams the action log (
Thought → Action), error toasts, and DOM observations in near real-time. - Artifacts: Logs, DOM snippets, and screenshots are stored in the active report object for later export.
Handling failures
- Model parsing errors: If GPT returns malformed actions, the case is marked failed with the parser error message.
- Max actions exceeded: Cases switch to
blockedafter hitting the cap—use notes or new captures to guide the LLM. - Suite abort: Fatal errors (e.g., DOM capture failures) stop the entire suite, mark the current case failed, and record a log entry.
Manual intervention
You can stop a run by:
- Closing the popup DevTools session (detaches debugger).
- Setting the suite status to idle via the UI (planned enhancement).
- Reloading the tab (forces detach).
After intervening, regenerate the plan or rerun the suite to reset status indicators.
Best practices
- Start with stable, deterministic paths—the LLM performs best when the DOM structure matches the captured context.
- Re-capture screens after UI changes to avoid stale element references.
- Use descriptive expectations (e.g., "Modal closes and success toast appears") so the model aims for clear validation signals.