Skip to content

Browser Extension

Chrome extension that gives AI agents access to the user’s real browser — with existing cookies, sessions, and authentication. No bot detection, no Playwright sandboxes.

Playwright runs in an isolated browser with no cookies or login state. Many sites detect and block automated browsers. The browser extension bridges to the user’s real Chromium instance, so agents can:

  • Navigate and interact with authenticated pages
  • Use existing OAuth sessions (GitHub, Google, etc.)
  • Bypass bot detection and CAPTCHAs
  • Take screenshots of real page state
Agent Worker
BrowserExtTool (src/tools/browser-ext/)
BrowserBridgeService (WebSocket)
↓ ws://localhost:3005/ws/browser-bridge
Chrome Extension
├─ Service Worker — WebSocket client, command dispatch
├─ Content Script — element highlighting
└─ Popup — settings, connect/disconnect

Commands are sent as JSON over WebSocket with a unique ID. The extension executes the command and returns the result with the same ID.

Terminal window
# Manjaro / Arch
sudo pacman -S chromium
# Ubuntu / Debian
sudo apt install chromium-browser
  1. Open chromium://extensions
  2. Enable Developer mode (toggle, top right)
  3. Click Load unpacked
  4. Select the browser-extension/ directory
  1. Click the extension icon in the toolbar
  2. Enter the backend URL: ws://localhost:3005
  3. Enter your API key (master key from .env)
  4. Click Connect

The badge shows ON when connected, OFF when disconnected.

Alternatively, run the setup wizard (bun run setup) which can install the extension automatically.

CommandDescriptionPermission
navigateNavigate active tab to a URLASK
new_tabOpen a new browser tabASK
close_tabClose a tab by IDASK
select_tabSwitch focus to a tab by IDASK
get_tabsList all open tabsALLOW
CommandDescriptionPermission
screenshotCapture visible tab as base64 PNGALLOW
extract_contentExtract text, links, forms from pageALLOW
CommandDescriptionPermission
clickClick element by CSS selector (supports double-click)ASK
fillFill input field with valueASK
selectSelect option in a <select> elementASK
hoverHover over an elementASK
press_keyPress a keyboard key (Enter, Tab, Escape, etc.)ASK
scrollScroll the page or an element by pixel offsetASK
dragDrag an element from one position to anotherASK
CommandDescriptionPermission
wait_forWait for an element or condition to appearALLOW
highlightVisually highlight an element on the pageALLOW
CommandDescriptionPermission
evaluateExecute JavaScript in page contextASK (dangerous)
get_cookies / set_cookiesRead/write cookies for a domainASK (dangerous)
get_storage / set_storageRead/write localStorage or sessionStorageASK (dangerous)
CommandDescriptionPermission
get_consoleRetrieve captured console log entriesALLOW
get_networkRetrieve captured network request/response logASK
handle_dialogAccept or dismiss browser dialogs (alert, confirm, prompt)ASK

The browser-ext tool is available to these roles:

RoleWhy
researchBrowse authenticated sources
qaTest real browser behavior
securityAssess authenticated endpoints
aiInteract with AI platforms
generalFallback access
  • WebSocket authenticated via master key
  • evaluate, get_cookies, set_cookies, get_storage, set_storage, and get_network are marked as dangerous and require explicit permission approval
  • Navigation and interaction commands require ASK-level approval
  • Screenshots, content extraction, and console capture default to ALLOW
  • v2.0.0 requires the tabs and cookies browser permissions