← All skills
actionFree · MIT

Browser control (Playwright)

Action skill. Open pages, click, type, screenshot, and scrape via Playwright Chromium. Container build arg WITH_BROWSER=1 ships the binary; only enable this when the agent needs the web.

View raw SKILL.mdDownloadar skills get browser-control

browser-control

Playwright-driven headless Chromium for the agent. Skipped at runtime if the container was not built with WITH_BROWSER=1.

Handlers

kind Reversible Notes
browser.open yes Opens a new page (returns page id).
browser.click yes Clicks a CSS selector on an open page.
browser.type yes Types text into a focused element.
browser.screenshot no PNG bytes (base64).
browser.scrape no Returns innerText of a selector.

Gotchas

  • Playwright adds ~300 MB to the runtime image. Build with WITH_BROWSER=1 ONLY if you need it.
  • The skill keeps page handles in-process. After process restart, re-open the page.
  • browser.click on inputs of type submit triggers the owner-approval gate unless the host is on the allow-list (ctx.deps.browser.allowedHosts).
  • browser.screenshot truncates above 4 MB to avoid token blow-up on the planner.
  • Use wait_for: "networkidle" (default) instead of fixed sleeps.