skyvern_v1, skyvern_v2, openai-cua, anthropic-cua, ui-tars) and browser backends (local, cdp, lexmount, skyvern-cloud).
Installation
The
skyvern and browser-use extras conflict. Use a separate venv (default .venvs/skyvern for skyvern, and .venvs/browser_use for browser-use in config.yaml).
bubench run will auto-create that venv and install the skyvern extra on first use..venv (or use uv run bubench ...) before running bubench commands.
Configuration
Configure Skyvern in the rootconfig.yaml under agents.skyvern:
OPENAI_COMPATIBLE_API_KEY, SKYVERN_API_KEY) in the repo root .env.
Engine Options
| ENGINE | Description |
|---|---|
skyvern_v1 | Skyvern v1 engine |
skyvern_v2 | Skyvern v2 engine |
openai-cua | OpenAI CUA engine |
anthropic-cua | Anthropic CUA engine |
ui-tars | UI-TARS engine |
Browser Backends
browser_id | Description |
|---|---|
local | Local browser |
cdp | Connect to an external CDP browser (set CDP_ADDRESS) |
lexmount | Lexmount cloud browser (set LEXMOUNT_BROWSER_MODE) |
skyvern-cloud | Skyvern cloud browser |
Common Parameters
| Parameter | Description | Example |
|---|---|---|
enable_openai_compatible | Enable OpenAI-compatible mode | true |
model_id | LLM model name | gemini-3-flash-preview |
api_key | LLM API key (use $ENV_VAR form) | $OPENAI_COMPATIBLE_API_KEY |
base_url | LLM API base URL | $OPENAI_BASE_URL |
max_tokens | Max output tokens | 16000 |
temperature | Temperature | 0.0 |
supports_vision | Model supports vision | true |
request_timeout | LiteLLM per-request timeout (seconds) | 600 |
execution_engine | Skyvern execution engine | skyvern_v2 (default) |
headless | Headless mode for local browser | true / false |
timeout | Task timeout (seconds) | 600 |
max_steps | Max steps per task | 25 |
max_screenshot_scrolls | Max scroll screenshots | 5 |
include_action_history_in_verification | Include action history in verification | true |
max_consecutive_repeats | Max consecutive repeats | 3 |
max_action_occurrences | Max occurrences of one action | 5 |
Renamed keys (legacy still honored)The
openai_compatible_* prefix on per-model config keys was dropped so Skyvern matches every other agent (model_id, api_key, base_url, max_tokens, temperature, supports_vision, request_timeout). The old names still work — you’ll see a one-shot DeprecationWarning per key, then the value is aliased onto the new key. The underlying env vars passed to the Skyvern subprocess (OPENAI_COMPATIBLE_*) are unchanged.Why no
browser_control?Skyvern is a fully managed service: the decision of when to use DOM inspection versus visual grounding is made internally by the Skyvern execution engine (configurable via execution_engine). The benchmark only submits tasks via API and does not control the internal interaction strategy.Usage Examples
Basic Run
Run All Tasks
Evaluation
Supported Benchmarks
- ✅ LexBench-Browser
- ✅ Online-Mind2Web
- ✅ BrowseComp