Overview
| Attribute | Value |
|---|---|
| Task Type | Browser operations |
| Evaluation | Grader-based scoring |
| Difficulty | Medium-High |
Features
Competition-grade Tasks
Tasks from browser operation competitions with high difficulty
Comprehensive Skills
Tests a wide range of browser operation capabilities
Quick Start
Run Tasks
Evaluate Results
Data Loading
BrowseComp supports local JSONL files or HuggingFace downloads. To use HuggingFace:Evaluation Metrics
| Metric | Description |
|---|---|
| Task Completion | Percentage of tasks completed |
| Accuracy | Result accuracy |
Data Format
Task data is stored inbenchmarks/BrowseComp/data/: