browseruse_bench.utils.task_utils

任务处理相关的工具函数。

导入

from browseruse_bench.utils import (
    load_tasks,
    load_tasks_with_benchmark_support,
    filter_tasks,
    filter_completed_tasks,
    is_task_completed_by_result_json,
    resolve_tasks_json_path,
    print_task_summary,
)

load_tasks

加载任务数据。

def load_tasks(
    tasks_json_path: str,
    prompt_fmt: Optional[str] = None
) -> List[Dict[str, Any]]

tasks_json_path

str

必填

tasks JSON 文件路径

prompt_fmt

str

默认值:"None"

可选的 prompt 模板，格式如 "{task}\n...{url}..."。提供时会在任务字典中追加 prompt 字段。

return

list of dict

任务列表，每项包含 task_id、task_text、url。若提供了 prompt_fmt，则还会包含 prompt。

load_tasks_with_benchmark_support

加载任务，并支持不同 benchmark（包括 BrowseComp）。

def load_tasks_with_benchmark_support(
    tasks_json_path: Path,
    prompt_fmt: Optional[str] = None
) -> List[Dict[str, Any]]

tasks_json_path

Path

必填

tasks JSON 文件路径

prompt_fmt

str

默认值:"None"

可选的 prompt 模板（BrowseComp 有自己的模板，此参数会被忽略）

filter_tasks

根据模式筛选任务。

def filter_tasks(
    tasks: List[Dict[str, Any]],
    mode: str,
    count: int,
    task_ids: Optional[List[str]],
    task_id: Optional[str] = None
) -> List[Dict[str, Any]]

tasks

list of dict

必填

任务列表

mode

str

必填

筛选模式：

single - 只跑第一个任务
first_n - 跑前 N 个任务
sample_n - 随机抽样 N 个任务
specific - 跑指定 ID 的任务
by_id - 按 ID 跑单个任务
all - 跑全部任务

count

int

必填

first_n 或 sample_n 模式下的任务数量

task_ids

list of str

默认值:"None"

specific 模式下的任务 ID 列表

task_id

str

默认值:"None"

by_id 模式下的单个任务 ID

filter_completed_tasks

过滤掉已完成的任务。

def filter_completed_tasks(
    tasks: List[Dict[str, Any]],
    output_dir: Path,
    check_func: Callable[[str, Path], bool]
) -> Tuple[List[Dict[str, Any]], int]

tasks

list of dict

必填

任务列表

output_dir

Path

必填

输出目录

check_func

function

必填

判断任务是否已完成的函数

return

tuple

(剩余任务列表, 被跳过的任务数量)

is_task_completed_by_result_json

通过 result.json 判断任务是否已完成。

def is_task_completed_by_result_json(
    task_id: str,
    output_dir: Path
) -> bool

task_id

str

必填

任务 ID

output_dir

Path

必填

输出目录路径

return

bool

result.json 存在且非空时返回 True

resolve_tasks_json_path

解析 tasks JSON 文件路径。

def resolve_tasks_json_path(
    tasks_json_arg: Optional[str],
    default_tasks_json: Path,
    env_var: str = 'TASKS_JSON'
) -> str

tasks_json_arg

str

默认值:"None"

命令行传入的路径

default_tasks_json

Path

必填

默认路径

env_var

str

默认值:"'TASKS_JSON'"

环境变量名

print_task_summary

打印任务执行摘要。

def print_task_summary(
    total_tasks: int,
    tasks_to_run: int,
    success_count: int,
    failed_count: int,
    output_dir: Path
) -> None

total_tasks

int

必填

任务总数

tasks_to_run

int

必填

本次运行的任务数量

success_count

int

必填

成功的任务数量

failed_count

int

必填

失败的任务数量

output_dir

Path

必填

输出目录路径

概览

工具模块

task_utils

browseruse_bench.utils.task_utils

导入

load_tasks

load_tasks_with_benchmark_support

filter_tasks

filter_completed_tasks

is_task_completed_by_result_json

resolve_tasks_json_path

print_task_summary

概览

工具模块

​browseruse_bench.utils.task_utils

​导入

​load_tasks

​load_tasks_with_benchmark_support

​filter_tasks

​filter_completed_tasks

​is_task_completed_by_result_json

​resolve_tasks_json_path

​print_task_summary

browseruse_bench.utils.task_utils

导入

load_tasks

load_tasks_with_benchmark_support

filter_tasks

filter_completed_tasks

is_task_completed_by_result_json

resolve_tasks_json_path

print_task_summary