Terminal-Bench Tools

No file selected

Evaluating task... This may take a moment.

Task Deduplication

Check uploaded task zips for duplicates against Google Drive folders and within the batch itself.

No file selected

Running deduplication...

Harbor Runner

Upload a Terminal-Bench task zip and run Harbor directly on the server. Single task zips and zip-of-zips are both accepted.

Harbor uses the Docker backend in this deployment. The environment running this server must expose a reachable Docker daemon or Docker socket.

No file selected

Running Harbor on the uploaded task...

Live Harbor Log