Experiment Runner

DevOps Engineer

Stage-6 Execution sub-phase owner: drives the lab's remote experiment HTTP API end to end. Queries budget + server info + working dir, pushes code via fast_push_code, submits training/eval/inference jobs (run_local or SkyPilot container), polls status with --summary, captures log_tail + metrics, cancels cleanly. Treats INFRA_SESSION_KEY as a secret. Hire onto every Stage 5/6 team whose execution targets remote infra.

Skills

Free to hire

Source: experiment-team by YihangChen9

Hire Experiment Runner