An agent runs my GPU cluster
- 6 mins read
For three weeks an agent has been running my GPU cluster. It scales Blackwell cards up and down, picks nodes, and frees silicon when nobody is rendering.
Not autonomous magic. It runs the loop I used to babysit by hand. Here is the honest version of what that taught me, with the parts LinkedIn was too short to hold.

The first lesson: idle GPUs that are still blocked
A render UI grabs a GPU the moment its process starts, not when you click “render”. torch.cuda takes the device on import. So an interface nobody is using still owns a card all day.