Discussion about this post

User's avatar
Neural Foundry's avatar

The async RL explantion with the token lag distribution is eye-opening. I never thought about how conventional RL concentrates the lag in later samples, but async RL spreads it evenly across rollouts. The multi-tenancy piece is clever too, you can batch requests from diferent custom models all using the same base. Makes sense why CoreWeave went after OpenPipe and W&B.

Expand full comment

No posts