Discussion about this post

User's avatar
Sven Meyer's avatar

Wow, can't remember when I read such detailed write up. Are you doing this for work or was hat your Master's degree paper ?

Should I want to build an inference provider business, it would be all in here ! However, too competitive and expensive to launch, but still great insight.

However, I got some real value out of it as I now understand why why I won't get far with my 8GB GPU in my laptop as the KV store for a 131,072 context window would be itself need 40GB (nice diagram !)

Expand full comment
Maxx Yung's avatar

hidden gold mine of information!

Expand full comment
3 more comments...