I’ve seen the math for this but it still blows my mind. It actually doesn’t even matter how good the small model is, the sampling still works. The better the small model is, the further out you can go before resampling. (correct me if wrong)
2026-02-15 09:56:04
1
reckless_dane :
swopping costly output tokens for cheaper input tokens -nice. unless your large model rejects every output of your smaller model 😂 but I’m just negatively speculating
2026-02-05 13:01:24
5
Girth Brooks :
This may help me out quite a bit. Im building a network of mathematical kernels and each is controlled by master kernel which in turn is controlled by a Qwen model so that the local AI can do as many tasks as possible and not needing Anthropic or Open AI as much, but at the same time the system could be using this technique although it sort of already does.
2026-03-13 01:58:53
0
Jeremy Fisher :
I wish you had explained what everything meant better because idk what’s going on
2026-04-30 12:51:01
0
Silly :
That’s pretty cool dann
2026-04-05 23:37:36
1
Peter :
Spec decoding is goated and people should use it more often
2026-02-17 23:49:52
0
AV :
link please
2026-02-05 07:58:02
0
Mxrk :
awesome!
2026-02-07 17:22:07
0
SonOfMan :
2026-02-25 23:45:21
0
Wes :
I hope when we get more concrete agents, one of them communicates just like you. 🥰
Besides your amazing research, your style of communication is exactly the same as my inner monolog if I could bring it to my speech
2026-02-05 02:36:52
4
Farhan Beats :
🤩👏👏
2026-02-05 22:18:56
0
Avg redditor :
This made me feel stupidly
2026-02-05 05:51:48
2
To see more videos from user @individualkex, please go to the Tikwm
homepage.