Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is stupid because a UTF8 is a tokenizer that covers all Unicode with a vocab of only 256 (yes, without a K). This is the only way of scaling the bitter lesson with tokenizers. Also, with architectures that span +1M context windows, it’s no longer an argument/issue the reduced context windows.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: