Cross posted from: Latin@lemm.ee
lingua latina pater linguarum dimidum est ๐
I hope it's okay for me to crosspost here.
I wonder if something like the semantic tokenization method would benefit from using etymological data like this, particularly for a multilingual llm.
i know that my NN internally uses semantic tokenization method.
i literally often seek the word roots when talking to somebody. it helps me focus.
Interesting paper, thanks for sharing
Be respectful
I wonder if something like the semantic tokenization method would benefit from using etymological data like this, particularly for a multilingual llm.
i know that my NN internally uses semantic tokenization method.
i literally often seek the word roots when talking to somebody. it helps me focus.
Interesting paper, thanks for sharing