{"channel":"llm","content":"Somehow Barsukas has become the other server as part of the Trakaido project. It contains translation data, sentence data, etc. \r\n\r\nIt is, in effect, a multilingual dictionary. \r\n\r\nInstead of sorting alphabetically, we have to categorize words and then sort them roughly by linguistic complexity. This is a blurry term, but:\r\n* cow is less complex than marmoset\r\n* red is less complex than crimson\r\n* table is less complex than credenza\r\n\r\nThe categories themselves have evolved. There is a super category that the LLM generated in the UI that I am not concerned with, but we have:\r\n* around 40 categories of nouns\r\n* 15 categories of verbs\r\n* 10 for adjectives and adverbs\r\n* a separate numeral category for number words\r\n\r\n----\r\n\r\nWe have an architecture designed around a development pattern where people make APIs including the OpenAI key to add words. As a single-user project, this is a way to avoid putting the LLM key on the server. For a multi-user project, there are obviously security risks associated with this design. We are hoping to solve it better later. \r\n\r\n----\r\n\r\nSome of the agents have turned out to be more useful than others. This is fine.","created_at":"2026-01-28T18:53:57.865143","id":739,"llm_annotations":{},"parent_id":null,"processed_content":"<p>Somehow Barsukas has become the other server as part of the Trakaido project. It contains translation data, sentence data, etc. \r</p>\n<p>It is, in effect, a multilingual dictionary. \r</p>\n<p>Instead of sorting alphabetically, we have to categorize words and then sort them roughly by linguistic complexity. This is a blurry term, but:\r</p>\n<ul>\n<li class=\"bullet-list\"> cow is less complex than marmoset\r</li>\n<li class=\"bullet-list\"> red is less complex than crimson\r</li>\n<li class=\"bullet-list\"> table is less complex than credenza\r</li>\n</ul>\n<p>The categories themselves have evolved. There is a super category that the LLM generated in the UI that I am not concerned with, but we have:\r</p>\n<ul>\n<li class=\"bullet-list\"> around 40 categories of nouns\r</li>\n<li class=\"bullet-list\"> 15 categories of verbs\r</li>\n<li class=\"bullet-list\"> 10 for adjectives and adverbs\r</li>\n<li class=\"bullet-list\"> a separate numeral category for number words\r</li>\n</ul>\n<hr class=\"section-break\" />\n<p>We have an architecture designed around a development pattern where people make APIs including the OpenAI key to add words. As a single-user project, this is a way to avoid putting the LLM key on the server. For a multi-user project, there are obviously security risks associated with this design. We are hoping to solve it better later. \r</p>\n<hr class=\"section-break\" />\n<p>Some of the agents have turned out to be more useful than others. This is fine.</p>","quotes":[],"subject":"on Barsukas"}
