Alex Power's blog

Ai and open source LLM

2026-07-02 07:22:49

https://news.ycombinator.com/item?id=48743472

I will need to work on my manifesto when I get home. But :

llms make open source more economically viable
for security, it should only be llm commits
you still need a human to know what change to make
between token cost and tool expertise, we won’t have everyone write their own tool

deliberate pigheadness is not intelligence LLM

2026-06-09 22:22:18

three hypothetical Ai queries:

how often do "disaster tribes" produce the winner on Survivor
how often do people born in February get voted out before the merge on Survivor
how often do people who use an idol win Survivor

questions only of interest to a certain type of compulsive fan ... or a person who noticed something and is curious

how do we solve this?

We do NOT just train an LLM on all the survivor results and hope it can think it out. No. We use a database. The LLM can build the database. The LLM can make the database query. The LLM can design the system to do steps 1 and 2. What it can't do is give the result fully-formed, like Athena jumping out of the head of Zeus.

going forward Atacama

2026-06-09 19:19:41

where is Atacama going?

the various ideas:

better AI tools. (summary; synthesis; annotation)
better compilation tools. (what is the quantum of a post? how do we take those to form useful output?)
better generation tools. (when do you want to post? what about? 🔥 what is the value of more knowlege? )
passive data inputs. (calendar input, geolocation input, etc.)
privacy-managed inputs. (read 100 text messages; 3 are relevant/authorized for public sharing)
new output visualizations/schemas. (email digests; river-of-news feed; etc.)

🔥 unfortunately none of that seems project-worthy, right now ...

brillat-savarin Atacama

2026-06-09 14:34:13

💬 tell me what you eat, and i tell you what you are.

🔥 the quote was supposed to be "perfection is achieved when there is nothing left to take away".

💡 for example: the "subject" field. it is, largely, an exercise in digressions. also: the built-in section breaks. just have multiple messages. in a channel.

the nested footnotes ⚙️ colortext will have to go. it is nice-to-have for the writer, but too confusing for the reader (man or AI).

what is left?

a thought. some words. and, commentary.

some of the commentary is load-bearing. other parts are for reference for the less-informed reader.

🔥 the commentary that makes it clear left means "remaining" and not "the opposite of right" ... is not load-bearing, normally.

A new app Atacama

2026-06-08 20:48:58

We have the new Atacama app it seems OK not great but OK.

The footnotes don't quite work right the speech to text is mediocre the category selection is bare bones but it is better than nothing⚔️ It might not be better than what we had

Sentence Decomposition (part 2) Barsukas

2026-05-18 19:45:32

These tasks are one end of the spectrum of "language model" tasks.

A "language model" takes language and does operations on it. Not thought. The cross-language translation and "sentence decomposition" rely on a simpler understanding of knowledge. ✨ it is much easier to say "I can think of a castle" than it is to build one

We do need some element of the second type of model.

Which is going to be an LLM/software centaur version.

the LLM here does need to be able to do thought tasks. not just language-model.

🔥 there is a reason we don't put a dictionary in charge of the country.

Sentence Decomposition Barsukas

2026-05-18 19:05:28

The sentence decomposition logic is almost done.

The design:

A sentence is entered into the database, often (but not always) in English.
It is translated by LLM operation 1 into other languages: English, Chinese, French, Spanish, Liithuanian, Ukrainian, Kannada, Bengali. ⚙️ all PIE, except Chinese. This is unintentional but welcome. It remains within the areas of linguistics where I am comfortable.
We check the database for word matches. This is used to create a list of candidate lemmas. We avoid the need to use LLMs to do the word-meaning resolution problem this way. The "correct" meaning will be present in most/all languages 💡 not all sentence translations will have the same lemmas. I like bread and Man patinka duona ⚙️ Lithuanian - literally translated as "Bread is pleasing to me" are the same sentence without a 1-1 lemma match
Once we get the list of candidate lemmas, we ask the LLM to do the sentence decomposition.

For a sentence, the decomposition identifies:

What specific lemma / sense-of-meaning of a word is in use. Is it to lose a race or to lose one's keys? A river-bank or a financial bank?
What tense is the word in? Present/past? Is it a conjugated form? This should align with the lemma's information for the grammatical form, but there is no requirement for it to do so.

We do not yet have a full "sentence diagram" out of the words. The traditional Reed-Kellogg sentence diagram is uninteresting -- it only applies to English, is outdated, and remains confusing. However, the data missing to be able to generate this will be relevant later on.

dead-weight cost LLM

2026-05-18 16:12:57

I have been trying to run some very simple API calls using Claude to operate the new Barsukas HTTP API.

The flow is a 20-word prompt, reading 3-4 files from disk, and making a 4-line Python script to call the API.

This costs 22 cents. ⚙️ which feels like too much

Why so much? There is something like 8 KB of system prompt and 14 KB of system tools. All of which should be cached/built-in to the model, but is instead just charged to the user. And several rounds of "cached reads" as the tool pauses for inputs.

Fortunately, the $20/month subscription gives something like $300/month of credits. When the cost is inflated by a factor of 5, it helps that the currency is also inflated.

On the other hand, the sheer volume of struggles Claude Opus is having with this task (simple directions like "only read the api/ dir" get ignored about 1/4 of the time) does make me want to look for a different harness, which can pivot between model providers. ⚙️ it's not as bad as it was six weeks ago, but it is still very disappointing for the "recommended/expensive model" that it can't do this well.

solving a prompting bug LLM

2026-05-18 15:49:40

a recent bug in Greenland: I was asking ChatGPT 5.4 mini to do a per-word breakdown of a sentence. But, sometimes it left words out. Other times, it would include the trailing period as a word, despite being instructed not to.

The cause of the bug: the prompt also asked ChatGPT to get the "word count" of the sentence. It is bad at this.

So, if it said The dog is on the log has 4 words, it would stop after 4 words and ignore the log when processing the sentence.

Of course, Python can easily do this. 💡 and, in the future, AI systems will know to just use Python to get this information rather than trying to guess it themselves. But here, it wasn't even used; I don't remember/know why a "word count" got added to the response.

some missing Trakaido words Trakaido

2026-04-22 17:11:25

I compared the Cambridge YLE list to the Trakaido wordlists. What I found was that most of the missing words were either deliberate ⚙️ modal verbs like "can", as well as pronouns, are handled outside the wordlists, correct 💡 ice skating, jellyfish, or toothache do not need to be on the list, or irrelevant 💡 having "grandfather" instead of "grandpa" is fine ... though it would be good to improve the synonyms.

Click title to read full message...