{"channel":"llm","content":"Initial indications are that phi4:14b is *substantially* slower than gemma2:9b on a 24GB MacBook Pro, with no noticeable performance difference. (<red> part of the slowness is related to the \"two phase response\" code I added to work around Ollama's inability to do structured JSON responses correctly.  Even adjusting for that, it is twice as slow as other models)\r\n\r\nIt also still has some of the \"excessively long responses\" problems that made phi3.5 nearly-unusable on benchmark tasks, and completely unreliable for production tasks.  At least these seem to be sensible responses rather than << end-of-message token >> bugs.\r\n\r\nSome of the issues may be related to memory pressure, but without Safari running there should be plenty of RAM.\r\n\r\nI see no reason to use this model over << gemma2:9b >> or << qwen2.5:7b >> locally.","created_at":"2025-01-14T22:59:32.659338","id":103,"llm_annotations":{},"parent_id":null,"processed_content":"<p>Initial indications are that phi4:14b is <em>substantially</em> slower than gemma2:9b on a 24GB MacBook Pro, with no noticeable performance difference. <span class=\"colorblock color-red\"><span class=\"sigil\">\ud83d\udca1</span><span class=\"colortext-content\">( part of the slowness is related to the \"two phase response\" code I added to work around Ollama's inability to do structured JSON responses correctly.  Even adjusting for that, it is twice as slow as other models)</span></span>\r</p>\n<p>It also still has some of the \"excessively long responses\" problems that made phi3.5 nearly-unusable on benchmark tasks, and completely unreliable for production tasks.  At least these seem to be sensible responses rather than <span class=\"literal-text\">end-of-message token</span> bugs.\r</p>\n<p>Some of the issues may be related to memory pressure, but without Safari running there should be plenty of RAM.\r</p>\n<p>I see no reason to use this model over <span class=\"literal-text\">gemma2:9b</span> or <span class=\"literal-text\">qwen2.5:7b</span> locally.</p>","quotes":[],"subject":"phi4 preliminary thoughts"}
