Dev.to3d ago1 min read

A Quick Note on Gemma 4 Image Settings in...

In my last post, I mentioned using --image-min-tokens to increase the quality of image responses from Qwen3.5. I went to load Gemma 4 the same way, and hit an error: [58175] srv process_chun: processing image... [58175] encoding image slice... [58175] image slice encoded in 7490 ms [58175] decoding image batch 1/2, n_tokens_batch = 2048 [58175] /Users/socg/llama.cpp-b8639/src/llama-context.cpp:1597: GGML_ASSERT((cparams.causal_attn || cparams.n_ubatch >= n_tokens_all) && "non-causal attention re

Read original on dev.to