Jin Daily AI Trivia: iPhone 17 Pro (A19 Pro) GPU with Neural Accelerator can run LLMs with ease π
Jin Daily AI Trivia: iPhone 17 Pro (A19 Pro) GPU with Neural Accelerator can run LLMs with ease π
Appleβs new A19 Pro GPU comes with a Neural Accelerator β Appleβs take on Tensor Cores β and it seriously boosts AI inference speed.
First tests are already out: running Llama.cpp on GPU MLX with Qwen3-8B hits 14β15 tok/s. That scales to about 50β60 tok/s for a 2B model β totally usable for on-device VLMs or text generation. Plus now all A19 Pro come with 12GB RAM by default, with desktop class memory bandwidth (80+GB/s)
Benchmark (Qwen3-8B Model)
M1: 11.4 tok/s M4 Max: 79 tok/s RTX 4090: 135 tok/s
π₯ This means weβre literally holding M1-class AI performance in the palm of our hands. How crazy is that !!!
