Jin Daily AI Trivia:

Love Deepseek but wish it could read images too? The Rednote team got you covered! 🙂

The AI Lab behind China’s popular social platform Xiaohongshu (XHS), also known as Rednote, just dropped a bunch of open-source AI models—including LLMs, OCR, and most importantly: dots.vlm1.

They’ve essentially trained a new ViT (Vision Transformer), the 1.2B NaViT, and paired it with a fine-tuned Deepseek V3 672B. This makes dots.vlm1 one of the largest open-source VLMs out there—alongside giants like Qwen2.5VL-72B and InternVL3-78B.

In short: this is a state-of-the-art, high-parameter open-source VLM trained on China’s social media images/videos—and it’s actually really good. Performance-wise, it holds up well against closed-source models like Gemini 2.5 and Seed 1.5-VL (and ya it isn’t open-source 😅).

The only catch? Model size. Since it’s built on Deepseek V3, you’re gonna need serious GPU power to run it—definitely not for lightweight setups.

My 2 cents: Compared to other AI teams in the China social media space—who often post suspiciously good benchmark scores but keep everything in secret and little documentation —Rednote HiLab feels way more transparent and trustworthy. Haha.

Hope you learned something new today—see ya! 👋

Trivia Image

Jin Daily AI Trivia:

Topics