you are right
NJX-njx PRO
AI & ML interests
Recent Activity
Organizations
Inquiry about dataset issues
This article is very inspiring to me.
- Since skills have become a great boost to the improvement of model capabilities, can we try to distill skills, just like we did model distillation before? I think this can be achieved through multiple iterations.
- The current functions of upskill are actually quite complete, but I wonder if we can try to make it generate a compatibility matrix between multiple skills, so that the combined effect is greater than the sum of the parts. In addition, Model A generates skills, and Model B looks for counterexamples, so that they can evolve together.
We Got Claude to Build CUDA Kernels and teach open models!
- +2
Actually, I think a very important point is that most independent developers do not have enough case studies to support their work, and at the same time, the cost of online deployment is actually a bit high
To be honest, although it still looks like AI at first glance in terms of design and other aspects, the quality of the model's website has indeed improved a lot compared to previous models.
So you want to create a dataset of papers that includes various AI-related papers, is that right?
I'm a bit confused. What can such synthetic census data be used for?
Yep! You've got that exactly right. I've finetuned the model so that it when it responds, it's like a Type 1 personality from Enneagram.
great
Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective
This sentence is very insightful: In today's world where models are becoming increasingly homogeneous, these decisions mark a significant shift in the competitive landscape from model performance to system design. What China has made open-source has never been a single model weight, but the entire AI ecosystem.
I would love to hear your views on China's system design.
Architectural Choices in China's Open-Source AI Ecosystem: Building Beyond DeepSeek
As a Chinese AI researcher who has fully experienced the wave of AI technologies and products in 2025, looking back on the series of changes brought about starting from DeepSeek, I have mixed feelings.
The author's analysis is excellent. The open-sourcing of the R1 model weights and related technologies empowered the most usable technologies to both startups and large companies at that time. Such an "unconventional" change pushed China's AI field, whether in research and development or product development, into a stage of rapid development.
Thanks to open source. May AI technology benefit all of humanity.
One Year Since the “DeepSeek Moment”
New paradigms for scientific research paper work
New paradigms for scientific research paper work
来自OpenAI gpt-oss的技巧,你🫵在transformers中也可以使用
- +5
用开源模型强化你的 OCR 工作流
- +5
Codex 正在推动 AI 模型的开源与训练流程
Some time ago, I came across a research analysis from two investors at a16z. In the past year of 2025, ChatGPT actually tried to promote some new AI functions in fields such as shopping, but in fact, the effect was not good.
I think the fundamental reason lies in the user's mindset, or rather, the user's interaction logic in vertical fields. The most prominent and distinctive feature of ChatGPT is that all-encompassing dialogue box, which is also a common problem with many homogeneous AI products nowadays (it seems that without a dialogue box, the AI's capabilities are sealed off).Although it can be adapted to many scenario fields, it will appear very boring in more vertical scenarios
Ask yourself, would you prefer the image-text waterfall flow interaction in shopping scenarios like Xiaohongshu, or the monotonous search box of ChatGPT? The answer is actually obvious from the start.
For all vertical scenarios, the interaction logic was already very well-developed before the emergence of AI. The user experience brought by such interaction logic is definitely not something that a single dialogue box can replace.
And if we want to create a good AI product in a vertical field, we should think more about how to silently embed the powerful capabilities of AI into the original interaction, and continuously iterate to provide users with a better experience.@lilianweng@clem@AdinaY