Abstract: Programming based approaches to reasoning tasks have substantially expanded the types of questions models can answer about visual scenes. Yet on benchmark visual reasoning data, when models ...
I tried four vibe-coding tools, including Cursor and Replit, with no coding background. Here's what worked (and what didn't).
Some results have been hidden because they may be inaccessible to you
Show inaccessible results