Pixels, Patterns, but No Poetry: To See the World like Humans
ICML 2026 Position Paper Track, accepted, 2026
A position paper proposing the Turing Eye Test for evaluating whether multimodal AI systems see the world like humans.
ICML 2026 Position Paper Track, accepted, 2026
A position paper proposing the Turing Eye Test for evaluating whether multimodal AI systems see the world like humans.
arXiv preprint, 2026
SpatialWorld benchmarks interactive spatial understanding of multimodal agents in complex real-world tasks.
Technical report, 2026
SimpleTES is a framework for scaling evaluation-driven discovery loops across scientific problems.
Technical report, 2025
A technical report on Klear-AgentForge, a guided perturbation learning framework for data-centric AI agents.
TACL submission, minor revision, 2025
HAVEN is a benchmark and analysis suite for hallucination in large multimodal models for video understanding.