NewtPhys: A New Benchmark for Newtonian Physics Understanding in Foundation Models

Researchers have introduced NewtPhys, a 4D physically annotated dataset designed to evaluate foundation models' understanding of low-level Newtonian physics. The dataset, built from multiview images of real-world scenes with physics-grounded simulations, provides detailed annotations including 3D forces and per-pixel quantities. Initial evaluations using NewtPhys on 56 Vision-Language Models (VLMs) and 10 Vision Foundation Models (VFMs) revealed limitations in their physics reasoning capabilities.

RDR82Confidence 90%physicsbenchmarkingfoundation modelsvision-language modelsvision foundation modelsdataset

Why it matters

Existing benchmarks for physics reasoning in foundation models often rely on synthetic scenes and high-level event analysis, which may not accurately assess true low-level Newtonian understanding. NewtPhys addresses this gap by offering a dataset with high visual fidelity and fine-grained physical annotations, enabling more rigorous evaluation and fostering the development of physics-aware AI models. This could lead to more robust and reliable AI systems in applications requiring a deep understanding of physical interactions.

A new research paper introduces NewtPhys, a novel 4D physically annotated dataset aimed at assessing how well foundation models comprehend Newtonian physics. Unlike previous benchmarks that often use synthetic or semi-synthetic scenes and focus on high-level events, NewtPhys is constructed from multiview images of real-world scenes, augmented with physics-grounded simulations. This approach provides dense, fine-grained annotations across timesteps, encompassing 3D forces and amodal per-pixel quantities related to physics, tracking, semantics, and geometry.

The creators of NewtPhys utilized this dataset to conduct a systematic evaluation of 56 Vision-Language Models (VLMs), including 54 open-weight and 2 closed-source models, alongside 10 Vision Foundation Models (VFMs). The findings indicate that these models exhibit limitations in their low-level physics reasoning abilities. The researchers suggest that NewtPhys can serve as a valuable resource for future research in physics-grounded vision and for developing advanced physics-aware evaluation methods. Code and datasets are publicly available.

Article ID - cmpxjlf880Featured on AI Radar: NewtPhys: A New Benchmark for Newtonian Physics Understanding in Foundation Models