Meta's latest AI video tool can track objects across frames

Meta has launched an upgraded video editing-friendly AI model, Segment Anything Model (SAM 2), built to handle intricate computer vision tasks and advanced video editing.

"SAM 2 can segment any object and consistently follow it across all frames of a video in real-time, unlocking new possibilities for video editing and new experiences in mixed reality," Meta said in a blog post.

The new model builds on a transformer architecture and includes a streaming memory component, letting it process video segments efficiently.

SAM 2 is trained on its extensive video segmentation dataset, known as the SA-V dataset, ensuring strong performance in various applications.

The company said that the initial version of SAM had diverse applications, such as helping marine scientists in segmenting sonar images to study coral reefs, supporting disaster relief through satellite imagery analysis, and assisting in medical fields by segmenting cellular images for skin cancer detection.

It said that SAM 2 can track the selected object throughout all video frames, even if the object temporarily disappears from view, as the model has context of the object from previous frames.

SAM 2 is an open-source model, its code and weights are available on Meta's GitHub page, licensed under Apache 2.0, which allows research, academic, innovation, testing, and non-commercial use.

Meta said it is releasing the research publicly so that "others can explore new capabilities and use cases."

Mark Zuckerberg had last week in an open letter said that Open-Source AI "has more potential than any other modern technology to increase human productivity, creativity, and quality of life, all while accelerating economic growth and advancing groundbreaking medical and scientific research."

Image Source: Unsplash