Part anything, recently released by Facebook Research, does something most people immersed in computer vision have found daunting: reliably knowing which pixels in an image belong to an object. Making this easier is the goal of the Segment Anything Model (SAM), which was recently released under the Apache 2.0 License.
The results look great, and are there Interactive presentation available You can play with the different ways SAM works. One can pick things up by pointing and clicking on an image, or the images can be divided automatically. Honestly, it’s impressive to see SAM make masking various objects in an image seem so effortless. What makes this possible is machine learning, and part of that is the fact that the model behind the system has been trained on a huge data set of high-quality images and masks, which makes it extremely efficient at what it does.
Once the image is segmented, these masks can be used to interact with other systems such as object detection (which identifies and labels what the object is) and other computer vision applications. This system works more powerfully if they actually know where to look, after all. this Blog post from Meta AI He goes into some additional detail about what’s possible with SAM, full details in research paper.
Such systems rely on high quality datasets. Of course, nothing beats a whole lot of real-world data, but we’ve also seen that it’s possible to generate automated data that didn’t actually exist, and get useful results.
“Unapologetic communicator. Wannabe web lover. Friendly travel scholar. Problem solver. Amateur social mediaholic.”