Meta’s New “SAM” AI Is Going To Change The World

Apr 06, 2024

Mark Zuckerberg went all out with the distraction of insisting on a Facebook rebrand to put all his chips into the

world

towards Ai and the CEO of Meta now has to catch up with this AI episode. Focus, we will explore his plans for how the

world

we have a lot to get into, so without further ado, let's get into it.

Zuckerberg told the media this week that Medici's appointment is an opportunity to present. AI agents for billions of people in ways that will be useful and meaningful. The goal is exploring chat experiences in WhatsApp and Messenger visual creation tools for posts, Facebook and Instagram, ads, videos and multimodal experiences. Zuckerberg envisions these tools being used by everyday people and businesses alike, he also says he expects there will be a lot of interest in customer service and business messaging once they master how to deliver it, and then of course he added in the metaverse where the objects and worlds of the avatars will be much easier to create according to him, keeping that miniverse dream alive Zuck The CEO of Meta also stated that they are no longer dragging their feet in the AI race and that AI products generative will be coming out in the next few months literally touching every one of their products and of course I will report on them once they do, for example AI could accelerate WhatsApp's customer support business, plus meta plans to eliminate 21,000 jobs and a plan to make 2023 the year of efficiency.

More Interesting Facts About,

meta s new sam ai is going to change the world...

The fear of efficiency has started well for the meta with a three percent revenue growth in the first quarter at Zuckerberg attributes this to its AI-powered Reels feature. He said time spent on Instagram had increased by 24 due to this feature alone, with people sharing Reels 2 billion times a day and his investments in meta AI made this possible in an earnings call. Zuckerberg said our investment in recommendations and Ranking systems have driven many of the results we're seeing today in the reels and aggregates of our discovery engine, but what big moves has meta already made in the AI space?

What if you wanted to? do your own check gbt meta is working on that meta, the first development of the three that we are

going

to cover in this video is lava, the company's 65 billion parameter large language model and Lama stands for large language model and meta AI, there is no real flame. a collection of language models ranging in size from 7 billion to 65 billion parameters that were trained on large amounts of public data, as you can see here, llama outperforms chat gpt3 on many benchmarks and is quite competitive too with Google's Palm language model, which has about 540 billion. parameters much more than llama's 65 billion parameters, as a result llama is a much more optimized model that requires much less computational power, which is impressive, but this llm is not like GPT or Bard chat in the sense that you can talk to it, but it's a research tool where people can use rapid engineering to help advance work in the field of AI right now, it's strictly for academic research in case-by-case spaces because there are some issues that must be resolved before widespread public use.

You know those problems don't matter to open AI. such as hallucinations or biases, the flame is used for many reasons, including research into natural language understanding, examining the capabilities and limitations of existing language models, improving language models, and essentially creating of language models themselves, but what's really interesting is the fact that meta open sourced their language model. to the public at all, even if it is not restricted, this means that one day everyone will have access to this innovative AI model technology that will open a path for people to develop their own AIS. That's right, you could make the next generative AI spectacular.

By the way, don't kill us with that, if you are enjoying this content and want to stay updated on the latest AI news and updates, feel free to subscribe to the channel now, come back to the video, the next meta has an AI model that converts text on video and its name might need a little work as it is called making a video the system uses images with descriptions to learn what the world looks like and also uses videos to learn how the world moves with this data you can generate strangely. Charming videos from just text, check out the images created from these prompts a teddy bear painting a portrait a robot dancing in Times Square a cat watching TV with a remote control in his hand a fluffy baby sloth in a knitted hat orange trying to figure out a laptop close up above you can even add motion to a single image check this out which is really cool and you can even add variations to an input video as you can see here. meta is also preparing our AI to be able to see and recognize objects in a process called image segmentation and is doing this with its any model segment or Sam for short, but before we get into how it works, we need to know why it is basically necessary for If an AI understands the visual world around it, it needs to be able to label. everything you see and separate each object from each other for recognition purposes.

This is segmentation, the process of dividing an image into smaller regions where each region corresponds to a specific object or background in the image, albeit an image processing model. Sam is similar to the models. like GPT and Bard in the sense that it is a fundamental model. A fundamental model is any model trained on extensive data that can be adapted to a wide range of downstream tasks. Basic models are trained with billions of data points, but the problem was that such data did not exist. for image segmentation when it comes to computer vision, although we have billions of images on the web, none of them are labeled with bounding boxes or what they call segmentation masks, making useful segmentation training impossible .

So Meta built the data, it's called a data engine and this is what would train your Sam model, the data engine collected training data in three stages; Stage 1 is called manual assisted; here professional annotators are hired to label images with segmentation masks and the annotators label any image they can find using a mask in this case it is any specific portion of an image isolated from the rest of the image at this stage the data engine collected more than 4 million masks of 120,000 images the next stage was called the semi-automatic stage where the goal was to increase the diversity of The data set out here, the same team was presented with already annotated images and asked to label anything else they could find.

An additional 5.9 million masks were collected here. The last stage was called fully automatic and did not involve any humans. Here it was indicated and given to Sam. a 32 by 32 grid of points and for each point predicted a set of masks that can correspond to valid objects. They applied this process to 11 million images that the meta previously collected and obtained 1.1 billion high-resolution masks. Here are some images where Sam predicted 500+ masks, that's the final dataset called sa1b which is now publicly available under a permissive license. It has six times more images and 400,000 more masks than any other segmentation dataset and it was this dataset that trained the full Sam model with rapid engineering. and zero shot learning now Sam works with three main components, for example the image encoder takes an image and calculates its embedding, then the user can start requesting segmentation masks from the image, which starts the second component.

The prompt encoder takes a message that can be a set of dots, a bounding box, another mask, or perhaps some simple text and generates an embedding message. This embedding is combined with the image embedding and fed into the decoder that predicts the segmentation masks. Check out this example right here from the meta demo page. The image is loaded first. to the model and once loaded, you can display the image in several ways. Okay, now the user selects points in the image and Sam will find a segment that corresponds to the selection. You can also display an image using a bounding box here the user selects. an area around the lightsaber and the model correctly predicts that the user wants the real lightsaber, lastly you can click on everything and let Sam find all the objects automatically.

What do you think about the future of meta in the AI space? I think they're late, but what? What you are doing is quite innovative. Disagree, let me know in the comments In the meantime, click on that video on the screen to see something you haven't seen and thanks for visiting AI Focus.

Watch Video & Subscribe

If you have any copyright issue, please Contact