Friday, March 1, 2024
HomeEntrepreneurAInsights: MultiOn and Large Action Model (LAM), introduction to Google Gemini and...

AInsights: MultiOn and Large Action Model (LAM), introduction to Google Gemini and its Copilot version, AI-driven framework, Disney's VR HoloTile

A new world of artificial intelligence: built with Google Gemini

Artificial Intelligence Insights: Executive insights on the latest in generative artificial intelligence…

MultiOn AI represents a shift from generative AI that passively responds to queries to actively participating in completing tasks

The idea is that tools like ChatGPT are trained on large language models (LLMs). A new class of solutions is emerging that focuses on simplifying information, processes and digital experiences by integrating large-scale action models (LAM).

More artificial intelligence is a new class of tools that make generative artificial intelligence feasible. It leverages generative artificial intelligence to autonomously execute digital processes and digital experiences in the background. It runs in the background of any digital platform and handles tasks that do not require the user's attention. Its purpose is to reduce manual steps and help users focus more on more valuable activities and interactions of their time and attention.

MultiOn is a software example Rabbit R1 Also executed through handheld artificial intelligence devices.

Artificial Intelligence Insights

In addition to automating repetitive tasks, LAMs such as MultiOn AI can interact with various platforms and services to perform different tasks across them. It opens the door to a variety of cross-platform applications that will only mature and accelerate exponentially.

For example:

Ordering and reservations: Users can instruct MultiOn AI to find restaurants and make reservations.

Organize meetings: MultiOn AI can automatically send out meeting invitations.

Uninterrupted entertainment: MultiOn AI plays movies and music from any platform, skipping ads for an uninterrupted experience.

Online interaction: MultiOn AI can post and interact with others online.

Web automation and navigation: MultiOn AI can interact with the web to perform tasks such as finding information online, filling out forms, booking flights and accommodation, and filling online calendars.

Google Bard changes its name to Gemini to officially (finally) challenge ChatGPT and other generative AI platforms

Google's Bud is a code red Respond to OpenAI's popular ChatGPT platform.Bud is now Gemini and officially opened. While it's not quite on the level of ChatGPT or Claude, it will compete. It must be so.

Google has also launched Gemini Advanced to compete with OpenAI's professional-grade ChatGPT service. It also goes against Microsoft and its Copilot plan.

Google says the new app is designed to accomplish a range of tasks, including serving as a personal tutor, helping computer programmers complete programming tasks, and even preparing job candidates for interviews.

“It helps you role-play in various scenarios,” Sissie Hsiao said. The Google vice president in charge of the company's Google Assistant department said when briefing reporters on the situation.

Gemini is a “multimodal” system, meaning it can respond to images and sounds. According to the agency, after analyzing math problems that include graphs, shapes and other images, it can answer the questions as well as a high school student. New York Times.

After going through the awkward phase of Google bard being a bard, let’s not forget that a “bard” is by definition a professional storyteller. Google recognizes this, and it must compete on a daily basis, not just on novelty.

Google AI now comes in two flavors. 1) Gemini is supported by Pro 1.0, 2) Gemini Advanced is supported by Ultra 1.0.The latter subscription costs $19.99 per month Google One.

Like ChatGPT-4, Gemini is multimodal, meaning you can enter more than just text.

I hear this question all the time, usually in private. So I'm just going to put it here and you can skip it if you don't need it. Multimodality refers to the genAI model's ability to understand, process, and generate content across multiple types of media, or “modalities,” including text, code, audio, images, and video. This feature allows Gemini or other models to perform tasks involving more than just text-based input and output, making it more general than traditional large language models (LLMs) that focus primarily on text. For example, Gemini can analyze images and generate text-based descriptions, or it can take text prompts and generate relevant audio or visual content.

Gemini can access links on the web and can also generate images using Google's Imagen 2 model (this feature will first be launched in February 2024).Like ChatGPT-4, Gemini keeps track of your conversation history so you can revisit previous conversations, e.g. technical art.

Gemini Advanced is great for more “advanced” features like coding, logical reasoning, and collaboration on creative projects. It also allows for longer, more detailed conversations and a better understanding of the context of previous prompts. Gemini Advanced is more powerful and suitable for enterprises, developers and researchers.

Gemini Pro supports more than 40 languages ​​and provides text generation, translation, question and answer and code generation functions. Gemini Pro is designed for general users and enterprises.

Artificial Intelligence Insights

Similar to what we saw with Microsoft Copilot, Google Gemini Advanced will be integrated into all Google Workspace and cloud services through its Google One AI premium plan. For those who learn how to get the most out of each application through multi-modal prompts, this will result in instant productivity and higher levels of output.

Artificial Intelligence-Powered AR Glasses and Other Devices Are Coming

2024 will be the year of consumer devices powered by artificial intelligence.humanitarian debut artificial intelligence needle, will begin shipping in March. The Rabbit R1 is also scheduled to begin shipping in March, but has been out of stock for several months.

Residing in Singapore Brilliant Laboratory has just joined the fray with the launch of its new Frame AI-powered AR glasses (designed by ❤️ in Stockholm and San Francisco) powered by the multi-modal artificial intelligence assistant Noa.

Not to be confused with Apple Visual Pro or Meta's line of AR products, the Frame is designed to be worn regularly in the same way you might use Humane's AI Pin. Oh, and before you ask, the battery reportedly lasts all day.

Priced at $349, Frame uses open source AR lenses to bring AI to your eyes. It uses voice commands and is also capable of visual processing.

Noa can also generate images and provide instant translation services and functional framework functions integrated with AI answer engines PuzzledStability AI's text-to-image model Stable Diffusion, OpenAI's GPT4 text generation model, and speech recognition system Whisper.

“The future of human/AI interaction will be enabled by innovative wearables and new devices, and I'm excited to bring Perplexity's instant response engine to Brilliant Labs' Frame,” said Aravind Srinivas, CEO and founder of Perplexity. said in a statement.

Imagine looking at a glass of wine and asking Noah the number of calories in the glass (I don’t want to know!) or the story behind the wine. Or let’s say you see a jacket someone else is wearing and want to know more about it. When you're shopping in-store, you can also prompt Frame to help you find the best deals or summarize reviews.

The results are displayed on the lens.

Artificial Intelligence Insights

The company received funding from John Hanke, CEO of Niantic, the AR platform behind Pokémon GO. This tells me it will be around for at least a few iterations, which is good news. Priced at $349, even though I might look like Steve Jobs' twin, I might try the Frames.

I still haven't purchased Rabbit's R1 simply because it's too far away to have any meaningful reaction to it that would help you understand its potential in your life. At $699 (starting price) plus a monthly service fee, I couldn't justify the investment in Humane's AI Pin, even though I'd love to give it a try. To me, Humane is pursuing a post-screen or post-smartphone world, which I find interesting!

Disney launches HoloTile concept to help you safely travel through virtual reality

No one wants it to end like this…

Disney Research envisions a potentially incredibly innovative solution.

Designed by Imagineer Lanny Smoot, HoloTile is the world's first multi-person, omnidirectional, modular treadmill floor for augmented and virtual reality applications.

Disney HoloTile is not based on a treadmill, but is designed for AR, VR and spatial computing applications today and in the future. I can only imagine the apps we'll see on Apple's Vision Pro and other devices in the near future.

Artificial Intelligence Insights

Just when AR and VR started to spark discussions about virtual worlds, new omnidirectional treadmills emerged that were promising, but in a more traditional way.

It reminds me of the original smartphone. They are phone based. When the iPhone was developed, the concept of the phone would be completely reimagined as “a revolutionary phone that combined three products – a wide-screen iPod with touch controls, and desktop-class email, web browsing, search and maps — all rolled into one small, lightweight handheld device.” In fact, the number of phone calls made has remained relatively flat since the launch of the iPhone, while data usage has continued to soar year after year.

Please subscribe to my newsletter, Solis Quantum.

Source link


Most Popular

Recent Comments