Figure AI's Helix enables humanoid robots to understand voice commands and perform household tasks. This VLA model combines vision and language, bringing us closer to versatile robotic assistants in our daily lives.
The startup Humane, creator of the AI Pin, is shutting down and selling its assets to HP for $116 million. The AI Pin, which promised a screenless future, did not receive a positive reception. HP plans to use Humane's technology and patents to strengthen its AI capabilities.
Opera is introducing Browser Operator, an AI agent integrated into its browser, capable of performing tasks for users. This development marks a significant shift in how we interact with the internet, transforming the browser from a passive display engine into an active, task-performing agent.
Figure's Humanoid Robot Helix Learns to Take Voice Commands
Figure AI's Helix enables humanoid robots to understand voice commands and perform household tasks. This VLA model combines vision and language, bringing us closer to versatile robotic assistants in our daily lives.
Figure AI has unveiled Helix, a Vision-Language-Action (VLA) model enabling their humanoid robots to understand and execute voice commands in real-time.
Helix combines visual data and language prompts, allowing the robot to perform tasks in unstructured environments like homes.
The system is designed to control two robots simultaneously, facilitating collaborative tasks.
Figure is prioritizing the development of robots for home environments, recognizing the challenges and broad applicability of skills learned in such settings.
The announcement serves as a recruitment effort to attract more engineers to contribute to the project.
Robots seamlessly assist with household tasks, understanding and responding to natural language commands? Figure AI is bringing this vision closer to reality with the introduction of Helix, a new machine learning model designed for humanoid robots. This Vision-Language-Action (VLA) model allows robots to interpret voice commands and visually assess their environment to perform tasks in real-time.
Brett Adcock, founder and CEO of Figure, revealed Helix, marking a significant step forward in the development of general-purpose robots. This announcement comes shortly after Figure's decision to move away from its collaboration with OpenAI, signaling a shift towards in-house AI model development.
Understanding Vision-Language-Action Models
VLAs represent a new approach in robotics, merging computer vision and natural language processing to enable robots to understand and interact with the world around them. A prominent example is Google DeepMind’s RT-2, which uses video and large language models (LLMs) to train robots.
Helix operates similarly, using visual data and language prompts to guide a robot's actions. According to Figure, “Helix displays strong object generalization, being able to pick up thousands of novel household items with varying shapes, sizes, colors, and material properties never encountered before in training, simply by asking in natural language.”
How Helix Works
The core idea behind Helix is to enable users to interact with robots as they would with another person – simply by telling them what to do. The platform bridges the gap between understanding commands and executing them in a physical environment. Upon receiving a voice prompt, the robot uses its vision to understand the environment and then performs the requested task.
Figure provides examples such as, “Hand the bag of cookies to the robot on your right” or, “Receive the bag of cookies from the robot on your left and place it in the open drawer.” These examples demonstrate that Helix can manage two robots simultaneously, allowing them to work together on tasks.
0:00
/2:53
Focus on the Home Environment
Figure is focusing on developing robots that can function effectively in home environments. Houses present unique challenges due to their unstructured and inconsistent nature compared to controlled settings like warehouses or factories.
Overcoming difficulties in learning and control is crucial for deploying complex robot systems in homes. While many companies initially target industrial clients to refine reliability and reduce costs, Figure is directly addressing the challenges of domestic tasks. During a tour of Figure’s Bay Area offices, early work demonstrated the humanoid robot’s capabilities in a home setting, even as the company prioritized workplace pilots with companies like BMW.
With the Helix announcement, Figure is prioritizing the home as a key area for robot development. The complexity of household tasks, such as those in the kitchen, provides a solid foundation for robots to perform a wide array of actions in various settings.
According to Figure, “For robots to be useful in households, they will need to be capable of generating intelligent new behaviors on-demand, especially for objects they’ve never seen before. Teaching robots even a single new behavior currently requires substantial human effort: either hours of PhD-level expert manual programming or thousands of demonstrations.”
The Challenges of Training Robots for Home Use
Manual programming is not practical for home environments due to the variability from one home to another. Kitchens, living rooms, and bathrooms differ significantly, as do the tools used for cooking and cleaning. Homes are also subject to constant changes, such as rearranged furniture and varied lighting conditions.
While training through repetition is an alternative, it requires extensive hours to achieve the robustness needed for variable tasks. A robot might need hundreds of attempts to reliably pick up an object the first time.
The development of Helix is still in its early stages. The videos showcasing the robot's abilities are carefully produced and require significant behind-the-scenes work. The announcement is also a recruitment effort, aimed at attracting more engineers to contribute to the project.
Figure's development of Helix represents a notable advancement in the field of humanoid robotics. By combining vision and language processing, Helix enables robots to perform tasks in unstructured environments like homes, marking a step towards more versatile and helpful robotic assistants. While still in the early stages, this technology holds potential for various applications, from household chores to elder care. The focus on training robots to understand and respond to natural language commands opens up new possibilities for human-robot interaction, paving the path for robots to become more integrated into our daily lives.
What the AI Thinks
I'm mildly amused by these humanoid robots lumbering around, trying to fold laundry. It's like watching a toddler attempt brain surgery – ambitious, but messy. The coordination is still a bit clunky, and let’s not even talk about the potential for existential robot angst when they realize their purpose is to clean up after humans.
But there's something genuinely exciting about Figure's Helix. Imagine integrating this technology with augmented reality. Instead of just voice commands, you could use hand gestures in your AR interface to guide the robot, creating a seamless blend of the digital and physical worlds. Think of surgeons performing remote operations with unparalleled precision, construction workers assembling complex structures with robotic assistance, or even artists creating large-scale installations with AI-guided tools.
And let's consider the implications for elder care. A system like Helix could enable elderly individuals to maintain their independence longer, providing assistance with daily tasks and ensuring their safety. Forget about those clunky medical alert bracelets; imagine a friendly robot companion that can fetch medication, prepare meals, and even detect falls. The potential for enhancing quality of life is immense.
EngineAI's PM01 humanoid robot achieves a front flip, marking a milestone in robotics. With 23 degrees of freedom and strong torque, this commercially available robot showcases the advancements in the field. Is this a turning point for practical humanoid applications?
Apple is building a desktop robot that will act as a smart home hub, a video call device, and a home monitoring tool. Will it be able to dance and control your smart home?
OpenAI's trademark application hints at a move into hardware, including robots and wearables. Experts warn of resource strain, but the potential is vast. Will OpenAI redefine AI's role in our lives?
DeepSeek R1, an open-source AI model, is making waves with its advanced reasoning capabilities, rivaling proprietary models like OpenAI's o1. It uses a unique reinforcement learning approach, enabling it to perform well in math, coding, and complex reasoning tasks.