Article 21 min read 4,788 words

Living Room Tidying: The Home Robot Benchmark

The most useful home robot benchmark is not a backflip, a factory tote lift, or a perfect tabletop pick-and-place clip. It is a messy living room at 9 p.m.: pillows on the floor, toys under the coffee table, a remote facing the wrong way, a cup near the sofa, a towel half hanging off a chair, and a human who says, "Can you tidy this up?"

ui44 Team All articles

That sounds ordinary. For robots, it is brutal.

Figure Helix 02 humanoid robot for the living room tidying benchmark

Figure's Helix 02 living-room-tidy demo is interesting because it names the right problem. Figure says a living room changes constantly: objects are scattered unpredictably, furniture creates narrow paths, soft objects deform, many actions require both hands, and nearly every behavior mixes walking with manipulation. In the demo, the robot wipes a surface, handles a towel, scoops blocks into a bin, tucks a container under one arm, tosses a pillow back onto the couch, reorients a remote, presses a button, and side-steps through a tight gap.

The right takeaway is not "home chores are solved." They are not. The useful takeaway is that living-room tidying gives buyers a better test than most humanoid hype videos. It asks whether a robot can perceive clutter, decide where things belong, move through a real room, handle soft and rigid objects, recover from mistakes, and keep the task going without a person babysitting every step.

Why tidying is harder than cleaning

Robot vacuums made home robotics familiar because floor cleaning is bounded. The robot mostly stays on the floor. The goal is measurable. The world can be mapped. Even when the robot gets stuck, the problem is usually navigation, suction, mopping, docking, or obstacle avoidance.

Tidying is different. A tidying robot has to answer questions a vacuum never sees:

  • Is this object trash, a toy, a book, clothing, or something valuable?
  • Does it belong on the table, in a basket, on the couch, in a drawer, or exactly where it is?
  • Can I pick it up without crushing it, spilling it, tangling it, or knocking over something nearby?
  • Do I need one hand, two hands, a tool, or a temporary place to stash it?
  • What should I do if the object slips, folds, rolls, or is partly hidden?

That is why a living-room benchmark matters. It combines six hard problems at once: scene understanding, whole-room navigation, dexterous grasping, soft-object handling, task memory, and error recovery. A robot that only performs one of those well can look impressive in a short clip and still fail the chore.

Figure's own technical framing points in the same direction. In its Helix 02 post, the company described a single neural system controlling the full body from pixels, with vision, touch, and proprioception feeding whole-body control. It also emphasized long-horizon loco-manipulation: walking while holding objects, adjusting balance while reaching, and sequencing many actions over minutes rather than seconds.

That is exactly what a buyer should care about. Not whether a humanoid can stand near a sofa. Whether it can make progress in a room that refuses to stay neat.

The ui44 living-room-tidy checklist

Which robots look closest today?

The ui44 robot database shows a wide gap between "can manipulate objects" and "can tidy a living room as a product." The strongest candidates fall into four groups.

1. Full humanoids with serious manipulation demos

Figure 03 is the clearest example of the direction. In the database it is listed as an active humanoid with a 168 cm body, 60 kg weight, roughly five-hour runtime, 4.3 km/h speed, tactile arrays, force sensors, depth cameras, and Figure's in-house Helix VLA system. The catch is just as important: no price is announced and it is not available as a consumer product.

Figure 02, the robot associated with the public Helix lineage, shows why the hardware class matters. ui44 lists it at 168 cm, 70 kg, with a 20 kg payload and 16-DOF hands, but also marks it discontinued and industrial-only. That makes it valuable evidence for the benchmark, not something a household can buy.

AGIBOT X2 is a different signal: it is actually listed as available, with an official $24,240 price, a 131 cm body, 35-39 kg weight depending on version, roughly two hours of walking at 0.5 m/s, 1.8 m/s top speed, and up to 3 kg payload in specific postures. It has object manipulation with an OmniHand accessory, but the database does not treat it as a proven living-room chore robot. The distinction matters. Availability is not the same as household autonomy.

1X NEO is more home-focused. ui44 lists it as a $20,000 pre-order robot, 167 cm tall, 30 kg, with about four hours of battery life, tactile skin, depth sensing, and capabilities such as household chores, tidying up, safe human interaction, adaptive learning, and gentle manipulation. 1X also describes an "Expert Mode" for chores NEO does not yet know, where a human expert can guide it. That is a realistic bridge to usefulness, but it is also a privacy and reliability boundary buyers should understand.

2. Home-first robots trained on chores

Sunday Memo is one of the most directly relevant entries because its official positioning is about household chores rather than general humanoid spectacle. ui44 tracks Memo as a development-stage home assistant with no announced retail price, a late-2026 limited beta, and capabilities including autonomous table clearing, dishwasher loading, laundry folding, coffee preparation, voice-directed scheduling, app scheduling, and mobile household navigation.

Sunday's official site says Memo is trained with a Skill Capture Glove and that the company has shipped more than 2,000 gloves to people collecting household task data. That is exactly the kind of data pipeline living-room tidying needs. A robot cannot be trained only on clean lab surfaces and then be expected to understand a family's messy den.

Syncere Lume takes the opposite hardware approach. It is not a walking humanoid; it is a sculptural floor lamp with a hidden robotic arm. ui44 lists it as a $1,499 pre-order home assistant, with soft-material chores, bed making, pillow resetting, laundry folding, and one-arm pick-and-place tidying. That limited form factor may actually be sensible for some rooms. If the job is resetting pillows and handling fabric within a known workspace, a fixed or semi-fixed robot can be safer and cheaper than a biped.

3. Mobile manipulators that admit the problem

Hello Robot Stretch 3 mobile manipulator for real home manipulation research

Hello Robot Stretch 3 is not marketed as a mass-market housekeeper, but it is one of the most honest platforms for the problem. ui44 lists it at $24,950, 24.5 kg, 33 × 34 × 141 cm, with a two-to-five-hour runtime, a 2 kg payload, a telescoping arm, ROS 2, a Python SDK, RGB-D cameras, LiDAR, and web/gamepad/dexterous teleoperation.

That spec sheet explains why research mobile manipulators remain important. Stretch does not need to look human to reach from the floor to a cabinet, navigate tight spaces, collect data, and test autonomy. For living-room tidying, a compact wheeled base plus one good arm may beat a humanoid that walks beautifully but cannot grasp reliably.

Toyota's Human Support Robot is another reminder that this problem is older than the current humanoid boom. Toyota framed HSR around independent living for elderly and disabled users: picking objects up from the floor, retrieving items from shelves, and allowing remote operation by family or caregivers. The database lists a 37 kg body, 100.5-135 cm adjustable height, 0.8 km/h speed, and a 1.2 kg object limit. It is a research/developer platform, not a retail living-room tidier, but its tasks are still the right tasks.

4. Limited consumer robots that tidy one slice

Roborock Saros Z70 is useful because it shows the first consumer-friendly slice of tidying: removing small obstacles before cleaning. ui44 lists it as an available $1,299.99 robot vacuum with a foldable five-axis OmniGrip arm, object pickup for socks, shoes, and small items, AI object recognition, 22,000 Pa suction, vacuuming, mopping, and a multifunction dock.

Roborock Saros Z70 robot vacuum with OmniGrip arm for small object pickup

That is not a general living-room robot. It will not decide where the TV remote belongs or fold a blanket. But it is commercially important because it turns "tidying" into a narrow, shippable feature: move the sock so the floor-cleaning task can continue. The first useful home robots may win by doing a small part of the benchmark reliably rather than promising the entire room.

What should buyers ask before believing a tidy-up demo?

A living-room demo is worth watching closely, but the questions matter more than the music in the video.

First, ask whether the run is continuous. A robot that completes a ten-step cleanup with no resets, cuts, or human intervention is showing a different level of autonomy from one that succeeds after repeated attempts.

Second, ask whether the room is meaningfully variable. If every object starts in a known place, the demo may be a choreography test. A real benchmark should randomize the pillow, remote, toys, towel, and bin location.

Third, ask what happens when the robot fails. Dropped objects, blocked paths, and uncertain classifications are normal. A useful home robot should recover, ask, or mark the item for human help. Silent failure is not autonomy.

Fourth, ask whether the robot can personalize storage. Tidying is not only "pick up object." It is "put this person's object where this household expects it." That requires memory, permissions, and a privacy model.

Fifth, ask whether there is an available product, price, support model, and safe fallback. The compare tool is helpful here because two robots can both claim manipulation while being completely different purchases: a research platform, a beta home robot, an enterprise RaaS robot, or a shipping vacuum with a small arm.

The honest ranking: who should care now?

If you want a robot that tidies the whole living room today, wait. The strongest public demos are still demos, and the most capable platforms are not ordinary consumer products.

If you are an early adopter, the most interesting category is not "humanoid" by itself. It is home-trained manipulation with a clear support model. That is why robots like NEO and Memo are worth watching: they are explicitly aimed at domestic chores, not just factory transfer. Their challenge is proving repeatability, privacy, and service support in real homes.

If you are a researcher, caregiver, or developer, Stretch 3 and HSR-style platforms remain relevant because they expose the hard parts instead of hiding them: teleoperation, grasping, navigation, data collection, and assistive use.

If you just want less clutter before the robot vacuum runs, a limited-arm cleaner like the Saros Z70 may be more practical than a $20,000 humanoid. It does one slice of tidying, but it is a slice attached to a job people already buy robots to do.

The bottom line

Living-room tidying is the right benchmark because it is ordinary, messy, and hard to fake. It tests whether a home robot can combine perception, movement, manipulation, memory, and judgment in a place that changes every day.

Figure's Helix 02 demo is an important signal, but it should raise the standard for everyone else rather than end the conversation. The next useful home robot does not need to look the most human. It needs to make reliable progress in the room where humans actually live.

For now, treat every tidy-up claim as a benchmark result, not a buying promise. The winning robot will be the one that can clean up a living room twice, in two different homes, with the couch moved, the toys changed, the pillow in a new place, and no engineer waiting just out of frame.

Database context

Use this article as a privacy verification workflow

Turn the article into a real verification pass

Living Room Tidying: The Home Robot Benchmark already points you toward 9 linked robots, 8 manufacturers, and 3 countries inside the ui44 database. That matters because strong buyer guidance is easier to apply when you can move immediately from a claim or warning into concrete product pages, manufacturer directories, component explainers, and country-level context instead of treating the article as an isolated opinion piece. The fastest next step is to turn the article into a shortlist workflow: open the linked robot pages, verify which specs are actually published for those models, then compare the surrounding manufacturer and component context before you decide whether the underlying claim changes your buying plan.

For this topic, the useful discipline is to separate the editorial lesson from the catalog evidence. The article gives you the framing, but the robot pages tell you what each product actually ships with today: sensor stack, connectivity methods, listed price, release timing, category, and support-relevant compatibility notes. The manufacturer pages then show whether you are looking at a one-off launch, a broader lineup pattern, or a company that spans multiple categories. That layered workflow reduces the risk of buying on a single marketing phrase or a single support FAQ.

Use the robot pages to confirm which products actually expose cameras, microphones, Wi-Fi, or voice systems, then use the manufacturer pages to decide how much of the privacy question seems product-specific versus brand-wide. On this route cluster, Figure 03, Figure 02, and X2 form the fastest reality check. If you want a quick working shortlist, open Compare Figure 03, Figure 02, and X2 next, then keep this article open as the reasoning layer while you compare structured data side by side.

Practical Takeaway

Every robot, manufacturer, category, component, and country reference below resolves to a real ui44 page, keeping the follow-up path grounded in database records rather than generic advice.

Suggested next steps in ui44

  1. Open Figure 03 and note the listed sensors, connectivity methods, and voice stack before you interpret any policy claim.
  2. Cross-check the wider brand context on Figure AI so you can see whether the privacy question touches one model or a broader lineup.
  3. Use the linked component pages to confirm how common the relevant sensors and connectivity layers are across the database.
  4. Keep a short note of which policy layers you checked, which device features are actually present on the robot page, and which items still depend on region- or app-level confirmation.
  5. Finish with Compare Figure 03, Figure 02, and X2 so the policy reading sits next to structured product data.

Database context

Robot profiles worth opening next

Use the linked product pages as the evidence layer

The linked robot pages are where this article becomes operational. Instead of asking whether the headline is interesting, use the robot entries to inspect the actual mix of sensors, connectivity options, batteries, pricing, release timing, and stated capabilities attached to the products mentioned in the article. That is the easiest way to see whether the warning or opportunity described here affects one product family, a specific design pattern, or an entire buying lane.

Figure 03

Figure AI · Humanoid · Active

Price TBA

Figure 03 is tracked on ui44 as a active humanoid robot from Figure AI. The database currently records a listed price of Price TBA, a release date of TBD, ~5 hours battery life, Not disclosed charging time, and a published stack that includes Stereo Vision, Depth Cameras, and Force Sensors plus Wi-Fi and Bluetooth.

For privacy-focused reading, this page matters because it shows the concrete device surface behind the policy discussion. Use it to verify whether Figure 03 combines sensors and connectivity in a way that could change the in-home data footprint, and compare the listed capabilities such as Complex Manipulation, Warehouse Work, and Manufacturing Tasks with any cloud, app, or voice layers.

Figure 02

Figure AI · Humanoid · Discontinued

Price TBA

Figure 02 is tracked on ui44 as a discontinued humanoid robot from Figure AI. The database currently records a listed price of Price TBA, a release date of 2024, Not disclosed (50% greater capacity than Figure 01) battery life, Not disclosed charging time, and a published stack that includes 6 RGB Cameras, Onboard Vision Language Model, and Microphones plus Wi-Fi and Bluetooth.

For privacy-focused reading, this page matters because it shows the concrete device surface behind the policy discussion. Use it to verify whether Figure 02 combines sensors and connectivity in a way that could change the in-home data footprint, and compare the listed capabilities such as Autonomous Task Execution, Speech-to-Speech Conversation, and Pick and Place with any cloud, app, or voice layers, including OpenAI Custom Model.

X2

AGIBOT · Humanoid · Available

$24,240

X2 is tracked on ui44 as a available humanoid robot from AGIBOT. The database currently records a listed price of $24,240, a release date of 2025, ~2 hours at 0.5 m/s walking battery life, ~1.5 hours charging time, and a published stack that includes 3D LiDAR (Ultra), RGB-D Camera (Ultra), and RGB Cameras plus Wi-Fi and Bluetooth.

For privacy-focused reading, this page matters because it shows the concrete device surface behind the policy discussion. Use it to verify whether X2 combines sensors and connectivity in a way that could change the in-home data footprint, and compare the listed capabilities such as Bipedal Walking, 25-30 DOF Articulation, and Object Manipulation (with OmniHand accessory) with any cloud, app, or voice layers.

NEO

1X Technologies · Humanoid · Pre-order

$20,000

NEO is tracked on ui44 as a pre-order humanoid robot from 1X Technologies. The database currently records a listed price of $20,000, a release date of 2025-10-28, ~4 hours battery life, Not disclosed charging time, and a published stack that includes RGB Cameras, Depth Sensors, and Tactile Skin plus Wi-Fi and Bluetooth.

For privacy-focused reading, this page matters because it shows the concrete device surface behind the policy discussion. Use it to verify whether NEO combines sensors and connectivity in a way that could change the in-home data footprint, and compare the listed capabilities such as Household Chores, Tidying Up, and Safe Human Interaction with any cloud, app, or voice layers.

Memo

Sunday · Home Assistants · Development

Price TBA

Memo is tracked on ui44 as a development home assistants robot from Sunday. The database currently records a listed price of Price TBA, a release date of 2026-03-12, Not officially disclosed battery life, Not officially disclosed charging time, and a published stack that includes its published sensor stack plus its listed connectivity stack.

For privacy-focused reading, this page matters because it shows the concrete device surface behind the policy discussion. Use it to verify whether Memo combines sensors and connectivity in a way that could change the in-home data footprint, and compare the listed capabilities such as Autonomous table clearing, Dishwasher loading, and Laundry folding with any cloud, app, or voice layers.

Database context

Manufacturer context behind the article

Check whether this is one product story or a broader company pattern

Manufacturer pages add the privacy context that individual product pages cannot show on their own. They help you check whether cameras, microphones, cloud accounts, app controls, and policy assumptions appear across a broader lineup or stay tied to one specific product story.

Figure AI

ui44 currently tracks 2 robots from Figure AI across 1 category. The company is grouped under USA, and the current catalog footprint on ui44 includes Figure 03, Figure 02.

That wider brand context matters because privacy questions rarely stop at one FAQ page. A manufacturer route helps you see whether the article is centered on one premium model or on a company that has several relevant products and therefore more than one place where the same policy or app assumptions might matter. The category mix here currently points toward Humanoid as the most useful next route if you want to see whether this article reflects a wider pattern inside the brand.

AGIBOT

ui44 currently tracks 6 robots from AGIBOT across 2 categorys. The company is grouped under China, and the current catalog footprint on ui44 includes A2 Ultra, X2, Expedition A3.

That wider brand context matters because privacy questions rarely stop at one FAQ page. A manufacturer route helps you see whether the article is centered on one premium model or on a company that has several relevant products and therefore more than one place where the same policy or app assumptions might matter. The category mix here currently points toward Humanoid, Quadruped as the most useful next route if you want to see whether this article reflects a wider pattern inside the brand.

1X Technologies

ui44 currently tracks 2 robots from 1X Technologies across 1 category. The company is grouped under Norway, and the current catalog footprint on ui44 includes NEO, EVE.

That wider brand context matters because privacy questions rarely stop at one FAQ page. A manufacturer route helps you see whether the article is centered on one premium model or on a company that has several relevant products and therefore more than one place where the same policy or app assumptions might matter. The category mix here currently points toward Humanoid as the most useful next route if you want to see whether this article reflects a wider pattern inside the brand.

Sunday

ui44 currently tracks 1 robot from Sunday across 1 category. The current catalog footprint on ui44 includes Memo.

That wider brand context matters because privacy questions rarely stop at one FAQ page. A manufacturer route helps you see whether the article is centered on one premium model or on a company that has several relevant products and therefore more than one place where the same policy or app assumptions might matter. The category mix here currently points toward Home Assistants as the most useful next route if you want to see whether this article reflects a wider pattern inside the brand.

Database context

Broaden the scan without leaving the database

Categories, components, and countries add the wider context

Category framing

Category pages are useful when the article touches a buying pattern that shows up across brands. A category route helps you confirm whether the linked products sit in a narrow niche or whether the same question should be tested across a larger field of alternatives.

Humanoid

The Humanoid category page currently groups 65 tracked robots from 47 manufacturers. ui44 describes this lane as: Full-size bipedal humanoid robots designed to work alongside humans. From factory floors to household tasks, these machines represent the cutting edge of robotics.

That makes the category route a practical follow-up when you want to check whether the products linked in this article are typical for the lane or whether they sit at one edge of the market. Useful starting examples currently include NEO, EVE, Mornine M1.

Home Assistants

The Home Assistants category page currently groups 12 tracked robots from 12 manufacturers. ui44 describes this lane as: Arm-based household helpers — laundry folders, kitchen robots, and mobile manipulators that handle physical tasks at home.

That makes the category route a practical follow-up when you want to check whether the products linked in this article are typical for the lane or whether they sit at one edge of the market. Useful starting examples currently include Robody, Futuring 2 (F2), Stretch 3.

Country and ecosystem context

Country pages give extra context when support practices, launch sequencing, regulatory posture, or manufacturer mix matter. They are not a substitute for model-level verification, but they do help you see which ecosystems cluster together and which manufacturers sit in the same regional field when you broaden the search beyond the article headline.

USA

The USA route currently groups 16 tracked robots from 12 manufacturers in ui44. That gives you a useful regional lens when the article points toward support practices, launch sequencing, or brand clusters that may share similar ecosystem assumptions.

On the current route, manufacturers like Boston Dynamics, Figure AI, Tesla make the page a good way to broaden the scan without losing the regional context that often shapes availability, documentation style, and adjacent alternatives.

China

The China route currently groups 47 tracked robots from 14 manufacturers in ui44. That gives you a useful regional lens when the article points toward support practices, launch sequencing, or brand clusters that may share similar ecosystem assumptions.

On the current route, manufacturers like AGIBOT, Roborock, Unitree Robotics make the page a good way to broaden the scan without losing the regional context that often shapes availability, documentation style, and adjacent alternatives.

Norway

The Norway route currently groups 2 tracked robots from 1 manufacturers in ui44. That gives you a useful regional lens when the article points toward support practices, launch sequencing, or brand clusters that may share similar ecosystem assumptions.

On the current route, manufacturers like 1X Technologies make the page a good way to broaden the scan without losing the regional context that often shapes availability, documentation style, and adjacent alternatives.

Database context

Questions to answer before you move from reading to buying

A follow-up FAQ built from the entities already linked in this article

Frequently Asked Questions

Which page should I open first after reading “Living Room Tidying: The Home Robot Benchmark”?

Start with Figure 03. That gives you a concrete product anchor for the article’s main claim. From there, branch into the manufacturer and component pages so you can tell whether the article is describing one specific model, a repeated brand pattern, or a wider technology issue that affects multiple shortlist options.

How do the manufacturer pages change the buying decision?

Figure AI help you zoom out from one article and one product. On ui44 they show lineup breadth, category spread, and the neighboring robots tied to the same company. That context is useful when you are deciding whether a risk belongs to a single model, whether it shows up across a brand’s portfolio, and whether you should keep looking at alternatives before committing.

When should I switch from reading to side-by-side comparison?

Move into Compare Figure 03, Figure 02, and X2 as soon as you understand the article’s main warning or promise. The article explains what to watch for, but the compare view is where you can check whether price, status, battery life, connectivity, sensors, and category fit still make the robot a good match for your own home and budget.

Database context

Where to go next in ui44

Keep the research chain inside the database

If you want to keep going, these follow-on pages give you the cleanest expansion path from article to research session. Open the comparison route first if you are deciding between products today. Open the manufacturer, category, and component routes if you still need to understand the broader pattern behind the claim.

UT

Written by

ui44 Team

Published April 30, 2026

Share this article

Open a plain share link on X or Bluesky. No embeds, no widgets, no cookie baggage.

Explore the database

Go beyond the headlines

Compare specs, features, and prices across 100+ robots from leading manufacturers worldwide.