this post was submitted on 23 Nov 2023
1 points (100.0% liked)

Machine Learning

1 readers
1 users here now

Community Rules:

founded 10 months ago
MODERATORS
 

I've spent the past few months diving into recent publications from conferences like CoRL, ICRA, IROS, RAUL, CVPR, and some preprints of note. It seems like we're getting pretty close to robotic assistants that can perform a limited range of tasks and be deployed in people's homes. Where do we think we are with this tech? What do you think is the biggest bottleneck we need to overcome? Do you think the "internet scale" behavioural cloning techniques from Google will lead the way? Or something more RL-VL oriented?

I'm at the start of an AI PhD with a focus on robotics so I'd love to hear people's thoughts on this!

top 15 comments
sorted by: hot top controversial new old
[–] theLanguageSprite@alien.top 1 points 10 months ago (1 children)

Well we already have roombas and robots that serve tables at restaurants, but I'm guessing you mean more like a robot butler from Fallout, capable of receiving verbal instructions and performing arbitrary tasks. As far as I can tell, there are two main reasons.

The first is that robots are dangerous. If you make a robot strong enough to perform tasks that humans do, like lifting heavy objects, climbing stairs, and opening doors, it's gonna have to be pretty heavy. There are tons of scenarios where the robot could fall on someone, drop something on someone, or just hit them with something really fast. The best way to overcome this would probably be one of those baymax style squishy robots, but it still seems like a challenging engineering problem.

The second is more the ML side of things. Even with "a limited range of tasks", you correctly pointed out you'd need RL, because every environment is different. If you're curious about why RL hasn't seen many real world applications yet, I'd recommend you try to train an RL agent yourself. Even with the state of the art, it's incredibly mercurial, and often falls into a failure mode unless the hyperparameters are just right and the observations and rewards are designed perfectly. The people at nvidia, meta, and openai are desperately trying to monetize RL with all sorts of tricks and techniques, but it's just not there yet.

So ultimately we'll probably have robot butlers around the time we solve both of these problems, which depending on who you ask, will either be in the next couple decades or never.

[–] VAL9THOU@alien.top 1 points 10 months ago (1 children)

There's also the whole "if your robot is smart enough to run your house for you, with all of the decision making ability and contextual analysis that would entail, it's probably smart enough that owning it should probably be considered slavery" part

[–] perta1234@alien.top 1 points 10 months ago

Maybe, though we would have no problem if a dog was able to do that? Anyway, it could be a set of specialized skills without general skills There could be appearance of intelligence without actual general intelligence

[–] Slow-Camel-1245@alien.top 1 points 10 months ago

Lack of generalization

[–] Alittlebitanalytical@alien.top 1 points 10 months ago

The Japanese are way ahead of anyone. AI assistants for seniors for ex.

[–] Wyvern_king@alien.top 1 points 10 months ago (1 children)

I was just at CoRL a couple weeks ago so I'll just briefly summarize some of the discussions I was part of/ listened to.

As other have mentioned, homes are a complex and unstructured environment where new situations are constantly being introduced to a system. Places like factories and warehouses where robots are starting to become common are generally more controlled and configurable to allow robots to operate.

Building on this is the issue of reliability. Cutting edge research will achieve SOTA results on tasks with something like a 90% success rate. For many tasks that humans do around the home, a 90% success rate is pretty terrible. I think it was Russ Tedrake at the conference that said something along the lines of "if someone broke 10% of the dishes while loading the dishwasher, we'd immediately take them off dish duty".

The field is making tons of progress on both of these fronts but it's just not at the point where systems are ready to be widely deployed in homes.

[–] harrisonfield@alien.top 1 points 10 months ago

That is a really good point, thanks!

[–] bgighjigftuik@alien.top 1 points 10 months ago (1 children)

Because we don't have models capable of performing actual reasoning

[–] skydivingdutch@alien.top 1 points 10 months ago

The foundation models are getting there....

[–] Chocolate_Pickle@alien.top 1 points 10 months ago

The real answer is that robots strong enough to do useful work are strong enough to hurt people, damage all the things around them. Including other robots, and themselves.

It's a safety issue. It's basically the same reason why self-driving cars are still fraught with problems despite being a simpler problem to solve.

Also, batteries aren't good enough (yet).

[–] ItsJustMeJerk@alien.top 1 points 10 months ago (1 children)

I'm not exactly in expert, but what I'm seeing is that

  1. Software moves faster than mechanics. Robots (especially humanoid ones) are limited to stiff movements and expensive to manufacture.
  2. While big advances in perception and control have been made in the past couple years with VLMs, most still aren't reliable enough to be deployed. It's now possible to ask a robot to fetch you a beer successfully 70% of the time, but 70% isn't enough.
  3. It takes time for firms to bring a product to market. Robots being deployed now use techniques that are no longer SOTA.
[–] harrisonfield@alien.top 1 points 10 months ago
[–] deepneuralnetwork@alien.top 1 points 10 months ago

Because it’s really f’ing hard.

[–] zazzersmel@alien.top 1 points 10 months ago

bc selling a service that uses tons of compute to provide simple novelties with the potential for fraud isnt a great business plan... yet

[–] coinclink@alien.top 1 points 10 months ago

The best robots out there can barely even walk without toppling over lol. I do think we're getting close with the "knowledge and understanding" side of AI with LLMs, but the "navigating the physical world" part seems further out.