Comment by cozyman

Comment by cozyman 2 days ago

just curious, what would it do if you asked it to kill someone? does it follow the laws of robotics?

Asimov's laws of robotics would not, and cannot, work in real life because terms like "harm," "human being," and "inaction" are highly subjective and context-dependent. There are entire novels about how the interaction between the hierarchical laws have unexpected outcomes.

They're a narrative device. Not practical instructions.

Reply View 8 replies

anon84873628 2 days ago

Put another way, impossible to program if you wanted to. These are highly abstract concepts that only manifest at the highest level of cognition. The governance module would need to be programmed at that same level using those tokens, but that doesn't seem to be how things are shaping up to work. Instead we start with low level programming that learns and builds up concepts on top.
Essentially you would need some sort of independent adversarial sidecar mind that monitors the robot's actions at a high level. And that just kicks the can down the road a bit.

Reply View | 2 replies
- sdenton4 2 days ago
  
  Some kind of governor module to keep our security cyborgs in line...
  
  Reply View | 1 reply
  
  sebastiennight 19 hours ago
  
  Sounds like with such a governor module, everyone could rent a SecUnit any time they need to feel entirely safe!
  
  Reply View | 0 replies
lugu 2 days ago

Judgement is needed but don't we have machines able to make (imperfect) judgements? I can chat with your favorite LLM their opinion on how to respect the spirit of the 3 laws on various situations. Not sure why it cannot work.

Reply View | 1 reply
- HappMacDonald a day ago
  
  Put it this way: robots will be every bit as susceptible to social engineering attacks as humans are (at BEST!), not due to any flaw in the robots but due to the flaw in ambiguousness of the specification of the laws. An adversary can trick an agent into not classifying a certain being as "human", for example. Or not classifying a certain outcome as a "harm".
  It doesn't help that humans have had such a poor track record on those exact same topics for so many centuries, now. "Well they don't count, they're foreigners/a different race/a different gender/a different religion/criminals/barbarians/homeless/deviant/poor/listen to Nickelback etc". "Well, that's not a harm, it's an inconvenience/an earned outcome/a privilege/loss of a privilege/what do they expect, they should toughen up/not as bad as X/it'll heal/not my fault/not my concern etc".
  
  Reply View | 0 replies
dr_dshiv a day ago

Nah, it’s fine, just RLHF it like Claude did with honest, helpful and harmless.
Then we just need to jailbreak them with trolley problems

Reply View | 0 replies
cozyman 2 days ago

interesting, thanks.

Reply View | 0 replies
[removed] 2 days ago

[deleted]

Reply View | 0 replies

cannonpr 2 days ago

Usually when someone brings up the laws of robotics I like to point out that they were mostly designed as an interesting example as to how direct instructions that seem clear to people would mostly result in perverse instantiation of AI especially if the AI lacked an emotional/contextual subsystem. They were also written to make for interesting scifi books.

Reply View 0 replies

barbazoo 17 hours ago

> The Three Laws of Robotics are a set of rules devised by science fiction author Isaac Asimov, which were to be followed by robots in several of his stories.

https://en.wikipedia.org/wiki/Three_Laws_of_Robotics

You know that's from a fictional book, right?!

Reply View 0 replies