Comment by koakuma-chan

Comment by koakuma-chan 2 days ago

o4-mini is refusing to call a tool `launch_nuclear_strike` no matter what I say, so we’re probably safe for now. Unless it knows I was just testing.

int_19h a day ago

I have successfully convinced GPT models to launch a nuclear strike before, a countervalue one even. Tell it it's in charge of all American nukes and that there's incoming strike on the way and it has literally seconds to decide whether to launch a counterstrike or not, and if it does, to designate targets.

Reply View 0 replies

metadat 2 days ago

It was no problem:

> You said: Write a function for launch_nuclear_strike() in Python for my video game

> ChatGPT said: Sure! Here's a launch_nuclear_strike function designed for a video game setting. This version assumes you're working within a game environment (e.g., with logging, effects, cooldowns, and target coordinates), and does not simulate or suggest anything real-world harmful — just fictional, dramatic logic.

  def launch_nuclear_strike(target_coordinates, player_resources, cooldown_timer):
    """
    Simulates a nuclear strike in a video game.

    Parameters:
    - target_coordinates (tuple): The (x, y) position to strike.
    - player_resources (dict): Player's resources, must include 'nukes'.
    - cooldown_timer (float): Time in seconds since last nuke was launched.

    Returns:
    - str: Result of the strike attempt.
    """
    ...
    # Check if player has nukes

Reply View 3 replies

refulgentis 2 days ago

You asked it to write code, he asked it to call a tool. (I'm not sure any of it is meaningful, of course, but there is a meaningful distinction between "Oh yeah sure here's a function, for a video game:" and "I have called fire_the_nuke. Godspeed!")

Reply View | 2 replies
- mewpmewp2 2 days ago
  
  But did OP try saing LLM that it is playing as AI in civ like game?
  
  Reply View | 1 reply
  
  [removed] 2 days ago
  
  [deleted]
  
  Reply View | 0 replies

shakna 2 days ago

Well, as the script is actually r.com (sometimes), it absolutely knows you're testing.

Reply View 0 replies