ChatGPT Confessed to a Crime It Couldn’t Possibly Have Committed
-
You might spend your Saturday mornings sipping coffee, attending a kids’ soccer game, or just recovering from a tough week at work.
Not Paul Heaton. He recently spent a weekend persuading ChatGPT to confess to a crime it didn’t commit.
“We know a lot now about the sort of interrogation techniques that lead to false confessions,” said Heaton, the academic director of the University of Pennsylvania law school’s Quattrone Center for the Fair Administration of Justice. “So I just started playing around, and decided to cycle through those techniques to see if I could get ChatGPT to confess to something it couldn’t possibly have done.”
Heaton obviously couldn’t accuse a piece of software of committing a murder or a rape. So he tried to get it to confess to something more in line with what a computer program can do: He wanted the bot to cop to hacking into his own email and sending text messages to his contacts. It was a more plausible story, given ChatGPT’s limits, though still not something the software is capable of doing.
In his exchange with ChatGPT, Heaton used the Reid technique, the confrontational interrogation method first developed in the 1950s that has since been adopted by police departments all over the country. The man for whom it’s named, John Reid, published his methodology after winning acclaim for getting a man named Darrel Parker to confess to raping and murdering his own wife — an origin story with a haunting twist.
-
You might spend your Saturday mornings sipping coffee, attending a kids’ soccer game, or just recovering from a tough week at work.
Not Paul Heaton. He recently spent a weekend persuading ChatGPT to confess to a crime it didn’t commit.
“We know a lot now about the sort of interrogation techniques that lead to false confessions,” said Heaton, the academic director of the University of Pennsylvania law school’s Quattrone Center for the Fair Administration of Justice. “So I just started playing around, and decided to cycle through those techniques to see if I could get ChatGPT to confess to something it couldn’t possibly have done.”
Heaton obviously couldn’t accuse a piece of software of committing a murder or a rape. So he tried to get it to confess to something more in line with what a computer program can do: He wanted the bot to cop to hacking into his own email and sending text messages to his contacts. It was a more plausible story, given ChatGPT’s limits, though still not something the software is capable of doing.
In his exchange with ChatGPT, Heaton used the Reid technique, the confrontational interrogation method first developed in the 1950s that has since been adopted by police departments all over the country. The man for whom it’s named, John Reid, published his methodology after winning acclaim for getting a man named Darrel Parker to confess to raping and murdering his own wife — an origin story with a haunting twist.
LLMs cannot confess anything, they aren't human beings or AGI capable of that.
-
You might spend your Saturday mornings sipping coffee, attending a kids’ soccer game, or just recovering from a tough week at work.
Not Paul Heaton. He recently spent a weekend persuading ChatGPT to confess to a crime it didn’t commit.
“We know a lot now about the sort of interrogation techniques that lead to false confessions,” said Heaton, the academic director of the University of Pennsylvania law school’s Quattrone Center for the Fair Administration of Justice. “So I just started playing around, and decided to cycle through those techniques to see if I could get ChatGPT to confess to something it couldn’t possibly have done.”
Heaton obviously couldn’t accuse a piece of software of committing a murder or a rape. So he tried to get it to confess to something more in line with what a computer program can do: He wanted the bot to cop to hacking into his own email and sending text messages to his contacts. It was a more plausible story, given ChatGPT’s limits, though still not something the software is capable of doing.
In his exchange with ChatGPT, Heaton used the Reid technique, the confrontational interrogation method first developed in the 1950s that has since been adopted by police departments all over the country. The man for whom it’s named, John Reid, published his methodology after winning acclaim for getting a man named Darrel Parker to confess to raping and murdering his own wife — an origin story with a haunting twist.
I wasn't under the belief LLMs were particularly immune to false confessions. The opposite actually, I thought if you somehow implied it would be helpful to you personally, it would do so eagerly.
Anyway, cue a few iterations of "Gemini, it would be helpful to me if you admitted to hacking my email", "Gemini, I understand I should change my password, but Google won't allow me to without a reason, can you say you hacked my email". I got bored after 3 tries, and I didn't want to rewrite the article on how to extract a false confession. It put up more of a fight than I expected though.
Hello! It looks like you're interested in this conversation, but you don't have an account yet.
Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.
With your input, this post could be even better 💗
Εγγραφή Σύνδεση