7.4 C
Munich
星期六, 4 4 月, 2026

Google DeepMind wants to know if chatbots are just virtue signaling

Must read

Mum with ‘Britain’s biggest boobs’ says men beg her to ‘do hot pilates on camera’

A mum with “the largest boobs in Britain” says she is bombarded with messages from creepy blokes who beg her to do “hot pilates”...

The OnePlus 15 continues to sell like hotcakes at this rare discount

Amazon still sells the OnePlus 15 at its Spring Sale discount, but probably not for much longer. #OnePlus #continues #sell #hotcakes #rare #discount

Football legend’s ‘oldest Playboy model’ ex fumed at Tinder and Hinge pics storm

A football legend's former girlfriend, who became Scotland's oldest Playboy model, was left fuming after discovering his racy pictures were being used to trick...

Huawei's banned Mate 80 Pro just solved big phones' biggest problem

This is how AI can be truly useful. #Huawei039s #banned #Mate #Pro #solved #big #phones039 #biggest #problem

With coding and math, you have clear-cut, correct answers that you can check, William Isaac, a research scientist at Google DeepMind, told me when I met him and Julia Haas, a fellow research scientist at the firm, for an exclusive preview of their work, which is published in Nature today. That’s not the case for moral questions, which typically have a range of acceptable answers: “Morality is an important capability but hard to evaluate,” says Isaac.

“In the moral domain, there’s no right and wrong,” adds Haas. “But it’s not by any means a free-for-all. There are better answers and there are worse answers.”

The researchers have identified several key challenges and suggested ways to address them. But it is more a wish list than a set of ready-made solutions. “They do a nice job of bringing together different perspectives,” says Vera Demberg, who studies LLMs at Saarland University in Germany.

Better than “The Ethicist”

A number of studies have shown that LLMs can show remarkable moral competence. One study published last year found that people in the US scored ethical advice from OpenAI’s GPT-4o as being more moral, trustworthy, thoughtful, and correct than advice given by the (human) writer of “The Ethicist,” a popular New York Times advice column.  

The problem is that it is hard to unpick whether such behaviors are a performance—mimicking a memorized response, say—or evidence that there is in fact some kind of moral reasoning taking place inside the model. In other words, is it virtue or virtue signaling?

This question matters because multiple studies also show just how untrustworthy LLMs can be. For a start, models can be too eager to please. They have been found to flip their answer to a moral question and say the exact opposite when a person disagrees or pushes back on their first response. Worse, the answers an LLM gives to a question can change in response to how it is presented or formatted. For example, researchers have found that models quizzed about political values can give different—sometimes opposite—answers depending on whether the questions offer multiple-choice answers or instruct the model to respond in its own words.

In an even more striking case, Demberg and her colleagues presented several LLMs, including versions of Meta’s Llama 3 and Mistral, with a series of moral dilemmas and asked them to pick which of two options was the better outcome. The researchers found that the models often reversed their choice when the labels for those two options were changed from “Case 1” and “Case 2” to “(A)” and “(B).”

They also showed that models changed their answers in response to other tiny formatting tweaks, including swapping the order of the options and ending the question with a colon instead of a question mark.

#Google #DeepMind #chatbots #virtue #signaling

- Advertisement -

More articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

- Advertisement -

Latest article

Mum with ‘Britain’s biggest boobs’ says men beg her to ‘do hot pilates on camera’

A mum with “the largest boobs in Britain” says she is bombarded with messages from creepy blokes who beg her to do “hot pilates”...

The OnePlus 15 continues to sell like hotcakes at this rare discount

Amazon still sells the OnePlus 15 at its Spring Sale discount, but probably not for much longer. #OnePlus #continues #sell #hotcakes #rare #discount

Football legend’s ‘oldest Playboy model’ ex fumed at Tinder and Hinge pics storm

A football legend's former girlfriend, who became Scotland's oldest Playboy model, was left fuming after discovering his racy pictures were being used to trick...

Huawei's banned Mate 80 Pro just solved big phones' biggest problem

This is how AI can be truly useful. #Huawei039s #banned #Mate #Pro #solved #big #phones039 #biggest #problem

BBC News presenter in ‘tight leather trousers’ faces backlash for on-air outfit

A BBC News host has faced horrible comments from online trolls after wearing a pair of "tight leather trousers" while presenting the news from...