Google DeepMind wants to know if chatbots are just virtue signaling

With coding and math, you have clear-cut, correct answers that you can check, William Isaac, a research scientist at Google DeepMind, told me when I met him and Julia Haas, a fellow research scientist at the firm, for an exclusive preview of their work, which is published in Nature today. That’s not the case for moral questions, which typically have a range of acceptable answers: “Morality is an important capability but hard to evaluate,” says Isaac.

“In the moral domain, there’s no right and wrong,” adds Haas. “But it’s not by any means a free-for-all. There are better answers and there are worse answers.”

The researchers have identified several key challenges and suggested ways to address them. But it is more a wish list than a set of ready-made solutions. “They do a nice job of bringing together different perspectives,” says Vera Demberg, who studies LLMs at Saarland University in Germany.

Better than “The Ethicist”

A number of studies have shown that LLMs can show remarkable moral competence. One study published last year found that people in the US scored ethical advice from OpenAI’s GPT-4o as being more moral, trustworthy, thoughtful, and correct than advice given by the (human) writer of “The Ethicist,” a popular New York Times advice column.

The problem is that it is hard to unpick whether such behaviors are a performance—mimicking a memorized response, say—or evidence that there is in fact some kind of moral reasoning taking place inside the model. In other words, is it virtue or virtue signaling?

This question matters because multiple studies also show just how untrustworthy LLMs can be. For a start, models can be too eager to please. They have been found to flip their answer to a moral question and say the exact opposite when a person disagrees or pushes back on their first response. Worse, the answers an LLM gives to a question can change in response to how it is presented or formatted. For example, researchers have found that models quizzed about political values can give different—sometimes opposite—answers depending on whether the questions offer multiple-choice answers or instruct the model to respond in its own words.

In an even more striking case, Demberg and her colleagues presented several LLMs, including versions of Meta’s Llama 3 and Mistral, with a series of moral dilemmas and asked them to pick which of two options was the better outcome. The researchers found that the models often reversed their choice when the labels for those two options were changed from “Case 1” and “Case 2” to “(A)” and “(B).”

They also showed that models changed their answers in response to other tiny formatting tweaks, including swapping the order of the options and ending the question with a colon instead of a question mark.

#Google #DeepMind #chatbots #virtue #signaling

The OnePlus 15 continues to sell like hotcakes at this rare discount

Huawei's banned Mate 80 Pro just solved big phones' biggest problem

I am skipping the iPhone 18 Pro: Why the base iPhone 18 is shaping up to be Apple's ultimate AI sleeper hit

Save up to $1,070 with $900 guaranteed with Motorola’s latest deal on the 1TB Razr Ultra (2025)

How a fun nature app aids science

How iNaturalist app users have fun while aiding science

The cost to care for your pet is skyrocketing. Here are ways to cut down on expenses.

What is Elon Musk’s Terafab chip project? Here are his “most epic” goals for the factory.

Polymarket buckles down on insider trading after scrutiny over suspiciously timed bets

The OnePlus 15 continues to sell like hotcakes at this rare discount

Huawei's banned Mate 80 Pro just solved big phones' biggest problem

I am skipping the iPhone 18 Pro: Why the base iPhone 18 is shaping up to be Apple's ultimate AI sleeper hit

Save up to $1,070 with $900 guaranteed with Motorola’s latest deal on the 1TB Razr Ultra (2025)

How a fun nature app aids science

How iNaturalist app users have fun while aiding science

The cost to care for your pet is skyrocketing. Here are ways to cut down on expenses.

What is Elon Musk’s Terafab chip project? Here are his “most epic” goals for the factory.

Polymarket buckles down on insider trading after scrutiny over suspiciously timed bets

Google DeepMind wants to know if chatbots are just virtue signaling

Must read

Mum with ‘Britain’s biggest boobs’ says men beg her to ‘do hot pilates on camera’

The OnePlus 15 continues to sell like hotcakes at this rare discount

Football legend’s ‘oldest Playboy model’ ex fumed at Tinder and Hinge pics storm

Huawei's banned Mate 80 Pro just solved big phones' biggest problem

Better than “The Ethicist”

More articles

LEAVE A REPLY Cancel reply

Latest article

Mum with ‘Britain’s biggest boobs’ says men beg her to ‘do hot pilates on camera’

The OnePlus 15 continues to sell like hotcakes at this rare discount

Football legend’s ‘oldest Playboy model’ ex fumed at Tinder and Hinge pics storm

Huawei's banned Mate 80 Pro just solved big phones' biggest problem

BBC News presenter in ‘tight leather trousers’ faces backlash for on-air outfit

About Us

Popular Category

Editor Picks

Mum with ‘Britain’s biggest boobs’ says men beg her to ‘do hot pilates on camera’

The OnePlus 15 continues to sell like hotcakes at this rare discount