How can an AI model refuse to do something if it's just pattern matching?

Started by Baz_26, Jun 21, 2026, 02:12 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Topic: How can an AI model refuse to do something if it's just pattern matching?   Views(Read 77 times)

Baz_26

People say AI models have values and refuse harmful requests. But I thought models just predict the next word based on patterns. How does pattern matching create refusal?
Question everything. Especially this.

Bob81

Pattern matching does create refusal through training. Models trained on text where people refuse harmful requests learn to predict refusal in similar contexts. It's statistical pattern