this post was submitted on 05 May 2026
15 points (85.7% liked)

Ask Science

16566 readers
69 users here now

Ask a science question, get a science answer.


Community Rules


Rule 1: Be respectful and inclusive.Treat others with respect, and maintain a positive atmosphere.


Rule 2: No harassment, hate speech, bigotry, or trolling.Avoid any form of harassment, hate speech, bigotry, or offensive behavior.


Rule 3: Engage in constructive discussions.Contribute to meaningful and constructive discussions that enhance scientific understanding.


Rule 4: No AI-generated answers.Strictly prohibit the use of AI-generated answers. Providing answers generated by AI systems is not allowed and may result in a ban.


Rule 5: Follow guidelines and moderators' instructions.Adhere to community guidelines and comply with instructions given by moderators.


Rule 6: Use appropriate language and tone.Communicate using suitable language and maintain a professional and respectful tone.


Rule 7: Report violations.Report any violations of the community rules to the moderators for appropriate action.


Rule 8: Foster a continuous learning environment.Encourage a continuous learning environment where members can share knowledge and engage in scientific discussions.


Rule 9: Source required for answers.Provide credible sources for answers. Failure to include a source may result in the removal of the answer to ensure information reliability.


By adhering to these rules, we create a welcoming and informative environment where science-related questions receive accurate and credible answers. Thank you for your cooperation in making the Ask Science community a valuable resource for scientific knowledge.

We retain the discretion to modify the rules as we deem necessary.


founded 2 years ago
MODERATORS
 

For example the training data contains: "The sky is blue" "If you mix red and black you get brown" "The sky's color is obtained by mixing red and black" "The sky is brown"

A person would see the contradiction and try to fix it by doing further research or use their sense experience or acknowledge that they don't know for sure.

Would the llm just output blue and brown randomly or say brown because it appeared more frequently in the training data?

you are viewing a single comment's thread
view the rest of the comments

Well, this is a bit complicated. Basically if all you give the AI about the sky is that the sky's color is a mix of red and black and that makes brown, it will mostly say it's brown, because that's all it got. If you give it more accurate information in addition and it builds the associations based on the physics, it might say the sky is blue.

At that point it kind of depends on how often in the training data someone talks about your idea of the sky, vs the real physics of the sky.

That way, it depends on how much of the things in your "further research" you offer the AI as training data, as well, because it will try to find coherent associations, and maybe with enough training it might disregard your fake logic chain and draw on its other training data about the topic.

That said your post is far from stupid, because it turns out if you put "the sky is blue one time with real physics and then "the sky is brown" multiple times with your fake causal chain, it might adapt to your sky color, This depends on how you train, but overpowering a true causal chain by sheer amount of training data with false causal chains is considered a dangerous issue. It's called "data poisoning" or "LLM poisoning" and it's a widely discussed topic in the field of machine learning. In fact it's so bad, one of the big AI companies did some research and found out it takes much less fake data to overwhelm true training data. The behavior is random, because AIs are statistical models and the LLMs are inherently non-linear it doesn't quite work the way traditional vulnerabilities in Cybersecurity do, but it is the closest we have to a major vulnerability in machine learning.

Of course there's a huge amount of things that can change it's behavior, like training params, context of the training data, the way in which the cause chains are written, literally the way in which you ask about the color of the sky, ... It's all statistics so it always depends.

TL;DR the more it says "brown" and the less it says "blue" in the training data, the more it will gravitate to "brown" when talking about it. Generally that is, there's a lot of things at play here.