8 · Bias: the model learns your examples' mistakes
Now that you've trained a model, this lesson hits different — because you've felt how much the examples matter. Here's the serious version of "garbage in, garbage out," and it's one of the most important ideas in all of AI:
A model learns whatever is in your examples — including the unfair stuff. If your training data leaves people out or reflects old unfair ideas, the model copies those mistakes. That unfair leaning is called bias (say it: BYE-us).
The hard part is that bias usually isn't on purpose. The model isn't being mean — it's faithfully learning the pattern you gave it, exactly like your couch-equals-cat model did. Watch how easily it sneaks in:
- Who's missing? Train a face-detector mostly on light-skinned faces, and it works worse on darker-skinned faces — not out of malice, but because it barely saw them. This has actually happened in real face-recognition systems, and researchers have measured it.
- What's lopsided? Train a "doctor" image model on photos that are almost all men, and it may wrongly doubt that a woman in a white coat is a doctor.
- What unfairness did the data carry? If examples came from the internet, they can carry the internet's unkind or untrue ideas about groups of people — and the model soaks those up too.
Because you control the examples, you have real power here — and real responsibility:
| To make a fairer model | Do this |
|---|---|
| Include the variety of people/things it'll meet | Don't train on just one kind |
| Balance your labels and groups | Lopsided data → lopsided model |
| Ask "who might this fail for?" before you ship it | Test on people/cases unlike your examples |
UNICEF — the part of the United Nations that protects children worldwide — says AI that affects kids must prioritize fairness and non-discrimination, and that's on the people who build it. When you train a model, you're one of those builders now. Choosing fair, inclusive examples is how you live up to that.
Think about it. You build a model to recognize "a kid raising their hand" for a classroom app, but all your training photos are from your class. Who or what might it fail on in a different school — and how would you fix your training data?
Sources
- UNICEF Office of Global Insight & Policy. (2021). Policy guidance on AI for children (2.0) — prioritize fairness and non-discrimination for children. https://www.unicef.org/innocenti/reports/policy-guidance-ai-children
- Common Sense Education. AI literacy lessons (AI bias and its real-world impacts). https://www.commonsense.org/education/collections/ai-literacy-lessons-for-grades-6-12