Skip to content

8 · Bias: the model learns your examples' mistakes

Now that you've trained a model, this lesson hits different — because you've felt how much the examples matter. Here's the serious version of "garbage in, garbage out," and it's one of the most important ideas in all of AI:

A model learns whatever is in your examples — including the unfair stuff. If your training data leaves people out or reflects old unfair ideas, the model copies those mistakes. That unfair leaning is called bias (say it: BYE-us).

The hard part is that bias usually isn't on purpose. The model isn't being mean — it's faithfully learning the pattern you gave it, exactly like your couch-equals-cat model did. Watch how easily it sneaks in:

  • Who's missing? Train a face-detector mostly on light-skinned faces, and it works worse on darker-skinned faces — not out of malice, but because it barely saw them. This has actually happened in real face-recognition systems, and researchers have measured it.
  • What's lopsided? Train a "doctor" image model on photos that are almost all men, and it may wrongly doubt that a woman in a white coat is a doctor.
  • What unfairness did the data carry? If examples came from the internet, they can carry the internet's unkind or untrue ideas about groups of people — and the model soaks those up too.

Because you control the examples, you have real power here — and real responsibility:

To make a fairer modelDo this
Include the variety of people/things it'll meetDon't train on just one kind
Balance your labels and groupsLopsided data → lopsided model
Ask "who might this fail for?" before you ship itTest on people/cases unlike your examples

UNICEF — the part of the United Nations that protects children worldwide — says AI that affects kids must prioritize fairness and non-discrimination, and that's on the people who build it. When you train a model, you're one of those builders now. Choosing fair, inclusive examples is how you live up to that.

Think about it. You build a model to recognize "a kid raising their hand" for a classroom app, but all your training photos are from your class. Who or what might it fail on in a different school — and how would you fix your training data?

Sources

8 · Bias: the model learns your examples' mistakes · ElementaryMBA