Skip to content

8 · Measure results honestly (yes, even the fails)

You trained your model and tested it on new examples. Now comes the most important scientific habit of all: measure what really happened — honestly — including the times it got things wrong.

Count, don't guess. Test your model on a set of new examples and tally the results:

What to countExample
How many you tested20 new photos
How many it got right14
How many it got wrong6
Your score14 / 20 = 70% right

That score is your data — the real answer to your question. Then compare it to your hypothesis: you guessed "at least 8 out of 10" (80%); you got 70%. So your hypothesis was not supported. And that's a completely valid, real result. In real science, a guess being wrong isn't failure — it's a finding. The Society for Science judges projects on honest, careful work, not on whether the result is impressive.

When the model fails, get curious — that's science:

  • Look at the misses. Which examples did it get wrong? Are the wrong ones all dark photos? All blurry? All one class? That pattern is a real discovery about your model.
  • Ask why. Maybe one class had fewer examples. Maybe the lighting was different. Maybe two classes look genuinely alike. Each "why" is something you learned.
  • Never fudge the numbers. Do not quietly drop the photos it got wrong to make your score look better. That's the opposite of science, and it breaks the honesty rule this whole course is built on.

The honest-scientist promise: I will report what actually happened — the hits and the misses — because the truth is the result, and the truth is what makes my project worth trusting.

Record your real numbers and your "why did it miss?" notes in your notebook. You'll use them in your poster.

Think about it. Your model scored lower than your hypothesis predicted. Why is reporting that honestly better science than changing the numbers to match your guess?

Sources

8 · Measure results honestly (yes, even the fails) · ElementaryMBA