1 · How AI makes images, music & text
You type a sentence and a few seconds later there's a painting, a song, or a finished paragraph. It feels like magic. It isn't — and understanding what's actually happening underneath is what separates someone who uses creative AI well from someone who just gets surprised by it.
The tools that make new pictures, sounds, or words are called generative AI ("generative" just means it generates — it makes new stuff, instead of only sorting or labeling things). Underneath, they all run on the same basic idea: learn patterns from a huge pile of examples, then produce something new that fits those patterns.
- Text tools (chatbots) read enormous amounts of human writing and learn which words tend to follow which. When you prompt one, it predicts likely next words, over and over, until it has a reply. It's a very sophisticated next-word predictor.
- Image tools learn from huge collections of pictures paired with descriptions. They learn what "a golden retriever" or "a watercolor sunset" tends to look like, then build a new image that matches your words — usually by starting from visual noise and refining it step by step.
- Music/audio tools learn patterns in sound — melody, rhythm, instruments, style — from large collections of recordings, then generate new audio that fits the style you ask for.
Notice the thread running through all three: every one of them learned from work that humans made. The paintings, songs, photos, and writing in that training data came from real artists, musicians, photographers, and writers. Hold onto that fact — it's the root of almost every ethics question later in this course.
| If you remember one thing per type | Text | Images | Music |
|---|---|---|---|
| What it learned from | mountains of human writing | images + their descriptions | recordings + their styles |
| What it's really doing | predicting likely next words | building an image to match words | generating audio to match a style |
| What it is not doing | "knowing" facts | "drawing from imagination" | "feeling" the music |
Plain-words summary: generative AI is a pattern machine. It's not imagining, feeling, or understanding — it's recombining patterns it learned from human-made work into something new that fits your request.
Think about it. All three tools "learned from examples." In one sentence, where did all those examples actually come from — and why might that matter to the people who made them?
Sources
- Common Sense Education. How Is AI Trained? (lesson, grades 6–12). https://www.commonsense.org/education/digital-citizenship/lesson/how-is-ai-trained
- MIT RAISE. Day of AI — free K–12 AI literacy curriculum. https://raise.mit.edu/