Journey through AI: Weekly Lessons from the Undergraduate Classroom
When Models Meet the World
This fall I launched UNIV 182—AI4All: Understanding & Building AI to every Mason undergraduate, across majors. It fulfills the Mason Core IT & Computing requirement, but that’s the least interesting thing about it. This isn’t an appreciation course. We understand, we apply, we critique, we build. The class has a rhythm. If you’re joining mid-journey, you can still catch the beat. You can catch up here and here. Join us!
From Theory to Practice: When Models Meet the Real World
On Monday the room went quiet for a new reason. After two weeks of scaffolding, problem → data → learning approach → evaluation, our “safe” exercises stepped into the real world. Neat ideas picked up rough edges. Important technical vocabulary we had gradually built together turned into values. And I finally lost control of my mic.
We began with a reminder that evaluation is never a single number. Accuracy pleases slides but can mislead; precision and recall pull us toward the trade-offs that real decisions demand. The class had absorbed this. And yet, and yet, I had promised the students that in this class we will learn by doing. And this is exactly what we did as we stepped into week three.
Peeling the Many, Many Layers
We built a tiny image classifier together. Nothing fancy: “smile” vs. “not smile.” Within minutes the technical questions turned human. Should we upload our own photos? Nope! Could we generate synthetic faces instead? Yes, but what are those generators trained on, and can outputs be traced back to real people? Maybe. We decided we were investigating and were in a controlled setting, so we could proceed carefully, and learned very fast that prompts had to be painfully specific to avoid near-duplicates and inject diversity in the data. We definitely saw distinct patterns in the images we obtained from Google Gemini versus those from ChatGPT.
We realized how exhausting this was, so tried to offload the grunt work to an “agent.” It needed hand-holding, wandered off mid-task, and sometimes never came back. (I reached for a Seinfeld joke, then remembered the year. Wrong audience; right lesson.)
So many lessons in this one deceptively simple exercise. Exactly the learning-by-doing I had wanted to happen in this course.
Pass the Ball, Kick the Ball
I told the students that these first classes were a glimpse into the course. The framework we would pursue would be very much iterative, or an advance and retreat sort of progress: obtain enough technical understanding to evaluate and appreciate in hands-on settings, then go back and get more technical understanding. Repeat. The students seem to enjoy this so far.
This is Heavy
With enough traction under our feet, we named the landmines you only notice once you step outside the lab: overfitting that aces practice and fails the final; distribution shift when the world ignores your dataset; brittleness from small changes; and bias slipping in through data or model design. We paired each with infamous cases students could appreciate as real: clinical decision support that looked strong and then stumbled; image models that wilted under everyday variation; a chatbot turned toxic when released into the wild; maps that misled because the underlying data was noisy; and familiar failures with face, credit, hiring, and safety.
Some examples, like the model overfitting the training data, with the sad parrot there, made students chuckle. But when they learned about the IBM Watson for Oncology, the room went silent. Real-world examples gradually added both layers of understanding and discomfort.
I prefaced many of the examples with the fact that they would make us uncomfortable, and even sad, but we had to talk about them, understand deeply what went wrong and why, so we did not allow it again. We can’t fix what we won’t name.
But the photo-tagging example was a heavy one. Many students locked their eyes on me, saying so much without saying a single word. Explaining the Tay debacle from Microsoft was difficult, too.
I asked, is this just a data problem? They said yes. Then I asked what kind? So, we talked about the various issues in the collected data: sampling bias, selection bias, representational bias, and historical bias. Then students asked: what if we make sure that the data has none of this? Couldn’t we be intentional about the data? This was the perfect opening to model bias.
The students understood that even if we are careful with the data, we can go wrong in how we set up the model, where we tell it to focus, on which variables, and what objective we formulate for it to optimize. Students really internalized now that models will cheat by design. They see data to feed on and numbers to optimize. They do not see intent or values.
Flipping the room
Monday was a difficult class. So, Wednesday I closed the slide deck and rearranged the room, literally. Students formed sector pods: health, education, transportation, defense, and more. Slides were seeds; Mason PatriotAI and Copilot were the soil for research in real time. With open laptops, students advanced their understanding with slides like these, from drug discovery…
to transportation …
to national security and defense.
The first minutes were awkward. You have to understand. Most of these students do not know one another. And the grouping was based on interests, not friends. So, it took a while, and I had to go to many tables and ask and engage, and pull and push.
And then the mic wouldn’t stay put.
Pass that Mic around
It’s still amazing to me how central pedagogy is to the learning process. The energy shift in the room, from those first few minutes of awkward eye contact to “keep passing the mic” happened because the agency moved to students. I wish you could have seen the turn. I had envisioned for only one team member to represent the team discussion and report to the rest of the class, but the students wanted to keep talking. They kept passing the mic to their team members. So, of course what I envisioned to be a twenty-minute report turned into fourty minutes without realizing it.
Education teams worried about cognitive off-loading and admitted healthy uses they’d already found: prep cards, study prioritization, self-quizzing. Health teams argued recall vs. precision while side-eyeing privacy and insurer misuse. Defense groups kept asking the right preliminary question: deploy into what world, and with what human-on-the-loop guardrails? The conversations held because we kept one lens on the table: Problem → Data → Learning family → Success measure → Value judgment.
One group moved past their own concerns and spoke about elementary and middle-school kids, their siblings. Their worry wasn’t grades; it was cognition, attention, judgment, development. I said out loud what I was thinking: there’s hope in this generation. They chuckled.
Another group described AI and drones. They talked about Ukraine. Then autonomous weapons. The room went still. Whose decision is it? Who is accountable? Some questions earned their silence.
I could keep quoting the students, but I want to protect their space. We are finally talking. We understand enough to ask the tough questions. And these questions will motivate us to want to understand more what’s under the hood.
Each student is now writing a short analysis of AI in their chosen sector: tracing storage, exchange, security, and privacy; naming one ethical concern; proposing two concrete recommendations; and backing claims with at least three trustworthy sources. That work becomes the evidence packet for what comes next.
Next, the room becomes a debate arena. Same systems, two stances: blue team vs. red team; deploy with guardrails or keep testing. Before anyone says “ship it,” claims must show their work: data, method, metric, failure modes.
For now I’m keeping one image: the students passing the mic around to their teammates, and the quiet that followed a good argument. That is the sound of AI literacy growing.






