AI Activity 4: Probabilistic and Statistical Simulation

Validate, interpret, and extend a simulation study

AI activity

Modified

April 23, 2026

Simulation is useful only when the simulated process matches the actual question. In this activity, AI can help you learn, but your group must still define the trial clearly, choose the correct sampling rule, validate the results, and explain what they mean.

In this activity, you will use AI as a learning partner to investigate a core data science idea: simulation is not just generating random numbers. A good simulation study requires a clearly defined trial, a correct event or statistic, enough repetitions, validation against a benchmark whenever possible, and human judgment about whether the final result is trustworthy.

You are graded on how you plan, validate, interpret, and explain. You are not graded on whether AI makes a mistake.

What you should learn from this activity

By the end of this activity, your group should be able to:

Define a probability question and a sampling distribution question using a known finite population.
Use AI to propose a simulation workflow, then evaluate whether that workflow actually matches the statistical question.
Run one probability simulation and one sampling distribution simulation using repeated simple random samples.
Validate the main results using benchmarks and stability checks.
Extend the simulation to a new but related question and explain what changes.

Important ideas for this activity

Auditing does not mean trying to prove that AI is wrong. It means evaluating AI output as statistical work.
You may learn one new probability or simulation idea from AI that we have not fully covered in class yet.
If AI introduces a new idea, you must explain it in plain language and decide whether it actually fits this activity.
A correct looking answer is not enough. Your group must show why the final result is trustworthy.

To do

Download the dataset from the Dataset section below.
Write your investigation question in 2 to 3 sentences. See Step 1.
Use 3 to 5 purposeful AI prompts. See Step 2.
Complete the required simulation and validation checklist. See Step 4.
Write the group synthesis and build slides. See Step 5 and Step 6.

Assigned Roles (3 students)

Each student has a designated role for accountability. However, teammates should still collaborate and support one another so the final work is cohesive.

Prompt Engineer

Responsibilities

Create 3 to 5 purposeful AI prompts
Save AI responses and build the AI Interaction Log
Write annotations for each prompt and response pair

Data Science Auditor

Responsibilities

Evaluate whether the AI output matches the statistical question
Check whether AI defined the correct trial, event, statistic, and sampling rule
Design and run validation checks after simulation
Explain what the group accepted, revised, rejected, or extended, and why

Synthesizer

Responsibilities

Write the Human Authored Synthesis in clear course language
Build slides using the required structure
Ensure the final work is consistent, concise, and well supported by evidence

Dataset

Required file

topic4-population.csv

Dataset notes

README_OnlineOrdersSimulation_Topic4.txt

What you will submit

Group submissions
- AI Interaction Log
- Human Authored Synthesis
- Slides for a 12 to 15 minute presentation
Individual submission
- Individual Reflection (150 to 200 words)

Step by step tasks

Step 1 Define your investigation question (before using AI)

Write 2 to 3 sentences answering:

What probability question are we trying to estimate?
What sampling distribution question are we trying to study?
What exactly counts as one trial or one repeated sample?
Why is simulation useful here?

For this activity, your group must include both targets below.

Probability target

Estimate the probability that a simple random sample of 20 orders, drawn without replacement from the population, contains at least 9 late deliveries.

Sampling distribution target

Study the sampling distribution of \(\hat{p}\), the sample proportion of late deliveries, under repeated simple random samples without replacement from the population.

You must compare two sample sizes:

\(n = 20\)
\(n = 60\)

Here, sampling distribution means the pattern of sample proportions we get when we repeatedly draw many samples from the same population.

Example investigation question you may use or adapt:

“We want to estimate the probability that a simple random sample of 20 orders contains at least 9 late deliveries, and we want to study how the sampling distribution of \(\hat{p}\) changes when the sample size increases from 20 to 60. Because this file is a full finite population, simulation lets us compare our simulated results to known population truths and check whether our method and interpretation are trustworthy.”

Step 2 Use AI strategically (3 to 5 prompts)

Use 3 to 5 purposeful prompts. Across those prompts, your conversation should cover all of the areas below. One prompt may address more than one area.

Prompt requirements

Ask AI to define one simulation trial clearly.
Ask AI about sampling with replacement versus without replacement.
Ask AI how many repetitions may be needed and how to check simulation stability.
Ask AI how to validate a simulation using a known benchmark from the population.
Ask AI how to interpret the sampling distribution of \(\hat{p}\).
Ask AI to teach you one new probability or simulation idea that we have not fully covered in class yet, and explain it in plain language.

Keep the 5 most useful and meaningful prompts for reporting in your AI Interaction Log.

Suggested prompt starters (you may adapt)

“I have a full finite population of online orders. I want to estimate the probability that a sample of 20 orders contains at least 9 late deliveries. What exactly should one simulation trial do?”
“My population is stored in a CSV file. Should I simulate with replacement or without replacement if the question says simple random sample from the population? Why?”
“What is the difference between the true population proportion of late deliveries and the Monte Carlo estimate from a simulation?”
“How many repetitions should I use for a probability simulation, and how can I tell whether my estimate is stable?”
“I want to simulate the sampling distribution of \(\hat{p}\) for \(n = 20\) and \(n = 60\). What summaries and plots should I use, and what mistakes should I avoid when interpreting them?”
“Teach me one new probability or simulation idea that might be relevant here, but explain it in plain language first.”
“What validation checks would help me trust a simulation answer, even if the code runs and looks correct?”

Step 3 Create the AI Interaction Log

For each prompt, include:

Prompt goal
The AI response excerpt you used
Your annotation:
- What AI got right
- What AI assumed
- What needed validation or clarification
- What your group accepted, revised, rejected, or extended

At least one entry in your AI Interaction Log must show how AI helped you learn one new idea that was not already part of the course, and how you translated that idea into plain language your group actually understands.

Important rule

Do not paste AI text into your final synthesis verbatim.

Step 4 Run the simulations and complete the required checklist

Import the required file, then complete the checklist below. Your evidence can be screenshots, printed outputs, or short summaries of what you observed. A short summary without numbers or outputs does not count as evidence.

Required simulation checklist

1. Rows and columns

Confirm the dataset has 300 rows and 12 columns after import.

2. Population truth

Compute the true population proportion of late_delivery == "Yes".
Report the count of late deliveries and the count of on time deliveries.
Explain why this population truth is useful for validating a simulation.

3. Define one trial correctly

For the probability question, define one trial in words.

Your definition must make clear:

the population you are sampling from
the sample size
whether sampling is with or without replacement
what event counts as a success

4. Small pilot check

Run a small pilot simulation with 10 to 20 repetitions.

Show a few trial outputs.
Explain how you checked that each trial matched your verbal definition.
State one bug or misunderstanding that could appear at this stage.

5. Main probability simulation

Run a large simulation with at least 5000 repetitions to estimate:

\[ P(\text{at least 9 late deliveries in a sample of 20}) \]

Requirements:

Use simple random samples without replacement.
Report the Monte Carlo estimate.
State the seed you used.
Explain why more than a handful of repetitions is necessary.

6. Audit one AI suggested workflow

Choose one AI suggested simulation setup, code chunk, or explanation and audit it.

Your audit must answer all of the following:

Did your group accept it as is, revise it, or reject it?
Why?
What evidence did you use to support that decision?
If the AI output was mostly correct, what did you still validate or clarify?
If the AI output was weak or wrong, how did you correct it?

Your goal is not to force an error. Your goal is to evaluate whether the AI output is trustworthy and complete for this activity.

7. Sampling distribution simulation

Simulate the sampling distribution of \(\hat{p}\) for late_delivery == "Yes" using repeated simple random samples without replacement.

Requirements:

At least 2000 repetitions for \(n = 20\)
At least 2000 repetitions for \(n = 60\)
Save the simulated \(\hat{p}\) values for both sample sizes

8. Sampling distribution evidence

Create visual evidence for the two sampling distributions.

Requirements:

Use plots such as histograms, dot plots, or density style plots.
Report the center and spread for each sample size.
Compare each simulated center to the true population proportion.
Explain how the spread changes when the sample size increases.

9. Stability or Monte Carlo uncertainty

Do one of the following:

Rerun one simulation with a different seed and compare the estimate.
Increase the number of repetitions and compare the estimate.
Compute and report a Monte Carlo standard error for the probability estimate.

Then explain what this tells you about simulation reliability.

10. Required extension

Choose one new but related variation below.

Change the sample size from 20 to 40 for the probability question.
Change the event from “at least 9 late deliveries” to “between 6 and 8 late deliveries, inclusive.”
Estimate the probability using both late delivery counts and sample proportions, then explain the connection.
Compare the probability estimate using 1000 repetitions and 10000 repetitions.

For your chosen extension:

State your prediction before simulating.
Justify the prediction in plain language.
Run the new simulation.
Explain what changed and why.

Step 5 Write the Human Authored Synthesis (group)

Length target: 300 to 450 words.

Your synthesis must include:

Your investigation question
At least two important simulation design or validation decisions, supported by evidence from your checklist
A clear explanation of how your group evaluated AI output, not just whether AI was right or wrong
A short recommended workflow for planning, validating, and extending a simulation study
A brief explanation of one new idea your group learned through AI and how you translated it into plain language
A brief concluding judgment about why your final results are trustworthy, or what remaining caution is still needed

Your synthesis must be written in your own words.

Step 6 Presentation slides (group)

Use this exact 5 slide structure:

Our question
What AI suggested
What we audited and validated
Our final understanding, with evidence
One takeaway for future data science work

Time: 12 to 15 minutes.

Individual Reflection (each student)

Write 150 to 200 words answering:

What did AI help you learn?
What did you need to validate, clarify, or revise?
What did you contribute as a human thinker?
How will you change your AI use in future data work?

Your reflection must match your assigned role.