Day 3: How Machines Learn
Recap
- Learned about Pandas
- Created charts with Matplotlib
- Investigated data on Ulaanbaatar’s air quality
But, we did all the work ourselves!
How can we use AI to detect the patterns in the data?
What is scikit-learn?
- A Python library for machine learning
- Built on top of Pandas — works with DataFrames you already know
- Provides ready-to-use models for:
- Regression — predicting a number (price, temperature, score)
- Classification — predicting a category (spam/not spam, yes/no)
- Clustering — finding hidden groups in data
Using scikit-learn
- Every sklearn model follows the same three steps:
- Prepare your features (
X) and labels (y)
- Fit —
model.fit(X, y) — the model learns from your data
- Predict —
model.predict(X_new) — apply what it learned to new data
Demo
Open and run the “Using Scikit-learn” Notebook (sklearn.ipynb)
Hands-on
Open and run the “Using Scikit-learn” Notebook (sklearn.ipynb)
Reflection
You created your first AI model!
Let’s use some real world data!
Dzud
- For many families in Mongolia, livestock are the primary source of food, income, and cultural identity.
- Periodically, a combination of summer drought and extreme winter cold kills millions of animals in a single season.
- This event is called a dzud.
- In 2001 alone, over seven million animals died. Entire herding families were left with nothing.
Can we use AI to predict an upcoming dzud?
Demo
Open and run the “Finding Disasters” Notebook (finding_disasters.ipynb)
Hands-on
Open and run the “Finding Disasters” Notebook (finding_disasters.ipynb)
Reflection
What did you discover? Did anything surprise you?
Finding The Cause
- In the previous notebook you found the disasters.
- 2001 and 2010 were catastrophic years
- Southern and western aimags tend to suffer more than others
- This tells use the when and where but not the why
Demo and Walk Through
Open and run the “Finding The Cause” Notebook (finding_cause.ipynb)
Let’s Train Our Model!
- Winter temperature is the strongest predictor of livestock mortality
- A prior summer drought increases mortality
- Can a machine learn that relationship?
- And then predict mortality for future years?
- This is called linear regression
What is Linear Regression?
- It draws the best-fit straight line through your data points
- Once the line is drawn, you can predict new values
- “Linear” just means the relationship is a straight line
What is Linear Regression?
Training and Test Datasets
- We split our data into two groups:
- A training set to teach the model, and a test set to check how well it learned
- The test set is like an exam at the end to see if the model is working
- A common split is 80% training / 20% testing
Demo
Open and run the “Linear Regression” Notebook (regression.ipynb)
Hands-On
Open and run the “Linear Regression” Notebook (regression.ipynb)
Reflection
Congratulations! You trained your first AI model!
Tomorrow
- Use the same approach for hardware
- Explore micro:bit boards
- Use AI to detect waving, putting your hand up, and other commands!