import cv
import graphics
import time
canvas = graphics.canvas()
camera = cv.start_camera(canvas)
detector = cv.start_pose_detector(camera)
try:
while True:
poses = detector.get_detections()
canvas.draw_poses(poses)
time.sleep(0.033)
finally:
detector.stop()
camera.stop()
print("Camera stopped.")Pose Landmarks with Computer Vision
In this notebook, you’ll use a neural network to detect the position of 33 body landmarks in real time. By tracking joints like shoulders, elbows, and wrists, we can understand how a person is moving — without knowing who they are.
What Is Pose Estimation?
Pose estimation detects the position of specific points on a person’s body — called landmarks or keypoints. The result is a skeleton-like overlay showing where joints are located in the image.
This platform uses MediaPipe Pose, which tracks 33 landmarks including: nose, eyes, ears, shoulders, elbows, wrists, hips, knees, and ankles.
Each landmark has: - x, y — position on screen (pixels) - z — estimated depth (how far the point is from the camera) - visibility — how confident the model is that this point is visible (0.0 – 1.0)
Step 1: Live Pose Detection
The cell below starts your camera and draws a skeleton overlay on your body in real time. Step back so your full upper body is visible — the model works best when it can see your shoulders, arms, and head.
Click Allow when your browser asks for webcam access, then click Stop (■) when you’re ready to move on.
Step 2: Explore Landmark Data
The cell below runs the same live skeleton overlay, but also prints the coordinates of key joints once per second. We access each landmark using cv.POSE constants — for example, cv.POSE.LEFT_WRIST.
Move your arms, lean in, or step back and watch the numbers change.
Tip: In image coordinates,
y=0is the top of the screen and increases downward.
Click Stop (■) when you’re done.
import cv
import graphics
import time
canvas = graphics.canvas()
camera = cv.start_camera(canvas)
detector = cv.start_pose_detector(camera)
key_points = [
("Nose", cv.POSE.NOSE),
("Left Shoulder", cv.POSE.LEFT_SHOULDER),
("Right Shoulder", cv.POSE.RIGHT_SHOULDER),
("Left Wrist", cv.POSE.LEFT_WRIST),
("Right Wrist", cv.POSE.RIGHT_WRIST),
]
frame = 0
try:
while True:
poses = detector.get_detections()
canvas.draw_poses(poses)
if frame % 30 == 0:
if poses:
pose = poses[0]
print("--- Key landmarks ---")
for name, idx in key_points:
lm = pose[idx]
print(f" {name:16}: x={lm['x']}, y={lm['y']} (visible: {lm['visibility']:.0%})")
else:
print("No pose detected — step back so more of your body is visible.")
frame += 1
time.sleep(0.033)
finally:
detector.stop()
camera.stop()
print("Camera stopped.")Step 3: Experiment — Arm Raise Detector
Now let’s write logic on top of the landmark data. If your wrist is higher than your shoulder, your arm is raised. Since y=0 is the top of the screen, a higher position means a smaller y value.
Run the cell, then try raising one or both arms. Watch the status update in the output. Click Stop (■) when you’re done.
import cv
import graphics
import time
canvas = graphics.canvas()
camera = cv.start_camera(canvas)
detector = cv.start_pose_detector(camera)
frame = 0
try:
while True:
poses = detector.get_detections()
canvas.draw_poses(poses)
if frame % 30 == 0:
if poses:
pose = poses[0]
l_wrist = pose[cv.POSE.LEFT_WRIST]
l_shoulder = pose[cv.POSE.LEFT_SHOULDER]
r_wrist = pose[cv.POSE.RIGHT_WRIST]
r_shoulder = pose[cv.POSE.RIGHT_SHOULDER]
left_raised = l_wrist['y'] < l_shoulder['y']
right_raised = r_wrist['y'] < r_shoulder['y']
print(f"Left arm: {'RAISED' if left_raised else 'at side'} | "
f"Right arm: {'RAISED' if right_raised else 'at side'}")
else:
print("No pose detected.")
frame += 1
time.sleep(0.033)
finally:
detector.stop()
camera.stop()
print("Camera stopped.")Reflect
Record your observations in the journal cells below.
Imagine you could only see the skeleton data — no video, no face, no identity. What could you still figure out about the person? What couldn’t you tell?
Check for Understanding
{ “question_type”: “true_false”, “question”: “MediaPipe Pose tracks 33 landmarks on the human body.”, “answer”: “True”, “submitted_answer”: “” }
{ “question_type”: “multiple_choice”, “question”: “What does the visibility value tell you about a landmark?”, “options”: [ { “key”: “a”, “text”: “Whether the landmark is currently moving” }, { “key”: “b”, “text”: “How confident the model is that the landmark is visible” }, { “key”: “c”, “text”: “The color of the landmark on screen” }, { “key”: “d”, “text”: “How far the person is from the camera in meters” } ], “answer”: “b”, “submitted_answer”: “” }
{ “question_type”: “true_false”, “question”: “In image coordinates, a smaller y value means the point is higher on the screen.”, “answer”: “True”, “submitted_answer”: “” }
{ “question_type”: “multiple_choice”, “question”: “Which constant would you use to get the position of a person’s left wrist?”, “options”: [ { “key”: “a”, “text”: “cv.POSE.LEFT_HAND” }, { “key”: “b”, “text”: “cv.POSE.LEFT_WRIST” }, { “key”: “c”, “text”: “cv.POSE.WRIST_LEFT” }, { “key”: “d”, “text”: “cv.POSE.LEFT_PALM” } ], “answer”: “b”, “submitted_answer”: “” }