import cv
import graphics
import time
canvas = graphics.canvas()
camera = cv.start_camera(canvas)
detector = cv.start_gesture_detector(camera)
frame = 0
try:
while True:
detections = detector.get_detections()
canvas.draw_hands(detections)
if frame % 30 == 0:
print(f"--- {len(detections)} hand(s) detected ---")
for hand in detections:
print(f" {hand['handedness']} hand: {hand['gesture']} ({hand['confidence']:.1%})")
frame += 1
time.sleep(0.033)
finally:
detector.stop()
camera.stop()
print("Camera stopped.")Gesture Recognition with Computer Vision
In this notebook, you’ll use a neural network to recognize hand gestures in real time. The model tracks 21 landmarks per hand and classifies the shape into one of 8 named gestures — all without any special hardware.
What Is Gesture Recognition?
Gesture recognition combines two ideas:
- Hand landmark detection — locating 21 key points on each hand (fingertips, joints, wrist)
- Gesture classification — analyzing the shape of those landmarks to name the gesture
This platform uses MediaPipe Gesture Recognizer, which can identify these 8 gestures:
| Gesture | Description |
|---|---|
None |
No recognized gesture |
Closed_Fist |
All fingers curled in |
Open_Palm |
All fingers extended, palm facing forward |
Pointing_Up |
Index finger extended upward |
Thumb_Down |
Thumbs down |
Thumb_Up |
Thumbs up |
Victory |
Index and middle fingers extended (V sign) |
ILoveYou |
ASL “I love you” sign |
Step 1: Live Gesture Detection
The cell below starts your camera and runs gesture detection in real time. Hold your hand up and try different gestures — the model will draw landmarks on your hand and print the detected gesture once per second.
Make sure there is enough light so the camera can clearly see your hand. Click Allow for webcam access, then click Stop (■) when you’re ready to move on.
Step 2: Explore Hand Landmarks
Beyond the gesture name, the detector gives you the exact position of every joint. We access individual landmarks using cv.HAND constants — for example, cv.HAND.INDEX_FINGER_TIP.
The cell below prints the index fingertip and wrist position once per second alongside the live hand overlay. Try different hand shapes and watch the coordinates change.
Click Stop (■) when you’re done.
import cv
import graphics
import time
canvas = graphics.canvas()
camera = cv.start_camera(canvas)
detector = cv.start_gesture_detector(camera)
frame = 0
try:
while True:
detections = detector.get_detections()
canvas.draw_hands(detections)
if frame % 30 == 0:
for hand in detections:
tip = hand['landmarks'][cv.HAND.INDEX_FINGER_TIP]
wrist = hand['landmarks'][cv.HAND.WRIST]
print(f"{hand['handedness']} hand — {hand['gesture']}")
print(f" Index fingertip: ({tip['x']}, {tip['y']})")
print(f" Wrist : ({wrist['x']}, {wrist['y']})")
frame += 1
time.sleep(0.033)
finally:
detector.stop()
camera.stop()
print("Camera stopped.")Step 3: Experiment — Gesture Challenge
Use the dropdown to choose a target gesture, then run the cell and try to make that gesture. The output will tell you what gesture it’s currently seeing and whether it matches your target.
Can you hit MATCH! on every gesture in the list? Click Stop (■) when you’re done.
import cv
import graphics
import time
TARGET_GESTURE = "Thumb_Up" #@param ["None", "Closed_Fist", "Open_Palm", "Pointing_Up", "Thumb_Down", "Thumb_Up", "Victory", "ILoveYou"]
canvas = graphics.canvas()
camera = cv.start_camera(canvas)
detector = cv.start_gesture_detector(camera)
frame = 0
try:
while True:
detections = detector.get_detections()
canvas.draw_hands(detections)
if frame % 30 == 0:
if detections:
for hand in detections:
matched = hand['gesture'] == TARGET_GESTURE
result = "MATCH!" if matched else "not yet..."
print(f"{hand['handedness']} hand: {hand['gesture']} ({hand['confidence']:.1%}) — {result}")
else:
print("No hands detected — hold your hand up to the camera.")
frame += 1
time.sleep(0.033)
finally:
detector.stop()
camera.stop()
print("Camera stopped.")Which gestures were easiest to get a high confidence score on? Which were hardest? Why do you think some gestures are more reliably detected than others?
Gesture recognition could let people control computers, TVs, or lights without touching anything. Describe one application where this would be especially helpful — and one situation where it might cause unintended problems.
Check for Understanding
{ “question_type”: “multiple_choice”, “question”: “How many different gestures can the MediaPipe Gesture Recognizer detect?”, “options”: [ { “key”: “a”, “text”: “4” }, { “key”: “b”, “text”: “6” }, { “key”: “c”, “text”: “8” }, { “key”: “d”, “text”: “12” } ], “answer”: “c”, “submitted_answer”: “” }
{ “question_type”: “true_false”, “question”: “The gesture detector can tell whether a detected hand is the left or right hand.”, “answer”: “True”, “submitted_answer”: “” }
{ “question_type”: “multiple_choice”, “question”: “How many landmarks does MediaPipe track per hand?”, “options”: [ { “key”: “a”, “text”: “5 (one per finger)” }, { “key”: “b”, “text”: “10” }, { “key”: “c”, “text”: “21” }, { “key”: “d”, “text”: “33” } ], “answer”: “c”, “submitted_answer”: “” }
{ “question_type”: “true_false”, “question”: “The gesture detector can only track one hand at a time.”, “answer”: “False”, “submitted_answer”: “” }