microsoft/AI-For-Beginners

Public

mirrored fromhttps://github.com/microsoft/AI-For-BeginnersAvailable

CodeCommitsIssuesPull requestsActionsInsightsSecurity
1bcb2489ab4af0973915b702e09175795612e4b0

Branches

Tags

  • No tags available.
0Branches0Tags
Go to file
Add file
Code

Clone

HTTPS

Download ZIP

examples/03-image-classifier.ipynb

384lines · modecode

1{
2 "cells": [
3 {
4 "cell_type": "markdown",
5 "metadata": {},
6 "source": [
7 "# Simple Image Classifier\n",
8 "\n",
9 "This notebook shows you how to classify images using a pre-trained neural network.\n",
10 "\n",
11 "**What you'll learn:**\n",
12 "- How to load and use a pre-trained model\n",
13 "- Image preprocessing\n",
14 "- Making predictions on images\n",
15 "- Understanding confidence scores\n",
16 "\n",
17 "**Use case:** Identify objects in images (like \"cat\", \"dog\", \"car\", etc.)\n",
18 "\n",
19 "---"
20 ]
21 },
22 {
23 "cell_type": "markdown",
24 "metadata": {},
25 "source": [
26 "## Step 1: Import Required Libraries\n",
27 "\n",
28 "Let's import the tools we need. Don't worry if you don't understand all of these yet!"
29 ]
30 },
31 {
32 "cell_type": "code",
33 "execution_count": null,
34 "metadata": {},
35 "outputs": [],
36 "source": [
37 "# Core libraries\n",
38 "import numpy as np\n",
39 "from PIL import Image\n",
40 "import requests\n",
41 "from io import BytesIO\n",
42 "\n",
43 "# TensorFlow for deep learning\n",
44 "try:\n",
45 " import tensorflow as tf\n",
46 " from tensorflow.keras.applications import MobileNetV2\n",
47 " from tensorflow.keras.applications.mobilenet_v2 import preprocess_input, decode_predictions\n",
48 " print(\"✅ TensorFlow loaded successfully!\")\n",
49 " print(f\" Version: {tf.__version__}\")\n",
50 "except ImportError:\n",
51 " print(\"❌ Please install TensorFlow: pip install tensorflow\")"
52 ]
53 },
54 {
55 "cell_type": "markdown",
56 "metadata": {},
57 "source": [
58 "## Step 2: Load Pre-trained Model\n",
59 "\n",
60 "We'll use **MobileNetV2**, a neural network already trained on millions of images.\n",
61 "\n",
62 "This is called **Transfer Learning** - using a model someone else trained!"
63 ]
64 },
65 {
66 "cell_type": "code",
67 "execution_count": null,
68 "metadata": {},
69 "outputs": [],
70 "source": [
71 "print(\"📦 Loading pre-trained MobileNetV2 model...\")\n",
72 "print(\" This may take a minute on first run (downloading weights)...\")\n",
73 "\n",
74 "# Load the model\n",
75 "# include_top=True means we use the classification layer\n",
76 "# weights='imagenet' means it was trained on ImageNet dataset\n",
77 "model = MobileNetV2(weights='imagenet', include_top=True)\n",
78 "\n",
79 "print(\"✅ Model loaded!\")\n",
80 "print(f\" The model can recognize 1000 different object categories\")"
81 ]
82 },
83 {
84 "cell_type": "markdown",
85 "metadata": {},
86 "source": [
87 "## Step 3: Helper Functions\n",
88 "\n",
89 "Let's create functions to load and prepare images for our model."
90 ]
91 },
92 {
93 "cell_type": "code",
94 "execution_count": null,
95 "metadata": {},
96 "outputs": [],
97 "source": [
98 "def load_image_from_url(url):\n",
99 " \"\"\"\n",
100 " Load an image from a URL.\n",
101 " \n",
102 " Args:\n",
103 " url: Web address of the image\n",
104 " \n",
105 " Returns:\n",
106 " PIL Image object\n",
107 " \"\"\"\n",
108 " response = requests.get(url)\n",
109 " img = Image.open(BytesIO(response.content))\n",
110 " return img\n",
111 "\n",
112 "\n",
113 "def prepare_image(img):\n",
114 " \"\"\"\n",
115 " Prepare an image for the model.\n",
116 " \n",
117 " Steps:\n",
118 " 1. Resize to 224x224 (model's expected size)\n",
119 " 2. Convert to array\n",
120 " 3. Add batch dimension\n",
121 " 4. Preprocess for MobileNetV2\n",
122 " \n",
123 " Args:\n",
124 " img: PIL Image\n",
125 " \n",
126 " Returns:\n",
127 " Preprocessed image array\n",
128 " \"\"\"\n",
129 " # Resize to 224x224 pixels\n",
130 " img = img.resize((224, 224))\n",
131 " \n",
132 " # Convert to numpy array\n",
133 " img_array = np.array(img)\n",
134 " \n",
135 " # Add batch dimension (model expects multiple images)\n",
136 " img_array = np.expand_dims(img_array, axis=0)\n",
137 " \n",
138 " # Preprocess for MobileNetV2\n",
139 " img_array = preprocess_input(img_array)\n",
140 " \n",
141 " return img_array\n",
142 "\n",
143 "\n",
144 "def classify_image(img):\n",
145 " \"\"\"\n",
146 " Classify an image and return top predictions.\n",
147 " \n",
148 " Args:\n",
149 " img: PIL Image\n",
150 " \n",
151 " Returns:\n",
152 " List of (class_name, confidence) tuples\n",
153 " \"\"\"\n",
154 " # Prepare the image\n",
155 " img_array = prepare_image(img)\n",
156 " \n",
157 " # Make prediction\n",
158 " predictions = model.predict(img_array, verbose=0)\n",
159 " \n",
160 " # Decode predictions to human-readable labels\n",
161 " # top=5 means we get the top 5 most likely classes\n",
162 " decoded = decode_predictions(predictions, top=5)[0]\n",
163 " \n",
164 " # Convert to simpler format\n",
165 " results = [(label, float(confidence)) for (_, label, confidence) in decoded]\n",
166 " \n",
167 " return results\n",
168 "\n",
169 "\n",
170 "print(\"✅ Helper functions ready!\")"
171 ]
172 },
173 {
174 "cell_type": "markdown",
175 "metadata": {},
176 "source": [
177 "## Step 4: Test on Sample Images\n",
178 "\n",
179 "Let's try classifying some images from the internet!"
180 ]
181 },
182 {
183 "cell_type": "code",
184 "execution_count": null,
185 "metadata": {},
186 "outputs": [],
187 "source": [
188 "# Sample images to classify\n",
189 "# These are from Unsplash (free stock photos)\n",
190 "test_images = [\n",
191 " {\n",
192 " \"url\": \"https://images.unsplash.com/photo-1514888286974-6c03e2ca1dba?w=400\",\n",
193 " \"description\": \"A cat\"\n",
194 " },\n",
195 " {\n",
196 " \"url\": \"https://images.unsplash.com/photo-1552053831-71594a27632d?w=400\",\n",
197 " \"description\": \"A dog\"\n",
198 " },\n",
199 " {\n",
200 " \"url\": \"https://images.unsplash.com/photo-1511919884226-fd3cad34687c?w=400\",\n",
201 " \"description\": \"A car\"\n",
202 " },\n",
203 "]\n",
204 "\n",
205 "print(f\"🧪 Testing on {len(test_images)} images...\")\n",
206 "print(\"=\" * 70)"
207 ]
208 },
209 {
210 "cell_type": "markdown",
211 "metadata": {},
212 "source": [
213 "### Classify Each Image"
214 ]
215 },
216 {
217 "cell_type": "code",
218 "execution_count": null,
219 "metadata": {},
220 "outputs": [],
221 "source": [
222 "for i, img_data in enumerate(test_images, 1):\n",
223 " print(f\"\\n📸 Image {i}: {img_data['description']}\")\n",
224 " print(\"-\" * 70)\n",
225 " \n",
226 " try:\n",
227 " # Load image\n",
228 " img = load_image_from_url(img_data['url'])\n",
229 " \n",
230 " # Display image\n",
231 " display(img.resize((200, 200))) # Show smaller version\n",
232 " \n",
233 " # Classify\n",
234 " results = classify_image(img)\n",
235 " \n",
236 " # Show predictions\n",
237 " print(\"\\n🎯 Top 5 Predictions:\")\n",
238 " for rank, (label, confidence) in enumerate(results, 1):\n",
239 " # Create a visual bar\n",
240 " bar_length = int(confidence * 50)\n",
241 " bar = \"█\" * bar_length\n",
242 " \n",
243 " print(f\" {rank}. {label:20s} {confidence*100:5.2f}% {bar}\")\n",
244 " \n",
245 " except Exception as e:\n",
246 " print(f\"❌ Error: {e}\")\n",
247 "\n",
248 "print(\"\\n\" + \"=\" * 70)"
249 ]
250 },
251 {
252 "cell_type": "markdown",
253 "metadata": {},
254 "source": [
255 "## Step 5: Try Your Own Images!\n",
256 "\n",
257 "Replace the URL below with any image URL you want to classify."
258 ]
259 },
260 {
261 "cell_type": "code",
262 "execution_count": null,
263 "metadata": {},
264 "outputs": [],
265 "source": [
266 "# Try your own image!\n",
267 "# Replace this URL with any image URL\n",
268 "custom_image_url = \"https://images.unsplash.com/photo-1472491235688-bdc81a63246e?w=400\" # A flower\n",
269 "\n",
270 "print(\"🖼️ Classifying your custom image...\")\n",
271 "print(\"=\" * 70)\n",
272 "\n",
273 "try:\n",
274 " # Load and show image\n",
275 " img = load_image_from_url(custom_image_url)\n",
276 " display(img.resize((300, 300)))\n",
277 " \n",
278 " # Classify\n",
279 " results = classify_image(img)\n",
280 " \n",
281 " # Show results\n",
282 " print(\"\\n🎯 Top 5 Predictions:\")\n",
283 " print(\"-\" * 70)\n",
284 " for rank, (label, confidence) in enumerate(results, 1):\n",
285 " bar_length = int(confidence * 50)\n",
286 " bar = \"█\" * bar_length\n",
287 " print(f\" {rank}. {label:20s} {confidence*100:5.2f}% {bar}\")\n",
288 " \n",
289 " # Highlight top prediction\n",
290 " top_label, top_confidence = results[0]\n",
291 " print(\"\\n\" + \"=\" * 70)\n",
292 " print(f\"\\n🏆 Best guess: {top_label} ({top_confidence*100:.2f}% confident)\")\n",
293 " \n",
294 "except Exception as e:\n",
295 " print(f\"❌ Error: {e}\")\n",
296 " print(\" Make sure the URL points to a valid image!\")"
297 ]
298 },
299 {
300 "cell_type": "markdown",
301 "metadata": {},
302 "source": [
303 "## 💡 What Just Happened?\n",
304 "\n",
305 "1. **We loaded a pre-trained model** - MobileNetV2 was trained on millions of images\n",
306 "2. **We preprocessed images** - Resized and formatted them for the model\n",
307 "3. **The model made predictions** - It output probabilities for 1000 object classes\n",
308 "4. **We decoded the results** - Converted numbers to human-readable labels\n",
309 "\n",
310 "### Understanding Confidence Scores\n",
311 "\n",
312 "- **90-100%**: Very confident (almost certainly correct)\n",
313 "- **70-90%**: Confident (probably correct)\n",
314 "- **50-70%**: Somewhat confident (might be correct)\n",
315 "- **Below 50%**: Not very confident (uncertain)\n",
316 "\n",
317 "### Why might predictions be wrong?\n",
318 "\n",
319 "- **Unusual angle or lighting** - Model was trained on typical photos\n",
320 "- **Multiple objects** - Model expects one main object\n",
321 "- **Rare objects** - Model only knows 1000 categories\n",
322 "- **Low quality image** - Blurry or pixelated images are harder\n",
323 "\n",
324 "---"
325 ]
326 },
327 {
328 "cell_type": "markdown",
329 "metadata": {},
330 "source": [
331 "## 🚀 Next Steps\n",
332 "\n",
333 "1. **Try different images:**\n",
334 " - Find images on [Unsplash](https://unsplash.com)\n",
335 " - Right-click → \"Copy image address\" to get URL\n",
336 "\n",
337 "2. **Experiment:**\n",
338 " - What happens with abstract art?\n",
339 " - Can it recognize objects from different angles?\n",
340 " - How does it handle multiple objects?\n",
341 "\n",
342 "3. **Learn more:**\n",
343 " - Explore [Computer Vision lessons](../lessons/4-ComputerVision/README.md)\n",
344 " - Learn to train your own image classifier\n",
345 " - Understand how CNNs (Convolutional Neural Networks) work\n",
346 "\n",
347 "---\n",
348 "\n",
349 "## 🎉 Congratulations!\n",
350 "\n",
351 "You just built an image classifier using a state-of-the-art neural network!\n",
352 "\n",
353 "This same technique powers:\n",
354 "- Google Photos (organizing your photos)\n",
355 "- Self-driving cars (recognizing objects)\n",
356 "- Medical diagnosis (analyzing X-rays)\n",
357 "- Quality control (detecting defects)\n",
358 "\n",
359 "Keep exploring and learning! 🚀"
360 ]
361 }
362 ],
363 "metadata": {
364 "kernelspec": {
365 "display_name": "Python 3",
366 "language": "python",
367 "name": "python3"
368 },
369 "language_info": {
370 "codemirror_mode": {
371 "name": "ipython",
372 "version": 3
373 },
374 "file_extension": ".py",
375 "mimetype": "text/x-python",
376 "name": "python",
377 "nbconvert_exporter": "python",
378 "pygments_lexer": "ipython3",
379 "version": "3.8.0"
380 }
381 },
382 "nbformat": 4,
383 "nbformat_minor": 4
384}
385