microsoft/AI-For-Beginners

Public

mirrored fromhttps://github.com/microsoft/AI-For-BeginnersAvailable

CodeCommitsIssuesPull requestsActionsInsightsSecurity
c80fdfd683036237c572564a357095093c573bec

Branches

Tags

  • No tags available.
0Branches0Tags
Go to file
Add file
Code

Clone

HTTPS

Download ZIP

examples/03-image-classifier.ipynb

384lines · modepreview

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Simple Image Classifier\n",
    "\n",
    "This notebook shows you how to classify images using a pre-trained neural network.\n",
    "\n",
    "**What you'll learn:**\n",
    "- How to load and use a pre-trained model\n",
    "- Image preprocessing\n",
    "- Making predictions on images\n",
    "- Understanding confidence scores\n",
    "\n",
    "**Use case:** Identify objects in images (like \"cat\", \"dog\", \"car\", etc.)\n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 1: Import Required Libraries\n",
    "\n",
    "Let's import the tools we need. Don't worry if you don't understand all of these yet!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Core libraries\n",
    "import numpy as np\n",
    "from PIL import Image\n",
    "import requests\n",
    "from io import BytesIO\n",
    "\n",
    "# TensorFlow for deep learning\n",
    "try:\n",
    "    import tensorflow as tf\n",
    "    from tensorflow.keras.applications import MobileNetV2\n",
    "    from tensorflow.keras.applications.mobilenet_v2 import preprocess_input, decode_predictions\n",
    "    print(\"✅ TensorFlow loaded successfully!\")\n",
    "    print(f\"   Version: {tf.__version__}\")\n",
    "except ImportError:\n",
    "    print(\"❌ Please install TensorFlow: pip install tensorflow\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 2: Load Pre-trained Model\n",
    "\n",
    "We'll use **MobileNetV2**, a neural network already trained on millions of images.\n",
    "\n",
    "This is called **Transfer Learning** - using a model someone else trained!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"📦 Loading pre-trained MobileNetV2 model...\")\n",
    "print(\"   This may take a minute on first run (downloading weights)...\")\n",
    "\n",
    "# Load the model\n",
    "# include_top=True means we use the classification layer\n",
    "# weights='imagenet' means it was trained on ImageNet dataset\n",
    "model = MobileNetV2(weights='imagenet', include_top=True)\n",
    "\n",
    "print(\"✅ Model loaded!\")\n",
    "print(f\"   The model can recognize 1000 different object categories\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 3: Helper Functions\n",
    "\n",
    "Let's create functions to load and prepare images for our model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def load_image_from_url(url):\n",
    "    \"\"\"\n",
    "    Load an image from a URL.\n",
    "    \n",
    "    Args:\n",
    "        url: Web address of the image\n",
    "        \n",
    "    Returns:\n",
    "        PIL Image object\n",
    "    \"\"\"\n",
    "    response = requests.get(url)\n",
    "    img = Image.open(BytesIO(response.content))\n",
    "    return img\n",
    "\n",
    "\n",
    "def prepare_image(img):\n",
    "    \"\"\"\n",
    "    Prepare an image for the model.\n",
    "    \n",
    "    Steps:\n",
    "    1. Resize to 224x224 (model's expected size)\n",
    "    2. Convert to array\n",
    "    3. Add batch dimension\n",
    "    4. Preprocess for MobileNetV2\n",
    "    \n",
    "    Args:\n",
    "        img: PIL Image\n",
    "        \n",
    "    Returns:\n",
    "        Preprocessed image array\n",
    "    \"\"\"\n",
    "    # Resize to 224x224 pixels\n",
    "    img = img.resize((224, 224))\n",
    "    \n",
    "    # Convert to numpy array\n",
    "    img_array = np.array(img)\n",
    "    \n",
    "    # Add batch dimension (model expects multiple images)\n",
    "    img_array = np.expand_dims(img_array, axis=0)\n",
    "    \n",
    "    # Preprocess for MobileNetV2\n",
    "    img_array = preprocess_input(img_array)\n",
    "    \n",
    "    return img_array\n",
    "\n",
    "\n",
    "def classify_image(img):\n",
    "    \"\"\"\n",
    "    Classify an image and return top predictions.\n",
    "    \n",
    "    Args:\n",
    "        img: PIL Image\n",
    "        \n",
    "    Returns:\n",
    "        List of (class_name, confidence) tuples\n",
    "    \"\"\"\n",
    "    # Prepare the image\n",
    "    img_array = prepare_image(img)\n",
    "    \n",
    "    # Make prediction\n",
    "    predictions = model.predict(img_array, verbose=0)\n",
    "    \n",
    "    # Decode predictions to human-readable labels\n",
    "    # top=5 means we get the top 5 most likely classes\n",
    "    decoded = decode_predictions(predictions, top=5)[0]\n",
    "    \n",
    "    # Convert to simpler format\n",
    "    results = [(label, float(confidence)) for (_, label, confidence) in decoded]\n",
    "    \n",
    "    return results\n",
    "\n",
    "\n",
    "print(\"✅ Helper functions ready!\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 4: Test on Sample Images\n",
    "\n",
    "Let's try classifying some images from the internet!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Sample images to classify\n",
    "# These are from Unsplash (free stock photos)\n",
    "test_images = [\n",
    "    {\n",
    "        \"url\": \"https://images.unsplash.com/photo-1514888286974-6c03e2ca1dba?w=400\",\n",
    "        \"description\": \"A cat\"\n",
    "    },\n",
    "    {\n",
    "        \"url\": \"https://images.unsplash.com/photo-1552053831-71594a27632d?w=400\",\n",
    "        \"description\": \"A dog\"\n",
    "    },\n",
    "    {\n",
    "        \"url\": \"https://images.unsplash.com/photo-1511919884226-fd3cad34687c?w=400\",\n",
    "        \"description\": \"A car\"\n",
    "    },\n",
    "]\n",
    "\n",
    "print(f\"🧪 Testing on {len(test_images)} images...\")\n",
    "print(\"=\" * 70)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Classify Each Image"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "for i, img_data in enumerate(test_images, 1):\n",
    "    print(f\"\\n📸 Image {i}: {img_data['description']}\")\n",
    "    print(\"-\" * 70)\n",
    "    \n",
    "    try:\n",
    "        # Load image\n",
    "        img = load_image_from_url(img_data['url'])\n",
    "        \n",
    "        # Display image\n",
    "        display(img.resize((200, 200)))  # Show smaller version\n",
    "        \n",
    "        # Classify\n",
    "        results = classify_image(img)\n",
    "        \n",
    "        # Show predictions\n",
    "        print(\"\\n🎯 Top 5 Predictions:\")\n",
    "        for rank, (label, confidence) in enumerate(results, 1):\n",
    "            # Create a visual bar\n",
    "            bar_length = int(confidence * 50)\n",
    "            bar = \"█\" * bar_length\n",
    "            \n",
    "            print(f\"  {rank}. {label:20s} {confidence*100:5.2f}% {bar}\")\n",
    "        \n",
    "    except Exception as e:\n",
    "        print(f\"❌ Error: {e}\")\n",
    "\n",
    "print(\"\\n\" + \"=\" * 70)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 5: Try Your Own Images!\n",
    "\n",
    "Replace the URL below with any image URL you want to classify."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Try your own image!\n",
    "# Replace this URL with any image URL\n",
    "custom_image_url = \"https://images.unsplash.com/photo-1472491235688-bdc81a63246e?w=400\"  # A flower\n",
    "\n",
    "print(\"🖼️  Classifying your custom image...\")\n",
    "print(\"=\" * 70)\n",
    "\n",
    "try:\n",
    "    # Load and show image\n",
    "    img = load_image_from_url(custom_image_url)\n",
    "    display(img.resize((300, 300)))\n",
    "    \n",
    "    # Classify\n",
    "    results = classify_image(img)\n",
    "    \n",
    "    # Show results\n",
    "    print(\"\\n🎯 Top 5 Predictions:\")\n",
    "    print(\"-\" * 70)\n",
    "    for rank, (label, confidence) in enumerate(results, 1):\n",
    "        bar_length = int(confidence * 50)\n",
    "        bar = \"█\" * bar_length\n",
    "        print(f\"  {rank}. {label:20s} {confidence*100:5.2f}% {bar}\")\n",
    "    \n",
    "    # Highlight top prediction\n",
    "    top_label, top_confidence = results[0]\n",
    "    print(\"\\n\" + \"=\" * 70)\n",
    "    print(f\"\\n🏆 Best guess: {top_label} ({top_confidence*100:.2f}% confident)\")\n",
    "    \n",
    "except Exception as e:\n",
    "    print(f\"❌ Error: {e}\")\n",
    "    print(\"   Make sure the URL points to a valid image!\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 💡 What Just Happened?\n",
    "\n",
    "1. **We loaded a pre-trained model** - MobileNetV2 was trained on millions of images\n",
    "2. **We preprocessed images** - Resized and formatted them for the model\n",
    "3. **The model made predictions** - It output probabilities for 1000 object classes\n",
    "4. **We decoded the results** - Converted numbers to human-readable labels\n",
    "\n",
    "### Understanding Confidence Scores\n",
    "\n",
    "- **90-100%**: Very confident (almost certainly correct)\n",
    "- **70-90%**: Confident (probably correct)\n",
    "- **50-70%**: Somewhat confident (might be correct)\n",
    "- **Below 50%**: Not very confident (uncertain)\n",
    "\n",
    "### Why might predictions be wrong?\n",
    "\n",
    "- **Unusual angle or lighting** - Model was trained on typical photos\n",
    "- **Multiple objects** - Model expects one main object\n",
    "- **Rare objects** - Model only knows 1000 categories\n",
    "- **Low quality image** - Blurry or pixelated images are harder\n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 🚀 Next Steps\n",
    "\n",
    "1. **Try different images:**\n",
    "   - Find images on [Unsplash](https://unsplash.com)\n",
    "   - Right-click → \"Copy image address\" to get URL\n",
    "\n",
    "2. **Experiment:**\n",
    "   - What happens with abstract art?\n",
    "   - Can it recognize objects from different angles?\n",
    "   - How does it handle multiple objects?\n",
    "\n",
    "3. **Learn more:**\n",
    "   - Explore [Computer Vision lessons](../lessons/4-ComputerVision/README.md)\n",
    "   - Learn to train your own image classifier\n",
    "   - Understand how CNNs (Convolutional Neural Networks) work\n",
    "\n",
    "---\n",
    "\n",
    "## 🎉 Congratulations!\n",
    "\n",
    "You just built an image classifier using a state-of-the-art neural network!\n",
    "\n",
    "This same technique powers:\n",
    "- Google Photos (organizing your photos)\n",
    "- Self-driving cars (recognizing objects)\n",
    "- Medical diagnosis (analyzing X-rays)\n",
    "- Quality control (detecting defects)\n",
    "\n",
    "Keep exploring and learning! 🚀"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}