microsoft/AI-For-Beginners

Public

mirrored fromhttps://github.com/microsoft/AI-For-BeginnersAvailable

CodeCommitsIssuesPull requestsActionsInsightsSecurity
278c50a748972c5ee148537f45d25a9a773b32ee

Branches

Tags

  • No tags available.
0Branches0Tags
Go to file
Add file
Code

Clone

HTTPS

Download ZIP

lessons/5-NLP/20-LangModels/GPT-PyTorch.ipynb

347lines · modepreview

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Experimenting with GPT-2\n",
    "\n",
    "This notebook is part of [AI for Beginners Curriculum](http://aka.ms/ai-beginners).\n",
    "\n",
    "In this notebook, we will explore how we can play with OpenAI GPT-2 model using Hugging Face `transformers` library.\n",
    "\n",
    "Without further ado, let's instantiate text generating pipeline and start generating! You can select smaller GPT-2 model in order to increase download time and speed of inference, but that would affect the quality."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Some weights of GPT2Model were not initialized from the model checkpoint at gpt2-large and are newly initialized: ['h.0.attn.masked_bias', 'h.1.attn.masked_bias', 'h.2.attn.masked_bias', 'h.3.attn.masked_bias', 'h.4.attn.masked_bias', 'h.5.attn.masked_bias', 'h.6.attn.masked_bias', 'h.7.attn.masked_bias', 'h.8.attn.masked_bias', 'h.9.attn.masked_bias', 'h.10.attn.masked_bias', 'h.11.attn.masked_bias', 'h.12.attn.masked_bias', 'h.13.attn.masked_bias', 'h.14.attn.masked_bias', 'h.15.attn.masked_bias', 'h.16.attn.masked_bias', 'h.17.attn.masked_bias', 'h.18.attn.masked_bias', 'h.19.attn.masked_bias', 'h.20.attn.masked_bias', 'h.21.attn.masked_bias', 'h.22.attn.masked_bias', 'h.23.attn.masked_bias', 'h.24.attn.masked_bias', 'h.25.attn.masked_bias', 'h.26.attn.masked_bias', 'h.27.attn.masked_bias', 'h.28.attn.masked_bias', 'h.29.attn.masked_bias', 'h.30.attn.masked_bias', 'h.31.attn.masked_bias', 'h.32.attn.masked_bias', 'h.33.attn.masked_bias', 'h.34.attn.masked_bias', 'h.35.attn.masked_bias']\n",
      "You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.\n",
      "Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "[{'generated_text': 'Hello! I am a neural network, and I want to say that I am an expert in the area of learning and understanding neural networks. I also know a bit about math. You might have seen \"How to make deep neural networks\" or \"What is a deep learning neural network?\". I would like to discuss the first one.\\n\\nHow to make deep neural networks is a very complicated topic, and I don\\'t know how to go into it in detail, but I want to do some'},\n",
       " {'generated_text': 'Hello! I am a neural network, and I want to say that you can find a network algorithm, called Naive Bayes, which produces good results, but which is computationally too expensive to be useful, so I will not cover all of its details here.\\n\\nFirst we are going to define how an NN looks like.\\n\\nHere is a naive Bayes classification problem with a few variables:\\n\\nFeature Input Value (x) 1 2 3 4 5 1 3 4'},\n",
       " {'generated_text': 'Hello! I am a neural network, and I want to say that… \"I\\'m done with this crap!\"\\n\\nNow, if you were to be able to tell me all about this, then you did something wrong with the previous part. If you weren\\'t able to tell I did this before, sorry! If your brain was too wired, then you don\\'t know anything about neural networks. I mean, it turns out neural networks have a big problem. It\\'s called lossy neural'},\n",
       " {'generated_text': 'Hello! I am a neural network, and I want to say that I learned this from Wikipedia but also from lots of other books. They also gave me some tips on how to do the analysis, which is really useful for beginners.\\n\\nFor the code, I just used Python and scikit-learn. But I will warn you with two caveats –\\n\\nfirst, this code will have to be imported before you can run them. You have to import scikit-learn after'},\n",
       " {'generated_text': 'Hello! I am a neural network, and I want to say that you\\'re awesome.\"'}]"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from transformers import pipeline\n",
    "\n",
    "model_name = 'gpt2-large' # try 'gpt2' for small model, 'gpt2-medium' for medium one\n",
    "\n",
    "generator = pipeline('text-generation', model=model_name)\n",
    "\n",
    "generator(\"Hello! I am a neural network, and I want to say that\", max_length=100, num_return_sequences=5)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Prompt Engineering\n",
    "\n",
    "In some of the problems, you can use GPT-2 generation right away by designing correct prompts. Have a look at the examples below:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "[{'generated_text': 'Synonyms of a word cat:\\n\\n(a) cat;\\n\\n(b) black'},\n",
       " {'generated_text': 'Synonyms of a word cat: cat-bitch; cat-queen; cat-c'},\n",
       " {'generated_text': 'Synonyms of a word cat: feline, feline form, feline spirit, fas'},\n",
       " {'generated_text': 'Synonyms of a word cat:\\n\\n(a) cat-like, like a cat;'},\n",
       " {'generated_text': 'Synonyms of a word cat:\\n\\nThe more common English words you need to understand the meaning'}]"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "generator(\"Synonyms of a word cat:\", max_length=20, num_return_sequences=5)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "[{'generated_text': 'I love when you say this -> Positive\\nI have myself -> Negative\\nThis is awful for you to say this -> Negative\\nHow do you think this hurts us all? -> Positive\\nYou seem'},\n",
       " {'generated_text': \"I love when you say this -> Positive\\nI have myself -> Negative\\nThis is awful for you to say this -> Negative\\nThis is such a horrible way to treat yourself -> Negative (You're\"},\n",
       " {'generated_text': 'I love when you say this -> Positive\\nI have myself -> Negative\\nThis is awful for you to say this -> I am disappointed\\nI have to admit -> You are a good friend but you'},\n",
       " {'generated_text': 'I love when you say this -> Positive\\nI have myself -> Negative\\nThis is awful for you to say this -> Positive\\nThis is awful for me to do this with you -> Negative\\nWhy'},\n",
       " {'generated_text': \"I love when you say this -> Positive\\nI have myself -> Negative\\nThis is awful for you to say this -> I'm still sad for you!\\nMe? \\xa0I find this a\"}]"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "generator(\"I love when you say this -> Positive\\nI have myself -> Negative\\nThis is awful for you to say this ->\", max_length=40, num_return_sequences=5)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "[{'generated_text': 'Translate English to French: cat => chat, dog => chien, student => étudiant;\\n\\n\\nOther:\\n\\nTo learn'},\n",
       " {'generated_text': 'Translate English to French: cat => chat, dog => chien, student => étude, french = le français = \"'},\n",
       " {'generated_text': \"Translate English to French: cat => chat, dog => chien, student => été, mama => m'aime. We\"}]"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "generator(\"Translate English to French: cat => chat, dog => chien, student => \", top_k=50, max_length=30, num_return_sequences=3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "[{'generated_text': 'People who liked the movie The Matrix also liked \\xa0\"Dawn of the Dead\" (or in the case of John Goodman \"Vanish\"),\\xa0but their votes didn\\'t matter! The first'},\n",
       " {'generated_text': 'People who liked the movie The Matrix also liked _____. It\\'s a true statement, as \"tasteful\" movies are often more popular for their plot, characters and themes than for their entertainment'},\n",
       " {'generated_text': 'People who liked the movie The Matrix also liked \\xa0the book... \\xa0And so on, and so forth. Now at the other end of the spectrum...\\n...the real \"truths'},\n",
       " {'generated_text': \"People who liked the movie The Matrix also liked 『The Grand Budapest Hotel』. 『The Grand Budapest Hotel』 is the movie people who like to watch their dream come true. I'll explain\"},\n",
       " {'generated_text': 'People who liked the movie The Matrix also liked \\xa0a lot of the lines that were used, and I got to listen to the dialogue in the movie. \\xa0The characters were really enjoyable characters'}]"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "generator(\"People who liked the movie The Matrix also liked \", max_length=40, num_return_sequences=5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Text Sampling Strategies\n",
    "\n",
    "So far we have been using simple **greedy** sampling strategy, when we selected next word based on the highest probability. Here is how it works:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "[{'generated_text': \"It was early evening when I can back from work. I usually work late, but this time it was an exception. When I entered a room, I saw my girlfriend with a very cute looking guy. I noticed that his eyes that looked into mine were a bit scary and kind of like a predator's eyes. And that's when I knew that I had to know something about him.\\n\\nAs I sat, studying him, my mind was racing in a crazy way. We didn't exactly\"},\n",
       " {'generated_text': 'It was early evening when I can back from work. I usually work late, but this time it was an exception. When I entered a room, I saw my reflection in a small screen attached to a thin, white piece of glass. Suddenly, I thought about the possibility of making money making a computer, and decided to try it for myself! I started working on making an e-ink display. I was able to make it work by following some simple rules!\\n\\nThe first step is'},\n",
       " {'generated_text': 'It was early evening when I can back from work. I usually work late, but this time it was an exception. When I entered a room, I saw a pile of clothing and furniture. They were completely ruined. The place had been abandoned. It had been a home, not a work place.\\n\\nWhen the police arrived, they found the crime scene and I walked with two officers to the police station. Before that I had not gone to the police station and I have never been to'},\n",
       " {'generated_text': 'It was early evening when I can back from work. I usually work late, but this time it was an exception. When I entered a room, I saw a woman who could not be a mother of any kind. I saw the same woman with her eyes closed and her mouth wide open. She was sitting naked on a couch and was sitting topless. I saw at least three different women with similar posture. The only thing I kept telling myself was \"if this is how a woman eats,'},\n",
       " {'generated_text': \"It was early evening when I can back from work. I usually work late, but this time it was an exception. When I entered a room, I saw a pale person with long white hair sitting quietly. He was a bit bigger and his eyes looked different. He was watching me.\\n\\nI didn't have to say anything. The man kept studying me, and after a while he turned his eyes away from me to the floor. In a moment he was gone, and he was still\"}]"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "prompt = \"It was early evening when I can back from work. I usually work late, but this time it was an exception. When I entered a room, I saw\"\n",
    "generator(prompt,max_length=100,num_return_sequences=5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Beam Search** allows the generator to explore several directions (*beams*) of text generation, and select the ones with highers overall score. You can do beam search by providing `num_beams` parameter. You can also specify `no_repeat_ngram_size` to penalize the model for repeating n-grams of a given size: "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "[{'generated_text': 'It was early evening when I can back from work. I usually work late, but this time it was an exception. When I entered a room, I saw a group of people sitting around a table. One of them was a middle-aged man. He was wearing a black suit and a white shirt. His eyes were closed and he was staring at the floor with his hands on his knees.\\n\\n\"What are you doing here?\" I asked him. \"What do you want?\"\\n'},\n",
       " {'generated_text': 'It was early evening when I can back from work. I usually work late, but this time it was an exception. When I entered a room, I saw a man sitting at a table in front of a computer. He was wearing a black suit, a white shirt, and a red tie.\\n\\n\"Hello,\" he said to me. \"How are you?\" he asked me in a voice that sounded as if he was speaking to a child. There was a smile on his face.'},\n",
       " {'generated_text': 'It was early evening when I can back from work. I usually work late, but this time it was an exception. When I entered a room, I saw a man sitting on the edge of the bed. He was staring at me.\\n\\n\"What are you doing here?\" he asked in a low voice. \"I don\\'t know why you\\'re here, and I\\'m not interested in what you have to say,\" he said, as if he didn\\'t understand what I was saying.'},\n",
       " {'generated_text': 'It was early evening when I can back from work. I usually work late, but this time it was an exception. When I entered a room, I saw a man sitting in a chair in front of a desk. He was in his mid-thirties, and he was wearing a dark suit and a white shirt.\\n\\n\"What are you doing here?\" I asked him. The man didn\\'t answer. Instead, he walked over to the desk and sat down on it. \"'},\n",
       " {'generated_text': 'It was early evening when I can back from work. I usually work late, but this time it was an exception. When I entered a room, I saw that the door was open, and a man was sitting on a chair. He was wearing a white shirt and black pants.\\n\\n\"What are you doing here?\" I asked. The man looked at me for a moment, then said, \"I have something to tell you.\" He handed me a piece of paper. It was a'}]"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "prompt = \"It was early evening when I can back from work. I usually work late, but this time it was an exception. When I entered a room, I saw\"\n",
    "generator(prompt,max_length=100,num_return_sequences=5,num_beams=10,no_repeat_ngram_size=2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Sampling** selects the next word non-deterministically, using the probability distribution returned by the model. You turn on sampling using `do_sample=True` parameter. You can also specify `temperature`, to make the model more or less deterministic."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "[{'generated_text': 'It was early evening when I can back from work. I usually work late, but this time it was an exception. When I entered a room, I saw one of my colleagues lying on his stomach. He was unconscious. I could smell a strong odor of alcohol on his breath. I was on my way out. I ran to his room and found him unconscious on the bed. I saw a bottle of wine on the bed. He was in a state of intoxication. He was unconscious. I'}]"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "prompt = \"It was early evening when I can back from work. I usually work late, but this time it was an exception. When I entered a room, I saw\"\n",
    "generator(prompt,max_length=100,do_sample=True,temperature=0.8)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can also provide to additional parameters to sampling:\n",
    "* `top_k` specifies the number of word options to consider when using sampling. This minimizes the chance of getting weird (low-probability) words in our text.\n",
    "* `top_p` is similar, but we chose the smallest subset of most probable words, whose total probability is larger than p.\n",
    "\n",
    "Feel free to experiment with adding those parameters in."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Fine-Tuning GPT-2\n",
    "\n",
    "You can also fine-tune GPT-2 text generation on your own dataset. This will allow you to adjust the style of text, while keeping the major part of language model. The example of fine-tuning GPT-2 to generate song lyrics can be found [in this blog post](https://towardsdatascience.com/how-to-fine-tune-gpt-2-for-text-generation-ae2ea53bc272)."
   ]
  }
 ],
 "metadata": {
  "interpreter": {
   "hash": "16af2a8bbb083ea23e5e41c7f5787656b2ce26968575d8763f2c4b17f9cd711f"
  },
  "kernelspec": {
   "display_name": "Python 3.8.12 ('py38')",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.12"
  },
  "orig_nbformat": 4
 },
 "nbformat": 4,
 "nbformat_minor": 2
}