revert html encoding

tensorflow · Nov 1, 2023 · 99aad3b · 99aad3b
1 parent 4516e2c
commit 99aad3b
Showing 1 changed file with 40 additions and 45 deletions.
diff --git a/docs/tutorials/word2vec.ipynb b/docs/tutorials/word2vec.ipynb
@@ -37,20 +37,20 @@
         "id": "AOpGoE2T-YXS"
       },
       "source": [
-        "<table class=\"tfo-notebook-buttons\" align=\"left\">\n",
-        "  <td>\n",
-        "    <a target=\"_blank\" href=\"https://www.tensorflow.org/text/tutorials/word2vec\"><img src=\"https://www.tensorflow.org/images/tf_logo_32px.png\" />View on TensorFlow.org</a>\n",
-        "  </td>\n",
-        "  <td>\n",
-        "    <a target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/text/blob/master/docs/tutorials/word2vec.ipynb\"><img src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" />Run in Google Colab</a>\n",
-        "  </td>\n",
-        "  <td>\n",
-        "    <a target=\"_blank\" href=\"https://github.com/tensorflow/text/blob/master/docs/tutorials/word2vec.ipynb\"><img src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" />View on GitHub</a>\n",
-        "  </td>\n",
-        "  <td>\n",
-        "    <a href=\"https://storage.googleapis.com/tensorflow_docs/text/docs/tutorials/word2vec.ipynb\"><img src=\"https://www.tensorflow.org/images/download_logo_32px.png\" />Download notebook</a>\n",
-        "  </td>\n",
-        "</table>"
+        "\u003ctable class=\"tfo-notebook-buttons\" align=\"left\"\u003e\n",
+        "  \u003ctd\u003e\n",
+        "    \u003ca target=\"_blank\" href=\"https://www.tensorflow.org/text/tutorials/word2vec\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/tf_logo_32px.png\" /\u003eView on TensorFlow.org\u003c/a\u003e\n",
+        "  \u003c/td\u003e\n",
+        "  \u003ctd\u003e\n",
+        "    \u003ca target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/text/blob/master/docs/tutorials/word2vec.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" /\u003eRun in Google Colab\u003c/a\u003e\n",
+        "  \u003c/td\u003e\n",
+        "  \u003ctd\u003e\n",
+        "    \u003ca target=\"_blank\" href=\"https://github.com/tensorflow/text/blob/master/docs/tutorials/word2vec.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" /\u003eView on GitHub\u003c/a\u003e\n",
+        "  \u003c/td\u003e\n",
+        "  \u003ctd\u003e\n",
+        "    \u003ca href=\"https://storage.googleapis.com/tensorflow_docs/text/docs/tutorials/word2vec.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/download_logo_32px.png\" /\u003eDownload notebook\u003c/a\u003e\n",
+        "  \u003c/td\u003e\n",
+        "\u003c/table\u003e"
       ]
     },
     {
@@ -86,7 +86,7 @@
         "id": "xP00WlaMWBZC"
       },
       "source": [
-        "## Skip-gram and negative sampling"
+        "## Skip-gram and negative sampling "
       ]
     },
     {
@@ -95,7 +95,7 @@
         "id": "Zr2wjv0bW236"
       },
       "source": [
-        "While a bag-of-words model predicts a word given the neighboring context, a skip-gram model predicts the context (or neighbors) of a word, given the word itself. The model is trained on skip-grams, which are n-grams that allow tokens to be skipped (see the diagram below for an example). The context of a word can be represented through a set of skip-gram pairs of `(target_word, context_word)` where `context_word` appears in the neighboring context of `target_word`."
+        "While a bag-of-words model predicts a word given the neighboring context, a skip-gram model predicts the context (or neighbors) of a word, given the word itself. The model is trained on skip-grams, which are n-grams that allow tokens to be skipped (see the diagram below for an example). The context of a word can be represented through a set of skip-gram pairs of `(target_word, context_word)` where `context_word` appears in the neighboring context of `target_word`. "
       ]
     },
     {
@@ -106,7 +106,7 @@
       "source": [
         "Consider the following sentence of eight words:\n",
         "\n",
-        "> The wide road shimmered in the hot sun.\n",
+        "\u003e The wide road shimmered in the hot sun.\n",
         "\n",
         "The context words for each of the 8 words of this sentence are defined by a window size. The window size determines the span of words on either side of a `target_word` that can be considered a `context word`. Below is a table of skip-grams for target words based on different window sizes."
       ]
@@ -135,7 +135,7 @@
         "id": "gK1gN1jwkMpU"
       },
       "source": [
-        "The training objective of the skip-gram model is to maximize the probability of predicting context words given the target word. For a sequence of words *w<sub>1</sub>, w<sub>2</sub>, ... w<sub>T</sub>*, the objective can be written as the average log probability"
+        "The training objective of the skip-gram model is to maximize the probability of predicting context words given the target word. For a sequence of words *w\u003csub\u003e1\u003c/sub\u003e, w\u003csub\u003e2\u003c/sub\u003e, ... w\u003csub\u003eT\u003c/sub\u003e*, the objective can be written as the average log probability"
       ]
     },
     {
@@ -171,7 +171,7 @@
         "id": "axZvd-hhotVB"
       },
       "source": [
-        "where *v* and *v<sup>'<sup>* are target and context vector representations of words and *W* is vocabulary size."
+        "where *v* and *v\u003csup\u003e'\u003csup\u003e* are target and context vector representations of words and *W* is vocabulary size."
       ]
     },
     {
@@ -180,7 +180,7 @@
         "id": "SoLzxbqSpT6_"
       },
       "source": [
-        "Computing the denominator of this formulation involves performing a full softmax over the entire vocabulary words, which are often large (10<sup>5</sup>-10<sup>7</sup>) terms."
+        "Computing the denominator of this formulation involves performing a full softmax over the entire vocabulary words, which are often large (10\u003csup\u003e5\u003c/sup\u003e-10\u003csup\u003e7\u003c/sup\u003e) terms."
       ]
     },
     {
@@ -189,7 +189,7 @@
         "id": "Y5VWYtmFzHkU"
       },
       "source": [
-        "The [noise contrastive estimation](https://www.tensorflow.org/api_docs/python/tf/nn/nce_loss) (NCE) loss function is an efficient approximation for a full softmax. With an objective to learn word embeddings instead of modeling the word distribution, the NCE loss can be [simplified](https://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf) to use negative sampling."
+        "The [noise contrastive estimation](https://www.tensorflow.org/api_docs/python/tf/nn/nce_loss) (NCE) loss function is an efficient approximation for a full softmax. With an objective to learn word embeddings instead of modeling the word distribution, the NCE loss can be [simplified](https://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf) to use negative sampling. "
       ]
     },
     {
@@ -198,7 +198,7 @@
         "id": "WTZBPf1RsOsg"
       },
       "source": [
-        "The simplified negative sampling objective for a target word is to distinguish  the context word from `num_ns` negative samples drawn from noise distribution *P<sub>n</sub>(w)* of words. More precisely, an efficient approximation of full softmax over the vocabulary is, for a skip-gram pair, to pose the loss for a target word as a classification problem between the context word and `num_ns` negative samples."
+        "The simplified negative sampling objective for a target word is to distinguish  the context word from `num_ns` negative samples drawn from noise distribution *P\u003csub\u003en\u003c/sub\u003e(w)* of words. More precisely, an efficient approximation of full softmax over the vocabulary is, for a skip-gram pair, to pose the loss for a target word as a classification problem between the context word and `num_ns` negative samples."
       ]
     },
     {
@@ -296,7 +296,7 @@
       "source": [
         "Consider the following sentence:\n",
         "\n",
-        "> The wide road shimmered in the hot sun.\n",
+        "\u003e The wide road shimmered in the hot sun.\n",
         "\n",
         "Tokenize the sentence:"
       ]
@@ -332,7 +332,7 @@
       "outputs": [],
       "source": [
         "vocab, index = {}, 1  # start indexing from 1\n",
-        "vocab['<pad>'] = 0  # add a padding token\n",
+        "vocab['\u003cpad\u003e'] = 0  # add a padding token\n",
         "for token in tokens:\n",
         "  if token not in vocab:\n",
         "    vocab[token] = index\n",
@@ -437,8 +437,7 @@
       },
       "outputs": [],
       "source": [
-        "# print([inverse_vocab[x] for x in example_sequence])\n",
-        "for target, context in positive_skip_grams[:10]:\n",
+        "for target, context in positive_skip_grams[:5]:\n",
         "  print(f\"({target}, {context}): ({inverse_vocab[target]}, {inverse_vocab[context]})\")"
       ]
     },
@@ -448,7 +447,7 @@
         "id": "_ua9PkMTISF0"
       },
       "source": [
-        "### Negative sampling for one skip-gram"
+        "### Negative sampling for one skip-gram "
       ]
     },
     {
@@ -457,7 +456,7 @@
         "id": "Esqn8WBfZnEK"
       },
       "source": [
-        "The `skipgrams` function returns all positive skip-gram pairs by sliding over a given window span. To produce additional skip-gram pairs that would serve as negative samples for training, you can to sample random words from the vocabulary. Use the `tf.random.log_uniform_candidate_sampler` function to sample `num_ns` number of negative samples for a given target word in a window. You can pass words from the positive class but this does not exclude them from the results. For large vocabularies, this is not a problem because the chance of drawing one of the positive classes is small. However for small data you may see overlap between negative and positive samples. Later we will add code to exclude positive samples for slightly improved accuracy at the cost of longer runtime."
+        "The `skipgrams` function returns all positive skip-gram pairs by sliding over a given window span. To produce additional skip-gram pairs that would serve as negative samples for training, you can sample random words from the vocabulary. Use the `tf.random.log_uniform_candidate_sampler` function to sample `num_ns` number of negative samples for a given target word in a window. You can pass words from the positive class but this does not exclude them from the results. For large vocabularies, this is not a problem because the chance of drawing one of the positive classes is small. However for small data you may see overlap between negative and positive samples. Later we will add code to exclude positive samples for slightly improved accuracy at the cost of longer runtime."
       ]
     },
     {
@@ -631,7 +630,7 @@
         "id": "iLKwNAczHsKg"
       },
       "source": [
-        "### Skip-gram sampling table"
+        "### Skip-gram sampling table "
       ]
     },
     {
@@ -640,7 +639,7 @@
         "id": "TUUK3uDtFNFE"
       },
       "source": [
-        "A large dataset means larger vocabulary with higher number of more frequent words such as stopwords. Training examples obtained from sampling commonly occurring words (such as `the`, `is`, `on`) don't add much useful information  for the model to learn from. [Mikolov et al.](https://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf) suggest subsampling of frequent words as a helpful practice to improve embedding quality."
+        "A large dataset means larger vocabulary with higher number of more frequent words such as stopwords. Training examples obtained from sampling commonly occurring words (such as `the`, `is`, `on`) don't add much useful information  for the model to learn from. [Mikolov et al.](https://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf) suggest subsampling of frequent words as a helpful practice to improve embedding quality. "
       ]
     },
     {
@@ -820,7 +819,7 @@
         "id": "sOsbLq8a37dr"
       },
       "source": [
-        "Read the text from the file and print the first few lines:"
+        "Read the text from the file and print the first few lines: "
       ]
     },
     {
@@ -1018,7 +1017,7 @@
       "outputs": [],
       "source": [
         "for seq in sequences[:5]:\n",
-        "  print(f\"{seq} => {[inverse_vocab[i] for i in seq]}\")"
+        "  print(f\"{seq} =\u003e {[inverse_vocab[i] for i in seq]}\")"
       ]
     },
     {
@@ -1061,7 +1060,7 @@
         "print('\\n')\n",
         "print(f\"targets.shape: {targets.shape}\")\n",
         "print(f\"contexts.shape: {contexts.shape}\")\n",
-        "print(f\"labels.shape: {labels.shape}\")"
+        "print(f\"labels.shape: {labels.shape}\")\n"
       ]
     },
     {
@@ -1200,7 +1199,7 @@
         "    # word_emb: (batch, embed)\n",
         "    context_emb = self.context_embedding(context)\n",
         "    # context_emb: (batch, context, embed)\n",
-        "    dots = tf.einsum('be,bce->bc', word_emb, context_emb)\n",
+        "    dots = tf.einsum('be,bce-\u003ebc', word_emb, context_emb)\n",
         "    # dots: (batch, context)\n",
         "    return dots"
       ]
@@ -1227,7 +1226,7 @@
         "      return tf.nn.sigmoid_cross_entropy_with_logits(logits=x_logit, labels=y_true)\n",
         "```\n",
         "\n",
-        "It's time to build your model! Instantiate your word2vec class with an embedding dimension of 128 (you could experiment with different values). Compile the model with the `tf.keras.optimizers.Adam` optimizer."
+        "It's time to build your model! Instantiate your word2vec class with an embedding dimension of 128 (you could experiment with different values). Compile the model with the `tf.keras.optimizers.Adam` optimizer. "
       ]
     },
     {
@@ -1282,11 +1281,7 @@
       },
       "outputs": [],
       "source": [
-        "word2vec.fit(dataset, epochs=20, callbacks=[tensorboard_callback])\n",
-        "# original\n",
-        "# 63/63 [==============================] - 1s 15ms/step - loss: 0.4750 - accuracy: 0.8917\n",
-        "# with negative samples\n",
-        "# 39/39 [==============================] - 1s 23ms/step - loss: 0.4328 - accuracy: 0.9214"
+        "word2vec.fit(dataset, epochs=20, callbacks=[tensorboard_callback])"
       ]
     },
     {
@@ -1316,7 +1311,7 @@
         "id": "awF3iRQCZOLj"
       },
       "source": [
-        "<!-- <img class=\"tfo-display-only-on-site\" src=\"images/word2vec_tensorboard.png\"/> -->"
+        "\u003c!-- \u003cimg class=\"tfo-display-only-on-site\" src=\"images/word2vec_tensorboard.png\"/\u003e --\u003e"
       ]
     },
     {
@@ -1433,9 +1428,9 @@
   ],
   "metadata": {
     "colab": {
-      "toc_visible": true,
-      "provenance": [],
-      "private_outputs": true
+      "collapsed_sections": [],
+      "name": "word2vec.ipynb",
+      "toc_visible": true
     },
     "kernelspec": {
       "display_name": "Python 3",
@@ -1444,4 +1439,4 @@
   },
   "nbformat": 4,
   "nbformat_minor": 0
-}
+}