In this example, we are going to learn how to apply pre-trained word embeddings. This can be useful when you have a very small dataset; too small to actually learn the embeddings from the data itself. However, pre-trained word embeddings for regression and classification predictive purposes rarely perform as well as learning the word embeddings from the data itself. This will become obvious in this example.

Learning objectives:

  • How to prepare pre-trained word embeddings
  • How to apply pre-trained word embeddings

Requirements

library(keras)     # deep learning modeling
library(tidyverse) # various data wrangling & visualization tasks
library(progress)  # provides progress bar for status updates during long loops
library(glue)      # easy print statements

Preprocess IMDB data

To minimize notebook clutter, I have extracted the code to preprocess the training data into a separate script. This script uses the same downloaded IMDB movie reviews that the previous module used and preprocesses it in the same way.

source("prepare_imdb.R")
Importing reviews and labels...✔ 
Processing text with tokenizer...✔ 
Creating feature and label tensors...✔ 
Cleaning up global environment...✔ 

As a result, we have a few key pieces of information that we will use in downstream modeling (i.e. word index, features, max sequence length).

ls()
[1] "features"         "labels"           "max_len"          "num_words_used"  
[5] "sequences"        "texts"            "tokenizer"        "top_n_words"     
[9] "total_word_index"

Prepare GloVe pre-trained word embeddings

We are going to use the pre-trained GloVe word embeddings which can be downloaded here. For this example, we downloaded the glove.6B.zip file that contains 400K words and their associated word embeddings. Here, we’ll use the 100 dimension word embeddings which has already been saved for you in the data directory.

See the requirements set-up notebook for download instructions.

# get path to downloaded data
if (stringr::str_detect(here::here(), "conf-2020-user")) {
  path <- "/home/conf-2020-user/data/glove/glove.6B.100d.txt"
} else {
  path <- here::here("materials", "data", "glove", "glove.6B.100d.txt")
}

glove_wts <- data.table::fread(path, quote = "", data.table = FALSE) %>% 
  as_tibble()

dim(glove_wts)
[1] 400000    101

Our imported data frame has the associated word (or grammatical symbol), and the 100 weights associated to its representative vector.

head(glove_wts)

However, pre-trained models are typically trained on entirely different data (or vocabulary). Consequently, they do not always capture all words present in our dataset. The following illustrates that…

applicable_index <- total_word_index[total_word_index <= top_n_words]
applicable_words <- names(applicable_index)

available_wts <- glove_wts %>%
  filter(V1 %in% applicable_words) %>% 
  pull(V1)

diff <- length(applicable_words) - length(available_wts)

glue("There are {diff} words in our IMDB data that are not represented in GloVe")
There are 199 words in our IMDB data that are not represented in GloVe

We need to create our own embeddings matrix with all applicable words represented. When doing so, we want to create the matrix in order of our word index so the embeddings are properly aligned. To do so, we will create an empty matrix to fill.

# required dimensions of our embedding matrix
num_words_used <- length(applicable_words)
embedding_dim <- ncol(glove_wts) - 1

# create empty matrix
embedding_matrix <- matrix(0, nrow = num_words_used, ncol = embedding_dim)
row.names(embedding_matrix) <- applicable_words

# First 10 rows & columns of our empty matrix
embedding_matrix[1:10, 1:10]
    [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
the    0    0    0    0    0    0    0    0    0     0
and    0    0    0    0    0    0    0    0    0     0
a      0    0    0    0    0    0    0    0    0     0
of     0    0    0    0    0    0    0    0    0     0
to     0    0    0    0    0    0    0    0    0     0
is     0    0    0    0    0    0    0    0    0     0
br     0    0    0    0    0    0    0    0    0     0
in     0    0    0    0    0    0    0    0    0     0
it     0    0    0    0    0    0    0    0    0     0
i      0    0    0    0    0    0    0    0    0     0

To fill our embedding matrix, we loop through the GloVe weights, get the available embeddings, and add to our empty embedding matrix so that they align with the word index order. If the word does not exist in the pretrained word embeddings then we make the embedding values 0.

Note: this takes a little less than 2 minutes to process.

# this just allows us to track progress of our loop
pb <- progress_bar$new(total = num_words_used)

for (word in applicable_words) {
  # track progress
  pb$tick()
  
  # get embeddings for a given word
  embeddings <- glove_wts %>%
    filter(V1 == word) %>%
    select(-V1) %>% 
    as.numeric()
  
  # if embeddings don't exist create a vector of all zeros
  if (all(is.na(embeddings))) {
    embeddings <- vector("numeric", embedding_dim)
  }
  
  # add embeddings to appropriate location in matrix
  embedding_matrix[word, ] <- embeddings
}
embedding_matrix[1:10, 1:8]
         [,1]      [,2]      [,3]      [,4]      [,5]      [,6]      [,7]      [,8]
the -0.038194 -0.244870  0.728120 -0.399610  0.083172  0.043953 -0.391410  0.334400
and -0.071953  0.231270  0.023731 -0.506380  0.339230  0.195900 -0.329430  0.183640
a   -0.270860  0.044006 -0.020260 -0.173950  0.644400  0.712130  0.355100  0.471380
of  -0.152900 -0.242790  0.898370  0.169960  0.535160  0.487840 -0.588260 -0.179820
to  -0.189700  0.050024  0.190840 -0.049184 -0.089737  0.210060 -0.549520  0.098377
is  -0.542640  0.414760  1.032200 -0.402440  0.466910  0.218160 -0.074864  0.473320
br   0.197880  0.252650 -0.283080 -0.110950 -0.733530 -0.475200  0.133500 -0.143690
in   0.085703 -0.222010  0.165690  0.133730  0.382390  0.354010  0.012870  0.224610
it  -0.306640  0.168210  0.985110 -0.336060 -0.241600  0.161860 -0.053496  0.430100
i   -0.046539  0.619660  0.566470 -0.465840 -1.189000  0.445990  0.066035  0.319100

Now we’re ready to train our model.

Create and train our model

Define a model

We will be using the same model architecture as we did in the IMDB word embeddings notebook. The top_n_words and max_len objects come from the IMDBsource(“prepare_imdb.R”)` code chunk.

model <- keras_model_sequential() %>% 
  layer_embedding(input_dim = top_n_words, 
                  output_dim = embedding_dim, 
                  input_length = max_len) %>% 
  layer_flatten() %>% 
  layer_dense(units = 1, activation = "sigmoid")

summary(model)
Model: "sequential_1"
________________________________________________________________________________________
Layer (type)                           Output Shape                       Param #       
========================================================================================
embedding_1 (Embedding)                (None, 150, 100)                   1000000       
________________________________________________________________________________________
flatten_1 (Flatten)                    (None, 15000)                      0             
________________________________________________________________________________________
dense_1 (Dense)                        (None, 1)                          15001         
========================================================================================
Total params: 1,015,001
Trainable params: 1,015,001
Non-trainable params: 0
________________________________________________________________________________________

Load the GloVe embeddings in the model

To set the weights of our embedding layer to our pretrained embedding matrix, we:

  1. access our first layer,
  2. set the weights by supplying our embedding matrix,
  3. freeze the weights so they are not adjusted when we train our model.
get_layer(model, index = 1) %>% 
  set_weights(list(embedding_matrix)) %>% 
  freeze_weights()

Train and evaluate

Let’s compile our model and train it.

Our best performance is less than stellar!

best_epoch <- which(history$metrics$val_loss == min(history$metrics$val_loss))
loss <- history$metrics$val_loss[best_epoch] %>% round(3)
acc <- history$metrics$val_acc[best_epoch] %>% round(3)

glue("The best epoch had a loss of {loss} and accuracy of {acc}")
The best epoch had a loss of 0.687 and accuracy of 0.584

When to use pre-trained models?

Recall that word embeddings are often trained for language models ℹ️. This means the embeddings are designed to predict the optimal word within a phrase. This rarely generalizes well to using word embeddings for normal regression and classification modeling.

Things to consider:

  • When you have very little text data, you often cannot produce reliable embeddings. Pre-trained embeddings can sometimes provide performance improvements in these situations.

  • Even better, when you have very little data, find alternative data sources that aligns with your (i.e. use Amazon product reviews as an alternative data source to your company’s product reviews). You can then train word embeddings on this alternative data source and use those embeddings for your own problem.

  • Some language models (i.e. ULMFiT) can be transferred for classification purposes much like VGG16, Resnet50, etc. within the CNNs context. Consequently, you can use these models, freeze and unfreeze layer weights, and re-train only part of the model to be specific to your task.

🏠

LS0tCnRpdGxlOiAiTkxQOiBUcmFuc2ZlciBsZWFybmluZyB3aXRoIEdsb1ZlIHdvcmQgZW1iZWRkaW5ncyIKb3V0cHV0OgogIGh0bWxfbm90ZWJvb2s6CiAgICB0b2M6IHllcwogICAgdG9jX2Zsb2F0OiB0cnVlCi0tLQoKYGBge3Igc2V0dXAsIGluY2x1ZGU9RkFMU0V9CmtuaXRyOjpvcHRzX2NodW5rJHNldChlY2hvID0gVFJVRSwgbWVzc2FnZSA9IEZBTFNFLCB3YXJuaW5nID0gRkFMU0UpCmBgYAoKSW4gdGhpcyBleGFtcGxlLCB3ZSBhcmUgZ29pbmcgdG8gbGVhcm4gaG93IHRvIGFwcGx5IHByZS10cmFpbmVkIHdvcmQgZW1iZWRkaW5ncy4KVGhpcyBjYW4gYmUgdXNlZnVsIHdoZW4geW91IGhhdmUgYSB2ZXJ5IHNtYWxsIGRhdGFzZXQ7IHRvbyBzbWFsbCB0byBhY3R1YWxseQpsZWFybiB0aGUgZW1iZWRkaW5ncyBmcm9tIHRoZSBkYXRhIGl0c2VsZi4gSG93ZXZlciwgcHJlLXRyYWluZWQgd29yZCBlbWJlZGRpbmdzCmZvciByZWdyZXNzaW9uIGFuZCBjbGFzc2lmaWNhdGlvbiBwcmVkaWN0aXZlIHB1cnBvc2VzIHJhcmVseSBwZXJmb3JtIGFzIHdlbGwgYXMKbGVhcm5pbmcgdGhlIHdvcmQgZW1iZWRkaW5ncyBmcm9tIHRoZSBkYXRhIGl0c2VsZi4gVGhpcyB3aWxsIGJlY29tZSBvYnZpb3VzIGluCnRoaXMgZXhhbXBsZS4KCkxlYXJuaW5nIG9iamVjdGl2ZXM6CgotIEhvdyB0byBwcmVwYXJlIHByZS10cmFpbmVkIHdvcmQgZW1iZWRkaW5ncwotIEhvdyB0byBhcHBseSBwcmUtdHJhaW5lZCB3b3JkIGVtYmVkZGluZ3MKCiMgUmVxdWlyZW1lbnRzCgpgYGB7cn0KbGlicmFyeShrZXJhcykgICAgICMgZGVlcCBsZWFybmluZyBtb2RlbGluZwpsaWJyYXJ5KHRpZHl2ZXJzZSkgIyB2YXJpb3VzIGRhdGEgd3JhbmdsaW5nICYgdmlzdWFsaXphdGlvbiB0YXNrcwpsaWJyYXJ5KHByb2dyZXNzKSAgIyBwcm92aWRlcyBwcm9ncmVzcyBiYXIgZm9yIHN0YXR1cyB1cGRhdGVzIGR1cmluZyBsb25nIGxvb3BzCmxpYnJhcnkoZ2x1ZSkgICAgICAjIGVhc3kgcHJpbnQgc3RhdGVtZW50cwpgYGAKCiMgUHJlcHJvY2VzcyBJTURCIGRhdGEKClRvIG1pbmltaXplIG5vdGVib29rIGNsdXR0ZXIsIEkgaGF2ZSBleHRyYWN0ZWQgdGhlIGNvZGUgdG8gcHJlcHJvY2VzcyB0aGUKdHJhaW5pbmcgZGF0YSBpbnRvIGEgc2VwYXJhdGUgc2NyaXB0LiBUaGlzIHNjcmlwdCB1c2VzIHRoZSBzYW1lIGRvd25sb2FkZWQgSU1EQgptb3ZpZSByZXZpZXdzIHRoYXQgdGhlIHByZXZpb3VzIG1vZHVsZSB1c2VkIGFuZCBwcmVwcm9jZXNzZXMgaXQgaW4gdGhlIHNhbWUKd2F5LgoKYGBge3IsIHdhcm5pbmc9RkFMU0V9CnNvdXJjZSgicHJlcGFyZV9pbWRiLlIiKQpgYGAKCkFzIGEgcmVzdWx0LCB3ZSBoYXZlIGEgZmV3IGtleSBwaWVjZXMgb2YgaW5mb3JtYXRpb24gdGhhdCB3ZSB3aWxsIHVzZSBpbgpkb3duc3RyZWFtIG1vZGVsaW5nIChpLmUuIHdvcmQgaW5kZXgsIGZlYXR1cmVzLCBtYXggc2VxdWVuY2UgbGVuZ3RoKS4KCmBgYHtyfQpscygpCmBgYAoKIyBQcmVwYXJlIEdsb1ZlIHByZS10cmFpbmVkIHdvcmQgZW1iZWRkaW5ncwoKV2UgYXJlIGdvaW5nIHRvIHVzZSB0aGUgcHJlLXRyYWluZWQgR2xvVmUgd29yZCBlbWJlZGRpbmdzIHdoaWNoIGNhbiBiZSBkb3dubG9hZGVkCltoZXJlXShodHRwczovL25scC5zdGFuZm9yZC5lZHUvcHJvamVjdHMvZ2xvdmUvKS4gRm9yIHRoaXMgZXhhbXBsZSwgd2UgZG93bmxvYWRlZAp0aGUgZ2xvdmUuNkIuemlwIGZpbGUgdGhhdCBjb250YWlucyA0MDBLIHdvcmRzIGFuZCB0aGVpciBhc3NvY2lhdGVkIHdvcmQKZW1iZWRkaW5ncy4gSGVyZSwgd2UnbGwgdXNlIHRoZSAxMDAgZGltZW5zaW9uIHdvcmQgZW1iZWRkaW5ncyB3aGljaCBoYXMgYWxyZWFkeQpiZWVuIHNhdmVkIGZvciB5b3UgaW4gdGhlIGRhdGEgZGlyZWN0b3J5LiAKCl9fX1NlZSB0aGUgW3JlcXVpcmVtZW50cyBzZXQtdXAgbm90ZWJvb2tdKGh0dHA6Ly9iaXQubHkvZGwtcnFtdHMpIGZvciBkb3dubG9hZCBpbnN0cnVjdGlvbnMuX19fCgpgYGB7cn0KIyBnZXQgcGF0aCB0byBkb3dubG9hZGVkIGRhdGEKaWYgKHN0cmluZ3I6OnN0cl9kZXRlY3QoaGVyZTo6aGVyZSgpLCAiY29uZi0yMDIwLXVzZXIiKSkgewogIHBhdGggPC0gIi9ob21lL2NvbmYtMjAyMC11c2VyL2RhdGEvZ2xvdmUvZ2xvdmUuNkIuMTAwZC50eHQiCn0gZWxzZSB7CiAgcGF0aCA8LSBoZXJlOjpoZXJlKCJtYXRlcmlhbHMiLCAiZGF0YSIsICJnbG92ZSIsICJnbG92ZS42Qi4xMDBkLnR4dCIpCn0KCmdsb3ZlX3d0cyA8LSBkYXRhLnRhYmxlOjpmcmVhZChwYXRoLCBxdW90ZSA9ICIiLCBkYXRhLnRhYmxlID0gRkFMU0UpICU+JSAKICBhc190aWJibGUoKQoKZGltKGdsb3ZlX3d0cykKYGBgCgpPdXIgaW1wb3J0ZWQgZGF0YSBmcmFtZSBoYXMgdGhlIGFzc29jaWF0ZWQgd29yZCAob3IgZ3JhbW1hdGljYWwgc3ltYm9sKSwgYW5kIHRoZQoxMDAgd2VpZ2h0cyBhc3NvY2lhdGVkIHRvIGl0cyByZXByZXNlbnRhdGl2ZSB2ZWN0b3IuCgpgYGB7cn0KaGVhZChnbG92ZV93dHMpCmBgYAoKSG93ZXZlciwgcHJlLXRyYWluZWQgbW9kZWxzIGFyZSB0eXBpY2FsbHkgdHJhaW5lZCBvbiBlbnRpcmVseSBkaWZmZXJlbnQgZGF0YSAob3IKdm9jYWJ1bGFyeSkuIENvbnNlcXVlbnRseSwgdGhleSBkbyBub3QgYWx3YXlzIGNhcHR1cmUgYWxsIHdvcmRzIHByZXNlbnQgaW4gb3VyCmRhdGFzZXQuIFRoZSBmb2xsb3dpbmcgaWxsdXN0cmF0ZXMgdGhhdC4uLgoKYGBge3J9CmFwcGxpY2FibGVfaW5kZXggPC0gdG90YWxfd29yZF9pbmRleFt0b3RhbF93b3JkX2luZGV4IDw9IHRvcF9uX3dvcmRzXQphcHBsaWNhYmxlX3dvcmRzIDwtIG5hbWVzKGFwcGxpY2FibGVfaW5kZXgpCgphdmFpbGFibGVfd3RzIDwtIGdsb3ZlX3d0cyAlPiUKICBmaWx0ZXIoVjEgJWluJSBhcHBsaWNhYmxlX3dvcmRzKSAlPiUgCiAgcHVsbChWMSkKCmRpZmYgPC0gbGVuZ3RoKGFwcGxpY2FibGVfd29yZHMpIC0gbGVuZ3RoKGF2YWlsYWJsZV93dHMpCgpnbHVlKCJUaGVyZSBhcmUge2RpZmZ9IHdvcmRzIGluIG91ciBJTURCIGRhdGEgdGhhdCBhcmUgbm90IHJlcHJlc2VudGVkIGluIEdsb1ZlIikKYGBgCgpXZSBuZWVkIHRvIGNyZWF0ZSBvdXIgb3duIGVtYmVkZGluZ3MgbWF0cml4IHdpdGggYWxsIGFwcGxpY2FibGUgd29yZHMKcmVwcmVzZW50ZWQuIFdoZW4gZG9pbmcgc28sIHdlIHdhbnQgdG8gY3JlYXRlIHRoZSBtYXRyaXggaW4gb3JkZXIgb2Ygb3VyIHdvcmQKaW5kZXggc28gdGhlIGVtYmVkZGluZ3MgYXJlIHByb3Blcmx5IGFsaWduZWQuIFRvIGRvIHNvLCB3ZSB3aWxsIGNyZWF0ZSBhbiBlbXB0eQptYXRyaXggdG8gZmlsbC4KCmBgYHtyfQojIHJlcXVpcmVkIGRpbWVuc2lvbnMgb2Ygb3VyIGVtYmVkZGluZyBtYXRyaXgKbnVtX3dvcmRzX3VzZWQgPC0gbGVuZ3RoKGFwcGxpY2FibGVfd29yZHMpCmVtYmVkZGluZ19kaW0gPC0gbmNvbChnbG92ZV93dHMpIC0gMQoKIyBjcmVhdGUgZW1wdHkgbWF0cml4CmVtYmVkZGluZ19tYXRyaXggPC0gbWF0cml4KDAsIG5yb3cgPSBudW1fd29yZHNfdXNlZCwgbmNvbCA9IGVtYmVkZGluZ19kaW0pCnJvdy5uYW1lcyhlbWJlZGRpbmdfbWF0cml4KSA8LSBhcHBsaWNhYmxlX3dvcmRzCgojIEZpcnN0IDEwIHJvd3MgJiBjb2x1bW5zIG9mIG91ciBlbXB0eSBtYXRyaXgKZW1iZWRkaW5nX21hdHJpeFsxOjEwLCAxOjEwXQpgYGAKClRvIGZpbGwgb3VyIGVtYmVkZGluZyBtYXRyaXgsIHdlIGxvb3AgdGhyb3VnaCB0aGUgR2xvVmUgd2VpZ2h0cywgZ2V0IHRoZQphdmFpbGFibGUgZW1iZWRkaW5ncywgYW5kIGFkZCB0byBvdXIgZW1wdHkgZW1iZWRkaW5nIG1hdHJpeCBzbyB0aGF0IHRoZXkgYWxpZ24gCndpdGggdGhlIHdvcmQgaW5kZXggb3JkZXIuICBJZiB0aGUgd29yZCBkb2VzIG5vdCBleGlzdCBpbiB0aGUgcHJldHJhaW5lZCB3b3JkCmVtYmVkZGluZ3MgdGhlbiB3ZSBtYWtlIHRoZSBlbWJlZGRpbmcgdmFsdWVzIDAuCgoqKk5vdGU6IHRoaXMgdGFrZXMgYSBsaXR0bGUgbGVzcyB0aGFuIDIgbWludXRlcyB0byBwcm9jZXNzLioqCgpgYGB7cn0KIyB0aGlzIGp1c3QgYWxsb3dzIHVzIHRvIHRyYWNrIHByb2dyZXNzIG9mIG91ciBsb29wCnBiIDwtIHByb2dyZXNzX2JhciRuZXcodG90YWwgPSBudW1fd29yZHNfdXNlZCkKCmZvciAod29yZCBpbiBhcHBsaWNhYmxlX3dvcmRzKSB7CiAgIyB0cmFjayBwcm9ncmVzcwogIHBiJHRpY2soKQogIAogICMgZ2V0IGVtYmVkZGluZ3MgZm9yIGEgZ2l2ZW4gd29yZAogIGVtYmVkZGluZ3MgPC0gZ2xvdmVfd3RzICU+JQogICAgZmlsdGVyKFYxID09IHdvcmQpICU+JQogICAgc2VsZWN0KC1WMSkgJT4lIAogICAgYXMubnVtZXJpYygpCiAgCiAgIyBpZiBlbWJlZGRpbmdzIGRvbid0IGV4aXN0IGNyZWF0ZSBhIHZlY3RvciBvZiBhbGwgemVyb3MKICBpZiAoYWxsKGlzLm5hKGVtYmVkZGluZ3MpKSkgewogICAgZW1iZWRkaW5ncyA8LSB2ZWN0b3IoIm51bWVyaWMiLCBlbWJlZGRpbmdfZGltKQogIH0KICAKICAjIGFkZCBlbWJlZGRpbmdzIHRvIGFwcHJvcHJpYXRlIGxvY2F0aW9uIGluIG1hdHJpeAogIGVtYmVkZGluZ19tYXRyaXhbd29yZCwgXSA8LSBlbWJlZGRpbmdzCn0KYGBgCgpgYGB7cn0KZW1iZWRkaW5nX21hdHJpeFsxOjEwLCAxOjhdCmBgYAoKTm93IHdlJ3JlIHJlYWR5IHRvIHRyYWluIG91ciBtb2RlbC4KCiMgQ3JlYXRlIGFuZCB0cmFpbiBvdXIgbW9kZWwKCiMjIERlZmluZSBhIG1vZGVsCgpXZSB3aWxsIGJlIHVzaW5nIHRoZSBzYW1lIG1vZGVsIGFyY2hpdGVjdHVyZSBhcyB3ZSBkaWQgaW4gdGhlIElNREIgd29yZCAKZW1iZWRkaW5ncyBbbm90ZWJvb2tdKGh0dHA6Ly9iaXQubHkvZGwtaW1kYi1lbWJlZGRpbmdzKS4gVGhlIGB0b3Bfbl93b3Jkc2AgYW5kCmBtYXhfbGVuYCBvYmplY3RzIGNvbWUgZnJvbSB0aGUgYElNREIgYHNvdXJjZSgicHJlcGFyZV9pbWRiLlIiKWAgY29kZSBjaHVuay4KCmBgYHtyfQptb2RlbCA8LSBrZXJhc19tb2RlbF9zZXF1ZW50aWFsKCkgJT4lIAogIGxheWVyX2VtYmVkZGluZyhpbnB1dF9kaW0gPSB0b3Bfbl93b3JkcywgCiAgICAgICAgICAgICAgICAgIG91dHB1dF9kaW0gPSBlbWJlZGRpbmdfZGltLCAKICAgICAgICAgICAgICAgICAgaW5wdXRfbGVuZ3RoID0gbWF4X2xlbikgJT4lIAogIGxheWVyX2ZsYXR0ZW4oKSAlPiUgCiAgbGF5ZXJfZGVuc2UodW5pdHMgPSAxLCBhY3RpdmF0aW9uID0gInNpZ21vaWQiKQoKc3VtbWFyeShtb2RlbCkKYGBgCgojIyBMb2FkIHRoZSBHbG9WZSBlbWJlZGRpbmdzIGluIHRoZSBtb2RlbAoKVG8gc2V0IHRoZSB3ZWlnaHRzIG9mIG91ciBlbWJlZGRpbmcgbGF5ZXIgdG8gb3VyIHByZXRyYWluZWQgZW1iZWRkaW5nIG1hdHJpeCwgCndlOgoKMS4gYWNjZXNzIG91ciBmaXJzdCBsYXllciwKMi4gc2V0IHRoZSB3ZWlnaHRzIGJ5IHN1cHBseWluZyBvdXIgZW1iZWRkaW5nIG1hdHJpeCwKMy4gZnJlZXplIHRoZSB3ZWlnaHRzIHNvIHRoZXkgYXJlIG5vdCBhZGp1c3RlZCB3aGVuIHdlIHRyYWluIG91ciBtb2RlbC4KCmBgYHtyfQpnZXRfbGF5ZXIobW9kZWwsIGluZGV4ID0gMSkgJT4lIAogIHNldF93ZWlnaHRzKGxpc3QoZW1iZWRkaW5nX21hdHJpeCkpICU+JSAKICBmcmVlemVfd2VpZ2h0cygpCmBgYAoKIyMgVHJhaW4gYW5kIGV2YWx1YXRlCgpMZXQncyBjb21waWxlIG91ciBtb2RlbCBhbmQgdHJhaW4gaXQuCgpgYGB7ciwgZWNobz1UUlVFLCByZXN1bHRzPSdoaWRlJ30KbW9kZWwgJT4lIGNvbXBpbGUoCiAgb3B0aW1pemVyID0gb3B0aW1pemVyX3Jtc3Byb3AobHIgPSAwLjAwMDEpLAogIGxvc3MgPSAiYmluYXJ5X2Nyb3NzZW50cm9weSIsCiAgbWV0cmljcyA9IGMoImFjYyIpCikKCmhpc3RvcnkgPC0gbW9kZWwgJT4lIGZpdCgKICBmZWF0dXJlcywgbGFiZWxzLAogIGVwb2NocyA9IDIwLAogIGJhdGNoX3NpemUgPSAzMiwKICB2YWxpZGF0aW9uX3NwbGl0ID0gMC4yLAogIGNhbGxiYWNrcyA9IGxpc3QoCiAgICBjYWxsYmFja19lYXJseV9zdG9wcGluZyhwYXRpZW5jZSA9IDUpLAogICAgY2FsbGJhY2tfcmVkdWNlX2xyX29uX3BsYXRlYXUocGF0aWVuY2UgPSAyKQogICAgKQopCmBgYAoKT3VyIGJlc3QgcGVyZm9ybWFuY2UgaXMgbGVzcyB0aGFuIHN0ZWxsYXIhCgpgYGB7cn0KYmVzdF9lcG9jaCA8LSB3aGljaChoaXN0b3J5JG1ldHJpY3MkdmFsX2xvc3MgPT0gbWluKGhpc3RvcnkkbWV0cmljcyR2YWxfbG9zcykpCmxvc3MgPC0gaGlzdG9yeSRtZXRyaWNzJHZhbF9sb3NzW2Jlc3RfZXBvY2hdICU+JSByb3VuZCgzKQphY2MgPC0gaGlzdG9yeSRtZXRyaWNzJHZhbF9hY2NbYmVzdF9lcG9jaF0gJT4lIHJvdW5kKDMpCgpnbHVlKCJUaGUgYmVzdCBlcG9jaCBoYWQgYSBsb3NzIG9mIHtsb3NzfSBhbmQgYWNjdXJhY3kgb2Yge2FjY30iKQpgYGAKCiMgV2hlbiB0byB1c2UgcHJlLXRyYWluZWQgbW9kZWxzPwoKUmVjYWxsIHRoYXQgd29yZCBlbWJlZGRpbmdzIGFyZSBvZnRlbiB0cmFpbmVkIGZvciBsYW5ndWFnZSBtb2RlbHMKW+KEue+4j10oaHR0cDovL2JpdC5seS9kbC0wNiMyKS4gVGhpcyBtZWFucyB0aGUgZW1iZWRkaW5ncyBhcmUgZGVzaWduZWQgdG8gcHJlZGljdAp0aGUgb3B0aW1hbCB3b3JkIHdpdGhpbiBhIHBocmFzZS4gVGhpcyByYXJlbHkgZ2VuZXJhbGl6ZXMgd2VsbCB0byB1c2luZyB3b3JkCmVtYmVkZGluZ3MgZm9yIG5vcm1hbCByZWdyZXNzaW9uIGFuZCBjbGFzc2lmaWNhdGlvbiBtb2RlbGluZy4gCgpUaGluZ3MgdG8gY29uc2lkZXI6CgotIFdoZW4geW91IGhhdmUgdmVyeSBsaXR0bGUgdGV4dCBkYXRhLCB5b3Ugb2Z0ZW4gY2Fubm90IHByb2R1Y2UgcmVsaWFibGUKICBlbWJlZGRpbmdzLiBQcmUtdHJhaW5lZCBlbWJlZGRpbmdzIGNhbiBzb21ldGltZXMgcHJvdmlkZSBwZXJmb3JtYW5jZQogIGltcHJvdmVtZW50cyBpbiB0aGVzZSBzaXR1YXRpb25zLgoKLSBFdmVuIGJldHRlciwgd2hlbiB5b3UgaGF2ZSB2ZXJ5IGxpdHRsZSBkYXRhLCBmaW5kIGFsdGVybmF0aXZlIGRhdGEgc291cmNlcwogIHRoYXQgYWxpZ25zIHdpdGggeW91ciAoaS5lLiB1c2UgQW1hem9uIHByb2R1Y3QgcmV2aWV3cyBhcyBhbiBhbHRlcm5hdGl2ZQogIGRhdGEgc291cmNlIHRvIHlvdXIgY29tcGFueSdzIHByb2R1Y3QgcmV2aWV3cykuIFlvdSBjYW4gdGhlbiB0cmFpbiB3b3JkCiAgZW1iZWRkaW5ncyBvbiB0aGlzIGFsdGVybmF0aXZlIGRhdGEgc291cmNlIGFuZCB1c2UgdGhvc2UgZW1iZWRkaW5ncyBmb3IgeW91cgogIG93biBwcm9ibGVtLgoKLSBTb21lIGxhbmd1YWdlIG1vZGVscyAoaS5lLiBbVUxNRmlUXShodHRwOi8vbmxwLmZhc3QuYWkvY2F0ZWdvcnkvY2xhc3NpZmljYXRpb24uaHRtbCkpCiAgY2FuIGJlIHRyYW5zZmVycmVkIGZvciBjbGFzc2lmaWNhdGlvbiBwdXJwb3NlcyBtdWNoIGxpa2UgVkdHMTYsIFJlc25ldDUwLCBldGMuCiAgd2l0aGluIHRoZSBDTk5zIGNvbnRleHQuIENvbnNlcXVlbnRseSwgeW91IGNhbiB1c2UgdGhlc2UgbW9kZWxzLCBmcmVlemUgYW5kCiAgdW5mcmVlemUgbGF5ZXIgd2VpZ2h0cywgYW5kIHJlLXRyYWluIG9ubHkgcGFydCBvZiB0aGUgbW9kZWwgdG8gYmUgc3BlY2lmaWMgdG8KICB5b3VyIHRhc2suCgpb8J+PoF0oaHR0cHM6Ly9naXRodWIuY29tL3JzdHVkaW8tY29uZi0yMDIwL2RsLWtlcmFzLXRmKQ==