ELMO TensorFlow implementation

Sebastian Correa
2 min readNov 22, 2019

--

An ELMO layer implementation using TensorFlow-Hub and how to prepare data for any NLP task.

ELMo (em-beddings from Language Models) is a deep contextualized word representation that models both (1) complex characteristics of word use (e.g., syntax and semantics), and (2) how these uses vary across linguistic contexts (i.e., to model polysemy). These word vectors are learned functions of the internal states of a deep bidirectional language model (biLM), which is pre-trained on a large text corpus. They can be easily added to existing models and significantly improve the state of the art across a broad range of challenging NLP problems, including question answering, textual entailment, and sentiment analysis.¹

In this post, we will focus on how to load a pre-train ELMO model to use in own applications. We can train or not train the weights of the embeddings, it depends on the approach we need. So now we need some dependencies.

pip install tensorflow-hub

For building a model we need to prepare the text data. First, we need to define the max length of the sentence. Normally I use all data train and test to define the max sentence length, but you can apply any method that works for you. Then we will build arrays of sentences separated by spaces.

Now we can download and build our pre-trained ELMo layer. For more information, you can refer to TensorFlow-hub.

It’s time to build our model architecture. For using the ELMo layer we need to import it and pass the needed parameters.

It’s really important to add the two sess.run lines of code before doing the model fit. Because it’s obligatory to initializer the ELMo weights so the model could work. I hope this could help.

[1]: Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner,
Christopher Clark, Kenton Lee, Luke Zettlemoyer.
NAACL 2018. Deep contextualized word representations
https://arxiv.org/abs/1802.05365

--

--

Sebastian Correa
Sebastian Correa

Written by Sebastian Correa

Experienced engineer in machine learning, pattern recognition, NLP, and computer vision. Passionate about AI product conceptualization and management.