Share AI Models and Methods

Does Bert or ELMo Representation Encode Syntax Information?

Bert or ELMo can generate word representations by a sentence. However, do these representations encode syntax information? Paper A Structural Probe for Finding Syntax in Word Representations gives us the answer.

From this paper, we can find:

syntax trees are embedded in a linear transformation of Bero or ELMo’s word representation space.

How to evaluate the distance between words?

In order to compute the word distance in a parse tree, we can do as follows:

words in the parse tree

Then, we can use a parse tree to train weight \(B\).

train distance using parse tree

However, there is a problem. If a sentence contains 5 words, we may get a 5*5 distance matrix. How to the correct link among words?

In this paper, we can generate a minimum spanning tree on predicted distances to recovers the dependency parse structure in both ELMo and BERT.

Then we will see a tree like:

a minimum spanning tree and parse tree

Leave a Reply

Your email address will not be published. Required fields are marked *