Share AI Models and Methods

An Introduction to Mask Bert Inputs for Text Classification

In NLP, we do not mask any input embeddings for Bert in text classifcation task. However, in paper: Spelling Error Correction with Soft-Masked BERT proposed a masked method.

Soft-Masked Bert

BERT does not have sufficient capability to detect whether there is an error at each position. This paper proposed a method that uses [MASK] to represent error word.

Architecture of Soft-Masked BERT

How to get masked inputs for Bert

Step 1. use a GRU to encode input

GRU for masked bert

Step 2: get masked input embeddings

get masked input embeddings for bert


This soft-masked method is not a good one, for example, in aspect level sentiment, we also can use [MASK] to mark aspect words. However, it is insufficient.

As to sentence:

This price is low.

This quality is low.

price and quality are aspect terms, we will marsk them.

This [MASK] is low.

This [MASK] is low.

It will give you bad performance, which means we should use more kinds of marsked symbols.

Leave a Reply

Your email address will not be published. Required fields are marked *