We usually make a model inference based on a distribution. For example, we will get the prediction based on a softmax distribution.
However, if we have two or more distributions (joint distribution) in a model. How to make an inference?
Making an inference based on joint distribution
PaperĀ PLOME: Pre-training with Misspelled Knowledge for Chinese Spelling Correction proposed a method for us.
For example:
In order to make an inference for \(p_j(y_i = j|X)\), we will compute theĀ joint distribution \(p_c(y_i = j|X)\) and \(p_p(g_i=j^p|X)\)
Then, we can use \( argmax p_j(y_i|X)\) to get the final prediction.
This method can be used in some multi-task classification.