Generalized Interpolation in Decision Tree LM

Denis Filimonov1 and Mary Harper2
1University of Maryland, HLTCOE Johns Hopkins University, 2University of Maryland


Abstract

In the face of sparsity, statistical models are often interpolated with lower order (backoff) models, particularly in Language Modeling. In this paper, we argue that there is a relation between the higher order and the backoff model that must be satisfied in order for the interpolation to be effective. We show that in n-gram models, the relation is trivially held, but in models that allow arbitrary clustering of context (such as decision tree models), this relation is generally not satisfied. Based on this insight, we also propose a generalization of linear interpolation which significantly improves the performance of a decision tree language model.




Full paper: http://www.aclweb.org/anthology/P/P11/P11-2109.pdf