My Note of Maximum Entropy

发布时间：2023年12月25日

Note for Maximum Entropy

Notations

$P$ : model distr.
$\tilde{P}$ : empirical/sample distr. (Dirac distr.)
${x_i\}$ : sample
$f_j$ : features
$E_Pf$ : expectation under the distr. $P$

Maximum Entropy

Def. Max Entropy(ME)
$\max_{P\in \mathcal{P}} H(X)\\ E_P(f)=E_{\tilde{P}}(f) ~~~~~~~~~~~~(\star)$
where $f (x)$ are features.

Fact. $P_w(x)\sim e^{\sum_jw_jf_j(x)}$ is the solution to $inf_P L(P, w)$ , where $L$ is the Laplacian of ( $\star$ ), $w$ is the Lagrange multiplier.

Laplacian function: the likelihood of the sample,
$\Psi(w):=L(P_w,w)\\ =-\ln Z(w)+E_{\tilde{P}}(f)w\\ =\sum_i \ln P_w(x_i)$

Dual problem: Max. likelihood estimation(MLE)
$\max_w \Psi(w)$
where $\Psi(w):= \sum_{i} \ln p(x_i)$ .

Fact. the dual of ME( $\star$ ) is MLE( $\star\star$ ).

ME for Machine learning/conditional likelihood

Assume that $P (Y ∣ X)$ is a determinative model.

Def. Max Entropy for $P (Y ∣ X)$
$\max_{P\in \mathcal{P}} H(Y|X)\\ E_P(f)=E_{\tilde{P}}(f)$
where $f (x, y)$ are features, and $P(y|x)=\tilde{P}(x)P(y|x)$

Fact. $P_w(y|x)\sim e^{\sum_jw_jf_j(x,y)}$ is the solution to max $L (P, w)$ .

Laplacian function: $\Psi(w):=L(P_w,w)=\sum_i \ln P_w(y_i|x_i)$ , the conditional likelihood.

Dual problem (conditional MLE):
$\max_w \Psi(w)$
where $\Psi(w):= \sum_{i} \ln p(y_i|x_i)$ .

Exercise
plz consider ME for the generative model $P (X, Y)$

文章来源:https://blog.csdn.net/nbu2004/article/details/135209717
本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若内容造成侵权/违法违规/事实不符，请联系我的编程经验分享网邮箱：chenni525@qq.com进行投诉反馈，一经查实，立即删除！