【深度学习】序列生成模型（五）：评价方法计算实例：计算BLEU-N得分【理论到程序】

发布时间：2023年12月20日

文章目录

一、BLEU-N得分（Bilingual Evaluation Understudy）

??给定一个生成序列“The cat sat on the mat”和两个参考序列“The cat is on the mat”“The bird sat on the bush”分别计算BLEU-N和ROUGE-N得分(N=1或N =2时).

生成序列 $\mathbf{x}=\text{the cat sat on the mat}$
参考序列
- $\mathbf{s}^{(1)}=\text{the cat is on the mat}$
- $\mathbf{s}^{(2)}=\text{the bird sat on the bush}$

一、BLEU-N得分（Bilingual Evaluation Understudy）

在这里插入图片描述

1. 定义

??设 𝒙 为模型生成的候选序列， $\mathbf{s^{(1)}}, ? , \mathbf{s^{(K)}}$ 为一组参考序列，𝒲 为从生成的候选序列中提取所有N元组合的集合。BLEU算法的精度（Precision）定义如下：

$P_N(\mathbf{x}) = \frac{\sum_{w \in \mathcal{W}} \min(c_w(\mathbf{x}), \max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))}{\sum_{w \in \mathcal{W}} c_w(\mathbf{x})}$

其中 $c_w(\mathbf{x})$ 是N元组合 $w$ 在生成序列 $\mathbf{x}$ 中出现的次数， $c_w(\mathbf{s}^{(k)})$ 是N元组合 $w$ 在参考序列 $\mathbf{s}^{(k)}$ 中出现的次数。

??为了处理生成序列长度短于参考序列的情况，引入长度惩罚因子 $b(\mathbf{x})$ ：

$b(\mathbf{x}) = \begin{cases} 1 & \text{if } l_x > l_s \\ \exp\left(1 - \frac{l_s}{l_x}\right) & \text{if } l_x \leq l_s \end{cases}$

其中 $l_x$ 是生成序列的长度， $l_s$ 是参考序列的最短长度。

??BLEU算法通过计算不同长度的N元组合的精度，并进行几何加权平均，得到最终的BLEU分数：

$\text{BLEU-N}(\mathbf{x}) = b(\mathbf{x}) \times \exp\left( \sum_{N=1}^{N'} \alpha_N \log P_N(\mathbf{x})\right)$

其中 $N^{'}$ 为最长N元组合的长度， $\alpha_N$ 是不同N元组合的权重，一般设为 $1/ N^{'}$ 。

2. 计算

N=1

生成序列 $\mathbf{x}=\text{the cat sat on the mat}$
参考序列
- $\mathbf{s}^{(1)}=\text{the cat is on the mat}$
- $\mathbf{s}^{(2)}=\text{the bird sat on the bush}$
$\mathcal{W}=\text{ {the, cat, sat, on, mat}}$
- $w=\text{the}$
  - $c_w(\mathbf{x})=2, c_w(\mathbf{s^{(1)}})=2,c_w(\mathbf{s^{(2)}})=2$
  - $\max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))=2$
  - $\min(c_w(\mathbf{x}), \max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))=2$
- $w=\text{cat}$
  - $c_w(\mathbf{x})=1, c_w(\mathbf{s^{(1)}})=1,c_w(\mathbf{s^{(2)}})=0$
  - $\max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))=1$
  - $\min(c_w(\mathbf{x}), \max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))=1$
- $w=\text{sat}$
  - $c_w(\mathbf{x})=1, c_w(\mathbf{s^{(1)}})=0, c_w(\mathbf{s^{(2)}})=1$
  - $\max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))=1$
  - $\min(c_w(\mathbf{x}), \max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))=1$
- $w=\text{on}$
  - $c_w(\mathbf{x})=1, c_w(\mathbf{s^{(1)}})=1,c_w(\mathbf{s^{(2)}})=1$
  - $\max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))=1$
  - $\min(c_w(\mathbf{x}), \max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))=1$
- $w=\text{mat}$
  - $c_w(\mathbf{x})=1, c_w(\mathbf{s^{(1)}})=1,c_w(\mathbf{s^{(2)}})=0$
  - $\max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))=1$
  - $\min(c_w(\mathbf{x}), \max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))=1$
$\sum_{w \in \mathcal{W}} \min(c_w(\mathbf{x}), \max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))=2+1+1+1+1+1=6$
$\sum_{w \in \mathcal{W}} c_w(\mathbf{x})=1+1+1+1+1+1=6$
$P_1(\mathbf{x}) = \frac{\sum_{w \in \mathcal{W}} \min(c_w(\mathbf{x}), \max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))}{\sum_{w \in \mathcal{W}} c_w(\mathbf{x})}= \frac{6}{6}=1$

N=2

生成序列 $\mathbf{x}=\text{the cat sat on the mat}$
参考序列
- $\mathbf{s}^{(1)}=\text{the cat is on the mat}$
- $\mathbf{s}^{(2)}=\text{the bird sat on the bush}$
$\mathcal{W}=\text{{the cat, cat sat, sat on, on the, the mat} }$

$w$	$c_w(\mathbf{x})$	$c_w(\mathbf{s^{(1)}})$	$c_w(\mathbf{s^{(2)}})$	$\max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))$	$\min(c_w(\mathbf{x}), \max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))$
the cat	1	1	0	1	1
cat sat	1	0	0	0	0
sat on	1	0	1	1	1
on the	1	1	1	1	1
the mat	1	1	0	1	1

$\sum_{w \in \mathcal{W}} \min(c_w(\mathbf{x}), \max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))=1+0+1+1+1=4$
$\sum_{w \in \mathcal{W}} c_w(\mathbf{x})=1+1+1+1+1=5$
$P_2(\mathbf{x}) = \frac{\sum_{w \in \mathcal{W}} \min(c_w(\mathbf{x}), \max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))}{\sum_{w \in \mathcal{W}} c_w(\mathbf{x})}= \frac{4}{5}$

BLEU-N 得分

??为了处理生成序列长度短于参考序列的情况，引入长度惩罚因子 $b(\mathbf{x})$ ： $b(\mathbf{x}) = \begin{cases} 1 & \text{if } l_x > l_s \\ \exp\left(1 - \frac{l_s}{l_x}\right) & \text{if } l_x \leq l_s \end{cases}$ 其中 $l_x$ 是生成序列的长度， $l_s$ 是参考序列的最短长度。

??这里 $l_x=l_{s^{(1)}}=l_{s^{(2)}}=6$ ，因此 $b(\mathbf{x}) =e^{\left( 1 - \frac{l_s}{l_x} \right)}=e^0=1$

??BLEU算法通过计算不同长度的N元组合的精度，并进行几何加权平均，得到最终的BLEU分数：
$\text{BLEU-N}(\mathbf{x}) = b(\mathbf{x}) \times \exp\left(\frac{1}{N'} \sum_{N=1}^{N'} \alpha_N \log P_N(\mathbf{x})\right)$ 其中 $N^{'}$ 为最长N元组合的长度， $\alpha_N$ 是不同N元组合的权重，一般设为 $1/ N^{'}$ 。
$\text{BLEU-N}(\mathbf{x}) = 1 \times\exp\left( \sum_{N=1}^{2} \frac{1}{2} \log P_N(\mathbf{x})\right)\\ =\exp\left(\frac{1}{2}\log P_1(\mathbf{x})+\frac{1}{2}\log P_2(\mathbf{x)}\right)\\ =\exp\left(\frac{1}{2}\log 1+\frac{1}{2}\log \frac{4}{5}\right)\\ =\exp\left(0+\log \sqrt\frac{4}{5}\right)\\ =\sqrt\frac{4}{5}$

3. 程序

main_string = 'the cat sat on the mat'
string1 = 'the cat is on the mat'
string2 = 'the bird sat on the bush'

# 计算单词
unique_words = set(main_string.split())
total_occurrences, matching_occurrences = 0, 0

for word in unique_words:
    count_main_string = main_string.count(word)
    total_occurrences += count_main_string
    matching_occurrences += min(count_main_string, max(string1.count(word), string2.count(word)))

similarity_word = matching_occurrences / total_occurrences
print(f"N=1: {similarity_word}")

# 计算双词
word_tokens = main_string.split()
bigrams = set([f"{word_tokens[i]} {word_tokens[i + 1]}" for i in range(len(word_tokens) - 1)])
total_occurrences, matching_occurrences = 0, 0

for bigram in bigrams:
    count_main_string = main_string.count(bigram)
    total_occurrences += count_main_string
    matching_occurrences += min(count_main_string, max(string1.count(bigram), string2.count(bigram)))

similarity_bigram = matching_occurrences / total_occurrences
print(f"N=2: {similarity_bigram}")

输出：

N=1: 1.0
N=2: 0.8

文章来源:https://blog.csdn.net/m0_63834988/article/details/135107231
本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若内容造成侵权/违法违规/事实不符，请联系我的编程经验分享网邮箱：chenni525@qq.com进行投诉反馈，一经查实，立即删除！