Evaluate translation or summarization with BLEU similarity score
The BiLingual Evaluation Understudy (BLEU) scoring algorithm evaluates the similarity between a candidate document and a collection of reference documents. Use the BLEU score to evaluate the quality of document translation and summarization models.
returns the BLEU similarity score between the specified candidate document and the reference
documents. The function computes n-gram overlaps between score
= bleuEvaluationScore(candidate
,references
)candidate
and
references
for n-gram lengths one through four, with equal weighting.
For more information, see BLEU Score.
uses the specified n-gram weighting, where score
= bleuEvaluationScore(candidate
,references
,'NgramWeights',ngramWeights
)ngramWeights(i)
corresponds to
the weight for n-grams of length i
. The length of the weight vector
determines the range of n-gram lengths to use for the BLEU score evaluation.
[1] Papineni, Kishore, Salim Roukos, Todd Ward, and Wei-Jing Zhu. "BLEU: A Method for Automatic Evaluation of Machine Translation." In Proceedings of the 40th annual meeting on association for computational linguistics, pp. 311-318. Association for Computational Linguistics, 2002.
bm25Similarity
| cosineSimilarity
| extractSummary
| lexrankScores
| mmrScores
| rougeEvaluationScore
| textrankScores
| tokenizedDocument