Evaluating natural language processing systems

King, Margaret. 1996. “Evaluating Natural Language Processing Systems.” Commun. ACM 39 (1): 73–79. https://doi.org/10.1145/234173.234208.

Notes

IS0 9126 evaluation metrics

adequacy
- can it meet a set of stated and implied needs
progress
- is the progress made towards desired goal state of the system
diagnostic
- discover whether the system is failing and understand why it does so

In-text annotations

"The ISO 9126 evaluation model sees evaluation in terms of judging a system’s adequacy: whether it can meet a set of stated or implied needs. We might also distinguish progress evaluation, typically carried out by a system developer in order to determine whether progress has been made towards some desired goal state of the system, and diagnostic evaluation, designed not only to discover whether the system fails but why it does so." (Page 74)

"comprehensibility and quality of translation" (Page 75)