Linguistic Perspective on Writing Quality

Lately, I have been doing some research on the Linguistic perspective on writing quality: what frameworks/theories others have used in research, what their methodologies are, how they define “quality”, etc. I found quite a lot of articles, but after reading through them, I realized that they are all pure Computational Linguistics, both in their theoretical frameworks and their methodologies. Most of the recent ones are trying to solve the problem of having the computer determine if their NLG output is “good” or not. (For example: Are automated summaries coherent?)

Almost all of the articles I found equate quality with coherence/cohesion. The articles will sometimes give a passing nod to Halliday and Hasan (1976), but not much more than that. Instead, they seem to focus on theories in the Computational Linguistics research such as Centering Theory (Grosz et al., 1983), or the “theory of attention, intention, and aggregation of utterances” (Grosz and Sidner, 1986) or Rhetorical Structure Theory (Mann and Thompson, 1988). Or they base it on cognitive psychology work, such as “Coherence in text, coherence in mind” — a book by Givón (1993).

The methodologies of the studies I have been reading are all using a lot of formulas and Hidden Markov Models trying to find a model of language that fits the data and which correlates with some human judgement of quality. I am not sure how far I will be going down that path, but out of all of it, the Rhetorical Structure Theory looks the most interesting and might be applicable to my research as an analysis tool. It’s definitely the most popular framework for the articles I have seen.

Unfortunately, my research purpose and rationale is not as focused as I would like it to be at this point. I was hoping to narrow it down sooner rather than later. But maybe I should just gather my data and pick a topic (or at least a linguistic level) and dive in and see what happens.


A Sample of Key (Highly-Cited) Computational Linguistics journal articles about Cohesion/Coherence and/or Writing/Text Quality

