e
RST
The website for Enhanced Rhetorical Structure Theory
Rhetorical Structure Theory
RST is a pragmatic theory of textual organization which aims to explain the structure of documents
by imposing a labeled tree structure on natural language texts. In RST, each document consists of a sequence of Elementary Discourse Units (EDUs), which roughly correspond to the propositions (sentences/clauses) in the text. For example, the following fragment from a vlog contains five EDUs, each corresponding to a clause:
[We did n't do a whole lot of hiking here] [because there was not a whole lot of cloud coverage that day ,] [and it was so hot .] [So what we did was we drove around the entire perimeter of the lake] [and you get all these incredible views of the mountains and the lake ,] [and seeing it from all the different angles was so worth it .]
EDUs in an RST analysis are connected to each other via labeled relations which indicate the rhetorical purpose the EDU with respect to another unit, using a small inventory of pragmatically motivated relations, such as Cause (unit A is the cause of unit B) or List (A and B for a list of coordinated propositions). Groups of EDUs which are formed in this way will be connected to other units, until a tree spanning the entire document is formed.
If the two units being connected are of equal prominence, then the relation will be multinuclear, such as the List in the example below. Otherwise, if one unit is more prominent and the other could more easily be omitted, then the less prominent unit will be called a satellite, and its labeled relation will point to the more prominent unit, called the nucleus.
In this example, the speaker first concedes that they did not do much hiking (Concession), additionally specifying a Cause using a List of two EDUs. The causes are less important or prominent than the result (not hiking much), so they form a satellite. The concession itself it also not the main point: it serves as a satellite to the nucleus specifying what the speaker actually did, which is drive around a lake and see some views (the nucleus, itself a List). The second item in the nucleus list also has an Evaluation satellite, giving a positive evaluation that seeing the views was worth it.
What relations?
A range of relation inventories have been used for RST; here we'll be using the inventory from the largest English RST corpus, GUM. Relations are often classified as either presentational, meaning the Writer or speaker (W) intends to influence or persuade the Reader or hearer (R), or subject matter, meaning they represent more objective semantic relations. We also distinguish satellite-nucleaus relations, in which one proposition is more prominent (e.g. Concession), from multinuclear relations, in which participating propositions are equally prominents (e.g. symmetric Contrast).
RST relation labels in the GUM corpus - W is the writer or speaker and R is the reader or hearer
Subject matter | (informational) | |
attribution (pos/neg) | S is (or is not) source of information in N | [we need to go now!] <- [Kim said] |
cause | S causes N | [I'm full] <- [because I already ate] |
condition | S is a condition for N to happened | [If you rent a car ] -> [you can drive there] |
elab.-additional | S provides more info about N | [The hall was full] <- [100 guests were there] |
elab.-attribute | S provides more info about phrase in N | [The hall] <- [built in 2010] [was…] |
evaluation | S gives opinion about N (R needn't agree) | [Madonna has a new song] <- [I like it] |
manner | S gives manner: how N happened | [It was shipped] <- [according to EU norms] |
means | S indicates means by which N happened | [I opened the door] <- [by kicking it with my boot] |
purpose-goal | N occurs in order for S to happen | [I bought it] <- [so that I have a present too] |
purp.-attribute | S provides purpose of phrase in N | [a plan] <- [to win] |
restatement-partial | S reiterates part of N (else use multinuc) | [It's big and heavy.] <- [Really huge.] |
result | S is result of N (inverse of cause) | [a bomb destroyed the house] <- [12 were injured] |
solutionhood | N is answer to a problem in S | [it's broken]->[so use a spare] |
Presentational | (influence R) | |
antithesis | R finds N more credible than S | [They're unemployed,] <- [they're not lazy!] |
background | R needs to know S to understand N | [He was offended] <- [in his culture that's an insult] |
circumstance | S gives circumstances (time, place) of N | [He got rich] <- [after the recession happened] |
concession | W admits S, but still claims N | [It's perfect] <- [even though it's scratched] |
evidence | S gives evidence that N is true | [Madonna's song is great] <- [it's in the top 10] |
justify | Justifies why W can say this | [Madonna's song is great] <- [the music is amazing] |
motivation | Motivates R to do something | [Madonna's song is great] -> [you should buy it] |
org.-preparation | S prepares R for N | [I'll tell you why:] -> [It hasn't changed since 1990] |
org.-heading | S is graphically arranged to prepare for N | [Introduction] -> [No code is unbreakable. ] |
org.-phatic | S holds the floor for N, no semantic value | [Um, I mean,] -> [so did they buy one?] |
question | S requests the information in N | [Why did you do it?] -> [I needed the money!] |
Multinuclear relations | (symmetric, all subject matter) | |
contrast | W presents similar units with contrast | [It makes things cheaper] [but it’s harder to do] |
disjunction | W presents a set of alternatives | [You can go by air] [or you can go by sea] |
list | W presents coordinate, like units | [Last year all summer I read books] [and surfed] |
sequence | W presents chronological sequence | [Jack joined in 1990.] [Then I joined in 1991.] |
other | W presents unlike units with no other rel. | [The cliffs are worth seeing.][The beach is a sight too.] |
repetition | W presents equivalent/redundant units | [It's unbeatable.][You just can't beat it.] |
Non-relations | | |
same-unit | (Technical device for interrupted EDUs) | [Kim,] who …, [was also there] |
Enhancements (or: why eRST?)
Although RST gives great insights into the intentional pragmatic structure of a text, it has at least two shortcomings:
- The tree constraint means that some relations cannot be expressed (between non-adjacent units, or multiple relations between two units)
- There is no indication how we know that a relation applies, or what the components of the relation are.
For example, it seems clear that the word because is an important signal (called a discourse marker or DM) for the Cause relation, and for the Evaluation we may want to know that the positive assessment relates to how there were different views, or that the activity was worth a lot to the speaker. Additionally, we can notice some other discourse markers with no corresponding relations, such as so, which indicates that the driving was also a Result of not hiking, or the and in the last EDT, which indicates that the evaluation is part of the List nucleus to which the first sentence is a Concession, and which is also part of the result of the first sentence.
eRST represents these observations using two enhancements:
- Secondary edges (marked in blue arrows in the graph below), which are superimposed on the primary RST tree;
- and signals, which are categorized spans of words specifying what kinds of devices can be used to identify the relations in the graph.
In this analysis, two additional relations are added in blue: Result, indicated by the So highlighted in blue, and List, indicated by the and highlighted in blue. Regular, tree-conforming RST relations, can also be marked by discourse markers, such as the red-highlighted because marking Cause and the two red instances of and marking List relations.
We can also see some non-discourse marker signals: the words different and worth form lexical signals (highlighted in yellow) which are indicative of the Evaluation relation. In total eRST currently recognizes over 40 different types of signals, outlined in the following table (see the Guidelines for more details).
Signal type | Subtypes | Example |
dm | discourse markers | [because they wanted to]<organization-preparation> |
orphan | secondary dm | [but then they wanted to]<joint-sequence> |
graphical | colon, dash, semicolon | [Let me tell you a story :]<organization-preparation> |
| layout | [Introduction]<organization-heading> |
| items in sequence | 1. wash [2. cut]<joint-list> |
| parentheses, quotation marks | it rained [(and snowed a bit)]<elaboration-additional> |
| question mark | [Did you?]<topic-question> No. |
lexical | alternate expression | He agreed. [That is he said yes]<restatement-repetition> |
| indicative word/phrase | They planned a party! [That’s nice/Can’t wait!]<evaluation-comment> |
morphological | mood | Go with them [I think you should]<explanation-motivation> |
| tense | I started an hour ago, [now I’m resting]<joint-sequence> |
numerical | same count | [Two reasons.]<organization-preparation> First. . . |
reference | comparative | [I don’t want it]<adversative-antithesis> I want another one. |
| demonstrative / personal | They met Kim. [This person / she was. . . ]<elaboration-additional> |
| propositional | They met Kim. [This encouner was. . . ]<elaboration-additional> |
semantic | antonymy | Beer is cheap, [wine is expensive]<adversative-contrast> |
| attribution source | [Kim said]<attribution-positive> they would |
| lexical chain | it was funny [so they laughed]<causal-result> |
| meronymy | The house was big, [the door two meters tall]<elaboration-additional> |
| negation | Kim danced, [Yun didn’t dance]<adversative-contrast> |
| repetition/synonymy | They met Dr. Kim. [Dr. Kim/The surgeon was. . . ]<elaboration-additional> |
syntactic | infinitival/relative clause | a plan [to win]<purpose-attribute> |
| interrupted matrix clause | [I meant –]<orgnization-phatic> I mean, |
| modified head | a plan [to win]<purpose-attribute> |
| nominal modifier | articles [explaining chess]<elaboration-attribute> |
| parallel syntactic construction | it’s all tasty [it’s all pretty]<joint-list> |
| past/present participial clause | Kim appeared [dressed in black]<elaboration-attribute> |
| reported speech | [Kim said]<attribution-positive> that they would |
| subject auxiliary inversion | I would have [had I known]<contingency-condition> |