Notes: Modality and Negation: An Introduction to the Special Issue (2012) by Roser Morante, University of Antwerp & Caroline Sporleder, Saarland University http://www.anthology.aclweb.org/J/J12/J12-2001.pdf “Certainty”? & “Modality”? *Proposition: the container of truth or falsity value of declarative sentences. [...] *Propositional aspects of meaning: elements in sentences that are presented as factual. *Extra-propositional aspects of meaning: *“a further step towards text understanding”(223) *linguistic constructions that give an indication of the degree of commitment of the speaker to the truth of a proposition. *“there is more to meaning than just propositional content is a long-held view”(223) *the attitude of the speaker towards her statements in terms of degree of certainty, reliability, subjectivity, sources of information, and perspective. (225) *Epistemic modality: *“expresses the speaker’s degree of commitment to the truth of a proposition”. (http://www.aclweb.org/anthology/W/W10/W10-3006.pdf) *Epistemic modals are used to indicate the possibility or necessity of some piece of knowledge. (Wikipedia, Linguistic_modality) Traditionally, most research in NLP has focused on propositional aspects of meaning. To truly understand language, however, extra-propositional aspects are equally important. Modality and negation typically contribute significantly to these extra-propositional meaning aspects. Researchers have started to work on modeling factuality, belief and certainty, detecting speculative sentences and hedging, identifying contradictions, and determining the scope of expressions of modality and negation. In this article, we will provide an overview of how modality and negation have been modeled in computational linguistics. 1. introduction grammatical phenomena One of the first categorizations of modality is proposed by Otto Jespersen (1924 = The Philosophy of Grammer) in the chapter about Mood, where the grammarian distinguishes between “categories containing an element of will” and categories “containing no element of will.” *from The Philosophy of Grammer: *Would it be possible to place all "moods" in a logically consistent system? This was attempted by grammarians more than a hundred years ago on the basis of first Wolff's and then Kant's philosophy. The former in his Ontology had the three categories, possibility, necclISity and contingency, a.nd the latter under the head of "modality" the three of possibility, existence, a.nd necessity; Gottfricd Hormann then gave the further subdivisions: objective possibility (conjunctive), subjective possibility (optative), objective necessity (Greek verba.l adjectives in -teos) and subjective necessity (imperative). (Jespersen, 1924) extra-propositional meanings to the event LAY OFF(GM,workers): *a. GM will lay off workers. *b. A spokesman for GM said GM will lay off workers. *c. GM may lay off workers. *d. The politician claimed that GM will lay off workers. *e. Some wish GM would lay of workers. *f. Will GM lay off workers? *g. Many wonder whether GM will lay off workers. Generally speaking, modality is a grammatical category that allows the expression of aspects related to the attitude of the speaker towards her statements in terms of degree of certainty, reliability, subjectivity, sources of information, and perspective. We understand modality in a broad sense, which involves related concepts like “subjectivity”, “hedging”, “evidentiality”, “uncertainty”, “committed belief,” and “factuality”. So far computational linguistics addressed these two main tasks: *detecting modality *lexical based, but the lexical markers are varied/heterogeneous *for example: 'might', 'this brings us to the largest of all mysteries', 'little was known' *interacts with mood and tense markers *and so discourse factors(?) do (224) *the resolution of the scope of modality Modality recognition is used for: *textual entailment (meaningful relations) *machine translation *trustworthiness detection *classification of citations *clinical and biomedical text processing *identification of text structure Most of the work in this area has been carried out at the sentence or predicate level. 2. Modality From a theoretical perspective, modality can be defined *as a philosophical concept, *as a subject of the study of logic *as a grammatical category. Modality, a way to shake answers from their factualness to turn them into open questions again. New observations come up as the subject matter is changing. Theoretical linguistic background of 'modality': *Jespersen (1924, page 329) attempts to place all moods in a logically consistent system, distinguishing between “categories containing an element of will” and “categories containing no element of will” *later named as propositional modality and event modality by Palmer (1986). *Lyons (1977, page 793) describes epistemic modality as concerned with matters of knowledge and belief, “the speaker’s opinion or attitude towards the proposition that the sentence expresses or the situation that the proposition describes.” *Palmer (1986, page 8) distinguishes propositional modality, which is “concerned with the speaker’s attitude to the truth-value or factual status of the proposition” Within propositional modality, Palmer defines two types: *epistemic, used by speakers “to express their judgement about the factual status of the proposition,” *evidential, used “to indicate the evidence that they have for its factual status” (Palmer 1986, 8–9). *deontic, which relates to obligation or permission and to conditional factors “that are external to the relevant individual,” *dynamic, where the factors are internal to the individual (Palmer 1986, pages 9–13). Additionally, Palmer indicates other categories that may be marked as irrealis and may be found in the mood system: *future *negative *interrogative *imperative-jussive *presupposed *conditional *purposive *resultative *wishes *fears The term hedging is originally due to Lakoff (1972, page 195), who describes hedges as “words whose job is to make things more or less fuzzy.”(...) Lakoff starts from the observation that “natural language concepts have vague boundaries and fuzzy edges and that, consequently, natural language sentences will very often be neither true, nor false, nor nonsensical, but rather true to a certain extent and false to a certain extent, true in certain aspects and false in certain aspects” (Lakoff 1972, page 183) In order to deal with this aspect of language, he extends the classical propositional and predicate logic to fuzzy logic and focuses on the study of hedges. (227) Certainty is a type of subjective information that can be conceived of as a variety of epistemic modality (Rubin, Liddy, and Kando 2005). Here we take their definition (page 65): . . . certainty is viewed as a type of subjective information available in texts and a form of epistemic modality expressed through explicitly-coded linguistic means. Such devices [...] explicitly signal presence of certainty information that covers a full continuum of writer’s confidence, ranging from uncertain possibility and withholding full commitment to statements. Modality and evidentiality are grammatical categories, whereas certainty, hedging, and subjectivity are pragmatic positions, and event factuality is a level of information. (228) Modality-related phenomena are not rare. *11% of sentences in MEDLINE contain speculative language. (According to Light, Qiu, and Srinivasan (2004)) *around 18% of sentences occurring in biomedical abstracts are speculative. (Vincze et al. (2008) report) *20% of the events in a biomedical corpus belong to speculative sentences and that 7% of the events are expressed with some degree of speculation. (Nawaz, Thompson, and Ananiadou (2010)) *a significant proportion of the gene names mentioned in a corpus of biomedical articles appear in speculative sentence (638 occurences out of a total of 1,968). This means that approximately 1 in every 3 genes should be excluded from the interaction detection process. (Szarvas (2008)) *59% of the sentences in a corpus of 80 articles from The New York Times were identified as epistemically modalized.(Rubin (2006)) 4. Categorizing and Annotating Modality and Negation categorization schemes annotated corpora 'modality attributes': OntoSem project Nirenburg and Raskin 2004): modality type *polarity - whether a proposition is positive or negated *volition - the extent to which someone wants or does not want the event/state to occur *obligation - the extent to which someone considers the event/state to be necessary *belief - the extent to which someone believes the content of the proposition *potential - the extent to which someone believes that the event/state is possible *permission - the extent to which someone believes that the event/state is permitted *evaluative - the extent to which someone believes the event/state is a good thing value *0-1 scope *predicate that is affected by the modality of the sentence attributed-to *to whom the modality is assigned (default = speaker) (9) Entrance to the tower should be totally camouflaged In Example (9), should is identified as a modality cue and characterized with the type obligative, value 0.8, scope camouflage, and is attributed to the speaker. FactBank: (Saur ?? and Pustejovsky 2009) a corpus of events annotated with factuality information degree of factualities: *fact *counterfact *probable *not probable *possible *not certain *certain but unknown output *unknown or uncommitted based on: Horn’s (1989) analysis of epistemic modality in terms of scalar predication: (234) *positive: {certain, {probable / likely}, possible} *negative counterpart: {uncertain, {unlikely / improbable}, impossible} modality lexicon (Baker et al., 2010) http://www.umiacs.umd.edu/~bonnie/ModalityLexicon.txt to automatically annotate a corpus with modality information the lexicon entries structure: *cue sequence of modal words *POS for each word *madality type *a head word *one or more subcategorization codes three components are identified in a sentence: *trigger - the word or sequence of words that expresses modality *target - the event, state, or relation that the modality scopes over *holder - the experiencer or cognizer of the modality eight modalities: *requirement - (does H require P?) *permissive - (does H allow P?) *success - (does H succeed in P?) *effort - (does H try to do P?) *intention - (does H intend P?) *ability - (can H do P?) *want - (does H want P?) *belief - (with what strength does H believe P?) The annotation work by Wilbur, Rzhetsky, and Shatkay (2006) is motivated by the need to identify and characterize parts of scientific documents where reliable information can be found. They define five dimensions to characterize scientific sentences: *FOCUS (scientific versus general) *POLARITY (positive versus negative statement) *LEVEL OF CERTAINTY in the range 0–3 *STRENGTH of evidence *DIRECTION / TREND (increase or decrease in certain measurement). 5. Detection of Speculative Sentences Three types of text analysis seem to be able to detect speculation: From the research presented in this section it seems that classifying sentences as to whether they are speculative or not can be performed by using knowledge-poor machine learning approaches as well as by linguistically motivated methods.It has also been shown that it is feasible to build a hedge classifier in an unsupervised manner. (241) 10. Final Remarks which aspects of extra-propositional meaning need to be modeled for which applications. Outside sentiment analysis, relatively little research has been carried out in this area so far. most research so far has been carried out on English and on selected domains and genres (biomedical, reviews, newswire). It would also be good to broaden the set of domains and genres (including fiction, scientific texts, weblogs, etc.) since extra-propositional meaning is particularly susceptible to domain and genre effects.