Citations
Publications and citation information
License and attribution information
GUM is made available under a Creative Commons license in keeping with the underlying texts. The documents from Wikimedia (Wikinews, including interviews, and Wikivoyage) are available under a CC-BY attribution license, as are academic articles, Wikipedia biographies, OpenStax textbooks and YouTube vlogs (retrieved using YouTube's Creative Commons filtered search). Some of the political speeches included in the corpus did not specify exact licenses, but are made available by official government and UN websites which indicate that these speeches are in the public domain, and not subject to copyright. Conversations from the Santa Barbara Corpus have been made available for annotation in GUM under the CC-BY license, courtesy of Jack DuBois (UCSB).
However please note that wikiHow texts and fiction texts are made available under a CC-BY-NC-SA license (non-commercial, share alike), meaning that commercial and/or non-open source use of those texts is prohibited. Data from reddit forum discussions is not made available with the corpus, but can be obtained using a script under the licensing conditions imposed by reddit. When using the data, please make sure to cite the sources of the texts as required by their source sites, and give credit to the GUM annotators, which are listed below, for the annotated data.
Academic citations
As a scholarly citation for the corpus in articles, please use the paper most closely matching your use case:
- General citation for the corpus:
Zeldes, Amir (2017) The GUM Corpus: Creating Multilayer Resources in the Classroom. Language Resources and Evaluation 51(3), 581–612. - Papers using the Reddit subcorpus:
Behzad, Shabnam and Zeldes, Amir (2020) A Cross-Genre Ensemble Approach to Robust Reddit Part of Speech Tagging. In: Proceedings of the 12th Web as Corpus Workshop (WAC-XII), 50–56. - Papers focusing on entities:
Jessica Lin, and Amir Zeldes (2021), WikiGUM: Exhaustive Entity Linking for Wikification in 12 Genres. In: Proceedings of The Joint 15th Linguistic Annotation Workshop (LAW) and 3rd Designing Meaning Representations (DMR) Workshop (LAW-DMR 2021). Punta Cana, Dominican Republic, 170–175. - For papers focusing on the salience annotations, please cite this paper instead:
Jessica Lin, and Amir Zeldes (2024), GUMsley: Evaluating Entity Salience in Summarization for 12 English Genres. In: Proceedings of EACL 2024. St. Julian's, Malta, 2575–2588. - papers focusing on summarization:
Yang Janet Liu and Amir Zeldes (2023), GUMSum: Multi-Genre Data and Evaluation for English Abstractive Summarization. In: Findings of ACL 2023. Toronto, 9315–9327 - Using the OntoGUM annotations for coreference:
Yilun Zhu, Sameer Pradhan, and Amir Zeldes (2021), OntoGUM: Evaluating Contextualized SOTA Coreference Resolution on 12 More Genres. In: Proceedings of ACL-IJCNLP 2021. Bangkok, Thailand, 461–467. - For papers focusing on the discourse relations, discourse markers or other discourse signal annotations, please cite the eRST paper:
Zeldes, Amir, Tatsuya Aoyama, Yang Liu, Siyao Peng, Debopam Das & Luke Gessler (2025) eRST: A Signaled Graph Theory of Discourse Relations and Organization. Computational Linguistics 51(1), 23–72. - For papers using GDTB/PDTB style shallow discourse relations, please cite:
Yang Janet Liu, Tatsuya Aoyama, Wesley Scivetti, Yilun Zhu, Shabnam Behzad, Lauren Elizabeth Levine, Jessica Lin, Devika Tiwari, and Amir Zeldes (2024), GDTB: Genre Diverse Data for English Shallow Discourse Parsing across Modalities, Text Types, and Domains. In: Proceedings of EMNLP 2024. Miami, FL, 12287–12303. -
CAMPOS, Ricardo, Ana Pacheco, Ana Fernandes, Inês Cantante, Rute Rebouccas, L. Cunha, José Isidro, J. Evans, Miguel Marques, Rodrigo Batista, Evelin Amorim, A. Jorge, Nuno Guimarães, S. Nunes, António Leal & Purificaccao Silvano (2026) CitiLink-Minutes: A Multilayer Annotated Dataset of Municipal Meeting Minutes.annotation
-
HU, Yiran, Huanghai Liu, Chong Wang, Kunran Li, Tien-Hsuan Wu, Haitao Li, Xinran Xu, Siqing Huo, Weihang Su, Ning Zheng, Siyuan Zheng, Qingyao Ai, Yun Liu, Renjun Bian, Yiqun Liu, Charles Clarke, Weixing Shen & Ben Kao (2026) Evaluation of Large Language Models in Legal Applications: Challenges, Methods, and Future Directions. arXiv.org.LLM
-
JIANG, Shuai, Marc Salvad'o-Benasco, Eric Cyr, Alena Kopanicáková, Rolf Krause & Jacob Schroder (2026) Layer-Parallel Training for Transformers. arXiv.org.other
-
KAJIKAWA, Kohei, Shinnosuke Isono & E. Wilcox (2026) Information-Theoretic Storage Cost in Sentence Comprehension.syntax
-
LI, Junyi, Yang Liu, Kanishka Misra, Valentina Pyatkin & William Sheffield (2026) Which course? Discourse! Teaching Discourse and Generation in the Era of LLMs.discourse
-
LIU, Dongqi, Hang Ding, Qiming Feng, Jian Li, Xurong Xie, Zhucun Xue, Chengjie Wang, Jiang-She Zhang & Yabiao Wang (2026) Disco-RAG: Discourse-Aware Retrieval-Augmented Generation. arXiv.org.discoursesummarization
-
PRAJAPATI, Priyanka, Vishal Goyal & Kawaljit Kaur (2026) Development of an Annotation Tool and Corpus for Anaphora Resolution in the Punjabi Language. SN Computer Science.coreferenceannotation
-
XU, Weijie & Richard Futrell (2026) Strategic resource allocation in memory encoding: An efficiency principle shaping language processing. Journal of Memory and Language.other
-
YANG, Xiulin, Heidi Getz & E. Wilcox (2026) From Linear Input to Hierarchical Structure: Function Words as Statistical Cues for Language Learning.syntax
-
YAO, Jianchao & Bin Song (2026) BAP-PR: Boundary-aware prompting progressive reasoning for few-shot named entity recognition. Neurocomputing.entities
-
ALTHANI, Fatima, Chris Madge & Massimo Poesio (2025) Style Matters: Exploring the Role of Art Style in AI-Generated Images on Engagement in a Gamified Text Labelling Interface. ACM SIGCHI Italian Chapter International Conference on Computer-Human Interaction.LLM
-
ALTHANI, Fatima, Chris Madge & Massimo Poesio (2025) How Task Complexity Moderates the Impact of AI-Generated Images on User Experience in Gamified Text Labelling. 2025 IEEE Conference on Games (CoG).other
-
BAIUK, Ilia, A. Baiuk & Maria Petrova (2025) CoBaLDParser: Joint Morphosyntactic and Semantic Annotation. Computational Linguistics and Intellectual Technologies.semanticssyntaxannotation
-
BANERJEE, Souvik, Yi Fan & Michael Strube (2025) HITS at DISRPT 2025: Discourse Segmentation, Connective Detection, and Relation Classification. Proceedings of the 4th Shared Task on Discourse Relation Parsing and Treebanking (DISRPT 2025).discourse
-
BAYANGALI, Abdygalym, M. Sambetbayeva, A. Yerimbetova, A. Nekessova, N. Tasbolatuly, Nunzhigit Smailov & A. Nazymkhan (2025) NLP Models for Military Terminology Analysis and Detection of Information Operations on Social Media. De Computis.discourseannotation
-
BOLTAYEVICH, Elov, Abdisalomova Qizi & Jumayeva Baxshulloyevna (2025) Creating an Annotated Dataset for Coreference Resolution in Uzbek Texts Based on the CoNLL-2012 Format. 2025 10th International Conference on Computer Science and Engineering (UBMK).coreference
-
BRAUD, Chloé, Amir Zeldes, Chuyuan Li, Yang Liu & Philippe Muller (2025) The DISRPT 2025 Shared Task on Elementary Discourse Unit Segmentation, Connective Detection, and Relation Classification. Proceedings of the 4th Shared Task on Discourse Relation Parsing and Treebanking (DISRPT 2025).discourse
-
BU, Lanni, Lauren Levine & Amir Zeldes (2025) DiscoTrack: A Multilingual LLM Benchmark for Discourse Tracking. arXiv.org.discoursecoreferencesummarizationLLMsalience
-
BUAPHET, Weerayut, Peerat Limkonchotiwat, Attapol Rutherford, Can Udomcharoenchaikit & Sarana Nutanong (2025) LLM-Augmented Prototype Representation for Few-Shot Named-Entity Recognition. IEEE Access.entitiesLLM
-
CASOLA, Silvia, Yang Liu, Siyao Peng, Oliver Kraus, Albert Gatt & Barbara Plank (2025) References Matter: Investigating the Impact of Reference Set Variation on Summarization Evaluation.summarizationLLM
-
CHEN, Beiduo, Yang Liu, Anna Korhonen & Barbara Plank (2025) Threading the Needle: Reweaving Chain-of-Thought Reasoning to Explain Human Label Variation. Conference on Empirical Methods in Natural Language Processing.discourseLLM
-
CHISTOVA, Elena (2025) Methods for Rhetorical Structure Parsing in Russian. Scientific and Technical Information Processing.discourse
-
CHISTOVA, Elena (2025) Bridging Discourse Treebanks with a Unified Rhetorical Structure Parser. Proceedings of the 6th Workshop on Computational Approaches to Discourse, Context and Document-Level Inferences (CODI 2025).coreferencediscoursesyntax
-
DEVI, S., Pattabhi Rao & Vijay Ram (2025) SeCoRel: Multilingual Discourse Analysis in DISRPT 2025. Proceedings of the 4th Shared Task on Discourse Relation Parsing and Treebanking (DISRPT 2025).discourse
-
DOBROVOLJC, Kaja (2025) Counting trees: A treebank-driven exploration of syntactic variation in speech and writing across languages. arXiv.org.syntax
-
EICHIN, Florian, Yang Liu, Barbara Plank & Michael Hedderich (2025) Probing LLMs for Multilingual Discourse Generalization Through a Unified Label Set. Annual Meeting of the Association for Computational Linguistics.discoursesalience
-
GACKOU, Hamady (2025) Refining Syntactic Distinctions Using Decision Trees: A Paper on Postnominal 'That' in Complement vs. Relative Clauses. arXiv.org.syntax
-
GENEST, Pierre-Yves, P. Portier, Elöd Egyed-Zsigmond & M. Lovisetto (2025) OWNER — Toward Unsupervised Open-World Named Entity Recognition. IEEE Access.entitiesLLM
-
HAQ, Muhammad, Davide Rigoni & Alessandro Sperduti (2025) LLMs as Data Annotators: How Close Are We to Human Performance. arXiv.org.entitiesLLM
-
HUANG, Yi, Yuhan Gao & Chengjuan Ren (2025) A survey of data augmentation in named entity recognition. Neurocomputing.entities
-
JU, Zhuoxuan, Jingni Wu, Abhishek Purushothama & Amir Zeldes (2025) DeDisCo at the DISRPT 2025 Shared Task: A System for Discourse Relation Classification. Proceedings of the 4th Shared Task on Discourse Relation Parsing and Treebanking (DISRPT 2025).discourse
-
LEVINE, Lauren & Amir Zeldes (2025) GUMBridge: a Corpus for Varieties of Bridging Anaphora. arXiv.org.coreferencediscourseannotation
-
LEVINE, Lauren & Amir Zeldes (2025) Subjectivity in the Annotation of Bridging Anaphora. Proceedings of the 19th Linguistic Annotation Workshop (LAW-XIX-2025).coreferenceentitiesdiscourse
-
LEVINE, Lauren, Junghyun Min & Amir Zeldes (2025) Building UD Cairo for Old English in the Classroom. arXiv.org.LLMsyntax
-
LI, Michael & Nishant Subramani (2025) Echoes of BERT: Do Modern Language Models Rediscover the Classical NLP Pipeline?.semanticsdiscourseLLMsyntax
-
LI, Xingyu, Chen Gong & Guohong Fu (2025) Multimodal Coreference Resolution for Chinese Social Media Dialogues: Dataset and Benchmark Approach. Annual Meeting of the Association for Computational Linguistics.coreferencesemanticsannotation
-
LI, Chuyuan & Giuseppe Carenini (2025) BeDiscovER: The Benchmark of Discourse Understanding in the Era of Reasoning Language Models. arXiv.org.semanticsdiscourseLLM
-
LIN, Jessica & Amir Zeldes (2025) GUM-SAGE: A Novel Dataset and Approach for Graded Entity Salience Prediction. Annual Meeting of the Association for Computational Linguistics.entitiessummarizationsalience
-
LIU, Zhengyuan, Ke Shi & Nancy Chen (2025) Transformer-based document-level discourse processing: Exploiting prior language knowledge and hierarchical parsing. Computer Speech and Language.discourse
-
LIU, Yahui, Zhenghua Li, Chen Gong, Shilin Zhou & Min Zhang (2025) Annotation error detection in painstakingly annotated data: Part-of-speech tagging as a case study. Expert systems with applications.annotation
-
LIU, Xiaoya, Senlin Luo, Zhouting Wu, Limin Pan & Xinshuai Li (2025) Joint contrastive learning with semantic enhanced label referents for few-shot NER. Neurocomputing.entitiessemantics
-
LIU, Meinan, Yunfang Dong, Xixian Liao & Bonnie Webber (2025) Multi-token Mask-filling and Implicit Discourse Relations. Conference on Empirical Methods in Natural Language Processing.discourse
-
LOPES, Lucelene, M. Nunes, M. Duran & T. Pardo (2025) A sintaxe no tribunal: apresentando e explorando um corpus jurídico em português anotado sintaticamente segundo o modelo Universal Dependencies. Brazilian Symposium in Information and Human Language Technology.syntax
-
LÜCKING, Andy & Jonathan Ginzburg (2025) Exceptions from rules and noteworthy exceptions: the balance scale for making exceptions. Linguistics and Philosophy.other
-
MAO, Jiannan, Chenchen Ding, Hour Kaing, Hideki Tanaka, Masao Utiyama & Tadahiro Matsumoto (2025) Data Augmentation for Low-Resource Languages in Multilingual Dependency Parsing. Journal of Natural Language Processing.syntax
-
MERCHANT, Rayyan & Kevin Tang (2025) ParsTranslit: Truly Versatile Tajik-Farsi Transliteration. arXiv.org.other
-
MIAO, Yisong & Min-Yen Kan (2025) Discursive Circuits: How Do Language Models Understand Discourse Relations? Conference on Empirical Methods in Natural Language Processing.coreferencesemanticsdiscourseLLM
-
MÜLLER-EBERSTEIN, Max, Rob Goot & Anna Rogers (2025) DECAF: A Dynamically Extensible Corpus Analysis Framework. Annual Meeting of the Association for Computational Linguistics.other
-
NAKAISHI, Kai, Ryosuke Yoshida, Kohei Kajikawa, Koji Hukushima & Yohei Oseki (2025) Rethinking the Relationship between the Power Law and Hierarchical Structures. arXiv.org.semanticsdiscoursesyntax
-
NOVÁK, Michal, Miloslav Konopík, A. Nedoluzhko, M. Popel, O. Pražák, Jakub Sido, Milan Straka, Zdenek Zabokrtský & Daniel Zeman (2025) Findings of the Fourth Shared Task on Multilingual Coreference Resolution: Can LLMs Dethrone Traditional Approaches? Proceedings of the Eighth Workshop on Computational Models of Reference, Anaphora and Coreference.coreferenceLLM
-
OGRODNICZUK, Maciej, Anna Latusek, Karol Saputa, Alina Wróblewska, Daniel Ziembicki, Bartosz Żuk, Martyna Lewandowska, Adam Okrasiński, Paulina Rosalska, Anna Śliwicka, Aleksandra Tomaszewska & S. Zurowski (2025) Where Frameworks (Dis)agree: A Study of Discourse Segmentation. Proceedings of the 6th Workshop on Computational Approaches to Discourse, Context and Document-Level Inferences (CODI 2025).discourse
-
PASTOR, Martial, Nelleke Oostdijk, Patricia Martín-Rodilla & Javier Parapar (2025) Enhancing Discourse Parsing for Local Structures from Social Media with LLM-Generated Data. International Conference on Computational Linguistics.discourseLLM
-
PEREIRA, Mateus & J. Souza (2025) Anotação Enhanced Rhetorical Structure Theory em textos de User-Generated Content. Brazilian Symposium in Information and Human Language Technology.discourse
-
PUJOL, Robin, Firmin Rousseau, Philippe Muller & Chloé Braud (2025) DisCuT and DiscReT: MELODI at DISRPT 2025 Multilingual discourse segmentation, connective tagging and relation classification. Proceedings of the 4th Shared Task on Discourse Relation Parsing and Treebanking (DISRPT 2025).discourse
-
SANTOS, Marquize, Roana Rodrigues & J. Souza (2025) A oposição nas relações de coerência Contrast e Concession. Brazilian Symposium in Information and Human Language Technology.discourse
-
SHAO, Ruihan, Xizhong Qin & Luyi Yang (2025) PINER: a few-shot named entity recognition method based on prompt learning and information fusion. Conference on Computer Graphics, Artificial Intelligence, and Data Processing.entities
-
SHARMA, Neeru & Saravjeet Singh (2025) A Hybrid Extractive and Encoder-Decoder-Based Approach for Mitigating Hallucination in Automatic Text Summarization. Journal of Transformative Technologies and Sustainable Development.summarization
-
TANG, Hanlin, Yanjie Jiang, Yuxia Zhang, Nan Niu & Hui Liu (2025) POS Tagging on Code Identifiers: How Far Are We? ACM Transactions on Software Engineering and Methodology.syntaxannotation
-
TIAN, Wei, Shuangshuang Xu, Yongwei Wang, Hao Li & Hao Zhu (2025) UnionPromptNER serves as a union prompting method to bridge few-shot named entity recognition. Scientific Reports.coreferencesemanticsentities
-
TONG, Meihan & Shuai Wang (2025) NovelCR: A Large-Scale Bilingual Dataset Tailored for Long-Span Coreference Resolution. Annual Meeting of the Association for Computational Linguistics.coreference
-
TURK, Nawar, Daniele Comitogianni & Leila Kosseim (2025) CLaC at DISRPT 2025: Hierarchical Adapters for Cross-Framework Multi-lingual Discourse Relation Classification. Proceedings of the 4th Shared Task on Discourse Relation Parsing and Treebanking (DISRPT 2025).discourse
-
WEI, Fengshun, Shengquan Liu, Bo Kong, Liruizhi Jia, Shouhao Yu & Lu Wang (2025) Cross-Domain Few-Shot Agricultural Pest and Disease Named Entity Recognition Based on a Pseudo-Label Filtering Self-Training Decomposition Framework. Data Intelligence.entitiessemantics
-
WEISSWEILER, Leonie, Abdullatif Köksal & Hinrich Schütze (2025) Hybrid Human-LLM Corpus Construction and LLM Evaluation for the Caused-Motion Construction. Northern European Journal of Language Technology.LLMsyntax
-
WU, Jingni & Amir Zeldes (2025) Unpacking Ambiguity: The Interaction of Polysemous Discourse Markers and Non-DM Signals. Proceedings of the 6th Workshop on Computational Approaches to Discourse, Context and Document-Level Inferences (CODI 2025).semanticsdiscourse
-
XIAO, Yuhui, Jianjian Zou & Qun Yang (2025) Advancing Few-Shot Named Entity Recognition with Large Language Model. Applied Sciences.entitiesLLM
-
YU, Mei, Yuang Tao, Mankun Zhao, Tianyi Xu, Zechen Meng, Wenbin Zhang & Jian Yu (2025) Class Semantic Prompts Enhanced Prototypical Fusion Method for Few-shot Named Entity Recognition. IEEE International Conference on Acoustics, Speech, and Signal Processing.entitiessemantics
-
YU, Shuai, Wei Gao, Yongbin Qin, Caiwei Yang, Ruizhang Huang, Yanping Chen & Chuan Lin (2025) IterSum: Iterative summarization based on document topological structure. Information Processing & Management.summarization
-
YUAN, Jingting, Qiuhan Lin & J. Lee (2025) Discourse complexity measures as indices of L2 writing proficiency: Insights from propositional connection and relation. Journal of Second Language Writing.discourse
-
ZELDES, Amir & Jessica Lin (2025) What makes an entity salient in discourse? arXiv.org.entitiesdiscoursesummarizationsalience
-
ZHANG, Longyin, Xin Tan, Fang Kong & Guodong Zhou (2025) Self-Trained and Self-Purified Data Augmentation for RST Discourse Parsing. IEEE Transactions on Audio, Speech, and Language Processing.discourse
-
ZHAO, Suxian, Nan Yu, Chen Gong & Guohong Fu (2025) Speaker Intention Enhanced Dialogue Discourse Parsing. IEEE International Joint Conference on Neural Network.semanticsdiscourseLLM
-
ZHAO, Qihui, Tianhan Gao & Nan Guo (2025) A multi-granularity in-context learning method for few-shot Named Entity Recognition via Knowledgeable Parameters Fine-tuning. Information Processing & Management.entities
-
ŽITKO, Branko, A. Gašpar, Lucija Bročić, Daniel Vasić & Ani Grubišić (2025) Human–machine interaction in building an English reference dataset for natural language processing tasks. Language Resources and Evaluation.other
-
ADIGA, Rishabh, Lakshminarayanan Subramanian & Varun Chandrasekaran (2024) Designing Informative Metrics for Few-Shot Example Selection. Annual Meeting of the Association for Computational Linguistics.entitiessemanticsLLM
-
AGARWAL, Raunak (2024) Zero-shot Factual Consistency Evaluation Across Domains. arXiv.org.summarization
-
ARNETT, Catherine & Benjamin Bergen (2024) Why do language models perform worse for morphologically complex languages? arXiv.org.other
-
BARRETT, M., Max Müller-Eberstein, Elisa Bassignana, Amalie Pauli, Mike Zhang & Rob Goot (2024) Can Humans Identify Domains? International Conference on Language Resources and Evaluation.other
-
BLASCHKE, Verena, Barbara Kovavci'c, Siyao Peng, Hinrich Schutze & Barbara Plank (2024) MaiBaam: A Multi-Dialectal Bavarian Universal Dependency Treebank. International Conference on Language Resources and Evaluation.syntaxannotation
-
BLASCHKE, Verena, Barbara Kovavci'c, Siyao Peng & Barbara Plank (2024) MaiBaam Annotation Guidelines. arXiv.org.syntaxannotation
-
BÜYÜKTEKIN, Faruk & Umut Özge (2024) A coreference corpus of Turkish situated dialogs. SIGTURK.coreferenceannotation
-
CHEN, Wei, Lili Zhao, Zhi Zheng, Tong Xu, Yang Wang & Enhong Chen (2024) Double-Checker: Large Language Model as a Checker for Few-shot Named Entity Recognition. Conference on Empirical Methods in Natural Language Processing.entitiesLLM
-
CHISTOVA, Elena (2024) Bilingual Rhetorical Structure Parsing with Large Parallel Annotations. Annual Meeting of the Association for Computational Linguistics.discourseannotation
-
CHISTOVA, Elena (2024) End-to-End Argument Mining over Varying Rhetorical Structures. Annual Meeting of the Association for Computational Linguistics.semanticsdiscoursesyntax
-
ERJAVEC, Tomaž, Matyáš Kopp, Nikola Ljubešić, Taja Kuzman, Paul Rayson, P. Osenova, Maciej Ogrodniczuk, Çağrı Çöltekin, Danijel Koržinek, Katja Meden, Jure Skubic, Peter Rupnik, Tommaso Agnoloni, José Aires, Starkaður Barkarson, Roberto Bartolini, Núria Bel, María Pérez, Roberts Darģis, Sascha Diwersy, Maria Gavriilidou, Ruben Heusden, Mikel Iruskieta, N. Kahusk, Anna Kryvenko, Noémi Ligeti-Nagy, Carmen Magariños, Martin Mölder, Costanza Navarretta, K. Simov, Lars Tungland, J. Tuominen, J. Vidler, A. Vladu, Tanja Wissik, Väinö Yrjänäinen & Darja Fišer (2024) ParlaMint II: advancing comparable parliamentary corpora across Europe. Language Resources and Evaluation.other
-
FRÖBE, Maik, Harrisen Scells, Theresa Elstner, Christopher Akiki, Lukas Gienapp, Jan Merker, Sean MacAvaney, Benno Stein, Matthias Hagen & Martin Potthast (2024) Resources for Combining Teaching and Research in Information Retrieval Coursework. Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.other
-
FUTRELL, Richard & Michael Hahn (2024) Linguistic Structure from a Bottleneck on Sequential Information Processing. Nature Human Behaviour.semanticssyntax
-
GAMBA, Federica, Abishek Stephen & Zdenek Zabokrtský (2024) Universal Feature-based Morphological Trees. Workshop on Multiword Expressions.other
-
GHIFFARI, Fadli, Ika Alfina & Kurniawati Azizah (2024) Cross-lingual Transfer Learning for Javanese Dependency Parsing. International Joint Conference on Natural Language Processing.syntax
-
GONG, Chen, Dexin Kong, Suxian Zhao, Xingyu Li & Guohong Fu (2024) MODDP: A Multi-modal Open-domain Chinese Dataset for Dialogue Discourse Parsing. Annual Meeting of the Association for Computational Linguistics.discourse
-
GUO, Quanjiang, Yihong Dong, Ling Tian, Zhao Kang, Yu Zhang & Sijie Wang (2024) BANER: Boundary-Aware LLMs for Few-Shot Named Entity Recognition. International Conference on Computational Linguistics.entitiesLLM
-
HAQ, Muhammad, Paolo Frazzetto, A. Sperduti & Giovanni Martino (2024) Improving Soft Skill Extraction via Data Augmentation and Embedding Manipulation. ACM Symposium on Applied Computing.entities
-
JASINSKAJA, Katja, Yuting Li, F. Same & David Uerlings (2024) Reference and discourse structure annotation of elicited chat continuations in German. Law.discourseannotation
-
KYLE, Kristopher & Masaki Eguchi (2024) Evaluating NLP models with written and spoken L2 samples. Research Methods in Applied Linguistics.other
-
LEE, Ji-Ung, Marc Pfetsch & Iryna Gurevych (2024) Constrained C-Test Generation via Mixed-Integer Programming. arXiv.org.LLM
-
LEVINE, Lauren & Amir Zeldes (2024) Unifying the Scope of Bridging Anaphora Types in English: Bridging Annotations in ARRAU and GUM. CRAC.coreference
-
LI, Zihao, Shaoxiong Ji, Timothee Mickus, Vincent Segonne & Jörg Tiedemann (2024) A Comparison of Language Modeling and Translation as Multilingual Pretraining Objectives. Conference on Empirical Methods in Natural Language Processing.other
-
LI, Xue & Paul Groth (2024) How different is different? Systematically identifying distribution shifts and their impacts in NER datasets. Language Resources and Evaluation.entities
-
LI, Yan, Caren Han, Yue Dai & Feiqi Cao (2024) ChuLo: Chunk-Level Key Information Representation for Long Document Understanding. Annual Meeting of the Association for Computational Linguistics.semantics
-
LI, Yan, S. Han, Yue Dai & Feiqi Cao (2024) ChuLo: Chunk-Level Key Information Representation for Long Document Processing. arXiv.org.other
-
LIN, Jessica & Amir Zeldes (2024) GUMsley: Evaluating Entity Salience in Summarization for 12 English Genres. Conference of the European Chapter of the Association for Computational Linguistics.coreferenceentitiessummarizationLLMsalience
-
LIU, Yang, Tatsuya Aoyama, Wesley Scivetti, Yilun Zhu, Shabnam Behzad, Lauren Levine, Jessica Lin, Devika Tiwari & Amir Zeldes (2024) GDTB: Genre Diverse Data for English Shallow Discourse Parsing across Modalities, Text Types, and Domains. Conference on Empirical Methods in Natural Language Processing.discoursesyntax
-
LIU, Haitao, Weiming Peng & Jihua Song (2024) RepEKShot: an evidential k-nearest neighbor classifier with repulsion loss for few-shot named entity recognition. Journal of Supercomputing.entities
-
LIU, Wei, Stephen Wan & Michael Strube (2024) What Causes the Failure of Explicit to Implicit Discourse Relation Recognition? North American Chapter of the Association for Computational Linguistics.semanticsdiscoursesyntax
-
LIU, Zhiwei, Bocheng Huang, Chunming Xia, Yujie Xiong, Zhensen Zang & Yongqiang Zhang (2024) Few-Shot Named Entity Recognition with the Integration of Spatial Features. Wuhan University Journal of Natural Sciences.entities
-
LONG, Wanqiu, Siddharth Narayanaswamy & Bonnie Webber (2024) Multi-Label Classification for Implicit Discourse Relation Recognition. Annual Meeting of the Association for Computational Linguistics.discoursesyntax
-
LUECKING, Andy, Giuseppe Abrami, Leon Hammerla, Marc Rahn, Daniel Baumartz, Steffen Eger & Alexander Mehler (2024) Dependencies over Times and Tools (DoTT). International Conference on Language Resources and Evaluation.other
-
MA, Tingting, Qianhui Wu, Huiqiang Jiang, Jieru Lin, Börje Karlsson, Tiejun Zhao & Chin-Yew Lin (2024) Decomposed Meta-Learning for Few-Shot Sequence Labeling. IEEE/ACM Transactions on Audio Speech and Language Processing.entities
-
MAEKAWA, Aru, Tsutomu Hirao, Hidetaka Kamigaito & Manabu Okumura (2024) Can we obtain significant success in RST discourse parsing by using Large Language Models? Conference of the European Chapter of the Association for Computational Linguistics.discourse
-
MANIKANTAN, Kawshik, Shubham Toshniwal, Makarand Tapaswi, Vineet Cvit, Iiit Hyderabad & Nvidia (2024) Major Entity Identification: A Generalizable Alternative to Coreference Resolution. CRAC.coreferenceentitiesLLM
-
MARTINC, Matej, Matic Perovsek, Nada Lavrač & S. Pollak (2024) Textflows: an open science NLP evaluation approach. Language Resources and Evaluation.other
-
METHENITI, Eleni, Philippe Muller, Chloé Braud & Margarita Casas (2024) Zero-shot Learning for Multilingual Discourse Relation Classification. International Conference on Language Resources and Evaluation.discourse
-
NAYAK, Kota (2024) Does ChatGPT Measure Up to Discourse Unit Segmentation? A Comparative Analysis Utilizing Zero-Shot Custom Prompts. Canadian AI.discourseLLM
-
NIELSEN, Dan, Kenneth Enevoldsen & Peter Schneider-Kamp (2024) Encoder vs Decoder: Comparative Analysis of Encoder and Decoder Language Models on Multilingual NLU Tasks. NoDaLiDa/Baltic-HLT.other
-
NOVÁK, Michal, Sárka Dohnalová, Miloslav Konopík, A. Nedoluzhko, M. Popel, O. Pražák, Jakub Sido, Milan Straka, Zdenek Zabokrtský & Daniel Zeman (2024) Findings of the Third Shared Task on Multilingual Coreference Resolution. CRAC.coreference
-
NUNES, S., A. Jorge, Evelin Amorim, Hugo Sousa, Antonio Leal, Purificação Silvano, Inês Cantante & Ricardo Campos (2024) Text2Story Lusa: A Dataset for Narrative Analysis in European Portuguese News Articles. International Conference on Language Resources and Evaluation.discourse
-
PASTOR, Martial & Nelleke Oostdijk (2024) Signals as Features: Predicting Error/Success in Rhetorical Structure Parsing. CODI.discoursesyntax
-
POESIO, Massimo, Maciej Ogrodniczuk, Vincent Ng, Sameer Pradhan, Juntao Yu, N. Moosavi, Silviu Paun, Amir Zeldes, A. Nedoluzhko, Michal Novák, M. Popel, Zdenek Zabokrtský & Daniel Zeman (2024) Universal Anaphora: The First Three Years. International Conference on Language Resources and Evaluation.coreference
-
POESIO, Massimo, Maris Camilleri, Paloma Garcia, Juntao Yu & Mark-Christoph Müller (2024) The ARRAU 3.0 Corpus. CODI.coreference
-
POLÁKOVÁ, Lucie, Jirí Mírovský, Šárka Zikánová & E. Hajicová (2024) Developing a Rhetorical Structure Theory Treebank for Czech. International Conference on Language Resources and Evaluation.discoursesyntax
-
PORADA, Ian & Jackie Cheung (2024) Solving the Challenge Set without Solving the Task: On Winograd Schemas as a Test of Pronominal Coreference Resolution. Conference on Computational Natural Language Learning.coreference
-
PORADA, Ian, Xiyuan Zou & Jackie Cheung (2024) A Controlled Reevaluation of Coreference Resolution Models. International Conference on Language Resources and Evaluation.coreferencesemantics
-
POTTER, Andrew (2024) An Algorithmic Approach to Analyzing Rhetorical Structures. CODI.discourse
-
QIN, Libo, Qiguang Chen, Jingxuan Zhou, Qin-zhen Li, Chunlin Lu & Wanxiang Che (2024) Decoupling Breaks Data Barriers: A Decoupled Pre-training Framework for Multi-intent Spoken Language Understanding. International Joint Conference on Artificial Intelligence.entities
-
RIZVI, Naba, Harper Strickland, D. Gitelman, Tristan Cooper, Alexis Flores, Michael Golden, Aekta Kallepalli, Akshat Alurkar, Haaset Owens, Saleha Ahmedi, Isha Khirwadkar, Imani Munyaka & Nedjma Ousidhoum (2024) AUTALIC: A Dataset for Anti-AUTistic Ableist Language In Context. Annual Meeting of the Association for Computational Linguistics.other
-
SILEO, Damien (2024) tasksource: A Large Collection of NLP tasks with a Structured Dataset Preprocessing Framework. International Conference on Language Resources and Evaluation.other
-
SWANSON, Daniel, Bryce Bussert & Francis Tyers (2024) Towards Named-Entity and Coreference Annotation of the Hebrew Bible. LT4HALA.coreference
-
TAGUCHI, Chihiro, Jefferson Saransig, Dayana Vel'asquez & David Chiang (2024) Killkan: The Automatic Speech Recognition Dataset for Kichwa with Morphosyntactic Information. International Conference on Language Resources and Evaluation.syntax
-
TIAN, Chang, Wenpeng Yin, Dan Li & Marie-Francine Moens (2024) Fighting Against the Repetitive Training and Sample Dependency Problem in Few-Shot Named Entity Recognition. IEEE Access.entitiesLLM
-
TOMASZEWSKA, Aleksandra, Purificação Silvano, Antonio Leal & Evelin Amorim (2024) ISO 24617-8 Applied: Insights from Multilingual Discourse Relations Annotation in English, Polish, and Portuguese. International Symposium on Algorithms.discourse
-
TOPORKOV, Olia & R. Agerri (2024) Evaluating Shortest Edit Script Methods for Contextual Lemmatization. International Conference on Language Resources and Evaluation.other
-
WALDIS, Andreas, Yotam Perlitz, Leshem Choshen, Yufang Hou & Iryna Gurevych (2024) Holmes ⌕ A Benchmark to Assess the Linguistic Competence of Language Models. Transactions of the Association for Computational Linguistics.semanticsdiscoursesyntax
-
WALDIS, Andreas, Yotam Perlitz, Leshem Choshen, Yufang Hou & Iryna Gurevych (2024) Holmes: A Benchmark to Assess the Linguistic Competence of Language Models.semanticsdiscoursesyntax
-
WEISSWEILER, Leonie, Nina Bobel, Kirian Guiller, Santiago Herrera, Wesley Scivetti, Arthur Lorenzi, Nurit Melnik, Archna Bhatia, Hinrich Schutze, Lori Levin, Amir Zeldes, Joakim Nivre, William Croft & N. Schneider (2024) UCxn: Typologically-Informed Annotation of Constructions Atop Universal Dependencies. International Conference on Language Resources and Evaluation.syntax
-
XIAO, Yuhui, Qun Yang, Jianjian Zou & Sichi Zhou (2024) LACNER: Enhancing Few-Shot Named Entity Recognition with Label Words and Contrastive Learning. IEEE International Joint Conference on Neural Network.entitiessemantics
-
XU, Weijie & Richard Futrell (2024) Syntactic dependency length shaped by strategic memory allocation. SIGTYP.LLMsyntax
-
YANG, Junhui & Yuxiang Xu (2024) Few-shot named entity recognition based on triplet loss. Proceedings of the 4th Asia-Pacific Artificial Intelligence and Big Data Forum.entitiessummarization
-
YIN, Yu, Hyunjae Kim, Xiao Xiao, Chih-Hsuan Wei, Jaewoo Kang, Zhiyong Lu, Hua Xu, Meng Fang & Qingyu Chen (2024) Augmenting Biomedical Named Entity Recognition with General-domain Resources. arXiv.org.entities
-
YIN, Yu, Hyunjae Kim, Xiao Xiao, Chih-Hsuan Wei, Jaewoo Kang, Zhiyong Lu, Hua Xu, Meng Fang & Qingyu Chen (2024) Augmenting biomedical named entity recognition with general-domain resources. Journal of Biomedical Informatics.entities
-
ZACZYNSKA, Karolina & Manfred Stede (2024) Rhetorical Strategies in the UN Security Council: Rhetorical Structure Theory and Conflicts. SIGDIAL Conferences.discourse
-
ZELDES, Amir, Tatsuya Aoyama, Yang Liu, Siyao Peng, Debopam Das & Luke Gessler (2024) eRST: A Signaled Graph Theory of Discourse Relations and Organization. Computational Linguistics.discoursesyntax
-
ZHA, Enze, Delong Zeng, Man Lin & Ying Shen (2024) CEPTNER: Contrastive learning Enhanced Prototypical network for Two-stage few-shot Named Entity Recognition. Knowledge-Based Systems.entities
-
ZHANG, Yafeng, Zilan Yu, Yu'ang Huang & Jing Tang (2024) CLLMFS: A Contrastive Learning enhanced Large Language Model Framework for Few-Shot Named Entity Recognition. European Conference on Artificial Intelligence.entitiessemanticsLLM
-
ZHANG, Shan, Bin Cao & Jing Fan (2024) KCL: Few-shot Named Entity Recognition with Knowledge Graph and Contrastive Learning. International Conference on Language Resources and Evaluation.entities
-
ZHAO, Yu, Zhaoyun Ding, Fei Wang, Longyin Zou & Aixin Nian (2024) Integrating Entities in Text Summarization: A Review. ICCBD.entitiessummarization
-
ZHOU, Shijia, Siyao Peng & Barbara Plank (2024) CLIMATELI: Evaluating Entity Linking on Climate Change Data. CLIMATENLP.entitiesannotation
-
ZHU, Yilun, Siyao Peng, Sameer Pradhan & Amir Zeldes (2024) SPLICE: A Singleton-Enhanced PipeLIne for Coreference REsolution. International Conference on Language Resources and Evaluation.coreferenceentitiesdiscoursesyntax
-
ZOU, Xiyuan, Yiran Li, Ian Porada & Jackie Cheung (2024) Separately Parameterizing Singleton Detection Improves End-to-end Neural Coreference Resolution. North American Chapter of the Association for Computational Linguistics.coreference
-
ALTHANI, Fatima, Chris Madge & Massimo Poesio (2023) The Onboarding Phase in a Game for Text Labelling: Comparing the Effect of Animated vs. Textual Onboarding on Player Experience and Accuracy. 2023 IEEE Conference on Games (CoG).other
-
AOYAMA, T., Shabnam Behzad, Luke Gessler, Lauren Levine, Jessica Lin, Yang Liu, Siyao Peng, Yilun Zhu & Amir Zeldes (2023) GENTLE: A Genre-Diverse Multilayer Challenge Set for English NLP and Linguistic Evaluation. Law.discoursecoreferenceentitiessyntaxannotation
-
BENTON, A., Tianze Shi, Ozan Irsoy & Igor Malioutov (2023) Weakly Supervised Headline Dependency Parsing. Conference on Empirical Methods in Natural Language Processing.syntax
-
BRAUD, Chloé, Yang Liu, Eleni Metheniti, Philippe Muller, Laura Rivière, Attapol Rutherford & Amir Zeldes (2023) The DISRPT 2023 Shared Task on Elementary Discourse Unit Segmentation, Connective Detection, and Relation Classification. DISRPT.discourse
-
CHAI, Haixia & Michael Strube (2023) Investigating Multilingual Coreference Resolution by Universal Annotations. Conference on Empirical Methods in Natural Language Processing.coreferencesyntax
-
CHENG, Zifeng, Qingyu Zhou, Zhiwei Jiang, Xuemin Zhao, Yunbo Cao & Qing Gu (2023) Unifying Token- and Span-level Supervisions for Few-shot Sequence Labeling. ACM Trans. Inf. Syst.other
-
CORTEZ, S. & Cassandra Jacobs (2023) Incorporating Annotator Uncertainty into Representations of Discourse Relations. SIGDIAL Conferences.discourse
-
CORTEZ, S. & Cassandra Jacobs (2023) The distribution of discourse relations within and across turns in spontaneous conversation. CODI.discourse
-
DAS, Mridusmita & Apurbalal Senapati (2023) Development of the Co-reference Resolution Tagged Data set in Assamese @ A Semi-Automated Approach. 2023 IEEE Guwahati Subsection Conference (GCON).discoursesummarization
-
DONG, Guanting, Zechen Wang, Jinxu Zhao, Gang Zhao, Daichi Guo, Dayuan Fu, Tingfeng Hui, Chen Zeng, Keqing He, Xuefeng Li, Liwen Wang, Xinyue Cui & Weiran Xu (2023) A Multi-Task Semantic Decomposition Framework with Task-specific Pre-training for Few-Shot NER. International Conference on Information and Knowledge Management.entitiessemantics
-
DONG, Guanting, Zechen Wang, Liwen Wang, Daichi Guo, Dayuan Fu, Yuxiang Wu, Chen Zeng, Xuefeng Li, Tingfeng Hui, Keqing He, Xinyue Cui, QiXiang Gao & Weiran Xu (2023) A Prototypical Semantic Decoupling Method via Joint Contrastive Learning for Few-Shot Named Entity Recognition. IEEE International Conference on Acoustics, Speech, and Signal Processing.entitiessemantics
-
FAN, Yunlong, Bin Li, Yikemaiti Sataer, Miao Gao, Chuanqi Shi, Siyi Cao & Zhiqiang Gao (2023) Hierarchical Clause Annotation: Building a Clause-Level Corpus for Semantic Parsing with Complex Sentences. Applied Sciences.discoursesemanticssummarizationsyntaxannotation
-
FANG, Jinyuan, Xiaobin Wang, Zaiqiao Meng, Pengjun Xie, Fei Huang & Yong Jiang (2023) MANNER: A Variational Memory-Augmented Model for Cross Domain Few-Shot Named Entity Recognition. Annual Meeting of the Association for Computational Linguistics.entities
-
FINDLAY, Jamie, Saeed Salimifar, Ahmet Yıldırım & Dag Haug (2023) Rule-based semantic interpretation for Universal Dependencies. Universal Dependencies Workshop.semanticssyntax
-
GAY, M. & Cristiano Chesi (2023) Extracting an Expectation-based Lexicon for UD Treebanks. CLICIT.syntax
-
GUILLAUME, Bruno (2023) Graph-based multi-layer querying in Parseme Corpora. Workshop on Multiword Expressions.syntax
-
GUPTA, Akshat, Xiaomo Liu & Sameena Shah (2023) Unsupervised Domain Adaptation using Lexical Transformations and Label Injection for Twitter Data. Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis.other
-
HAUG, Dag, Jamie Findlay & Ahmet Yildirim (2023) The long and the short of it: DRASTIC, a semantically annotated dataset containing sentences of more natural length. DMR.semantics
-
JOHNSON, Jacob & Ana Marasovi'c (2023) How Much Consistency Is Your Accuracy Worth? BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP.other
-
KALLINI, Julie & Christiane Fellbaum (2023) What to Make of make? Sense Distinctions for Light Verbs. Global WordNet Conference.semanticssyntax
-
KE, Zixuan, Bing Liu, Wenhan Xiong, Asli Celikyilmaz & Haoran Li (2023) Sub-network Discovery and Soft-masking for Continual Learning of Mixed Tasks. Conference on Empirical Methods in Natural Language Processing.other
-
KLIE, Jan-Christoph, Ji-Ung Lee, Kevin Stowe, Gozde cSahin, N. Moosavi, Luke Bates, Dominic Petrak, Richard Castilho & Iryna Gurevych (2023) Lessons Learned from a Citizen Science Project for Natural Language Processing. Conference of the European Chapter of the Association for Computational Linguistics.summarization
-
LATA, K., Pardeep Singh & Kamlesh Dutta (2023) Semi-automatic Annotation for Mentions in Hindi Text. SN Computer Science.entitiesannotation
-
LEVINE, Lauren (2023) Difficulties in Handling Mathematical Expressions in Universal Dependencies. Law.syntax
-
LEVY, Tal, Omer Goldman & Reut Tsarfaty (2023) Is Probing All You Need? Indicator Tasks as an Alternative to Probing Embedding Spaces. Conference on Empirical Methods in Natural Language Processing.other
-
LI, Jiaqi, Ming Liu, Yuxin Wang, Daxing Zhang & Bing Qin (2023) A speaker-aware multiparty dialogue discourse parser with heterogeneous graph neural network. Cognitive Systems Research.discourse
-
LI, Yongqi & T. Qian (2023) Type-Aware Decomposed Framework for Few-Shot Named Entity Recognition. Conference on Empirical Methods in Natural Language Processing.entitiessemantics
-
LIU, Yang, Tatsuya Aoyama & Amir Zeldes (2023) What’s Hard in English RST Parsing? Predictive Models for Error Analysis. SIGDIAL Conferences.discourse
-
LIU, Yang & Amir Zeldes (2023) GUMSum: Multi-Genre Data and Evaluation for English Abstractive Summarization. Annual Meeting of the Association for Computational Linguistics.summarization
-
LIU, Ruicheng, Rui Mao, A. Luu & E. Cambria (2023) A brief survey on recent advances in coreference resolution. Artificial Intelligence Review.coreference
-
LIU, Yang & Amir Zeldes (2023) Why Can’t Discourse Parsing Generalize? A Thorough Investigation of the Impact of Data Diversity. Conference of the European Chapter of the Association for Computational Linguistics.discourse
-
LIU, Wei, Yi Fan & M. Strube (2023) HITS at DISRPT 2023: Discourse Segmentation, Connective Detection, and Relation Classification. DISRPT.discourse
-
LU, Junyu, Ping Yang, Ruyi Gan, Junjie Wang, Yuxiang Zhang, Jiaxing Zhang & Pingjian Zhang (2023) UniEX: An Effective and Efficient Framework for Unified Information Extraction via a Span-extractive Perspective. Annual Meeting of the Association for Computational Linguistics.entities
-
MALLART, Cyriel, A. Simpkin, Rmi Venant, Nicolas Ballier, Bernardo Stearns, Jen Li & Thomas Gaillat (2023) Exploring a New Grammatico-functional Type of Measure as Part of a Language Learning Expert System. Workshop on Innovative Use of NLP for Building Educational Applications.other
-
MALLART, Cyriel, Nicolas Ballier, Jen-Yu Li, Andrew Simpkin, Bernardo Stearns, Rémi Venant & Thomas Gaillat (2023) A new learner language data set for the study of English for Specific Purposes at university. International Conference on Language, Data, and Knowledge.other
-
MAO, Rui, Kai He, Xulang Zhang, Guanyi Chen, Jinjie Ni, Zonglin Yang & E. Cambria (2023) A survey on semantic processing techniques. Information Fusion.semantics
-
MESBAHI, Yassir, A. Mahmud, Abbas Ghaddar, Mehdi Rezagholizadeh, P. Langlais & Prasanna Parthasarathi (2023) On the utility of enhancing BERT syntactic bias with Token Reordering Pretraining. Conference on Computational Natural Language Learning.syntax
-
METHENITI, Eleni, Chloé Braud, Philippe Muller & Laura Rivière (2023) DisCut and DiscReT: MELODI at DISRPT 2023. DISRPT.discoursesyntax
-
MOSCATO, V., Marco Postiglione & Giancarlo Sperlí (2023) Few-shot Named Entity Recognition: Definition, Taxonomy and Research Directions. ACM Transactions on Intelligent Systems and Technology.entities
-
NIKOLAEV, D. & Sebastian Padó (2023) The Universe of Utterances According to BERT. International Conference on Computational Semantics.other
-
PAN, Yan, Jiaoyun Yang, Hong Ming, Lili Jiang & Ning An (2023) Few-Shot Named Entity Recognition via Label-Attention Mechanism. International Conference on Computing and Artificial Intelligence.entitiessemantics
-
PORADA, Ian, Alexandra Olteanu, Kaheer Suleman, Adam Trischler & J. Cheung (2023) Challenges to Evaluating the Generalization of Coreference Resolution Models: A Measurement Modeling Perspective. Annual Meeting of the Association for Computational Linguistics.coreference
-
POTTER, Andrew (2023) An Algorithm for Pythonizing Rhetorical Structures. International Conference on Language, Data, and Knowledge.discourse
-
PRZEPIÓRKOWSKI, A. & Michal Wozniak (2023) Conjunct Lengths in English, Dependency Length Minimization, and Dependency Structure of Coordination. Annual Meeting of the Association for Computational Linguistics.syntax
-
REPP, Magdalena, P. Schumacher & F. Same (2023) Multi-layered Annotation of Conversation-like Narratives in German. Law.discourse
-
SIEWERT, Jan, Martijn Wieling & Yves Scherrer (2023) Changing usage of Low Saxon auxiliary and modal verbs. Workshop on Computational Approaches to Historical Language Change.syntax
-
SILEO, Damien (2023) tasksource: A Dataset Harmonization Framework for Streamlined NLP Multi-Task Learning and Evaluation.other
-
SILEO, Damien (2023) tasksource: Structured Dataset Preprocessing Annotations for Frictionless Extreme Multi-Task Learning and Evaluation. arXiv.org.other
-
STARACE, Giulio, Konstantinos Papakostas, Rochelle Choenni, Apostolos Panagiotopoulos, Matteo Rosati, Alina Leidinger & Ekaterina Shutova (2023) Probing LLMs for Joint Encoding of Linguistic Categories. Conference on Empirical Methods in Natural Language Processing.semanticsLLMsyntax
-
STRANISCI, M., R. Damiano, Enrico Mensa, V. Patti, Daniele Radicioni & T. Caselli (2023) WikiBio: a Semantic Resource for the Intersectional Analysis of Biographical Events. Annual Meeting of the Association for Computational Linguistics.semantics
-
TALAMO, Luigi & Annemarie Verkerk (2023) A new methodology for an old problem: A corpus-based typology of adnominal word order in European languages.syntax
-
TOPORKOV, Olia & Rodrigo Agerri (2023) On the Role of Morphological Information for Contextual Lemmatization. International Conference on Computational Logic.syntax
-
TSKHOVREBOVA, Ekaterina & Pascal Gygax (2023) Exploring the Sensitivity to Non-connective Signals of Coherence Relations: The Case of French Speaking Teenagers.discourse
-
WEBER, Leon & Barbara Plank (2023) ActiveAED: A Human in the Loop Improves Annotation Error Detection. Annual Meeting of the Association for Computational Linguistics.annotation
-
WILLIAMSON, Gregor, Angela Cao, Yingying Chen, Yuxin Ji, Liyan Xu & Jinho Choi (2023) Exploring a Multi-Layered Cross-Genre Corpus of Document-Level Semantic Relations. Inf.coreferencesemanticsannotation
-
XU, Ruoxi, Hongyu Lin, Xinyan Guan, Xianpei Han, Yingfei Sun & Le Sun (2023) DLUE: Benchmarking Document Language Understanding. China National Conference on Chinese Computational Linguistics.other
-
XUE, Xiaojun, Chunxia Zhang, Tianxiang Xu & Zhendong Niu (2023) Robust Few-Shot Named Entity Recognition with Boundary Discrimination and Correlation Purification. AAAI Conference on Artificial Intelligence.entities
-
YANG, Fengyi, Xiaoping Zhou, Yating Yang, Bo Ma, Rui Dong & Abibulla Atawulla (2023) A Domain-Transfer Meta Task Design Paradigm for Few-Shot Slot Tagging. AAAI Conference on Artificial Intelligence.entities
-
YE, Bingyang, Jingxuan Tu & James Pustejovsky (2023) Scalar Anaphora: Annotating Degrees of Coreference in Text. CRAC.discoursesemanticscoreferenceentitiesLLM
-
YU, Juntao, Michal Novák, Abdulrahman Aloraini, N. Moosavi, Silviu Paun, Sameer Pradhan & Massimo Poesio (2023) The Universal Anaphora Scorer 2.0. International Conference on Computational Semantics.coreference
-
ZABOKRTSKÝ, Zdenek, Miloslav Konopík, A. Nedoluzhko, Michal Novák, Maciej Ogrodniczuk, M. Popel, O. Pražák, Jakub Sido & Daniel Zeman (2023) Findings of the Second Shared Task on Multilingual Coreference Resolution. CRAC.coreference
-
ZELDES, Amir & Nathan Schneider (2023) Are UD Treebanks Getting More Consistent? A Report Card for English UD. Universal Dependencies Workshop.syntax
-
ZHANG, Mozhi, Hang Yan, Yaqian Zhou & Xipeng Qiu (2023) PromptNER: A Prompting Method for Few-shot Named Entity Recognition via k Nearest Neighbor Search. arXiv.org.entities
-
ZHANG, Jingyi, Yingxue Zhang, Yufeng Chen & Jinan Xu (2023) Structure and Label Constrained Data Augmentation for Cross-domain Few-shot NER. Conference on Empirical Methods in Natural Language Processing.entities
-
ZHANG, Shan, Bin Cao, Tianming Zhang, Yuqi Liu & Jing Fan (2023) Task-adaptive Label Dependency Transfer for Few-shot Named Entity Recognition. Annual Meeting of the Association for Computational Linguistics.entities
-
ZHAO, Haoran & Jake Williams (2023) Bit Cipher - A Simple yet Powerful Word Representation System that Integrates Efficiently with Language Models. arXiv.org.entities
-
ZHAO, Fang & Timothée Bernard (2023) Auto-apprentissage et renforcement pour une analyse jointe sur données disjointes : étiquetage morpho-syntaxique et analyse syntaxique. JEPTALNRECITAL.syntax
-
ZHU, Yilun, Siyao Peng, Sameer Pradhan & Amir Zeldes (2023) Incorporating Singletons and Mention-based Features in Coreference Resolution via Multi-task Learning for Better Generalization. International Joint Conference on Natural Language Processing.coreference
-
AGGARWAL, Piush & Torsten Zesch (2022) Bye, Bye, Maintenance Work? Using Model Cloning to Approximate the Behavior of Legacy Tools. Conference on Natural Language Processing.other
-
ALAGHA, I. (2022) Leveraging Knowledge-Based Features with Multilevel Attention Mechanisms for Short Arabic Text Classification. IEEE Access.entitiessemantics
-
ALI, Rana, Benjamin Zhao, H. Asghar, Tham Nguyen, Ian Wood & Dali Kaafar (2022) Unintended Memorization and Timing Attacks in Named Entity Recognition Models. Proceedings on Privacy Enhancing Technologies.entities
-
ATWELL, Katherine, Sabit Hassan & Malihe Alikhani (2022) APPDIA: A Discourse-aware Transformer-based Style Transfer Model for Offensive Social Media Conversations. International Conference on Computational Linguistics.discourse
-
ATWELL, Katherine, Anthony Sicilia, Seong Hwang & Malihe Alikhani (2022) The Change that Matters in Discourse Parsing: Estimating the Impact of Domain Shift on Parser Error. Findings.discourse
-
BAKSHI, Sahil, D. Sharma & N. Rangaswamy (2022) Towards Discourse Parsing and Connective Identification in Hindi.discourse
-
BLEVINS, Terra, Hila Gonen & Luke Zettlemoyer (2022) Prompting Language Models for Linguistic Structure. Annual Meeting of the Association for Computational Linguistics.entities
-
CASSIDY, Lauren, Teresa Lynn, James Barry & Jennifer Foster (2022) TwittIrish: A Universal Dependencies Treebank of Tweets in Modern Irish. Annual Meeting of the Association for Computational Linguistics.syntax
-
CHEN, Yongjian & M. Farrús (2022) Neural Detection of Cross-lingual Syntactic Knowledge. IberSPEECH Conference.syntax
-
CHEN, Yanru, Yanan Zheng & Zhilin Yang (2022) Prompt-Based Metric Learning for Few-Shot NER. Annual Meeting of the Association for Computational Linguistics.entitiessemantics
-
CHEN, Yanran & Steffen Eger (2022) Transformers Go for the LOLs: Generating (Humourous) Titles from Scientific Abstracts End-to-End. EVAL4NLP.other
-
CHOI, Jonathan (2022) Computational Corpus Linguistics. Social Science Research Network.other
-
COPOT, Maria, Sara Court, N. Diewald, Stephanie Antetomaso & Micha Elsner (2022) A Word-and-Paradigm Workflow for Fieldwork Annotation. COMPUTEL.other
-
CUNHA, Yanis & Abeillé (2022) Objectifying Women? A Syntactic Bias in French and English Corpora. LATERAISSE.syntax
-
DEVATINE, Nicolas, Philippe Muller & Chloé Braud (2022) Predicting Political Orientation in News with Latent Discourse Structure to Improve Bias Understanding. CODI.discourse
-
EGGLESTON, Chloe & Brendan O'Connor (2022) Cross-Dialect Social Media Dependency Parsing for Social Scientific Entity Attribute Analysis. WNUT.syntax
-
FANG, Biaoyan, Tim Baldwin & Karin Verspoor (2022) What does it take to bake a cake? The RecipeRef corpus and anaphora resolution in procedural text. Findings.coreference
-
GALITSKY, B. (2022) Relying on discourse analysis to answer complex questions by neural machine reading comprehension. Artificial Intelligence for Healthcare Applications and Management.discourse
-
GANDHI, Nupoor, Anjalie Field & Emma Strubell (2022) Annotating Mentions Alone Enables Efficient Domain Adaptation for Coreference Resolution. Annual Meeting of the Association for Computational Linguistics.coreferenceannotation
-
GANDHI, Nupoor, Anjalie Field & Emma Strubell (2022) Mention Annotations Alone Enable Efficient Domain Adaptation for Coreference Resolution. arXiv.org.coreference
-
GESSLER, Luke, Lauren Levine & Amir Zeldes (2022) Midas Loop: A Prioritized Human-in-the-Loop Annotation for Large Scale Multilayer Data. Law.annotation
-
GUPTA, Ankita, Marzena Karpinska, Wenlong Zhao, Kalpesh Krishna, Jack Merullo, Luke Yeh, Mohit Iyyer & Brendan O'Connor (2022) ezCoref: Towards Unifying Annotation Guidelines for Coreference Resolution. Findings.coreferenceannotation
-
HAJICOVÁ, E., Marie Mikulová, B. Štěpánková & Jirí Mírovský (2022) Advantages of a Complex Multilayer Annotation Scheme: The Case of the Prague Dependency Treebank. Law.syntaxannotation
-
HUBER, Patrick & G. Carenini (2022) Towards Domain-Independent Supervised Discourse Parsing Through Gradient Boosting. arXiv.org.semanticsdiscourse
-
HUBER, Patrick & G. Carenini (2022) Towards Understanding Large-Scale Discourse Structures in Pre-Trained and Fine-Tuned Language Models. North American Chapter of the Association for Computational Linguistics.discourse
-
JIANG, Hang, Y. Hua, Doug Beeferman & Dwaipayan Roy (2022) Annotating the Tweebank Corpus on Named Entity Recognition and Building NLP Models for Social Media Analysis. International Conference on Language Resources and Evaluation.entitiessyntax
-
JOUET, Grégor, Clément Duhart, Jacopo Staiano, Francis Rousseaux & Cyril Runz (2022) A Novel Gradient Accumulation Method for Calibration of Named Entity Recognition Models. IEEE International Joint Conference on Neural Network.entities
-
KLIE, Jan-Christoph, B. Webber & Iryna Gurevych (2022) Annotation Error Detection: Analyzing the Past and Present for a More Coherent Future. International Conference on Computational Logic.annotation
-
KYLE, K., Masaki Eguchi, Aaron Miller & Theodore Sither (2022) A Dependency Treebank of Spoken Second Language English. Workshop on Innovative Use of NLP for Building Educational Applications.syntax
-
LAPSHINOVA-KOLTUNSKI, Ekaterina, Christian Hardmeier & Pauline Krielke (2022) ParCorFull2.0: a Parallel Corpus Annotated with Full Coreference. International Conference on Language Resources and Evaluation.coreference
-
LATA, K., Pardeep Singh & Kamlesh Dutta (2022) Mention detection in coreference resolution: survey. Applied intelligence (Boston).coreference
-
LITOVSKY, Celia, A. Finley, Bonnie Zuckerman, Matthew Sayers, Julie Schoenhard, Yoed Kenett & Jamie Reilly (2022) Semantic flow and its relation to controlled semantic retrieval deficits in the narrative production of people with aphasia. Neuropsychologia.semanticsdiscourse
-
LIU, Yang, Jena Hwang, Nathan Schneider & Vivek Srikumar (2022) Putting Context in SNACS: A 5-Way Classification of Adpositional Pragmatic Markers. Law.semantics
-
MA, Tingting, Huiqiang Jiang, Qianhui Wu, T. Zhao & Chin-Yew Lin (2022) Decomposed Meta-Learning for Few-Shot Named Entity Recognition. Findings.entities
-
MAULANA, Andhika, Ika Alfina & Kurniawati Azizah (2022) Building Indonesian Dependency Parser Using Cross-lingual Transfer Learning. International Conference on Asian Language Processing.syntax
-
MERLO, Paola & Giuseppe Samo (2022) Exploring T3 languages with quantitative computational syntax. Theoretical Linguistics.syntax
-
MOHAMMADI, Hassan, A. Talebpour, Ahmad Aznaveh & S. Yazdani (2022) Review of coreference resolution in English and Persian. arXiv.org.coreference
-
MOKH, Noor, D. Dakota & Sandra Kübler (2022) Improving POS Tagging for Arabic Dialects on Out-of-Domain Texts. Workshop on Arabic Natural Language Processing.other
-
MÆHLUM, Petter, D. Haug, T. Jorgensen, Andre Kåsen, Anders Nøklestad, Egil Rønningstad, Per Solberg, Erik Velldal & Lilja Øvrelid (2022) NARC – Norwegian Anaphora Resolution Corpus. CRAC.coreference
-
NEDOLUZHKO, A., Michal Novák, M. Popel, Zdenek Zabokrtský, Amir Zeldes & Daniel Zeman (2022) CorefUD 1.0: Coreference Meets Universal Dependencies. International Conference on Language Resources and Evaluation.coreferencesyntax
-
PADMAKUMAR, Vishakh, Leonard Lausen, Miguel Ballesteros, Sheng Zha, He He & G. Karypis (2022) Exploring the Role of Task Transferability in Large-Scale Multi-Task Learning. North American Chapter of the Association for Computational Linguistics.other
-
PAUN, Silviu, Juntao Yu, N. Moosavi & Massimo Poesio (2022) Scoring Coreference Chains with Split-Antecedent Anaphors. Dialogue and Discourse.coreferenceentitiesdiscourse
-
PENG, Siyao, Yang Liu & Amir Zeldes (2022) GCDT: A Chinese RST Treebank for Multigenre and Multilingual Discourse Parsing. AACL.discoursesyntax
-
PENG, Siyao, Yang Liu & Amir Zeldes (2022) Chinese Discourse Annotation Reference Manual. arXiv.org.discoursesyntax
-
PLAKIDIS, Melina & Georg Rehm (2022) A Dataset of Offensive German Language Tweets Annotated for Speech Acts. International Conference on Language Resources and Evaluation.other
-
RAVISHANKAR, Vinit, Mostafa Abdou, Artur Kulmizev & Anders Søgaard (2022) Word Order Does Matter and Shuffled Language Models Know It. Annual Meeting of the Association for Computational Linguistics.syntax
-
SCHOLMAN, Merel, T. Dong, Frances Yung & Vera Demberg (2022) DiscoGeM: A Crowdsourced Corpus of Genre-Mixed Implicit Discourse Relations. International Conference on Language Resources and Evaluation.discourse
-
SICILIA, Anthony, Katherine Atwell, Malihe Alikhani & Seong Hwang (2022) PAC-Bayesian Domain Adaptation Bounds for Multiclass Learners. Conference on Uncertainty in Artificial Intelligence.other
-
SUN, Kun & Rong Wang (2022) Constructing the Corpus of Chinese Textual ‘Run-on’ Sentences (CCTRS): Discourse Corpus Benchmark with Multi-layer Annotations. International Conference on Natural Language and Speech Processing.discourse
-
TIGHIDET, Zineddine & Nicolas Ballier (2022) Fine-tuning a Subtle Parsing Distinction Using a Probabilistic Decision Tree: the Case of Postnominal “that” in Noun Complement Clauses vs. Relative Clauses. Australasian Language Technology Association Workshop.syntax
-
WANG, J., Chengcheng Han, Chengyu Wang, Chuanqi Tan, Minghui Qiu, Songfang Huang, Jun Huang & Ming Gao (2022) SpanProto: A Two-stage Span-based Prototypical Network for Few-shot Named Entity Recognition. Conference on Empirical Methods in Natural Language Processing.entitiessemantics
-
WILLIAMS, J. & H. Heidenreich (2022) To Know by the Company Words Keep and What Else Lies in the Vicinity. arXiv.org.other
-
XING, Linzi, Patrick Huber & G. Carenini (2022) Improving Topic Segmentation by Injecting Discourse Dependencies. CODI.discourse
-
XU, Yang, Fadi Farha, Yueliang Wan, Jiabo Xu, Hong Liu & Huansheng Ning (2022) Improving completeness and consistency of co-reference annotation standard. Wireless networks.other
-
YU, Juntao, Sopan Khosla, R. Manuvinakurike, Lori Levin, Vincent Ng, Massimo Poesio & M. Strube (2022) Message from the Program Co-Chairs. International Conference on Mobile Data Management.coreferencediscourseannotation
-
YU, Nan, Meishan Zhang, G. Fu & M. Zhang (2022) RST Discourse Parsing with Second-Stage EDU-Level Pre-training. Annual Meeting of the Association for Computational Linguistics.discourse
-
YU, Juntao, Sopan Khosla, R. Manuvinakurike, Lori Levin, Vincent Ng, Massimo Poesio, M. Strube & C. Rosé (2022) The CODI-CRAC 2022 Shared Task on Anaphora, Bridging, and Discourse Deixis in Dialogue. CODI.coreferencediscourse
-
YU, Juntao, Sopan Khosla, N. Moosavi, Silviu Paun, Sameer Pradhan & Massimo Poesio (2022) The Universal Anaphora Scorer. International Conference on Language Resources and Evaluation.coreference
-
ZELDES, Amir, Nick Howell, Noam Ordan & Y. Moshe (2022) A Second Wave of UD Hebrew Treebanking and Cross-Domain Parsing. Conference on Empirical Methods in Natural Language Processing.syntaxannotation
-
ZUPON, Andrew, A. Carnie, Michael Hammond & M. Surdeanu (2022) Automatic Correction of Syntactic Dependency Annotation Differences. International Conference on Language Resources and Evaluation.syntax
-
VZABOKRTSK'Y, Zdenvek, Miloslav Konop'ik, A. Nedoluzhko, Michal Nov'ak, Maciej Ogrodniczuk, M. Popel, Ondvrej Pravz'ak, Jakub Sido, Daniel Zeman & Yilun Zhu (2022) Findings of the Shared Task on Multilingual Coreference Resolution. CRAC.coreference
-
ATWELL, Katherine, J. Li & Malihe Alikhani (2021) Where Are We in Discourse Relation Recognition? SIGDIAL Conferences.discourse
-
BAKSHI, Sahil & D. Sharma (2021) A Transformer Based Approach towards Identification of Discourse Unit Segments and Connectives. DISRPT.discoursesummarizationsyntax
-
BENTON, A., Hanyang Li & Igor Malioutov (2021) Cross-Register Projection for Headline Part of Speech Tagging. Conference on Empirical Methods in Natural Language Processing.syntax
-
BOND, Francis, Andrew Devadason, Melissa Teo & Luís Costa (2021) Teaching Through Tagging — Interactive Lexical Semantics. Global WordNet Conference.semantics
-
BRAGGAAR, Anouck & Rob Goot (2021) Challenges in Annotating and Parsing Spoken, Code-switched, Frisian-Dutch Data. ADAPTNLP.other
-
CHEN, Zeming, Qiyue Gao & Lawrence Moss (2021) NeuralLog: Natural Language Inference with Joint Neural and Logical Reasoning. STARSEM.other
-
CHEN, Zeming & Qiyue Gao (2021) Monotonicity Marking from Universal Dependency Trees. International Conference on Computational Semantics.semanticssyntax
-
CRANENBURGH, Andreas, Esther Ploeger, Frank Berg & Remi Thüss (2021) A Hybrid Rule-Based and Neural Coreference Resolution System with an Evaluation on Dutch Literature. CRAC.coreference
-
DAS, Sarkar, Arzoo Katiyar, R. Passonneau & Rui Zhang (2021) CONTaiNER: Few-Shot Named Entity Recognition via Contrastive Learning. Annual Meeting of the Association for Computational Linguistics.entitiessemantics
-
DUC, Do, Pham Thinh, Vu Duy & Luong Vinh (2021) ADAPTING WORD ORDER TRANSFORMATION FOR VIETNAMESE DEPENDENCY PARSING. PROCEEDINGS OF THE 14TH NATIONAL CONFERENCE ON FUNDAMENTAL AND APPLIED INFORMATION TECHNOLOGY RESEARCH.syntax
-
EZZABADY, Morteza, Philippe Muller & Chloé Braud (2021) Multi-lingual Discourse Segmentation and Connective Identification: MELODI at Disrpt2021. DISRPT.discourse
-
FALENSKA, Agnieszka & Özlem Çetinoğlu (2021) Assessing Gender Bias in Wikipedia: Inequalities in Article Titles. GEBNLP.other
-
FANG, Biaoyan, Christian Druckenbrodt, S. Akhondi, Jiayuan He, Tim Baldwin & K. Verspoor (2021) ChEMU-Ref: A Corpus for Modeling Anaphora Resolution in the Chemical Domain. Conference of the European Chapter of the Association for Computational Linguistics.coreferenceannotation
-
HALEY, Coleman, E. Ponti & Sharon Goldwater (2021) Overview of AMALGUM – Large Silver Quality Annotations across English Genres. SCIL.other
-
HASSERT, Naïma, P. Ménard & E. Galy (2021) UD on Software Requirements: Application and Challenges. Universal Dependencies Workshop.syntax
-
HIIPPALA, Tuomo (2021) Applied Language Technology: NLP for the Humanities. TEACHINGNLP.other
-
HUBER, Patrick, Linzi Xing & G. Carenini (2021) Predicting Above-Sentence Discourse Structure using Distant Supervision from Topic Segmentation. AAAI Conference on Artificial Intelligence.semanticsdiscoursesummarization
-
HUBER, Patrick, Wen Xiao & G. Carenini (2021) W-RST: Towards a Weighted RST-style Discourse Framework. Annual Meeting of the Association for Computational Linguistics.discourse
-
INDURKHYA, Sagar, Beracah Yankama & R. Berwick (2021) Evaluating Universal Dependency Parser Recovery of Predicate Argument Structure via CompChain Analysis. STARSEM.semanticssyntax
-
JIANG, Feng, Yaxin Fan, Xiaomin Chu, Peifeng Li, Qiaoming Zhu & F. Kong (2021) Hierarchical Macro Discourse Parsing Based on Topic Segmentation. AAAI Conference on Artificial Intelligence.discourse
-
KAHANE, Sylvain, B. Caron, Emmett Strickland & Kim Gerdes (2021) Annotation guidelines of UD and SUD treebanks for spoken corpora: A proposal. International Workshop on Treebanks and Linguistic Theories.syntaxannotation
-
KHOSLA, Sopan, Juntao Yu, R. Manuvinakurike, Vincent Ng, Massimo Poesio, M. Strube & C. Rosé (2021) The CODI-CRAC 2021 Shared Task on Anaphora, Bridging, and Discourse Deixis in Dialogue. CODI.coreferencediscourse
-
LAMÉRIS, H. & Sara Stymne (2021) Whit’s the Richt Pairt o Speech: PoS tagging for Scots. Workshop on NLP for Similar Languages, Varieties and Dialects.other
-
LANGE, Lukas, Jannik Strotgen, Heike Adel & D. Klakow (2021) To Share or not to Share: Predicting Sets of Sources for Model Transfer Learning. Conference on Empirical Methods in Natural Language Processing.other
-
LI, Xue, Sara Magliacane & Paul Groth (2021) The Challenges of Cross-Document Coreference Resolution for Email. International Conference on Knowledge Capture.coreferenceentities
-
LIN, Jessica, Jessica Lin & Amir Zeldes (2021) WikiGUM: Exhaustive Entity Linking for Wikification in 12 Genres. Law.entities
-
LINSCHEID, P., Peter Bourgonje & Georg Rehm (2021) Parsing Discourse Structures for Semantic Storytelling: Evaluating an efficient RST Parser. Conference on Digital Curation Technologies.semanticsdiscourse
-
LOÁICIGA, S., Simon Dobnik & David Schlangen (2021) Reference and coreference in situated dialogue. ALVR.coreference
-
LOÁICIGA, S., Simon Dobnik & David Schlangen (2021) Annotating anaphoric phenomena in situated dialogue. MMSR.coreference
-
MANNING, Emma, Nathan Schneider & Amir Zeldes (2021) A Balanced and Broadly Targeted Computational Linguistics Curriculum. TEACHINGNLP.other
-
MIASCHI, Alessio, D. Brunato, F. Dell'Orletta & Giulia Venturi (2021) What Makes My Model Perplexed? A Linguistic Investigation on Neural Language Models Perplexity. Workshop on Knowledge Extraction and Integration for Deep Learning Architectures; Deep Learning Inside Out.LLMsyntax
-
MÜLLER-EBERSTEIN, Max, Rob Goot & Barbara Plank (2021) How Universal is Genre in Universal Dependencies? International Workshop on Treebanks and Linguistic Theories.syntax
-
NEDOLUZHKO, A., M. Novák, M. Popel, Z. Žabokrtský & Daniel Zeman (2021) Is one head enough? Mention heads in coreference annotations compared with UD-style heads. International Conference on Dependency Linguistics.coreferencesyntax
-
POPEL, M., Z. Žabokrtský, A. Nedoluzhko, M. Novák & Daniel Zeman (2021) Do UD Trees Match Mention Spans in Coreference Annotations? Conference on Empirical Methods in Natural Language Processing.coreferencesyntaxannotation
-
PUCCETTI, Giovanni, Alessio Miaschi & F. Dell'Orletta (2021) How Do BERT Embeddings Organize Linguistic Knowledge? Workshop on Knowledge Extraction and Integration for Deep Learning Architectures; Deep Learning Inside Out.other
-
QUEIRUGA, A., N. Erichson, Liam Hodgkinson & Michael Mahoney (2021) Stateful ODE-Nets using Basis Function Expansions. Neural Information Processing Systems.other
-
QUEIRUGA, A., N. Erichson, Liam Hodgkinson & Michael Mahoney (2021) Compressing Deep ODE-Nets using Basis Function Expansions. arXiv.org.other
-
SAMO, Giuseppe & Paola Merlo (2021) Intervention effects in clefts: a study in quantitative computational syntax. Glossa.syntax
-
SARTI, Gabriele, D. Brunato & F. Dell'Orletta (2021) That Looks Hard: Characterizing Linguistic Complexity in Humans and Language Models. Workshop on Cognitive Modeling and Computational Linguistics.syntax
-
SCHNEIDER, Nathan & Amir Zeldes (2021) Mischievous nominal constructions in Universal Dependencies. Universal Dependencies Workshop.syntax
-
SHAHMOHAMMADI, Sara, H. Veisi & A. Darzi (2021) Persian Rhetorical Structure Theory. arXiv.org.discoursesyntax
-
SHI, Tianze & Lillian Lee (2021) TGIF: Tree-Graph Integrated-Format Parser for Enhanced UD with Two-Stage Generic- to Individual-Language Finetuning. International Workshop/Conference on Parsing Technologies.syntax
-
SILVANO, Purificação, Antonio Leal, Fátima Silva, Inês Cantante, F. Oliveira & Alípio Jorge (2021) Developing a multilayer semantic annotation scheme based on ISO standards for the visualization of a newswire corpus. International Symposium on Algorithms.semanticsannotation
-
SUN, Kun, Rong Wang & Wenxin Xiong (2021) Investigating genre distinctions through discourse distance and discourse network. Corpus Linguistics and Linguistic Theory.discoursesyntax
-
TOSHNIWAL, Shubham, Patrick Xia, Sam Wiseman, Karen Livescu & Kevin Gimpel (2021) On Generalization in Coreference Resolution. CRAC.coreferenceannotation
-
TUORA, Ryszard, A. Przepiórkowski & Aleksander Leczkowski (2021) Comparing learnability of two dependency schemes: 'semantic' (UD) and 'syntactic' (SUD). Conference on Empirical Methods in Natural Language Processing.semanticssyntaxannotation
-
WANG, Hongru, Zezhong Wang, G. Fung & Kam-Fai Wong (2021) MCML: A Novel Memory-based Contrastive Meta-Learning Method for Few Shot Slot Tagging. International Joint Conference on Natural Language Processing.entities
-
XIAO, Wen, Patrick Huber & G. Carenini (2021) Predicting Discourse Trees from Transformer-based Neural Summarizers. North American Chapter of the Association for Computational Linguistics.discoursesummarization
-
YAN, Jianwei & Haitao Liu (2021) Semantic Roles or Syntactic Functions: The Effects of Annotation Scheme on the Results of Dependency Measures. Studia Linguistica.semanticssyntaxannotation
-
ZELDES, Amir (2021) Opinion Piece: Can we Fix the Scope for Coreference?: Problems and Solutions for Benchmarks beyond OntoNotes. Dialogue and Discourse.coreferencesemanticsannotation
-
ZELDES, Amir, Yang Liu, Mikel Iruskieta, Philippe Muller, Chloé Braud & Sonia Badene (2021) The DISRPT 2021 Shared Task on Elementary Discourse Unit Segmentation, Connective Detection, and Relation Classification. DISRPT.discourseannotation
-
ZEMAN, Daniel, A. Nedoluzhko, M. Novák, M. Popel, Z. Žabokrtský & Daniel Zeman (2021) CorefUD 0 . 1 Coreference meets Universal Dependencies – a pilot experiment on harmonizing coreference datasets for 11 languages.coreferencesyntax
-
ZHU, Yilun, Sameer Pradhan & Amir Zeldes (2021) Anatomy of OntoGUM—Adapting GUM to the OntoNotes Scheme to Evaluate Robustness of SOTA Coreference Algorithms. CRAC.coreferencediscoursesyntaxannotation
-
ZHU, Yilun, Sameer Pradhan & Amir Zeldes (2021) OntoGUM: Evaluating Contextualized SOTA Coreference Resolution on 12 More Genres. Annual Meeting of the Association for Computational Linguistics.coreferencediscoursesyntax
-
ZHUKOVA, Anastasia, Felix Hamborg & Bela Gipp (2021) Towards Evaluation of Cross-document Coreference Resolution Models Using Datasets with Diverse Annotation Schemes. International Conference on Language Resources and Evaluation.coreferencesemanticsentitiesannotation
-
ZHUKOVA, Anastasia, Felix Hamborg & Bela Gipp (2021) Qualitative and Quantitative Analysis of Diversity in Cross-document Coreference Resolution Datasets. arXiv.org.coreference
-
BEHZAD, Shabnam & Amir Zeldes (2020) A Cross-Genre Ensemble Approach to Robust Reddit Part of Speech Tagging. Workshop on Autonomic Communication.other
-
BEREND, Gábor (2020) Sparsity Makes Sense: Word Sense Disambiguation Using Sparse Contextualized Word Representations. Conference on Empirical Methods in Natural Language Processing.semantics
-
BRUNET, P., Olivier Ferret & Ludovic Tanguy (2020) Which Dependency Parser to Use for Distributional Semantics in a Specialized Domain? COMPUTERM.semanticssyntax
-
CULBERTSON, J., M. Schouwstra & S. Kirby (2020) From the world to word order: Deriving biases in noun phrase order from statistical properties of the world. Language.syntax
-
EDMISTON, Daniel (2020) A Systematic Analysis of Morphological Content in BERT Models for Multiple Languages. arXiv.org.semantics
-
GARDNER, Matt, Yoav Artzi, Victoria Basmova, Jonathan Berant, Ben Bogin, Sihao Chen, Pradeep Dasigi, Dheeru Dua, Yanai Elazar, Ananth Gottumukkala, Nitish Gupta, Hannaneh Hajishirzi, Gabriel Ilharco, Daniel Khashabi, Kevin Lin, Jiangming Liu, Nelson Liu, Phoebe Mulcaire, Qiang Ning, Sameer Singh, Noah Smith, Sanjay Subramanian, Reut Tsarfaty, Eric Wallace, Ally Zhang & Ben Zhou (2020) Evaluating NLP Models via Contrast Sets. arXiv.org.syntax
-
GARDNER, Matt, Yoav Artzi, Jonathan Berant, Ben Bogin, Sihao Chen, Dheeru Dua, Yanai Elazar, Ananth Gottumukkala, Nitish Gupta, Hannaneh Hajishirzi, Gabriel Ilharco, Daniel Khashabi, Kevin Lin, Jiangming Liu, Nelson Liu, Phoebe Mulcaire, Qiang Ning, Sameer Singh, Noah Smith, Sanjay Subramanian, Eric Wallace, Ally Zhang & Ben Zhou (2020) Evaluating Models’ Local Decision Boundaries via Contrast Sets. Findings.syntax
-
GESSLER, Luke, Siyao Peng, Yang Liu, Yilun Zhu, Shabnam Behzad & Amir Zeldes (2020) AMALGUM – A Free, Balanced, Multilayer English Web Corpus. International Conference on Language Resources and Evaluation.coreferenceentitiesdiscourse
-
GOOT, Rob, A. Ustun, Alan Ramponi, Ibrahim Sharaf & Barbara Plank (2020) Massive Choice, Ample Tasks (MaChAmp): A Toolkit for Multi-task Learning in NLP. Conference of the European Chapter of the Association for Computational Linguistics.syntax
-
GUIBON, Gaël, M. Courtin, Kim Gerdes & Bruno Guillaume (2020) When Collaborative Treebank Curation Meets Graph Grammars. International Conference on Language Resources and Evaluation.syntax
-
HAO, Z., Di Lv, Zijian Li, Ruichu Cai, Wen Wen & Boyan Xu (2020) Semi-Supervised Disentangled Framework for Transferable Named Entity Recognition. Neural Networks.entities
-
HOU, Yutai, Wanxiang Che, Y. Lai, Zhihan Zhou, Yijia Liu, Han Liu & Ting Liu (2020) Few-shot Slot Tagging with Collapsed Dependency Transfer and Label-enhanced Task-adaptive Projection Network. Annual Meeting of the Association for Computational Linguistics.semantics
-
JIANG, Feng, Xiaomin Chu, Peifeng Li, F. Kong & Qiaoming Zhu (2020) Chinese Paragraph-level Discourse Parsing with Global Backward and Local Reverse Reading. International Conference on Computational Linguistics.discourse
-
KOBAYASHI, Hideo & Vincent Ng (2020) Bridging Resolution: A Survey of the State of the Art. International Conference on Computational Linguistics.coreference
-
LAI, Yi-An, Garima Lalwani & Yi Zhang (2020) Context Analysis for Pre-trained Masked Language Models. Findings.syntax
-
LATA, K., Pardeep Singh & Kamlesh Dutta (2020) A comprehensive review on feature set used for anaphora resolution. Artificial Intelligence Review.coreference
-
MARTÍN-RODILLA, Patricia & Miguel Sánchez (2020) Software Support for Discourse-Based Textual Information Analysis: A Systematic Literature Review and Software Guidelines in Practice. Inf.discourse
-
MIASCHI, Alessio, D. Brunato, F. Dell'Orletta & Giulia Venturi (2020) Linguistic Profiling of a Neural Language Model. International Conference on Computational Linguistics.other
-
MIASCHI, Alessio & F. Dell'Orletta (2020) Contextual and Non-Contextual Word Embeddings: an in-depth Linguistic Investigation. Workshop on Representation Learning for NLP.other
-
MILLERT, A. & Anar Yeginbergenova (2020) What are you saying? Dialogue act annotation.discourse
-
RUPPENHOFER, Josef & Ines Rehbein (2020) I’ve got a construction looks funny – representing and recovering non-standard constructions in UD. Universal Dependencies Workshop.syntax
-
SANGUINETTI, M., Lauren Cassidy, C. Bosco, Ozlem cCetinouglu, Alessandra Cignarella, Teresa Lynn, Ines Rehbein, Josef Ruppenhofer, Djamé Seddah & Amir Zeldes (2020) Treebanking user-generated content: a UD based overview of guidelines, corpora and unified recommendations. Language Resources and Evaluation.syntaxannotation
-
SANGUINETTI, M., C. Bosco, Lauren Cassidy, Özlem Çetinoğlu, Alessandra Cignarella, Teresa Lynn, Ines Rehbein, Josef Ruppenhofer, Djamé Seddah & Amir Zeldes (2020) Treebanking User-Generated Content: A Proposal for a Unified Representation in Universal Dependencies. International Conference on Language Resources and Evaluation.syntax
-
TOLDOVA, S., T. Davydova, M. Kobozeva, D. Pisarevskaya & Moscow Ras (2020) DISCOURSE FEATURES OF BLOGS IN SUBCORPUS OF RUSSIAN RU-RSTREEBANK. Computational Linguistics and Intellectual Technologies.discourse
-
YAN, Lingyong, Xianpei Han, Ben He & Le Sun (2020) Global Bootstrapping Neural Network for Entity Set Expansion. Findings.entitiessemantics
-
ZELDES, Amir & Yang Liu (2020) A Neural Approach to Discourse Relation Signal Detection. Dialogue and Discourse.semanticsdiscourse
-
ZELDES, Amir & Yang Liu (2020) A Neural Approach to Discourse Relation Signal Detection..semanticsdiscourse
-
ZELDES, Amir (2020) Corpus Architecture. A Practical Handbook of Corpus Linguistics.other
-
ZHU, Su, Ruisheng Cao, Lu Chen & Kai Yu (2020) Vector Projection Network for Few-shot Slot Tagging in Natural Language Understanding. arXiv.org.entities
-
(2019) DISCEVAL: DISCOURSE BASED EVALUATION.discourse
-
BAMMAN, David, Olivia Lewke & A. Mansoor (2019) An Annotated Dataset of Coreference in English Literature. International Conference on Language Resources and Evaluation.coreference
-
DAS, Debopam (2019) Nuclearity in RST and signals of coherence relations. Proceedings of the Workshop on Discourse Relation Parsing and Treebanking 2019.discoursesyntax
-
FERRACANE, Elisa, Titan Page, Junyi Li & K. Erk (2019) From News to Medical: Cross-domain Discourse Segmentation. North American Chapter of the Association for Computational Linguistics.discourse
-
GESSLER, Luke, Yang Liu & Amir Zeldes (2019) A Discourse Signal Annotation System for RST Trees. Proceedings of the Workshop on Discourse Relation Parsing and Treebanking 2019.discourse
-
GUILLAUME, Bruno (2019) Graph Matching for Corpora Exploration.other
-
HOU, Yutai, Zhihan Zhou, Yijia Liu, Ning Wang, Wanxiang Che, Han Liu & Ting Liu (2019) Few-Shot Sequence Labeling with Label Dependency Transfer. arXiv.org.entities
-
HOU, Yutai, Zhihan Zhou, Yijia Liu, Ning Wang, Wanxiang Che, Han Liu & Ting Liu (2019) Few-Shot Sequence Labeling with Label Dependency Transfer and Pair-wise Embedding.other
-
HUBER, Patrick & G. Carenini (2019) Predicting Discourse Structure using Distant Supervision from Sentiment. Conference on Empirical Methods in Natural Language Processing.discourse
-
KRAUSE, Thomas (2019) ANNIS: A graph-based query system for deeply annotated text corpora.other
-
LAN, Ouyu, Xiao Huang, Bill Lin, He Jiang, Liyuan Liu & Xiang Ren (2019) Learning to Contextually Aggregate Multi-Source Supervision for Sequence Labeling. Annual Meeting of the Association for Computational Linguistics.annotation
-
LIU, Yang (2019) Beyond The Wall Street Journal: Anchoring and Comparing Discourse Signals across Genres. Proceedings of the Workshop on Discourse Relation Parsing and Treebanking 2019.discoursesyntax
-
MADGE, Chris, R. Bartle, Jon Chamberlain, Udo Kruschwitz & Massimo Poesio (2019) Incremental Game Mechanics Applied to Text Annotation. ACM SIGCHI Annual Symposium on Computer-Human Interaction in Play.annotation
-
MADGE, Chris, R. Bartle, Jon Chamberlain, Udo Kruschwitz & Massimo Poesio (2019) Making text annotation fun with a clicker game. International Conference on Foundations of Digital Games.coreferencesyntaxannotation
-
MULLER, Philippe, Chloé Braud & Mathieu Morey (2019) ToNy: Contextual embeddings for accurate multilingual discourse segmentation of full documents. Proceedings of the Workshop on Discourse Relation Parsing and Treebanking 2019.discourse
-
PRANGE, Jakob, Nathan Schneider & Omri Abend (2019) Semantically Constrained Multilayer Annotation: The Case of Coreference. Proceedings of the First International Workshop on Designing Meaning Representations.coreferencesemanticsannotation
-
RATHORE, Archit, N. Chalapathi, Sourabh Palande & Bei Wang (2019) TopoAct: Visually Exploring the Shape of Activations in Deep Learning. Computer graphics forum (Print).summarization
-
SILEO, Damien, Philippe Muller, T. Cruys & Camille Pradel (2019) A Pragmatics-Centered Evaluation Framework for Natural Language Understanding. International Conference on Language Resources and Evaluation.semantics
-
SMITH, Hannah, Zeyu Zhang, John Culnan & Peter Jansen (2019) ScienceExamCER: A High-Density Fine-Grained Science-Domain Corpus for Common Entity Recognition. International Conference on Language Resources and Evaluation.entitiessemantics
-
SPRUGNOLI, R. & Sara Tonelli (2019) Novel Event Detection and Classification for Historical Texts. Computational Linguistics.annotation
-
STYLIANOU, Nikolaos & I. Vlahavas (2019) A Neural Entity Coreference Resolution Review. Expert systems with applications.coreference
-
WESTERA, M. (2019) Some linguistic correlates of gradients and attention weights in BERT.other
-
YAN, Jianwei & Haitao Liu (2019) Which annotation scheme is more expedient to measure syntactic difficulty and cognitive demand? Proceedings of the First Workshop on Quantitative Syntax (Quasy, SyntaxFest 2019).semanticssyntaxannotation
-
ZELDES, Amir, Debopam Das, E. Maziero, Juliano Antonio & Mikel Iruskieta (2019) The DISRPT 2019 Shared Task on Elementary Discourse Unit Segmentation and Connective Detection. Proceedings of the Workshop on Discourse Relation Parsing and Treebanking 2019.discourseannotation
-
ZELDES, Amir, Debopam Das, E. Maziero, Juliano Antonio & Mikel Iruskieta (2019) Introduction to Discourse Relation Parsing and Treebanking (DISRPT): 7th Workshop on Rhetorical Structure Theory and Related Formalisms. Proceedings of the Workshop on Discourse Relation Parsing and Treebanking 2019.discourse
-
CASTILHO, Richard, Giulia Dore, T. Margoni, Penny Labropoulou & Iryna Gurevych (2018) A Legal Perspective on Training Models for Natural Language Processing. International Conference on Language Resources and Evaluation.other
-
HELFRICH, Philipp, Elias Rieb, Giuseppe Abrami, Andy Lücking & Alexander Mehler (2018) TreeAnnotator: Versatile Visual Annotation of Hierarchical Text Relations. International Conference on Language Resources and Evaluation.annotation
-
LIU, Yuqin, Lanling Han, Bo Jiang & Xiaoyan Su (2018) The application and teaching evaluation of Japanese films and TV series corpus in JFL classroom. Electronic library.other
-
PENG, Siyao & Amir Zeldes (2018) All Roads Lead to UD: Converting Stanford and Penn Parses to English Universal Dependencies with Multilayer Annotations. LAW-MWE-CxG@COLING.coreferenceentitiessyntax
-
RODRIGUEZ, Juan, Adam Caldwell & Alexander Liu (2018) Transfer Learning for Entity Recognition of Novel Classes. International Conference on Computational Linguistics.entities
-
ROITBERG, Anna & D. Khachko (2018) Russian Bridging Anaphora Corpus. CLIB.coreference
-
RÖSIGER, Ina, Arndt Riester & Jonas Kuhn (2018) Bridging resolution: Task definition, corpus resources and rule-based experiments. International Conference on Computational Linguistics.coreference
-
RÖSIGER, Ina (2018) BASHI: A Corpus of Wall Street Journal Articles Annotated with Bridging Links. International Conference on Language Resources and Evaluation.coreference
-
SUKTHANKER, R., Soujanya Poria, E. Cambria & Ramkumar Thirunavukarasu (2018) Anaphora and Coreference Resolution: A Review. Information Fusion.coreferencesummarization
-
BARTELD, Fabian & Johanna Flick (2017) LEA - Linguistic Exercises with Annotation Tools. Teach4DH@GSCL.annotation
-
REITTER, D. (2017) Who Aligns With Whom in Web Forum Dialogue: Studies in Big-Data Computational Psycholinguistics.other
-
ZELDES, Amir (2017) A Distributional View of Discourse Encapsulation: Multifactorial Prediction of Coreference Density in RST.coreferencediscourse
-
DUAN, Manjuan, Ethan Hill & Michael White (2016) Generating Disambiguating Paraphrases for Structurally Ambiguous Sentences. LAW@ACL.semantics
-
GERDES, Kim & Sylvain Kahane (2016) Dependency Annotation Choices: Assessing Theoretical and Practical Issues of Universal Dependencies. LAW@ACL.semanticssyntaxannotation
-
HORSMANN, Tobias & Torsten Zesch (2016) Assigning Fine-grained PoS Tags based on High-precision Coarse-grained Tagging. International Conference on Computational Linguistics.other
-
KRAUSE, Thomas, U. Leser & Anke Lüdeling (2016) graphANNIS: A Fast Query Engine for Deeply Annotated Linguistic Corpora. Journal for Language Technology and Computational Linguistics.discourse
-
MEYER, N., Michael Wojatzki & Torsten Zesch (2016) Validating bundled gap filling – Empirical evidence for ambiguity reduction and language proficiency testing capabilities.semantics
-
VOLODINA, Elena, Gintare Grigonyte, I. Pilán, K. Björkenstam & L. Borin (2016) Proceedings of the joint workshop on NLP for Computer Assisted Language Learning and NLP for Language Acquisition at SLTC, Umeå 16th November 2016.other
-
WOJATZKI, Michael, Oren Melamud & Torsten Zesch (2016) Bundled Gap Filling: A New Paradigm for Unambiguous Cloze Exercises. BEA@NAACL-HLT.semantics
-
ZELDES, Amir & D. Simonson (2016) Different Flavors of GUM: Evaluating Genre and Sentence Type Effects on Multilayer Corpus Annotation Quality. LAW@ACL.coreferencesyntaxannotation
-
ZELDES, Amir (2016) rstWeb - A Browser-based Annotation Interface for Rhetorical Structure Theory and Discourse Relations. North American Chapter of the Association for Computational Linguistics.discourseannotation
-
ZELDES, Amir & Shuo Zhang (2016) When Annotation Schemes Change Rules Help: A Configurable Approach to Coreference Resolution beyond OntoNotes. CORBON@HLT-NAACL.coreferencesyntaxannotation
-
HORSMANN, Tobias, Nicolai Erbs & Torsten Zesch (2015) Fast or Accurate? - A Comparative Evaluation of PoS Tagging Models. German Society for Computational Linguistics.other
-
(None) Breaking Ties: Some Methods for Refactoring RST Convergences.discourse
-
FU, Yingxue, M. Nederhof & Anaïs Ollagnier (None) A Topicality-Driven QUD Model for Discourse Processing.discourse
-
HÜLL, Nives & Kaja Dobrovoljc (None) Word Order Variation in Spoken and Written Corpora: A Cross-Linguistic Study of SVO and Alternative Orders.syntax
-
LEAL, António, Purificação, Qinren Zuo, Evelin Amorim & Alípio Jorge (None) An annotation scheme for financial news in Portuguese.annotation
-
MARTELLI, Federico, Roberto Navigli, Simon Krek, Jelena Kallas, Polona, Gantar, S. Koeva, Sanni Nimb, Bolette Pedersen, Sussi, Olsen, Margit Langemets, Kristina Koppel, Tiiu Üksik, Kaja Dobrovoljc, Rafael-J. Ureña-Ruiz, José-Luis Sancho-Sánchez, Veronika Lipp, Tamás, Váradi, A. Győrffy, S. László, Valeria Quochi, Monica, Monachini, Francesca Frontini, Carole Tiberius, R. Tempelaars, Rute, Costa, Ana Salgado, Jaka Čibej & Tina Munda (None) Designing the ELEXIS Parallel Sense-Annotated Dataset in 10 European Languages.semantics
-
MOHAMMADI, Hassan, Alireza Talebpour, Ahmad Aznaveh & S. Yazdani (None) Mehr: A Persian Coreference Resolution Corpus.coreference
-
PRZEPIÓRKOWSKI, Adam, Magdalena Borysiak, Adam Okrasi´nski, Bartosz Pobo˙zniak, Wojciech Stempniak, Kamil Tomaszek, Adam Głowacki, Daniel Zeman, Joakim Nivre, Mitchell Abrams, Elia Ackermann, Noëmi Aepli, Hamid Aghaei, Željko Agi´c, Amir Ahmadi, Lars Ahrenberg, Chika Ajede, Gabrielė Aleksandravičiūtė, Ika Alfina, Avner Algom, Erik Andersen, L. Antonsen, Katya Aplonova, Angelina Aquino, C. Aragon, Glyd Aranes, M. Aranzabe, Bilge Nas, Þórunn Arıcan, Gashaw Arnardóttir, Jes-837 Arutie, Masayuki Arwidarasti, A. Baran, C. Aslan, Luma Asmazoğlu, Furkan Ateyah, Mohammed Atmaca, Aitz-iber Attia, Liesbeth Atutxa, Elena Augustinus, B. Keerthana, M. Balasubramani, Esha Ballesteros, S. Banerjee, Verginica Bank, Mititelu Starkaður, Rodolfo Barkarson, V. Basile, Colin Bas-mov, John Batchelor, S. Bauer, Bedir Kepa, Yifat Bengoetxea, Gözde Moshe, Berk Yevgeni, I. Berzak, Riyaz Bhat, Erica Bhat, Eckhard Biagetti, Agnė Bick, Kristín Bielinskienė, Rogier Bjarnadóttir, Victoria Blok-land, Loïc Bobicev, Emanuel Boizou, Carl Völker, Cristina Börstell, G. Bosco, S. Bouma, A. Bowman, A. Boyd, Kristina Brag-gaar, Aljoscha Brokaitė, Marie Burchardt, Bernard Candito, G. Caron, Lauren Caron, Tatiana Cassidy, Gülşen Cavalcanti, Cebiroğlu Ery-iğit, Flavio Massimiliano, Giuseppe Cecchini, Slavomír Celano, Neslihan Čéplö, Savas Cesur, Özlem Cetin, Fabri-cio Çetinoğlu, S. Chalub, Ethan Chauhan, Taishi Chi, Yongseok Chika, Cho Jinho, Jayeol Choi, Juyeon Chun, Alessandra Chung, Silvie Cignarella, Aurélie Cinková, Collomb Çağrı, Miriam Çöltekin, D. Connor, Corbetta Marine, Mihaela Courtin, P. Cristescu, Daniel Elizabeth, M. Davidson, Martina Dehouck, Elisa Dinakaramani, Bamba Nuovo, Peter Dione, Kaja Dirix, T. Dobrovoljc, Kira Dozat, Puneet Droganova, Hanne Dwivedi, Sandra Eckhoff, Marhaba Eiche, Ali Eli, Binyam Elkahky, Ephrem Olga, Tomaž Erina, Aline Erjavec, Wograine Etienne, Sidney Evelyn, Richárd Facundes, Feder-ica Farkas, Jannatul Favero, Marília Ferdaousi, Fernanda Hector, Jennifer Alcalde, C. Foster, Kazunori Freitas, K. Fujita, Gajdošová Daniel, Federica Galbraith, Marcos Gamba, Moa Garcia, Sebastian Gärdenfors, Fabrício Garza, Kim Gerardi, Filip Gerdes, Gustavo Ginter, Iakes Godoy, Koldo Goenaga, Memduh Gojenola, Yoav Gökırmak, Xavier Goldberg, Gómez Guino-vart, Berta Saavedra, Bernadeta Griciūtė, M. Grioni, Loic Grobol, Normunds Grūzıtis, Bruno Guillaume, Céline Guillot-Barbance, Tunga Güngör, Nizar Habash, Hinrik Hafsteinsson, Jan Ha-jič, Jan Hajič, Mika Hämäläinen, Linh Mỹ, Muhammad Han, Yudistira Hanifmuti, T. Harada, S. Hardwick, Kim Harris, Dag Haug, Johannes Heinecke, Oliver Hellwig, Felix Hennig, Barbora Hladká, Jaroslava Hlaváčová, Florinel Hoci-ung, Petter Hohle, Jena Hwang, Takumi Ikeda, A. Ingason, Radu Ion, Elena Irimia, O.lájídé Ishola, Kaoru Ito, Siratun Jannat, Tomáš Jelínek, Apoorva Jha, Anders Johannsen, Hildur Jónsdóttir, Fredrik Jørgensen, Markus Juutinen, Hüner Kaşıkara, Andre Kaasen, N. Kabaeva, Sylvain Kahane, Hiroshi Kanayama, Jenna Kanerva, Neslihan Kara, Ritván Karahóğa, Boris Katz, Tolga Kayadelen, Jessica Kenney, Václava Kettnerová, Jesse Kirchner, Elena Klementieva, Elena Klyachko, Arne Köhn, Ab-dullatif Köksal, Kamil Kopacewicz, Timo Korkiakangas, M. Köse, N. Kotsyba, Jolanta Kovalevskaitė, Simon Krek, Parameswari Krishnamurthy, Sandra Kübler, Oğuzhan Kuyrukçu, Aslı Kuzgun, Sookyoung Kwak, Veronika Laippala, Lucia Lam, Lorenzo Lambertino, Tatiana Lando, S. Larasati, Alexei Lavrentiev, John Lee, P. Le, Hông, A. Lenci, Saran Lertpra-dit, Herman Leung, Maria Levina, Cheuk Ying, L. Josie, Keying Li, Yuan Li, KyungTae Li, B. Lim, Krister Padovani, Nikola Lindén, Ljubeši´c Olga, Stefano Loginova, Andry Lusito, Mikko Luthfi, Olga Luukko, Teresa Lyashevskaya, Vivien Lynn, Menel Macketanz, Jean Mahamdi, Aibek Maillard, Michael Makazhanov, Christopher Mandl, Manning Ruli, Büşra Manurung, Cătălina Marşan, Mărănduc David, Katrin Mareček, Stella Marheinecke, Héctor Markan-tonatou, Lorena Alonso, André Ro-dríguez, J. Martins, Hiroshi Mašek, Matsuda Yuji, Alessandro Matsumoto, Ryan Mazzei, Sarah McDon-ald, G. McGuinness, Tatiana Mendonça, Niko Merzhevich, Karina Miekka, Mischenkova Margarita, Anna Misirpashayeva, Cătălin Missilä, Maria-Bianca Mititelu, Yusuke Mitrofan, AmirHos-922 Miyao, Judit Foroushani, Amirsaeid Molnár, Simonetta Moloodi, Amir Montemagni, Laura More, G. Romero, Keiko Moretti, Shinsuke Mori, Tomohiko Mori, Shigeki Morioka, Bjartur Moro, Bohdan Mortensen, Moskalevskyi Kadri, Robert Muischnek, Yugo Munro, Murawaki Kaili, Pinkey Müürisep, Mariam Nainwani, Nakhlé Juan, Ignacio Navarro, Anna Horñiacek, Nedoluzhko Gunta, Manuela Nešpore-B¯erzkalne, Lûông Nevaci, Nguy˜ên, Huyên Nguy˜ên, Thi. Minh, Yoshihiro Nikaido, Vitaly Nikolaev, Rattima Nitisaroj, Alireza Nourian, H. Nurmi, Stina Ojala, Atul Ojha, Adedayo Oluokun, Mai Omura, Emeka Onwueg-buzia, Noam Ordan, P. Osenova, Robert Östling, Lilja Øvrelid, Şaziye Özateş & Merve Özçelik (None) Symmetric Dependency Structure of Coordination: Crosslinguistic Arguments from Dependency Length Minimization.syntax
@Article{Zeldes2017,
author = {Amir Zeldes},
title = {The {GUM} Corpus: Creating Multilayer Resources in the Classroom},
journal = {Language Resources and Evaluation},
year = {2017},
volume = {51},
number = {3},
pages = {581--612},
doi = {http://dx.doi.org/10.1007/s10579-016-9343-x}
}
@InProceedings{BehzadZeldes2020,
author = {Shabnam Behzad and Amir Zeldes},
title = {A Cross-Genre Ensemble Approach to Robust {R}eddit
Part of Speech Tagging},
booktitle = {Proceedings of the 12th Web as Corpus Workshop (WAC-XII)},
pages = {50--56},
year = {2020},
url = {https://aclanthology.org/2020.wac-1.7/}
}
@inproceedings{lin-zeldes-2021-wikigum,
title = {{W}iki{GUM}: Exhaustive Entity Linking for Wikification in 12 Genres},
author = {Jessica Lin and Amir Zeldes},
booktitle = {Proceedings of The Joint 15th Linguistic Annotation Workshop (LAW) and
3rd Designing Meaning Representations (DMR) Workshop (LAW-DMR 2021)},
year = {2021},
address = {Punta Cana, Dominican Republic},
url = {https://aclanthology.org/2021.law-1.18},
pages = {170--175},
}
@inproceedings{lin-zeldes-2024-gumsley,
title = "{GUM}sley: Evaluating Entity Salience in Summarization
for 12 {E}nglish Genres",
author = "Lin, Jessica and
Zeldes, Amir",
editor = "Graham, Yvette and
Purver, Matthew",
booktitle = "Proceedings of the 18th Conference of the European Chapter of the
Association for Computational Linguistics (Volume 1: Long Papers)",
year = "2024",
address = "St. Julian{'}s, Malta",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2024.eacl-long.158/",
pages = "2575--2588"
}
@inproceedings{liu-zeldes-2023-gumsum,
title = "{GUMS}um: Multi-Genre Data and Evaluation for {E}nglish Abstractive
Summarization",
author = "Liu, Yang Janet and
Zeldes, Amir",
editor = "Rogers, Anna and
Boyd-Graber, Jordan and
Okazaki, Naoaki",
booktitle = "Findings of the Association for Computational Linguistics: ACL 2023",
month = jul,
year = "2023",
address = "Toronto, Canada",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2023.findings-acl.593/",
doi = "10.18653/v1/2023.findings-acl.593",
pages = "9315--9327",
}
@InProceedings{ZhuEtAl2021,
author = {Yilun Zhu and Sameer Pradhan and Amir Zeldes},
booktitle = {Proceedings of ACL-IJCNLP 2021},
title = {{OntoGUM}: Evaluating Contextualized {SOTA} Coreference Resolution
on 12 More Genres},
year = {2021},
address = {Bangkok, Thailand},
pages = {461--467},
url = {https://aclanthology.org/2021.acl-short.59.pdf}
}
@article{zeldes-etal-2025-erst,
title = "e{RST}: A Signaled Graph Theory of Discourse Relations and Organization",
author = "Zeldes, Amir and
Aoyama, Tatsuya and
Liu, Yang Janet and
Peng, Siyao and
Das, Debopam and
Gessler, Luke",
journal = "Computational Linguistics",
volume = "51",
number = "1",
year = "2025",
address = "Cambridge, MA",
publisher = "MIT Press",
url = "https://aclanthology.org/2025.cl-1.3/",
doi = "10.1162/coli_a_00538",
pages = "23--72"
}
@inproceedings{liu-etal-2024-gdtb,
title = "{GDTB}: Genre Diverse Data for {E}nglish Shallow Discourse Parsing across
Modalities, Text Types, and Domains",
author = "Liu, Yang Janet and
Aoyama, Tatsuya and
Scivetti, Wesley and
Zhu, Yilun and
Behzad, Shabnam and
Levine, Lauren Elizabeth and
Lin, Jessica and
Tiwari, Devika and
Zeldes, Amir",
editor = "Al-Onaizan, Yaser and
Bansal, Mohit and
Chen, Yun-Nung",
booktitle = "Proceedings of the 2024 Conference on Empirical Methods in Natural
Language Processing",
month = nov,
year = "2024",
address = "Miami, Florida, USA",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2024.emnlp-main.684/",
doi = "10.18653/v1/2024.emnlp-main.684",
pages = "12287--12303"
}
The ISLRN for the corpus is 421-566-418-865-2.
Papers using GUM
This is a (non-exhaustive) list of papers citing the GUM corpus reference papers, feel free to let us know if you know more papers, especially if they make extensive use of the corpus (this list is automatically generated, so not all of them do):
For other research citing GUM, see also the Semantic Scholar and Google Scholar entries for the reference paper.


