Our EACL 2023 paper on a thorough investigation of RST generalizability issues, with a focus on the impact of data diversity, thereby promoting multi-genre benchmarks for RST parsing based on our experimental results
Our EMNLP paper on cross-domain treebanking and parsing for Hebrew has new SOTA parsing results and presents a brand new UD Hebrew dataset
Check out Logan & Janet & Amir's arxiv paper for the largest Chinese RST dataset to be presented at AACL 2022
Congratulations on Corpling lab's system DisCoDisCo winning the DISRPT 2021 Shared Task on discourse relations!
Check out our ACL paper about generalization in SOTA coreference resolution, including the new OntoGUM dataset for evaluation.
Please join us online for Digital Coptic 3 , the virtual workshop for DH project on Coptic!
Would like to have more data to work with? Check our LREC paper , where we present a freely available, genre-balanced English web corpus totaling 4M tokens and featuring a large number of high-quality automatic annotation layers, including dependency trees, non-named entity annotations, coreference resolution, and discourse trees in Rhetorical Structure Theory.
Shabnam and Amir's paper on Reddit part of speech tagging was accepted to WAC-XII .
Thoughts on how to treebank social media? Read our LREC paper
The Coptic Dictionary Online won the 2019 DH Awards for Best DH Tool!
RFTokenizer now supports Arabic
rstWeb 3.0.0 is out with discourse signal annotation support ( paper )
GumDrop scores 3 second places and 1 first place at the DISRPT2019 shared task
GUM version 5.0.0 has been released!
The DISRPT 2019 shared task data on discourse unit segmentation is online
Janet and Amir will present a paper about anchoring discourse signals in RST-DT at SCiL 2019
Logan, Janet and Amir are presenting 3 papers at AACL2018 in Atlanta
Amir's new book is out! ( Google preview )
Amir and Mitchell published a paper on the Coptic Universal Dependency Treebank, which will be presented at the Universal Dependencies Workshop 2018
RFTokenizer achieves best accuracy on Hebrew morphological segmentation in a new paper at SIGMORPHON 2018
The National Endowment for the Humanities just announced we won a big DHAG grant for research on Coptic !
We're presenting a new paper about the Coptic Dictionary Online at COLING's LaTeCH-2018 workshop .
Logan and Amir will present a paper on converting Stanford Dependencies to Universal Dependencies using multilayered corpus in LAW-MWE-CxG-2018 workshop at COLING2018
Amir presented a paper about notional anaphora at CRAC2018
Logan is presenting a paper about UD GUM at MASC SLL in UMBC
The GUM corpus is now part of Universal Dependencies!
V2.5.0 of Coptic Scriptorium corpora is released
Amir gave a talk about discourse signals at JHU