Older updates

Our EACL 2023 paper on a thorough investigation of RST generalizability issues, with a focus on the impact of data diversity, thereby promoting multi-genre benchmarks for RST parsing based on our experimental results

Check out our ACL paper about generalization in SOTA coreference resolution, including the new OntoGUM dataset for evaluation.

Please join us online for Digital Coptic 3 , the virtual workshop for DH project on Coptic!

Would like to have more data to work with? Check our LREC paper , where we present a freely available, genre-balanced English web corpus totaling 4M tokens and featuring a large number of high-quality automatic annotation layers, including dependency trees, non-named entity annotations, coreference resolution, and discourse trees in Rhetorical Structure Theory.

2020-06-07

WAC-XII

Shabnam and Amir's paper on Reddit part of speech tagging was accepted to WAC-XII .

2020-05-18

Thoughts on how to treebank social media? Read our LREC paper

2019-08-01

rstWeb 3.0.0 is out with discourse signal annotation support ( paper )

2019-03-21

GUM version 5.0.0 has been released!

2018-12-09

SCiL 2019

Janet and Amir will present a paper about anchoring discourse signals in RST-DT at SCiL 2019

2018-09-21

Logan, Janet and Amir are presenting 3 papers at AACL2018 in Atlanta

2018-05-10

MASC SLL

Logan is presenting a paper about UD GUM at MASC SLL in UMBC

2018-02-23

Amir giving a talk at JHU

Amir gave a talk about discourse signals at JHU