Research

  • What's new in GUM?

    What's new in GUM?

    GUM is an open source multilayer corpus of richly annotated web texts from eight text types. The corpus is collected and expanded by students as part of the curriculum in More

  • Entities in the Coptic Treebank

    Entities in the Coptic Treebank

    With the release of Version 2.6 of Universal Dependencies, our focus has shifted to handling Named and Non-Named Entity Recognition (NER/NNER) in Coptic data. As a result of intensive work by the Coptic Scriptorium team in the past few months,...

    More
  • New features in our NLP pipeline

    New features in our NLP pipeline

    Coptic Scriptorium’s Natural Language Processing (NLP) tools now support two new features:



    More
  • A Neural Network Reads the Newspaper

    A Neural Network Reads the Newspaper

    ... in search of discourse signals! We now know a lot about what cues people use to identify discourse relations, but can we teach computers to notice the same signals?



    More
  • What you say where - a discourse heatmap

    What you say where - a discourse heatmap

    Does discourse structure constrain where we talk about what? Research on recurring mentions within discourse graphs shows back-reference is sensitive to the reasons why sentences and groups of sentences are uttered. In the image above, ...

    More