Skip to content

Human-Centric Event Representations at the Document Level and Beyond

Author

Alexander Martin

Mentors

Aaron White and Hangfeng He

Abstract

Understanding event descriptions is a central aspect of language processing, but current approaches focus overwhelmingly on single sentences or documents. Aggregating information about an event \emph{across documents} can offer a much richer understanding. To this end, we present FAMuS, a new corpus of Wikipedia passages that \emph{report} on some event, paired with underlying, genre-diverse (non-Wikipedia) \emph{source} articles for the same event. Events and (cross-sentence) arguments in both report and source are annotated against FrameNet, providing broad coverage of different event types. We present results on two key event understanding tasks enabled by FAMuS: \emph{source validation} — determining whether a document is a valid source for a target report event — and \emph{cross-document argument extraction} — full-document argument extraction for a target event from both its report and the correct source article. We release both FAMuS and our models to support further research.

Human-Centric Event Representations at the Document Level and Beyond