Skip to yearly menu bar Skip to main content

Affinity Workshop: Tiny Papers Poster Session 4

Beyond Words: A Topological Exploration of Coherence in Text Documents

Samyak Jain · Rishi Singhal · Sriram Krishna · Yaman Singla · Rajiv Ratn Shah

Halle B #305
[ ] [ Project Page ]
Wed 8 May 7:30 a.m. PDT — 9:30 a.m. PDT

Abstract: Coherence serves as a pivotal metric in evaluating the quality of a text. It quantifies how well the sentences within the text are connected and how well the text is structured and organized. It plays a vital role in various downstream Natural Language Processing tasks such as text summarization, question answering and machine translation among others. In this work, we explore the use of topological data analysis (TDA) techniques on attention graphs of text documents to model coherence. TDA techniques are known to capture structural information and patterns in data, making it suitable for modeling the $\textit{structure}$ and $\textit{flow}$ of a document, i.e. coherence. We validate our approach with experiments on the GCDC dataset, achieving state-of-the-art results with a simple MLP.

Chat is not available.