EMNLP

EMNLP long paper using the shared ACL style file. Double-column, 8-page body + unlimited references layout, required Limitations and Ethics Statement sections, and full-length empirical NLP paper structure (dataset, method, experiments, analysis).

Category

Conference

License

Free to use (MIT)

File

emnlp/main.tex

main.texRead-only preview
\documentclass[11pt]{article}

% Shared ACL style file (also used by EMNLP, NAACL).
\usepackage[]{acl}

\usepackage{times}
\usepackage{latexsym}
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage{microtype}
\usepackage{inconsolata}
\usepackage{graphicx}
\usepackage{amsmath,amssymb}
\usepackage{booktabs}
\usepackage{multirow}
\usepackage{natbib}
\usepackage{url}

\title{Structured Retrieval for Long-Document\\
       Open-Domain Question Answering}

\author{First Last \\
  University of Example \\
  \texttt{[email protected]} \\\And
  Jane Doe \\
  University of Example \\
  \texttt{[email protected]} \\\AND
  John Smith \\
  Example Research Labs \\
  \texttt{[email protected]} \\}

\begin{document}
\maketitle

\begin{abstract}
Long documents present retrieval challenges distinct from standard
passage retrieval: chunk-level retrievers lose cross-section context
while document-level retrievers lose precision. We propose a structured
retriever that encodes section hierarchy explicitly through learned
tree embeddings, allowing a single model to match queries at multiple
granularities. On four long-document QA benchmarks---NarrativeQA,
Qasper, ContractQA, and LegalBench---our method improves answer exact
match by 7.2 points over strong chunk-based baselines at equal compute.
We release the trained retriever and a new 200k-document training corpus.
\end{abstract}

\section{Introduction}
Modern LLMs can consume long contexts but benefit greatly from retrieval
at inference time. Yet, how to retrieve structured, well-contextualized
evidence from long documents remains an open problem. Naive chunking
destroys cross-section discourse structure; single-vector document
retrievers sacrifice precision.

We propose a \emph{structured retriever} that operates on documents
represented as trees of sections. Queries can match at leaf, interior,
or root granularity, and a GNN propagates information across the
hierarchy.

\paragraph{Contributions.} (1) A hierarchical document representation
and retrieval procedure; (2) an efficient training scheme via
contrastive learning over section trees; (3) state-of-the-art results on
four long-document QA benchmarks; (4) open release of the 200k-document
training corpus.

\section{Related Work}
\paragraph{Dense retrieval.}
DPR~\citep{karpukhin2020dpr} and ColBERT introduced dense retrieval as
a workhorse for open-domain QA.

\paragraph{Long-document retrieval.}
Prior work includes hierarchical transformers~\cite{liu2020hierarchical}
and long-context retrievers trained over passage windows. Our method
differs by making structure an explicit, learned component.

\section{Method}
\subsection{Document Representation}
We represent each document as a tree of sections. Section embeddings
are computed with a shared encoder, and a GNN propagates information
between parents and children.

\subsection{Training}
We train with contrastive learning over triplets (query, positive
section, negative sections). Negatives are mined at multiple tree
depths to force the retriever to distinguish precise versus approximate
matches.

\subsection{Inference}
At query time, we encode the question once and score all nodes of each
document tree in parallel. The top-$k$ nodes across the corpus are
returned as retrieved evidence.

\section{Experiments}
\subsection{Benchmarks}
\paragraph{NarrativeQA} requires reasoning over full book texts.
\paragraph{Qasper} has scholarly papers as context.
\paragraph{ContractQA} tests retrieval over legal contracts.
\paragraph{LegalBench} is a broad legal reasoning benchmark.

\begin{table}[t]
\centering
\small
\begin{tabular}{lcccc}
\toprule
Method & NQA & Qasper & CQA & LB \\
\midrule
BM25           & 38.1 & 29.2 & 41.8 & 52.3 \\
DPR            & 42.6 & 33.4 & 44.7 & 55.1 \\
LongRAG        & 47.9 & 39.8 & 48.4 & 58.6 \\
\textbf{Ours}  & \textbf{54.3} & \textbf{46.1} & \textbf{55.9} & \textbf{65.2} \\
\bottomrule
\end{tabular}
\caption{Exact match / F1 on four long-document QA benchmarks.}
\label{tab:main}
\end{table}

\subsection{Analysis}
Ablations show that hierarchical structure helps most when documents
exceed 20 pages. On shorter documents, our retriever matches strong
passage-level baselines but does not dominate them.

\section{Conclusion}
Explicitly modeling document structure is a simple and effective lever
for retrieval on long documents. Our released retriever and corpus
should accelerate follow-up work in this area.

\section*{Limitations}
Our method requires document segmentation, which we obtain via heading
detection. Performance degrades on unstructured texts lacking headings.
We evaluate only on English-language benchmarks; generalization to
other languages remains future work.

\section*{Ethics Statement}
The released corpus is drawn from public domain documents and licensed
legal text. We removed personally identifying information during
preprocessing. While our retriever can be used for legal search, we
stress that it is not a substitute for qualified legal advice.

\section*{Acknowledgments}
We thank the anonymous EMNLP reviewers and colleagues at Example
Research Labs.

\bibliography{refs}
\bibliographystyle{acl_natbib}

\end{document}
Bibby Mascot

PDF Preview

Create an account to compile and preview

EMNLP LaTeX Template | Free Download & Preview - Bibby