ECCV

ECCV paper using the official ECCV style on the Springer LNCS track. Single-column body, anonymous submission mode, LNCS-style references via splncs04, and ORCID link support. 14-page body + unlimited references is the standard ECCV limit.

Category

Conference

License

Free to use (MIT)

File

eccv/main.tex

main.texRead-only preview
% This is samplepaper.tex, a sample chapter demonstrating the LLNCS + ECCV style.
\documentclass[runningheads]{llncs}

\usepackage[T1]{fontenc}
\usepackage{graphicx}
\usepackage{amsmath,amssymb}
\usepackage{booktabs}
\usepackage{caption}
\usepackage{subcaption}
\usepackage{multirow}
\usepackage{array}
\usepackage[accsupp]{axessibility}  % Improves PDF readability for screen readers
\usepackage{color}

% ECCV style toggle file. Swap to the current year's variant.
\usepackage{eccv}

% For anonymous submission this line is commented out; for camera-ready uncomment:
% \eccvfinalcopy

\def\eccvPaperID{7777}  % *** Enter the ECCV paper ID here ***

\usepackage[pagebackref,breaklinks,colorlinks]{hyperref}
\ifdefined\eccvfinalcopy
\else
\pagestyle{headings}
\mainmatter
\def\ECCVSubNumber{7777}
\fi

\begin{document}

\title{Self-Supervised Dense Correspondence via\\
  Forward-Backward Cyclic Consistency}

\titlerunning{Self-Supervised Dense Correspondence}

\ifdefined\eccvfinalcopy
\author{First Last\inst{1}\orcidlink{0000-0000-0000-0000} \and
        Jane Doe\inst{2}\orcidlink{0000-0000-0000-0000} \and
        John Smith\inst{1}}
\authorrunning{F. Last et al.}
\institute{University of Example, City, Country\\
           \email{[email protected]}
           \and Example Research Labs, City, Country\\
           \email{[email protected]}}
\else
\author{Anonymous ECCV submission\\\vspace{1em}Paper ID \eccvPaperID}
\authorrunning{ECCV-26 submission ID \eccvPaperID}
\institute{\email{Anonymous}}
\fi

\maketitle

\begin{abstract}
We propose a self-supervised method for learning dense pixel
correspondences across video frames by enforcing forward-backward
cyclic consistency. Given a pair of frames, we learn a bidirectional
flow field and penalize deviations from the identity under composition
of forward and backward maps. The method requires no labeled training
data and achieves state-of-the-art results among self-supervised
approaches on TAP-Vid-DAVIS, TAP-Vid-Kinetics, and YouTube-VOS,
improving average Jaccard by up to 9 points. We further show that
cyclic consistency is a strict generalization of prior photometric
losses, explaining the consistent gains.

\keywords{Dense correspondence \and Self-supervised learning \and
Video tracking \and Cycle consistency}
\end{abstract}

\section{Introduction}
Dense correspondence underpins optical flow, tracking, video editing,
and 3D reconstruction. Supervised methods rely on synthetic or
expensively annotated data; self-supervised methods are abundant but
have lagged in accuracy, particularly for long temporal horizons.

We revisit cyclic consistency, a well-known idea in computer vision,
and propose an especially strong form that enforces agreement across
non-adjacent frames via composition of intermediate flows. This yields
a dense self-supervised signal that scales with video length.

\paragraph{Contributions.} (1) A cyclic self-supervised objective
applicable to any dense-correspondence architecture. (2) Theoretical
analysis showing that standard photometric self-supervision is a strict
weakening of our objective. (3) State-of-the-art self-supervised
results on three long-horizon tracking benchmarks.

\section{Related Work}
Self-supervised optical flow~\cite{meister2018unflow,liu2020flow}, point
tracking~\cite{doersch2022tap}, and cycle consistency in vision are the
directly relevant prior work. Our contribution is the combination of
high-order cycles with modern dense-tracker architectures.

\section{Method}
\subsection{Cyclic Consistency Loss}
Given a frame pair $(I_t, I_{t+1})$, we compute a forward map $F$ and
backward map $B$. Per-pixel consistency is enforced as
\begin{equation}
  \mathcal{L}_{\text{cyc}} = \mathbb{E}_{x} \big\| B(F(x)) - x \big\|^2.
  \label{eq:cyc}
\end{equation}
For frame triples we enforce a three-hop cycle:
$\mathcal{L}_{\text{cyc}}^{(3)} = \mathbb{E}_x \| B \circ B \circ F \circ F \circ F(x) - x \|^2$.

\subsection{Photometric Loss}
As a secondary signal, we use census-transform photometric matching
across warped frames.

\subsection{Architecture}
We build on RAFT~\cite{teed2020raft}, adding cyclic supervision on top
of its iterative refinement.

\section{Experiments}
\subsection{Setup}
We pretrain on an unlabeled 20k-video subset of Kinetics and fine-tune
on each benchmark's training split.

\begin{table}[t]
\centering
\small
\begin{tabular}{lcc}
\toprule
Method & DAVIS (AJ) & Kinetics (AJ) \\
\midrule
DINO-track~\cite{caron2021dino}   & 32.4 & 27.8 \\
RAFT (self-supervised)            & 38.1 & 31.2 \\
\textbf{Ours}                     & \textbf{47.6} & \textbf{40.3} \\
\bottomrule
\end{tabular}
\caption{Average Jaccard on TAP-Vid benchmarks.}
\label{tab:tapvid}
\end{table}

\subsection{Ablations}
Removing the higher-order cycle hurts tracking by 11.3 AJ points.
Removing the photometric secondary signal costs 2.7 points.

\section{Discussion}
The method is self-supervised but is not invariant to occlusion: when
a point is occluded, the cycle deviates. We handle this with a learned
visibility mask, introduced in Section~3.3 of the supplementary material.

\section{Conclusion}
High-order cyclic consistency is a simple and powerful self-supervised
signal for learning dense correspondence over long horizons.

\subsubsection{Acknowledgements.} We thank our anonymous reviewers
and colleagues at Example Research Labs.

\bibliographystyle{splncs04}
\bibliography{refs}

\end{document}
Bibby Mascot

PDF Preview

Create an account to compile and preview

ECCV LaTeX Template | Free Download & Preview - Bibby