
Effective monitoring of underwater ecosystems is crucial for tracking environmental changes, guiding conservation efforts, and ensuring long-term ecosystem health. However, automating underwater ecosystem management with robotic platforms remains challenging due to the complexities of underwater imagery, which pose significant difficulties for traditional visual localization methods. We propose an integrated pipeline that combines Visual Place Recognition (VPR), feature matching, and image segmentation on video-derived images. This method enables robust identification of revisited areas, estimation of rigid transformations, and downstream analysis of ecosystem changes. Furthermore, we introduce the SQUIDLE+ VPR Benchmark-the first large-scale underwater VPR benchmark designed to leverage an extensive collection of unstructured data from multiple robotic platforms, spanning time intervals from days to years. The dataset encompasses diverse trajectories, arbitrary overlap and diverse seafloor types captured under varying environmental conditions, including differences in depth, lighting, and turbidity.
To enable reliable multi-year change detection, Underloc is an integrated pipeline that combines Visual Place Recognition (VPR), feature matching, and image segmentation on video-derived images. Using the state-of-the-art VPR method, MegaLoc, our hierarchical method takes the top K matched images per query and reranks these candidates using more computationally expensive local feature matching. We use LightGlue to establish keypoint correspondences between Superpoint features and compute the homography matrix for warping and aligning query-database matches. Using the inlier count, we rerank matched images and filter out those with reprojection errors greater than 10 pixels. To simulate a potential change detection method we automatically extract segmentation masks for each image using Segment-Anything 2 (SAM2). We then use the homography matrix to warp the masks into a common image space, enabling pixel-level comparison using intersection over union (IoU) as a similarity proxy.
Our hierarchical method, combining MegaLoc with SuperPoint, achieves performance comparable to brute-force SuperPoint (average precision of 16% for our hierarchical method vs 18% for brute-force).
We evaluate the alignment between the two warped masks using an intersection over union (IoU) metric, where the intersection is defined as the number of shared pixels between the query and database masks, and the union represents the total number of pixels covered by both masks in the warped image.
The first large-scale underwater VPR benchmark designed to leverage an extensive collection of unstructured data from multiple robotic platforms, spanning time intervals from days to years.
Using our publicly available code, any sequence from SQUIDLE+ can be exported and processed by the pipeline to create a dataset that encompasses diverse trajectories, arbitrary overlap and diverse seafloor types captured under varying environmental conditions.
@inproceedings{GorryIROS2025,
title={Image-Based Relocalization and Alignment for Long-Term Monitoring of Dynamic Underwater Environments},
author={Beverley Gorry and Tobias Fischer and Michael Milford and Alejandro Fontan},
booktitle={IEEE/RSJ Conference on Intelligent Robots and Systems},
year={2025},
}
This research was partially supported by funding from ARC Laureate Fellowship FL210100156 to MM and ARC DECRA Fellowship DE240100149 to TF.
The authors acknowledge continued support from the Queensland University of Technology (QUT) through the Centre for Robotics.
We would particularly like to acknlowedge the authors of VPR-methods-evaluation,
MegaLoc, LightGlue,
SAM2, and VSLAM-LAB.