Disparity and scene flow estimation methods play a large role in the perception of the environment of autonomous vehicles and vessels. They often require expensive and unpractical stereo camera systems to ensure accuracy. Monocular approaches have been well researched in city environments, but there has been no research about the viability of such approaches in maritime environments that pose challenges such as object mirroring and glitter. We use Self-Mono-SF, one of the best real-time monocular disparity and scene flow estimation methods, as a basis for SceneFlowSegmentation, a novel method that predicts scene flow and disparity and semantic segmentation in real-time. We use the MODD2 and MaSTr1325 datasets for training and the MODS dataset for evaluation. We compare the disparity and scene flow estimation with the predictions of Self-Mono-SF that is trained on a city domain and observe improvements in accuracy. Segmentation results are compared to the current state-of-the-art method, WaSR. The F1 score is decreased by 2 percentage points, however the newly developed method is able to accurately predict disparity and scene flow at the same time while operating at the same speed (10 frames per second).
|