Landmark detection with surprise saliency using convolutional neural networks

Feng Tang, Fordham University


Landmarks can be used as reference to enable people or robots localize themselves or to navigate in an environment. Automatic definition and extraction of appropriate landmarks from the environment has proven to be a challenging task when pre-defined landmarks are not present. In this thesis we propose a novel computational model of automatic detecting landmarks from a single image without any pre-defined landmark database. The hypothesis of this research is that an object looks abnormal due to its atypical scene context, and it then may be considered as a good landmark because it is unique and easy to spot by different viewers (or the same viewer at different times). The model consists of scene classification, object detection, scene-object relation and landmark detection using surprised saliency. We leverage state-of-the-art algorithms based on convolutional neural networks to recognize scenes and detect objects. Then we calculate surprise score for objects detected in the image, to determine if they are good landmarks. A surprise score is based on the probabilities of scenes, objects and relationship between them. In order to evaluate the performance of our model, we collected a landmark image dataset which consists of landmark images and non-landmark images. The experimental results, including accuracy and F1 scores, show that our model achieves good performance in landmark detection and landmark image classification.

Subject Area

Robotics|Artificial intelligence|Computer science

Recommended Citation

Tang, Feng, "Landmark detection with surprise saliency using convolutional neural networks" (2016). ETD Collection for Fordham University. AAI10189245.