I am reading this paper in an attempt to recreate the salient region detection and segmentation model employed. I have the following questions pertaining to section 3 of the paper and I would highly appreciate it if someone could provide clarity on them.
The word "scales" is used at multiple points in the section, for example, line 4 of the section states "saliency maps are created at different scales". I do not exactly understand what the authors mean by the word scales. Moreover, is there a mathematical way to think about it?
I understand that a saliency value
is computed for each pixel at (
) using the equation
However, there is no mention of in the equation. Hence, I am confused as to what pixel the saliency value is being computed for. Is it
?
- I did not understand what the authors meant by the term "bin" in section 3.2 line 5 where it is stated, "The hill-climbing algorithm can be seen as a search window being run across the space of the d-dimensional histogram to find the largest bin within that window."
Note 1: This question was originally posted on Stack Overflow. I was advised to post it on another platform as a consequence of it being unfitting to the site. Hence, I am uploading the question here. Link to the original post here.
Note 2: In case you are unable to access the link to the research paper, the following citation may help: Achanta, R., Estrada, F., Wils, P., & Süsstrunk, S. (2008, May). Salient region detection and segmentation.