GENERATING REPRESENTATIVE SETS AND SUMMARIES FOR LARGE COLLECTION OF IMAGES: EVALUATION (1)

Initial Experimental Setup

First we define the data set. The data set is of 1000 holiday images (ordinary 2D high quality images) taken from INRIA holidays data-set [4].Our dataset only contains images without any tags. The dataset includes a very large variety of scene types (natural, man-made, water and fire effects, etc) and images are in high resolution. Using windows cropping techniques we increase image sets from 1000 to N*1000 images. For example if we get 3 windows per image then we will be have 3000 images for the overall experiments. For the evaluation, we have selected 6 data set on the basis of overall time, coverage and number of windows and also we used the only sift output for the overall comparison. Well we have set values for coverage C, number of windows N for the experiments. The idea is to set coverage higher in order to lose less information. We right now take consideration of only N=3 and 5 windows and C =66%, 75% and 85%.We have selected two result image sets (one for random and the other for sequential windows) of each coverage C for the evaluation while taking care of different N values.

For the clustering phase, we get input image set or windows set from the first phase. After many initial experiments we set к=90 for the first clustering. As we said earlier that we are applying k-means two times in order to get reduced set and highly dissimilar images. So, after have 90 clusters we apply again k-means with the value of к=20. The idea is to keep same к, so that windows of the original image mostly go to the same cluster because of high coverage C. The outcome of this phase is the representative set which have 20 images of the original data set.

For the ranking mechanism, we get input 20 images from the second clustering phase and then we apply ranking mechanism on it and it gives rank of all images. Now for taking into consideration of human based evaluation, we select 3 different sets of images top 10, 15 and 20 images and these becomes summarization of the original big set. By doing this, we would like to see whether changing the number of images in the summary set does give different results.

We devised a questionnaire for human evaluation of our outcomes from different methods and spread it to 24 interviewees. We would like to see the rating of the users or human evaluator for each sets. We are also interested to see, changing the number of images inside the summary affects the result. For the experiment we have chosen 10, 15 and 20 images for the summary for the each set. So, human evaluator has to check total 18 sets (6 generated set * 3 different summary of each image set). Before starting questionnaire, we show the original image collection and make them understand about images and our project perspective.