On these cropped images we performed image recognition. Because image features have many properties that make them suitable for matching and differing images of an object or scene. Some of the features are invariant to image scaling and rotation. We use Scale-invariant feature transform (or SIFT) algorithm for finding and computing descriptors of each images. SIFT is an algorithm in computer vision to detect and describe local features in images. The algorithm was published by David Lowe in 1999. SIFT features are extracted from a set of candidate images and stored in a database. By applying K-means clustering algorithm those descriptors for image features go into clusters respectively. к-means clustering is a method of cluster analysis which aims to partition n observations into к clusters in which each observation belongs to the cluster with the nearest mean. We fetch then centroid image of all clusters which means that the images thereof are the representative images of each clusters.

As we intend to have highly representative images, namely we want small set of images that are highly dissimilar; we developed a ranking mechanism to select more representative images given a image set created above. First we perform image matching by individually comparing each features based on Euclidean distance of their feature vectors. Then we compute the ratio of number of matching points to summation number of the detected key points between images. A higher ratio indicates a larger possibility of similarity of two images. At last we sort the ratio by ascending order. So that we are free by top n images from the sorted array. The overall process is depicted in figure 1.

Fig1Generating Representative_decrypted
Figure 1 Overall process of image processing

We evaluated the performances of these two methods by face to face interview with human and came to a conclusion that sequential windows cropping method is better than the random one.

We defined the problem statement more specifically in section 3. In section 4 we briefly introduced the approaches we used to generate representative sets. Section 5 gives the detailed windows cropping mechanism. In section 6 we give the clustering details and section 7 the ranking mechanism. We evaluated our experiment in section 8. Before we do all that, we report some of the important related work.