使用 OpenCV 和 Python 比较图像的相似性

时间:2022-11-11
本文介绍了使用 OpenCV 和 Python 比较图像的相似性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

问题描述

我正在尝试将一张图片与其他图片列表进行比较,并返回该列表中相似度高达 70% 的图片选择(如 Google 搜索图片).

I'm trying to compare a image to a list of other images and return a selection of images (like Google search images) of this list with up to 70% of similarity.

我在 这篇文章中得到了这段代码 并根据我的上下文进行更改

I get this code in this post and change for my context

# Load the images
img =cv2.imread(MEDIA_ROOT + "/uploads/imagerecognize/armchair.jpg")

# Convert them to grayscale
imgg =cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)

# SURF extraction
surf = cv2.FeatureDetector_create("SURF")
surfDescriptorExtractor = cv2.DescriptorExtractor_create("SURF")
kp = surf.detect(imgg)
kp, descritors = surfDescriptorExtractor.compute(imgg,kp)

# Setting up samples and responses for kNN
samples = np.array(descritors)
responses = np.arange(len(kp),dtype = np.float32)

# kNN training
knn = cv2.KNearest()
knn.train(samples,responses)

modelImages = [MEDIA_ROOT + "/uploads/imagerecognize/1.jpg", MEDIA_ROOT + "/uploads/imagerecognize/2.jpg", MEDIA_ROOT + "/uploads/imagerecognize/3.jpg"]

for modelImage in modelImages:

    # Now loading a template image and searching for similar keypoints
    template = cv2.imread(modelImage)
    templateg= cv2.cvtColor(template,cv2.COLOR_BGR2GRAY)
    keys = surf.detect(templateg)

    keys,desc = surfDescriptorExtractor.compute(templateg, keys)

    for h,des in enumerate(desc):
        des = np.array(des,np.float32).reshape((1,128))

        retval, results, neigh_resp, dists = knn.find_nearest(des,1)
        res,dist =  int(results[0][0]),dists[0][0]


        if dist<0.1: # draw matched keypoints in red color
            color = (0,0,255)

        else:  # draw unmatched in blue color
            #print dist
            color = (255,0,0)

        #Draw matched key points on original image
        x,y = kp[res].pt
        center = (int(x),int(y))
        cv2.circle(img,center,2,color,-1)

        #Draw matched key points on template image
        x,y = keys[h].pt
        center = (int(x),int(y))
        cv2.circle(template,center,2,color,-1)



    cv2.imshow('img',img)
    cv2.imshow('tm',template)
    cv2.waitKey(0)
    cv2.destroyAllWindows()

我的问题是,如何将图像与图像列表进行比较并仅获得相似的图像?有什么方法可以做到吗?

My question is, how can I compare the image with the list of images and get only the similar images? Is there any method to do this?

推荐答案

我建议你看一下图像之间的推土机距离(EMD).该指标让人感觉将标准化灰度图像转换为另一个灰度图像有多难,但可以推广到彩色图像.可以在以下论文中找到对该方法的非常好的分析:

I suggest you to take a look to the earth mover's distance (EMD) between the images. This metric gives a feeling on how hard it is to tranform a normalized grayscale image into another, but can be generalized for color images. A very good analysis of this method can be found in the following paper:

robotics.stanford.edu/~rubner/papers/rubnerIjcv00.pdf

它既可以在整个图像上完成,也可以在直方图上完成(这确实比整个图像方法更快).我不确定哪种方法可以进行完整的图像比较,但对于直方图比较,您可以使用 cv.CalcEMD2 函数.

It can be done both on the whole image and on the histogram (which is really faster than the whole image method). I'm not sure of which method allow a full image comparision, but for histogram comparision you can use the cv.CalcEMD2 function.

唯一的问题是这个方法没有定义相似度的百分比,而是一个你可以过滤的距离.

The only problem is that this method does not define a percentage of similarity, but a distance that you can filter on.

我知道这不是一个完整的工作算法,但仍然是它的基础,所以我希望它有所帮助.

I know that this is not a full working algorithm, but is still a base for it, so I hope it helps.

这是 EMD 原则上如何工作的恶搞.主要思想是有两个归一化矩阵(两个灰度图像除以它们的总和),并定义一个通量矩阵,描述如何将灰色从一个像素移动到另一个像素以获得第二个图像(甚至可以定义对于非标准化的,但更难).

Here is a spoof of how the EMD works in principle. The main idea is having two normalized matrices (two grayscale images divided by their sum), and defining a flux matrix that describe how you move the gray from one pixel to the other from the first image to obtain the second (it can be defined even for non normalized one, but is more difficult).

在数学术语中,流矩阵实际上是一个四维张量,它给出了从旧图像的点 (i,j) 到新图像的点 (k,l) 的流,但是如果您将图像展平,您可以将其转换为普通矩阵,只是更难阅读.

In mathematical terms the flow matrix is actually a quadridimensional tensor that gives the flow from the point (i,j) of the old image to the point (k,l) of the new one, but if you flatten your images you can transform it to a normal matrix, just a little more hard to read.

这个流矩阵有三个约束:每一项都应该是正数,每行之和应该返回相同的目标像素值,每列之和应该返回起始像素的值.

This Flow matrix has three constraints: each terms should be positive, the sum of each row should return the same value of the desitnation pixel and the sum of each column should return the value of the starting pixel.

鉴于此,您必须最小化转换的成本,由 (i,j) 到 (k,l) 的每个流的乘积之和对于 (i,j) 和 (k,l).

Given this you have to minimize the cost of the transformation, given by the sum of the products of each flow from (i,j) to (k,l) for the distance between (i,j) and (k,l).

文字看起来有点复杂,下面是测试代码.逻辑是正确的,我不确定为什么 scipy 求解器会抱怨它(你应该看看 openOpt 或类似的东西):

It looks a little complicated in words, so here is the test code. The logic is correct, I'm not sure why the scipy solver complains about it (you should look maybe to openOpt or something similar):

#original data, two 2x2 images, normalized
x = rand(2,2)
x/=sum(x)
y = rand(2,2)
y/=sum(y)

#initial guess of the flux matrix
# just the product of the image x as row for the image y as column
#This is a working flux, but is not an optimal one
F = (y.flatten()*x.flatten().reshape((y.size,-1))).flatten()

#distance matrix, based on euclidean distance
row_x,col_x = meshgrid(range(x.shape[0]),range(x.shape[1]))
row_y,col_y = meshgrid(range(y.shape[0]),range(y.shape[1]))
rows = ((row_x.flatten().reshape((row_x.size,-1)) - row_y.flatten().reshape((-1,row_x.size)))**2)
cols = ((col_x.flatten().reshape((row_x.size,-1)) - col_y.flatten().reshape((-1,row_x.size)))**2)
D = np.sqrt(rows+cols)

D = D.flatten()
x = x.flatten()
y = y.flatten()
#COST=sum(F*D)

#cost function
fun = lambda F: sum(F*D)
jac = lambda F: D
#array of constraint
#the constraint of sum one is implicit given the later constraints
cons  = []
#each row and columns should sum to the value of the start and destination array
cons += [ {'type': 'eq', 'fun': lambda F:  sum(F.reshape((x.size,y.size))[i,:])-x[i]}     for i in range(x.size) ]
cons += [ {'type': 'eq', 'fun': lambda F:  sum(F.reshape((x.size,y.size))[:,i])-y[i]} for i in range(y.size) ]
#the values of F should be positive
bnds = (0, None)*F.size

from scipy.optimize import minimize
res = minimize(fun=fun, x0=F, method='SLSQP', jac=jac, bounds=bnds, constraints=cons)

变量 res 包含最小化的结果......但正如我所说,我不确定它为什么抱怨奇异矩阵.

the variable res contains the result of the minimization...but as I said I'm not sure why it complains about a singular matrix.

这个算法唯一的问题是速度不是很快,所以不可能按需做,但你必须耐心地在创建数据集时执行它并将结果存储在某个地方

The only problem with this algorithm is that is not very fast, so it's not possible to do it on demand, but you have to perform it with patience on the creation of the dataset and store somewhere the results

这篇关于使用 OpenCV 和 Python 比较图像的相似性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

上一篇:如何在python中的感兴趣区域周围绘制一个矩形 下一篇:如何在 OpenCV(Python)中将灰度图像转换为 RGB?

相关文章

最新文章