自动比较两个系列-相异性检验

2022-01-07 00:00:00 statistics algorithm c c++

我有两个系列,系列 1 和系列 2.我的目标是自动/定量地找出Series2 与Series1 有多少不同,在bin 到bin 的基础上(每个bin 代表一个特定的特征). 可以通过 点击此处.

I have two series, series1 and series2. My aim is to find how much Series2 is different from Series1,on a bin to bin basis, (each bin represents a particular feature,) automatically/quantitatively. This image can be seen in its original size by clicking here.

Series1 是预期的结果.系列 2 是测试/传入系列.

Series1 is the expected result. Series2 is the test/incoming series.

我提供了一个直方图,其中 Series2 以深棕色表示.您还可以注意到 x 轴在 221 和 353 之间存在显着差异.即 Series2 小于 Series1.我正在使用 C++ 编码.

I am providing a histogram plot, where Series2 is represented in dark brown colour. You can also note in the x-axis between 221 and 353 there is a significant variation. ie Series2 is less than Series1. I am coding using C++.

我认为,互相关会有所帮助,但会产生基于相似性而非不同性的值.我看到人们谈论 Kolmogorov-Smirnov 检验.这是我应该执行的测试吗?

I think, crosscorrelation will help, but produces a value based on similarity rather than dissimilarity. I see people talk about Kolmogorov-Smirnov Test. Is this the test which i should be performing?

更新1:我正在尝试执行模板匹配.我已将模板图像和传入的测试图像分成 8x8 块.我试图将模板图像中的一个块与测试图像中的相同块(基于空间像素位置)进行比较.我计算每个块内的强度总和.我获得模板图像的系列 1 和测试图像的系列 2.

UPDATE 1: I am trying to perform a template matching. I have divided my template image in to 8x8 blocks as well as my incoming test image. I am trying to compare one block in template image with the same block(based on the spatial pixel positions) in the test image. I calculate the intensity sum within each block.I obtain series1 for the Template image and have Series2 for the test image.

推荐答案

这里是一个算法的 C 实现,用于计算实际数据与预测数据的差异.该算法来自 Osborne/McGraw-Hill 版权所有 1980 的一本名为 Practical BASIC Programs 的书.

Here is a C implementation of an algorithm to compute the divergence of actual data from predicted data. The algorithm comes from a book entitled Practical BASIC Programs from Osborne/McGraw-Hill copyright 1980.

这是.h文件:

/*
 * divergence.h
 *
 *  Created on: Jan 13, 2011
 *      Author: Erik Oosterwal
 */

#ifndef DIVERGENCE_H_
#define DIVERGENCE_H_

typedef struct
{
    int DataSize;
    float TotalError;
    float AbsError;       //< Total Absolute Error
    float SqError;        //< Total Squared Error
    float MeanError;
    float MeanAbsError;
    float MeanSqError;
    float RMSError;     //< Root Mean Square Error
}DIVERGENCE_ERROR_TYPE;

void Divergence__Error(int size, float expected[], float actual[], DIVERGENCE_ERROR_TYPE *error);


// Prefer to use abs() from "stdlib.h"
#ifndef ABS
    #define ABS(x) ((x)>0) ? (x) : (0-(x))     //< Not safe!!! - Do not increment parameter inside ABS()!
#endif


#endif /* DIVERGENCE_H_ */

....c 文件:

/*
 * divergence.c
 *
 *  Created on: Jan 13, 2011
 *      Author: Erik Oosterwal
 */

#include "math.h"
#include "divergence.h"

/**
 *      @brief  Compute divergence from expected values.
 *
 *      @details    Compute the raw errors, absolute errors, root mean square errors,
 *                  etc. for a series of values.
 *
 *      @param  size - integer value defines the number of values to compare.
 */
void Divergence__Error(int size, float expected[], float actual[], DIVERGENCE_ERROR_TYPE *error)
{
    double total_err = 0.0;
    double abs_err = 0.0;
    double abs_sqr_err = 0.0;
    double temp = 0.0;
    int index = 0;

    for(index=0; index<size; index++)
    {
        temp = (double)(actual[index])-(double)(expected[index]);
        total_err+=temp;
        abs_err+=ABS(temp);
        abs_sqr_err+=pow(ABS(temp),2);
    }

    temp = (double)size;
    error->DataSize = (int)size;
    error->TotalError = (float)total_err;
    error->AbsError = (float)abs_err;
    error->SqError = (float)abs_sqr_err;
    error->MeanError = (float)(total_err/temp);
    error->MeanAbsError = (float)(abs_err/temp);
    error->MeanSqError = (float)(abs_sqr_err/temp);
    error->RMSError = (float)(sqrt(abs_sqr_err/temp));
}

...以及用于测试函数的示例 main():

...and a sample main() for testing the function:

/*
 * main.c
 *
 *  Created on: Jan 13, 2011
 *      Author: Erik Oosterwal
 */

#include <stdio.h>
#include "divergence.h"

float vote[]={40.3, 22.5, 16.3, 10.5, 7.2, 3.2};
float poll[]={42.7, 21.4, 18.2, 6.0, 7.4, 4.3};
float actual[] ={74, 70, 58, 60, 65, 73, 70};
float predict[]={49, 62, 75, 82, 37, 58, 92};

int main(int argc, char *argv[])
{
    DIVERGENCE_ERROR_TYPE stats;

    Divergence__Error(6, poll, vote, &stats);
    printf("%i
%f
%f
%f
%f
%f
%f
%f


",stats.DataSize,stats.TotalError,stats.AbsError,stats.SqError,stats.MeanError,stats.MeanAbsError,stats.MeanSqError,stats.RMSError);

    Divergence__Error(7, predict, actual, &stats);
    printf("%i
%f
%f
%f
%f
%f
%f
%f


",stats.DataSize,stats.TotalError,stats.AbsError,stats.SqError,stats.MeanError,stats.MeanAbsError,stats.MeanSqError,stats.RMSError);

    return(0);
}

我不能保证这是最快的方法,该函数可以进行一些调整以使其对不同的数据类型更友好,但它确实有效,并且结果已根据书中提供的示例进行了验证.

I can't guarantee that this is the fastest method and the function could use some tweaking to make it more friendly to different data types, but it works and the results were verified against the samples provided in the book.

相关文章