在windows下运行Felzenszwalb的Deformable Part Model(DPM)源码voc-release3.1来训练自己的模型

我的环境

DPM源码版本:voc-release3.1

VOC开发包版本:VOC2007_devkit_08-Jun

使用的训练数据集:VOC2007

Matlab版本:MatlabR2012b

c++编译器:VS2010

系统:Win7 32位

 

        为什么不使用voc-release4.01呢?因为第4版中加入了目标检测语法(Grammars),并且还使用了非对称部件分部等等,虽然准确度提高了,但源码变得更加复杂,不利于源码分析。而相对来说第3版精简了不少,更容易分析。

        首先需要下载voc-release3.1和VOCdevkit开发包:

        Deformable Part Model 第三版voc-release3.1下载:http://cs.brown.edu/~pff/latent-release3/

        PASCAL VOC 2007 数据集及开发包下载:http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2007/index.html


       有关Deformable Part Model参见论文

       A Discriminatively Trained, Multiscale,Deformable Part Model[CVPR 2008]的中文翻译

       Object Detection with Discriminatively Trained Part Based Models[PAMI 2010]的中文翻译 

       及 有关可变形部件模型(Deformable Part Model)的一些说明

       Pedro Felzenszwalb的个人主页:http://cs.brown.edu/~pff/



1、修改globals.m中的一些全局变量(主要是目录设定)

cachedir= ‘D:\DPMtrain\VOCCache\‘;

% 训练好的模型结果和中间数据的文件目录

 

tmpdir =‘D:\DPMtrain\VOCtemp\‘;

% 训练中用到的临时文件的目录,临时文件可能会很大

 

VOCdevkit =[‘H:\文档文件\工作\█计算机视觉█\数据集\PascalVOC\VOC2007\VOCdevkit‘];

% PASCAL VOC 开发包目录

 

2、修改VOCdevkit开发包中的VOCinit.m文件,设定数据集目录

我们使用VOC2007数据集,所以将VOC2006标识设为false(默认就是false)。

如果将解压出来的VOC2007数据集的文件夹放在VOCdevkit目录下的话,就不用再修改VOCinit.m中的目录设定了,因为代码中默认就是这样的目录安排。但如果想把数据集放到其他地方,可以修改VOCopts.datadir目录,指向数据集所在目录。

我是直接将VOC2007文件夹放在了VOCdevkit下,目录结构如下:

-VOCdevkit

   -local

      -VOC2006

      -VOC2007

   -results

      -VOC2006

      -VOC2007

   -VOC2007

      -Annotations

      -ImageSets

      -JPEGImages

      -SegmentationClass

      -SegmentationObject

   -VOCcode

 

 

3、pascal_data.m 从PASCAL数据集中获取指定类别目标的训练数据

分析代码时,为了避免加载整个数据集,所以我在VOCdevkit\VOC2007\ImageSets\Main中新建了两个图片文件列表:

trainval_smallset.txt和train_smallset.txt

每个里面只有一两百个文件名,分别拷贝自trainval.txt和train.txt。

对应的,在pascal_data.m中修改下代码,加载我新建的小文件集:

加载正样本列表改为:

ids= textread(sprintf(VOCopts.imgsetpath, ‘trainval_smallset‘), ‘%s‘);

负样本列表对应修改。

 

这里要注意下:VOCdevkit\VOC2007\ImageSets\Main中有几个列表文件容易混淆:

train.txt 是所有用来训练的图片文件的文件名列表

trianval.txt是所有用来训练和验证的图片文件的文件名列表

val.txt是所有用来验证的图片文件的文件名列表

上面三个集合对应所有目标类别的训练和验证。

还有几个train_train.txt,train_trainval.txt,train.val.txt

这三个是“火车”类别的训练、训练+验证、验证图片集,

所以要分清楚。

 

4、修改rewritedat.m文件中的两个系统调用命令

在matlab命令行中运行pascal(‘person’,1),即训练含一个组件(component)模型的人体类别DPM模型,看看出什么错误,一点一点来修改。

错误如下:

‘mv‘ 不是内部或外部命令,也不是可运行的程序 

Error in rewritedat (line 38)

所以来修改下rewritedat.m文件

将第13行的 unix([‘mv ‘ datfile ‘ ‘ oldfile]);

替换为:system([‘move ‘ datfile ‘ ‘ oldfile]);

第44行的 unix([‘cp ‘ inffile ‘ ‘ oldfile]);

替换为:system([‘copy ‘ inffile ‘ ‘ oldfile]);

 

5、修改train.m文件

修改完rewritedat.m文件后,再次调用pascal(‘person’,1),这次的问题是:

executing: ./learn 0.0020 1.0000 D:\DPMtrain\VOCtemp\person.hdr D:\DPMtrain\VOCtemp\person.dat D:\DPMtrain\VOCtemp\person.mod D:\DPMtrain\VOCtemp\person.inf D:\DPMtrain\VOCtemp\person.lob
‘.‘ 不是内部或外部命令,也不是可运行的程序或批处理文件。 

这又是一个linux下的系统调用,所以需要修改train.m,

将第123行的    cmd = sprintf(‘./learn %.4f %.4f %s %s %s %s %s‘, ...
                                                   C, J, hdrfile, datfile, modfile, inffile, lobfile);

命令字符串中的./去掉,变成:

cmd = sprintf(‘learn %.4f %.4f %s %s %s %s %s‘, ...
                           C, J, hdrfile, datfile, modfile, inffile, lobfile);

将第128行的    status = unix(cmd);

替换为: status = system(cmd);

修改完这部分后,还是会提示‘learn‘ 不是内部或外部命令,也不是可运行的程序 ,因为我们还没有编译learn.cc呢。


6、编译learn.cc

首先将learn.cc改名为learn.cpp,在VS2010中新建一个空的控制台工程,添加原文件learn.cpp

尝试编译,错误提示为:Cannot open include file: ‘sys/time.h‘: No such file or directoryc:\test\dpm_learn\dpm_learn\learn.cpp5

这是linux中的目录格式,直接将第5行的#include <sys/time.h> 改为 #include <time.h>,再次编译,然后就出现了一大堆错误,有以下几点:

(1) windows中的time.h文件中没有结构体timeval的定义,也没有gettimeofday函数,从网上找了一段代码添加到learn.cpp中。

(2) 上一步加入的代码需要添加头文件windows.h,而包含进这个头文件后,就包含了系统中定义的min和max函数,所以需要注释掉learn.cpp中定义的min和max函数,否则出错。

(3) windows中没有drand48和srand48的定义,把网友博客中自己写的这两个函数添加进去。

(4) windows中没有INFINITY的定义,自己定义一个。

(5) main函数中的int buf[labelsize+2];出错,原因是VS中的c++编译器不允许用变量指定数组长度,改为使用new动态分配:

int *buf = new int[labelsize+2]; ,同时,在同一作用域最后delete [] buf;

以上5点修改完后,就没有错误了,当然还有些警告,不过不用管,编译运行,生成learn.exe可执行文件,拷贝到voc-release3.1目录下,等待训练时被matlab代码通过系统调用来执行。

修改完后的完整learn.cpp文件为:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>
//#include <sys/time.h>
#include <errno.h>

#include <time.h> //windows中用以替代sys/time.h
#include <windows.h>//windows中用以替代sys/time.h

/*
 * Optimize LSVM objective function via gradient descent.
 *
 * We use an adaptive cache mechanism.  After a negative example
 * scores beyond the margin multiple times it is removed from the
 * training set for a fixed number of iterations.
 */

// Data File Format
// EXAMPLE*
// 
// EXAMPLE:
//  long label          ints
//  blocks              int
//  dim                 int
//  DATA{blocks}
//
// DATA:
//  block label         float
//  block data          floats
//
// Internal Binary Format
//  len           int (byte length of EXAMPLE)
//  EXAMPLE       <see above>
//  unique flag   byte

// number of iterations
#define ITER 5000000

// small cache parameters
#define INCACHE 3
#define WAIT 10

// error checking
#define check(e) (e ? (void)0 : (printf("%s:%u error: %s\n%s\n", __FILE__, __LINE__, #e, strerror(errno)), exit(1)))

// number of non-zero blocks in example ex
#define NUM_NONZERO(ex) (((int *)ex)[labelsize+1])

// float pointer to data segment of example ex
#define EX_DATA(ex) ((float *)(ex + sizeof(int)*(labelsize+3)))

// class label (+1 or -1) for the example
#define LABEL(ex) (((int *)ex)[1])

// block label (converted to 0-based index)
#define BLOCK_IDX(data) (((int)data[0])-1)

//windows中没有INFINITY的定义,自己定义一个
#define INFINITY 0xFFFFFFFFF  


int labelsize;
int dim;

//windows下没有gettimeofday函数,从网上找的一个替代函数  
int gettimeofday(struct timeval *tp, void *tzp)  
{  
	time_t clock;  
	struct tm tm;  
	SYSTEMTIME wtm;  
	GetLocalTime(&wtm);  
	tm.tm_year     = wtm.wYear - 1900;  
	tm.tm_mon     = wtm.wMonth - 1;  
	tm.tm_mday     = wtm.wDay;  
	tm.tm_hour     = wtm.wHour;  
	tm.tm_min     = wtm.wMinute;  
	tm.tm_sec     = wtm.wSecond;  
	tm. tm_isdst    = -1;  
	clock = mktime(&tm);  
	tp->tv_sec = clock;  
	tp->tv_usec = wtm.wMilliseconds * 1000;  
	return (0);  
}  

//参照网上自己写的drand48和srand48函数

#define MNWZ 0x100000000    
#define ANWZ 0x5DEECE66D    
#define CNWZ 0xB16   

static unsigned long long seed = 1;  

double drand48(void)    
{    
	seed = (ANWZ * seed + CNWZ) & 0xFFFFFFFFFFFFLL;    
	unsigned int x = seed >> 16;    
	return  ((double)x / (double)MNWZ);       
}  

//static unsigned long long seed = 1;  

void srand48(unsigned int i)    
{    
	seed  = (((long long int)i) << 16) | rand();    
}  

// comparison function for sorting examples 
int comp(const void *a, const void *b) {
  // sort by extended label first, and whole example second...
  int c = memcmp(*((char **)a) + sizeof(int), 
		 *((char **)b) + sizeof(int), 
		 labelsize*sizeof(int));
  if (c)
    return c;
  
  // labels are the same  
  int alen = **((int **)a);
  int blen = **((int **)b);
  if (alen == blen)
    return memcmp(*((char **)a) + sizeof(int), 
		  *((char **)b) + sizeof(int), 
		  alen);
  return ((alen < blen) ? -1 : 1);
}

// a collapsed example is a sequence of examples
struct collapsed {
  char **seq;
  int num;
};

// set of collapsed examples
struct data {
  collapsed *x;
  int num;
  int numblocks;
  int *blocksizes;
  float *regmult;
  float *learnmult;
};

// seed the random number generator with the current time
void seed_time() {
 struct timeval tp;
 check(gettimeofday(&tp, NULL) == 0);
 srand48((long)tp.tv_usec);
}

//包含include.h后,系统中有min和max函数的定义,所以注释掉下面的定义,否则出错
//static inline double min(double x, double y) { return (x <= y ? x : y); }
//static inline double max(double x, double y) { return (x <= y ? y : x); }

// gradient descent
void gd(double C, double J, data X, double **w, double **lb) {
  int num = X.num;
  
  // state for random permutations
  int *perm = (int *)malloc(sizeof(int)*X.num);
  check(perm != NULL);

  // state for small cache
  int *W = (int *)malloc(sizeof(int)*num);
  check(W != NULL);
  for (int j = 0; j < num; j++)
    W[j] = 0;

  int t = 0;
  while (t < ITER) {
    // pick random permutation
    for (int i = 0; i < num; i++)
      perm[i] = i;
    for (int swapi = 0; swapi < num; swapi++) {
      int swapj = (int)(drand48()*(num-swapi)) + swapi;
      int tmp = perm[swapi];
      perm[swapi] = perm[swapj];
      perm[swapj] = tmp;
    }

    // count number of examples in the small cache
    int cnum = 0;
    for (int i = 0; i < num; i++) {
      if (W[i] <= INCACHE)
	cnum++;
    }

    for (int swapi = 0; swapi < num; swapi++) {
      // select example
      int i = perm[swapi];
      collapsed x = X.x[i];

      // skip if example is not in small cache
      if (W[i] > INCACHE) {
	W[i]--;
	continue;
      }

      // learning rate
      double T = t + 1000.0;
      double rateX = cnum * C / T;
      double rateR = 1.0 / T;

      if (t % 10000 == 0) {
	printf(".");
	fflush(stdout);
      }
      t++;
      
      // compute max over latent placements
      int M = -1;
      double V = 0;
      for (int m = 0; m < x.num; m++) {
	double val = 0;
	char *ptr = x.seq[m];
	float *data = EX_DATA(ptr);
	int blocks = NUM_NONZERO(ptr);
	for (int j = 0; j < blocks; j++) {
	  int b = BLOCK_IDX(data);
	  data++;
	  for (int k = 0; k < X.blocksizes[b]; k++)
	    val += w[b][k] * data[k];
	  data += X.blocksizes[b];
	}
	if (M < 0 || val > V) {
	  M = m;
	  V = val;
	}
      }
      
      // update model
      for (int j = 0; j < X.numblocks; j++) {
	double mult = rateR * X.regmult[j] * X.learnmult[j];
	for (int k = 0; k < X.blocksizes[j]; k++) {
	  w[j][k] -= mult * w[j][k];
	}
      }
      char *ptr = x.seq[M];
      int label = LABEL(ptr);
      if (label * V < 1.0) {
	W[i] = 0;
	float *data = EX_DATA(ptr);
	int blocks = NUM_NONZERO(ptr);
	for (int j = 0; j < blocks; j++) {
	  int b = BLOCK_IDX(data);
	  double mult = (label > 0 ? J : -1) * rateX * X.learnmult[b];      
	  data++;
	  for (int k = 0; k < X.blocksizes[b]; k++)
	    w[b][k] += mult * data[k];
	  data += X.blocksizes[b];
	}
      } else if (label == -1) {
	if (W[i] == INCACHE)
	  W[i] = WAIT;
	else
	  W[i]++;
      }
    }

    // apply lowerbounds
    for (int j = 0; j < X.numblocks; j++) {
      for (int k = 0; k < X.blocksizes[j]; k++) {
	w[j][k] = max(w[j][k], lb[j][k]);
      }
    }

  }

  free(perm);
  free(W);
}

// score examples
double *score(data X, char **examples, int num, double **w) {
  double *s = (double *)malloc(sizeof(double)*num);
  check(s != NULL);
  for (int i = 0; i < num; i++) {
    s[i] = 0.0;
    float *data = EX_DATA(examples[i]);
    int blocks = NUM_NONZERO(examples[i]);
    for (int j = 0; j < blocks; j++) {
      int b = BLOCK_IDX(data);
      data++;
      for (int k = 0; k < X.blocksizes[b]; k++)
        s[i] += w[b][k] * data[k];
      data += X.blocksizes[b];
    }
  }
  return s;  
}

// merge examples with identical labels
void collapse(data *X, char **examples, int num) {
  collapsed *x = (collapsed *)malloc(sizeof(collapsed)*num);
  check(x != NULL);
  int i = 0;
  x[0].seq = examples;
  x[0].num = 1;
  for (int j = 1; j < num; j++) {
    if (!memcmp(x[i].seq[0]+sizeof(int), examples[j]+sizeof(int), 
		labelsize*sizeof(int))) {
      x[i].num++;
    } else {
      i++;
      x[i].seq = &(examples[j]);
      x[i].num = 1;
    }
  }
  X->x = x;
  X->num = i+1;  
}

int main(int argc, char **argv) {  
  seed_time();
  int count;
  data X;

  // command line arguments
  check(argc == 8);
  double C = atof(argv[1]);
  double J = atof(argv[2]);
  char *hdrfile = argv[3];
  char *datfile = argv[4];
  char *modfile = argv[5];
  char *inffile = argv[6];
  char *lobfile = argv[7];

  // read header file
  FILE *f = fopen(hdrfile, "rb");
  check(f != NULL);
  int header[3];
  count = fread(header, sizeof(int), 3, f);
  check(count == 3);
  int num = header[0];
  labelsize = header[1];
  X.numblocks = header[2];
  X.blocksizes = (int *)malloc(X.numblocks*sizeof(int));
  count = fread(X.blocksizes, sizeof(int), X.numblocks, f);
  check(count == X.numblocks);
  X.regmult = (float *)malloc(sizeof(float)*X.numblocks);
  check(X.regmult != NULL);
  count = fread(X.regmult, sizeof(float), X.numblocks, f);
  check(count == X.numblocks);
  X.learnmult = (float *)malloc(sizeof(float)*X.numblocks);
  check(X.learnmult != NULL);
  count = fread(X.learnmult, sizeof(float), X.numblocks, f);
  check(count == X.numblocks);
  check(num != 0);
  fclose(f);
  printf("%d examples with label size %d and %d blocks\n",
	 num, labelsize, X.numblocks);
  printf("block size, regularization multiplier, learning rate multiplier\n");
  dim = 0;
  for (int i = 0; i < X.numblocks; i++) {
    dim += X.blocksizes[i];
    printf("%d, %.2f, %.2f\n", X.blocksizes[i], X.regmult[i], X.learnmult[i]);
  }

  // read examples
  f = fopen(datfile, "rb");
  check(f != NULL);
  printf("Reading examples\n");
  char **examples = (char **)malloc(num*sizeof(char *));
  check(examples != NULL);
  for (int i = 0; i < num; i++) {
    // we use an extra byte in the end of each example to mark unique
    // we use an extra int at the start of each example to store the 
    // example‘s byte length (excluding unique flag and this int)

    //int buf[labelsize+2]; //windows下不支持这样分配,换成new动态分配
	  int *buf = new int[labelsize+2]; //动态分配

    count = fread(buf, sizeof(int), labelsize+2, f);
    check(count == labelsize+2);
    // byte length of an example‘s data segment
    int len = sizeof(int)*(labelsize+2) + sizeof(float)*buf[labelsize+1];
    // memory for data, an initial integer, and a final byte
    examples[i] = (char *)malloc(sizeof(int)+len+1);
    check(examples[i] != NULL);
    // set data segment‘s byte length
    ((int *)examples[i])[0] = len;
    // set the unique flag to zero
    examples[i][sizeof(int)+len] = 0;
    // copy label data into example
    for (int j = 0; j < labelsize+2; j++)
      ((int *)examples[i])[j+1] = buf[j];
    // read the rest of the data segment into the example
    count = fread(examples[i]+sizeof(int)*(labelsize+3), 1, 
		  len-sizeof(int)*(labelsize+2), f);
    check(count == len-sizeof(int)*(labelsize+2));

	delete [] buf;  //删除buf
  }
  fclose(f);
  printf("done\n");

  // sort
  printf("Sorting examples\n");
  char **sorted = (char **)malloc(num*sizeof(char *));
  check(sorted != NULL);
  memcpy(sorted, examples, num*sizeof(char *));
  qsort(sorted, num, sizeof(char *), comp);
  printf("done\n");

  // find unique examples
  int i = 0;
  int len = *((int *)sorted[0]);
  sorted[0][sizeof(int)+len] = 1;
  for (int j = 1; j < num; j++) {
    int alen = *((int *)sorted[i]);
    int blen = *((int *)sorted[j]);
    if (alen != blen || 
	memcmp(sorted[i] + sizeof(int), sorted[j] + sizeof(int), alen)) {
      i++;
      sorted[i] = sorted[j];
      sorted[i][sizeof(int)+blen] = 1;
    }
  }
  int num_unique = i+1;
  printf("%d unique examples\n", num_unique);

  // collapse examples
  collapse(&X, sorted, num_unique);
  printf("%d collapsed examples\n", X.num);

  // initial model
  double **w = (double **)malloc(sizeof(double *)*X.numblocks);
  check(w != NULL);
  f = fopen(modfile, "rb");
  for (int i = 0; i < X.numblocks; i++) {
    w[i] = (double *)malloc(sizeof(double)*X.blocksizes[i]);
    check(w[i] != NULL);
    count = fread(w[i], sizeof(double), X.blocksizes[i], f);
    check(count == X.blocksizes[i]);
  }
  fclose(f);

  // lower bounds
  double **lb = (double **)malloc(sizeof(double *)*X.numblocks);
  check(lb != NULL);
  f = fopen(lobfile, "rb");
  for (int i = 0; i < X.numblocks; i++) {
    lb[i] = (double *)malloc(sizeof(double)*X.blocksizes[i]);
    check(lb[i] != NULL);
    count = fread(lb[i], sizeof(double), X.blocksizes[i], f);
    check(count == X.blocksizes[i]);
  }
  fclose(f);
  
  // train
  printf("Training");
  gd(C, J, X, w, lb);
  printf("done\n");

  // save model
  printf("Saving model\n");
  f = fopen(modfile, "wb");
  check(f != NULL);
  for (int i = 0; i < X.numblocks; i++) {
    count = fwrite(w[i], sizeof(double), X.blocksizes[i], f);
    check(count == X.blocksizes[i]);
  }
  fclose(f);

  // score examples
  printf("Scoring\n");
  double *s = score(X, examples, num, w);

  // Write info file
  printf("Writing info file\n");
  f = fopen(inffile, "w");
  check(f != NULL);
  for (int i = 0; i < num; i++) {
    int len = ((int *)examples[i])[0];
    // label, score, unique flag
    count = fprintf(f, "%d\t%f\t%d\n", ((int *)examples[i])[1], s[i], 
                    (int)examples[i][sizeof(int)+len]);
    check(count > 0);
  }
  fclose(f);
  
  printf("Freeing memory\n");
  for (int i = 0; i < X.numblocks; i++) {
    free(w[i]);
    free(lb[i]);
  }
  free(w);
  free(lb);
  free(s);
  for (int i = 0; i < num; i++)
    free(examples[i]);
  free(examples);
  free(sorted);
  free(X.x);
  free(X.blocksizes);
  free(X.regmult);
  free(X.learnmult);

  return 0;
}


7 数组下标越界错误

数组越界错误有好几处,我注意到的有:

(1)pascal_train.m中

合并模型并进行LSVM训练部分

即注释 %merge models and train using latent detections & hard negatives下的

model = train(cls, model, pos, neg(1:200), 0, 0, 2, 2, 2^28, true, 0.7);

原因是这里我的负样本集neg中的负样本数目没有达到200个,所以这里改为:

model = train(cls, model, pos, neg(1:min(length(neg),200)), 0, 0, 2, 2, 2^28, true, 0.7);


还有添加部件更新模型部分

即注释% add parts and update models using latent detections & hard negatives.下的两处调用train函数的地方,

都将neg(1:200)改为neg(1:min(length(neg),200))


(2)网友pozen提出的rewritedat.m中可能出现的下标越界情况,

将28行左右的dim = info(end);改为:

    if length(info) == 0
        dim = 0;
    else
        dim = info(end);
    end

将38行左右的dim = y(end);改为:

    if length(y) == 0
        dim = 0;
    else
        dim = y(end);
    end


参考:

http://blog.csdn.net/pozen/article/details/7103412

http://blog.csdn.net/dreamd1987/article/details/7399151


自己用少量数据训练了一个单组件人体模型,截取trainval.txt中的前50个图片文件名做正样本,train.txt中的前300个做负样本,经过pascal.data函数处理后,获得了含176个负样本的neg数组,含45个正样本的pos数组。learn.cc中的迭代次数我没改,还是每次train迭代500万次(感觉时间都花在这里了,如果只是做测试的话,可以修改learn.cc中的迭代次数ITER值),训练过程用了大概1小时左右吧,训练完后用PASCAL开发包中的评价函数做了评价,正确率和召回率都是0,平均精度AP也是0,在预料之中。

训练完后,最终结果在cachedir目录中,很多中间数据也在这个目录中,如下:

在windows下运行Felzenszwalb的Deformable Part Model(DPM)源码voc-release3.1来训练自己的模型

其中person_final.mat就是训练好的最终模型。源码中训练的每个阶段都会将中间数据保存下来,所以即使某一阶段出现了错误,下次重新运行时自动加载上次保存的数据,而不用再次计算,非常方便。


训练出来的模型的可视化如下:

在windows下运行Felzenszwalb的Deformable Part Model(DPM)源码voc-release3.1来训练自己的模型



在windows下运行Felzenszwalb的Deformable Part Model(DPM)源码voc-release3.1来训练自己的模型,布布扣,bubuko.com

在windows下运行Felzenszwalb的Deformable Part Model(DPM)源码voc-release3.1来训练自己的模型

上一篇:C# 调用Win8系统键盘


下一篇:Laravel/Lumen搭建服务器性能测试