Вы находитесь на странице: 1из 3

Hello, welcome back.

So, in the last lecture we talked about orientation histograms and how they will be import for object recognition. In this lecture I'm going to give you some more details of how we compute these in practice. And then I'll give you some references of papers to read before the next class. So the crucial implementation details are the following. Images are 2D areas of numbers. So how does one implement the process of computing derivatives, gradients etcetera? And the solution is that we use discrete convolution. So, in a second sense, we define convolution as an integral, but there is obvious counterpart of an integral which is a discrete summation. And I am not going to go through the details of this, but you can look it upon the Wikipedia entry of convolution which gives you both the continuous version as well as the discrete version. And you can also find the nice exposition in Wolfram MathWorld. So let's give a little example. Suppose we want to compute derivatives. So, these are going to be important for computing the gradient. So if I want to find the x derivative, then what we do is that we convolve with the filter shown here. This -one zero one filter so if I have an image I, I convolve with this filter, I get some resulting image, and that resulting image will be Ix. In other words, the partial derivative of I with respect to x. For computing the y derivative we convolve it a different filter -one zero one but this is a vertical filter and the result will be a new image which we will call Iy. So essentially we get these two images Ix and Iy corresponding to the partial derivatives with respect to x and y. If I start together the values of these at a particular pixel that gives us the gradient. And once we have the gradient, we can compute the orientation at any pixel. So, let me take you through how we do this convolution with -one zero one. So here's an example so I have drawn a very simple image. It's a four by four image and it essentially corresponds to a dark left column and then three columns on th e right which are slightly brighter. And suppose I want to convolve it with this -one zero one filter or sometimes people call this a mask. So the result is going to be a new array. And, the convolution process is essentially implemented as flip and drag, that's kind of a dumb because if

you see the formula for convolution, one of the functions and one of the function argument is inverted. So in this case we will take this -one zero one mask and flip it over. And let's flip over this -one zero one mask, okay? So, let's do that. So that gives us a one zero -one. I've shown that here by these red entries one zero -one. And now what we have to do is to multiply these pointwise so the one gets multiplied by ten. The zero gets multiplied by twenty. The -one gets multiplied by twenty and then we take a sum of these. And that sum is going to give me -ten. And this -ten is an entry in the new area, which is shown here in green and I put down this entry -ten. So, if this was the image I, here I'm building up Is of x, the partial directory of I with respect to x. And this is how I got one entry. I put down the mask, multiply pointwise. That's the definition of convolution. Now, how do I fill in the entry here? Well, what I have to do is to slide the mask and repeat. So sliding the mask here means that I move the mask one zero -one. I shift it slightly and then I'm going to multiply pointwise again and add. So, if I multiply pointwise what do I get? This one gets multiplied by twenty that gives you one x twenty. Zero x twenty. -One x twenty, it all adds up to zero so this zero gets entered here. And so forth, now I can apply the mask in the second row and so on. So notice that at the end, things kind of fall over, so there are edge effects because you can't define the values there. Now in this example this effect is exaggerated because I have already small image. But in general we'll have a 500 pixel by 500 pixel image and the mask is just three pixel so we only have to worry about this phenomena at the borders of the image which are not small, which are small so we no need to worry about it. So, that's the process of convolution in a discrete setting and this is implemented in packages such as MacLab and Octave and so forth, so you don't really have to implement these. I would conclude by giving you two references to read before the next lecture. And these are both references which use a representation based on histogram of orientations for object recognition. The first example is for detecting pedestrians. And this paper is CVPR 2005 by [inaudible] and I have given you here a place where you can download this paper. There'll be lots of stuff in this paper

you won't be able to understand yet such as support rector machines. We are going to cover that later. But if you read through it the important thing to focus on is the details of how to compute the histogram of orientation. In fact we use this term quite often so the usual abbreviation is HOG. Hso if I just say HOG, that's what it means, Histiogram of Oriented Gradients. I also want you to read this other paper which is from my group which is on digit recognition. Essentially the same methodology but now used for the problem of classifying zero, one, two, three, so forth up to nine. This is a well studied task and essentially the same approach works. So here again, I'm giving you the reference and tech report that you can download and read. So if you before the next lecture, study upon convolution so you are very comfortable with that. And then by reading these references you get a good handle on what orientation histograms are about and how they are computed. We'll be all ready to launch our, our study of recognition. Thank you.

Вам также может понравиться