Let us now see what is SURF.
SURF stands for Speeded Up Robust Features. It is an algorithm which extracts some unique keypoints and descriptors from an image. More details on the algorithm can be found here and a note on its implementation in OpenCV can be found here. A set of SURF keypoints and descriptors can be extracted from an image and then used later to detect the same image. SURF uses an intermediate image representation called Integral Image, which is computed from the input image and is used to speed up the calculations in any rectangular area.It is formed by summing up the pixel values of the x,y
co-ordinates from origin to the end of the image. This makes computation time invariant to change in size and is particularly useful while encountering large images. The SURF detector is based on the determinant of the Hessian matrix. The SURF descriptor describes how pixel intensities are distributed within a scale dependent neighborhood of each interest point detected by Fast Hessian
Object detection using SURF is scale and rotation invariant which makes it very powerful. Also it doesn’t require long and tedious training as in case of using cascaded haar classifier based detection. But the detection time of SURF is a little longer than Haar, but it doesn’t make much problem in most situations if the robot takes some tens of millisecond more for detection. Since this method is rotation invariant, it is possible to successfully detect objects in any orientation. This will be particularly useful in mobile robots where it may encounter situations in which it has to recognize objects which may be at different orientations than the trained image, say for example , the robot was trained with the upright image of an object and it has to detect a fallen object. Detection using haar features fails miserably in this case. OK, lets now move from theory to practical, the actual way things works.
OpenCV library provides an example of detection called find_obj.cpp. It can be found at the OpenCV-x.x.x/samples/c/ folder of the source tar file, where x.x.x stands for the version number. It loads two images, finds the SURF keypoints and descriptors , compares them and finds a matching if there is any. But this sample code will be a bit tough for beginners. So let us move slowly, step by step. As the first step , we can find the SURF keypoints and descriptors in an frame captured from the webcam. The code for the same is given below:
//*******************surf.cpp******************// //********** SURF implementation in OpenCV*****// //**loads video from webcam, grabs frames computes SURF keypoints and descriptors**// //** and marks them**// //****author: achu_wilson@rediffmail.com****// #include <stdio.h> #include <opencv2/features2d/features2d.hpp> #include <opencv2/highgui/highgui.hpp> #include <opencv2/imgproc/imgproc_c.h> using namespace std; int main(int argc, char** argv) { CvMemStorage* storage = cvCreateMemStorage(0); cvNamedWindow("Image", 1); int key = 0; static CvScalar red_color[] ={0,0,255}; CvCapture* capture = cvCreateCameraCapture(0); CvMat* prevgray = 0, *image = 0, *gray =0; while( key != 'q' ) { int firstFrame = gray == 0; IplImage* frame = cvQueryFrame(capture); if(!frame) break; if(!gray) { image = cvCreateMat(frame->height, frame->width, CV_8UC1); } //Convert the RGB image obtained from camera into Grayscale cvCvtColor(frame, image, CV_BGR2GRAY); //Define sequence for storing surf keypoints and descriptors CvSeq *imageKeypoints = 0, *imageDescriptors = 0; int i; //Extract SURF points by initializing parameters CvSURFParams params = cvSURFParams(500, 1); cvExtractSURF( image, 0, &imageKeypoints, &imageDescriptors, storage, params ); printf("Image Descriptors: %d\n", imageDescriptors->total); //draw the keypoints on the captured frame for( i = 0; i < imageKeypoints->total; i++ ) { CvSURFPoint* r = (CvSURFPoint*)cvGetSeqElem( imageKeypoints, i ); CvPoint center; int radius; center.x = cvRound(r->pt.x); center.y = cvRound(r->pt.y); radius = cvRound(r->size*1.2/9.*2); cvCircle( frame, center, radius, red_color[0], 1, 8, 0 ); } cvShowImage( "Image", frame ); cvWaitKey(30); } cvDestroyWindow("Image"); return 0; }
The explanation of the code is straightforward. It captures a frame from the camera, then converts it into grayscale ( because OpenCV SURF implementation works on grayscale images). The function cvSURFParams sets the various algorithm parameters. The function cvExtractSURF extracts the keypoints and desciptors into the corresponding sequences. Now circles are drawn with these keypoints as center and the strength of descriptors as the radius. Below are some images showing the keypoints captured.
- SURF keypoints of my mobile phone
- The above picture shows the SURF keypoints captured by myself holding a mobile phone. The background wall has no strong intensity variations and hence no keypoints exist. An average of about 125 keypoints are detected in the above image and is shown in th terminal. Below are some more images.
-
From the above set of images, it can be clearly seen that SURF keypoints are those pixels whose average intensity values differ much greatly from the immediate neighbors, and by using the descriptor, a relation between the reliable keypoints and the neighboring pixels are obtained. Once the SURF descriptors and keypoints of two images are calculated, then they can be compared using one of the many algorithms like nearest neighbour , k-means clustering etc. SURF finds usage not only in object detection. It is used in many other applications like 3-D reconstruction.
Here are a few more screenshots of object recognition using surf:
it finds key features in the hand and mobile but not on walls, because of no variation in intensities. How would SIFT behave in such a case?
Dear Acchu,
I have started working on embedded vision using OpenCV. Could you please name few algorithms which are supported or has been implemented using OpenCV for any kind of applications. Any reply would be appreciated. Thanks
Hi, thank for your sharing, it’s very helpful. I have a question is that when i work with camera, i can track an area in camera, but with small square like 10×10 pixel or smaller, i can not track it, please help me. Thank you so much. Here is my email: vancdt2612@gmail.com.
Here is my code: https://www.dropbox.com/s/xbxtctp78o9rshy/SURF8.cpp
nice .. I have implemented surf for finding the object , for that i am using java and opencv
Thanks
These are the undefined functions I’ve encountered “CvSURFParams ,cvExtractSURF”. I also include the nonefree and legacy library because I use opencv 2.4.5 but I still got this error. Need some help. Please
Hi Achuwilson, My name Anna and I’m a beginner of computer vision. I want to learn about object recognition like matching the watch as in the picture of your above example. Would you mind sharing the source code or post about that?. I am looking forward to your reply. Thanks so much
Hi, I used the code which came default with opencv examples to demonstrate this
hi achuwilson i really need u’r help. when i used u’r code i get error like this (OpenCV Error: Incorrect size of input array (Non-positive width or height)) it cant accept from the camera. what is my problem and what is the version of you’r opencv ?is that opencv2.1 ? my opencv version is opencv2.1.
HI Thank you very much for this article
Hi, Im new to Computer vision, but I got a projecto in which a robot uses the vision to detect and get a position of some objects.. when I try simple segmentation as contours, I dont get the results I expected.. I need something better, like SURF (I guess). In the code above, where do you load ur “template” image?? I know this might be a stupid question, but Im a begginer, some patience would be great! 🙂
hai achu ..can you please tell me how to extract texture based feature extraction of an image
Hi, for texture analysis and classification, the commonly used features include Fourier Transforms,edge features, Gray-level Co Occurrence Matrix, Eigen Values and Surf. Then some machine learning/classification technique is used to classify the textures.
hi achu,
I want to know how can i match two images, one rotated/ translated from another. I think we can apply surf detection, but again, after writing above program for both images, what code should i write. it would be a lot help, if you can provide the exact code. for translation i have already applied optical flow, but on rotating the image, it fails.
Thanks.
nice post. Good links. But I have a question i’m hoping you can help with.
I’m following the SURF algorithm, and I get everything except one part. Why does it use a Hessian Matrix? I understand WHAT it is, but not WHY it works? What is the Hessian’s significance here?
Will you plz help me to match ImageDescriptors of one frame(say reference frame) in video with other video?? Using KNN or SVM in OpenCV…..
Thanks for such nice post:)
I Want to know how we can use this imageKeypoints, imageDescriptors to recognize particular object in Video? Like in case of Haar cascades, we store all features in xml file and then use it to object detection.
Please help…
We can use this image keypoints to recognize objects. The simplest method is a bruteforce matching between the descriptors of the object and those of the scene, but it may not be that efficient. The best method is to train a model using machine learning techniques (like SVM) with the extracted keypoints.
Thanks!!!! Will you please elaborate your answer??? plz …..
how can i only compare 2 images which are saved on my computer?
Thank you very much for your sharing knowledge.
I’ve run the code follow an example of detection find_obj.cpp and I got the good result when I matched 2 images.
But now,I can not see the value contrains which I got success or false when I matched the image object and scene object.
Please explain me about that
Thanks
Thank you very much for your sharing knowledge.
I’ve run the code follow an example of detection find_obj.cpp and I got the good result when I matched 2 images.
But now,I can not see the value contrains which I got success or false when I matched the image object and scene object.
What do you think about that?
Hi,
Is there any relation exists between query image descriptors and training image descriptors so that I can say that the two images don’t match? If no, then how can I find out whether the 2 images are same or not?
Thank You.
Hi Achu, i want to detect object in a video. can we use SURF for that. the program detect all the key points and descriptors, how we could change that to detect only rectangular object. please help me to solve this. thank you
hello, Great job!! I am a curious student,i do like image processing.. could you email me or give me a link for the source code of the matching part,and is it based on scale invariant features . or can you give me detailed algorithm? is that possibility ?
wangqiuyuehnu@hotmail.com
Thanks a lot!!
Can you share matching code or give the idea how to mach it????
Hy Rocky, i think he just compared surf points of both images . if surf points are same then object is recognized . And i think this method is invariant to translation , rotation and scaling .
Can you share matching code
Could you share the code for the matching part. I need to match some objects..
Hi Achu can u help me out with using surf for detection of an image in a video capture.
ok..what help do you need?
Hi,
How ı use surf library from here http://www.vision.ee.ethz.ch/~surf/download.html .and I dont know how I use it?
Can you help me plss!!!
“Error 3 error LNK2001: unresolved external symbol _cvSURFParams” I take this error.Please help me!!!!!!
I am usig SURF to extract features.I need to print the descriptors..can somebody help me how to convert the descriptor vector to matrix.I hav used reshape command..but its not working..this is wat i hav used..
cvExtractSURF( image, 0, &imageKeypoints, &imageDescriptors, storage, params );
and i hav used
CvMat desc = CvMat(imageDescriptors).reshape(0, keypoints.size());
But its not working…
somebody please reply..
Did you figured out , how to print descriptors ? If yes please tell me
Hello! I am trying to implement object detection as a hobby project. I came across SURF features and wrote some code based on OpenCV’s tutorials that tries to find my object in the video coming from my laptop webcam.
Works, but the false-positive rate is too high. I mean even if my object is not present in front of the camera, the program manages to find scores of matches.
Do you know of a remedy?
Thanks!
i have been using SURF implementation to recognize object from images..Can somebody tell me how to get the descriptors rather than getting the number of descriptors..
The extracted SURF descriptors are present in the sequence “imageDescriptors”
I am doing a project on object recognition.ie to recognize object from video samples.I tried to use SURF algorithm to recognize object.But as the object is not scale invariant i am not able to extract the features accurately..is there any good algorithm to extract features even if the objects are not scale invariant..
If you dont want the detection to be scale invariant, it would be easier to implement using template matching.
Hey can you guide me on how to use these features in classification of objects. I am presently working on a project where i need to classify car in the image according to their make(eg. Ford, Fiat, Tata etc.) How can i proceed after getting the features from the car image.
where are part2
Hello there. I’m from Brazil too and started using SURF just a few weeks ago. Anyway, i got some faster results using OpenMP with it (like 28% faster extraction). Congratulation for your tutorial, i’m really looking foward the second part.
Hello, Very Nice Post, when is part 2 coming up?
Thanks.. Part 2 will be posted soon. I am a little busy now with my studies..
Hi, I’m brazilian, I’m based in your work to develop an program for my monography. I would like an opinion. I need to make an program to recognize banknotes. Do you recommend that I do this with HaarTraining? Do you think I will have success? Otherwise, I would like an opinion how could I have a good result! Thank you.
Alisson, you should probably use Template Matching for bank-note recognition.
Thank you very much for sharing the code 🙂