CONTENTS 1 Introduction 1.1 API Concepts 2 core. The Core Functionality 7 2.1 Basic structures 7 2.2 Basic C Structures and Operations 2.3 Dynamic Structures 92 2.4 Operations on Arrays 2.5 Drawing functions ..178 2. 6 XML/YAML Persistence 188 2.7 XMI/YAMI Persistence(C APD 2.8 Clustering 217 2.9 Utility and System Functions and Macros 219 2.10 OpenGL interoperability ..,,.,,,,,,,,.,,228 imgproc. Image Processing 239 3.1 Image Filtering 勹 3.2 Geometric Image Transformations 268 3.3 Miscellaneous Image Transformations 280 3.4 Histograms 295 3.5 Structural Analysis and Shape Descriptors .306 3.6 Motion Analysis and object Tracking ...321 3.7上 eature Detection 38 bject Detection· .337 4 highgui High-level GUI and Media 1/O 339 41 User interface 339 4.2 Reading and Writing Images and Video 345 4.3 Qt New Functions .356 5 video. Video Analysis 363 5. I Motion Analysis and Object Tracking 363 6 calib3d. Camera Calibration and 3D Reconstruction 379 6.1 Camera Calibration and 3D Reconstruction .379 7 features2d 2D Features framework 411 7.1 Feature Detection and Description 411 7. 2 Common Interfaces of Feature Detectors 415 7.3 Common Interfaces of Descriptor Extractors 425 7.4 Common Interfaces of Descriptor Matchers 428 7.5 Common Interfaces of Generic Descriptor Matchers .434 7.6 Drawing Function of Keypoints and Matches 439 7.7 Object Categorization 8 objdetect Object Detectio 447 8.1 Cascade classification 447 8.2 Latent SVM .453 9 ml Machine Learning 459 9.1 Statistical Models 459 9. 2 Normal Bayes Classifier .,..,.462 9.3 K-Nearest Neighbors 464 9.4 Support Vector Machines 468 9.5 Decision trees 474 9.6 Boosting .481 9.7 Gradient Boosted Trees ..,.485 9.8 Random trees 490 9.9 Extremely randomized trees 494 9.10 Expectation Maximization 495 9.11 Neural Networks .499 9.12 MLData 504 10 fann Clustering and Search in Multi-Dimensional Spaces 511 10.1 Fast Approximate Nearest Neighbor Search 5l1 10.2 Clustering 515 l1 gpu GPU-accelerated Computer vis 517 11.1 GPU Module introduction 517 11.2 Initalization and Information 518 11.3 Data structu 522 11.4 Operations on Matrices 529 11.5 Per-element Operations ..534 11.6 Image Processing 542 I1. 7 Matrix Reductions 563 11.8 Object Detection 11.9 Feature Detection and Descriptio 567 573 11 10 Image Filtering 584 11 11 Camera Calibration and 3D Reconstruction 599 11.12 Video analysis .608 12 photo. Computational Photography 631 12.1 Inpainting 63 12. 2 Denoising .632 13 stitching. Images stitching 635 3.1 Stitching Pipeline ..635 13.2 References 13.3 High Level Functionality 636 13.4 Camera..,,, 640 13.5 Features Finding and Images matching 640 13.6 Rotation estimation ...645 13.7 Autocalibration 650 13.8 Images Warping ...650 13.9 Seam Estimation 655 13.10 Exposure Compensation 658 13.11 Image blenders .660 14 nonfree Non-free functionalit 663 14.1 Feature Detection and Description 663 15 contrib Contributed/Experimental Stuff 671 15.1 Stereo Correspondence 671 15.2 FaceRecognizer- Face Recognition with OpenCv 673 15.3 Retina: a Bio mimetic human retina model .747 15. 4 OpenFABMAP 755 16 legacy. Deprecated stuff 761 16.1 Motion Analysis 16.2 Expectation Maximization 763 16.3 Histograms .767 16.5 Feature Detection and Description 16.4 Planar Subdivisions(C APD ..768 775 16.6 CoMmon Interfaces of Descriptor Extractors ..782 16.7 Common Interfaces of Generic Descriptor Matchers 783 17 ocl. Open CL-accelerated Computer Vision 17.1 OpenCL Module Introduction 17.2 Data Structures and Utility functions 793 17.3 Data structures 17.4 Operations on Matrics 797 17.5 Matrix Reductions 809 17.6 Image Filtering 811 17.7 Image Processing 827 17. 8 l Machine leanin .833 17.9 Object Detecti 832 17.10 Feature Detection And Description ..837 17.11 Video ana 848 17. 12 Camera Calibration and 3D Reconstruction .859 18 superres. Super Resolution 18.1 Super Resolution 865 19 viz. 3D Visualizer 867 19.1V 867 19.2 Widget 880 Bibliography 905 CHAPTER ONE INTRODUCTION Opencv(openSourceComputerVisionLibrary:http:/opencv.org)isanopen-sourceBsd-LiceNSedlibrarythat includes several hundreds of computer vision algorithms. The document describes the so-called OpenCV 2.X API which is essentially a C++ APl, as opposite to the c-based OpenCv l X APl. The latter is described in opencv lx. pdf OpenCV has a modular structure, which means that the package includes several shared or static libraries. The following modules are available. core-a compact module defining basic data structures, including the dense multi-dimensional array Mat and basic functions used by all other modules imgproc -an image processing module that includes linear and non-linear image filtering, geometrical image transformations(resize, affine and perspective warping, generic table-based remapping), color space conversion histograms. and so on video-a video analysis module that includes motion estimation background subtraction and object tracking gorithms calib3d-basic multiple-view geometry algorithms, single and stereo camera calibration, object pose estimation, stereo correspondence algorithms, and elements of 3D reconstruction features2d-salient feature detectors, descriptors, and descriptor matchers objdetect-detection of objects and instances of the predefined classes(for example, faces, eyes, mugs, people, cars, and so on) highgui-an easy-to-use interface to video capturing, image and video codecs, as well as simple ui capabilities gpu-GPU-accelerated algorithms from different Opencv modules some other helper modules, such as Flann and google test wrappers, Python bindings, and others The further chapters of the document describe functionality of each module. But first, make sure to get familiar with the common API concepts used thoroughly in the library 1.1 APl Concepts cv Namespace your code, use the CV: specifier or using namespace cv; directive. Therefore to access this functionality from All the OpenCv classes and functions are placed into the cv namespac #include opencv2/core/core hpp The Open CV Reference Manual, Release 2. 4.9.0 cv:: Mat H- CV:: findHomography (points, points2, CV_RANSAC, 5) or #include"opencv2/core/core hpp usIng namespace Cv Mat h= findHomography(pointsl, points2, CV_RANSAC, 5) Some of the current or future OpenCV external names may conflict with STL or other libraries. In this case, use explicit namespace specifiers to resolve the name conflicts ata(100,100,CV32F) randu (a, scalar: all(l), scalar: all(std: rand) g(a, a)i /=std::Log(2.); Automatic Memory Management OpenCv handles all the memory automatically. First of all, std: vector, Mat, and other data structures used by the functions and methods have destructors that deallocate the underlying memory buffers when needed. This means that the destructors do not always deallocate the buffers as in case of Mat. They take into account possible data sharing. a destructor decrements the reference counter associated with the matrix data buffer The buffer is deallocated if and only if the reference counter reaches zero that is, when no other structures refer to the same buffer. Similarly, when a Mat instance is copied, no actual data is really copied. Instead, the reference counter is incremented to memorize that there is another owner of the same data. There is also the Mat: clone method that creates a full copy of the matrix data. See the example below // create a big 8Mb matrix MatA(1000,1000,CV64F); // create another header for the same matrix; this is an instant operation, regardless of the matrix size Mat b= a create another header for the 3-rd row of A, no data is copied either Mat c=b rov (3) // now create a separate copy of the matrix Mat d=B clone // copy the 5-th row of b to C, that is, copy the 5-th row of A // to the 3-rd row of A B row(5). copyTo(c) // now let a and d share the data after that the modified version // of a is still referenced by b and c. // now make B an empty matrix (which references no memory buffers) / but the modified version of A will still be referenced by C, despite that c is Just a single row of the originaL A B. release() // finally, make a full copy of C. As a result, the big modified // matrix wilt be deal located, since it is not referenced by anyone C=C clone oi You see that the use of Mat and other basic structures is simple. but what about high-level classes or even user data types created without taking automatic memory management into account? For them, OpenCV offers the ptr Chapter 1. Introduction The Opencv Reference Manual, release 2.4.9.0 template class that is similar to std: shared-ptr from C++ TRl. So, instead of using plain pointers T* ptr = new T you can use. Ptr ptr new T That is, Ptr ptr encapsulates a pointer to a T instance and a reference counter associated with the pointer. See the ptr description for details Automatic Allocation of the output data Openc v deallocates the memory automatically, as well as automatically allocates the memory for output function parameters most of the time. So, If a function has one or more input arrays(cv: Mat instances )and some output arrays, the output arrays are automatically allocated or reallocated. The size and type of the output arrays are determined from the size and type of input arrays. If needed, the functions take extra parameters that help to figure out the output array properties Example #inc lude "cv.h" #inc lude"highgui h usIng namespace CVi int main(int, chark*) VideoCapture cap (⊙) if(! cap. isopened())return -1: Mat frame, edges: namedwindow( edges",1); for(ii) cap > Trame cvtColor(frame edges, CV-BGR2GRAY); GaussianBlur(edges, edges, Size(7,7), 1.5, 1.5) dqes,0,30,3) imshow("edges", edges); if (waitKey (30)>=0)break; return 0 The array frame is automatically allocated by the > operator since the video frame resolution and the bit-depth is known to the video capturing module. The array edges is automatically allocated by the cvtcolor function It has the same size and the bit-depth as the input array. The number of channels is I because the color conversion code CV-BGR2GRAY is passed, which means a color to grayscale conversion. Note that f rame and edges are allocated only once during the first execution of the loop body since all the next video frames have the same resolution. If you somehow change the video resolution, the arrays are automatically reallocated The key component of this technology is the Mat:: create method. It takes the desired array size and type. If the array lready has the specified size and type, the method does nothing. Otherwise, it releases the previously allocated data if any(this part involves decrementing the reference counter and comparing it with zero), and then allocates a new buffer of the required size. Most functions call the Mat: create method for each output array, and so the automatic output data allocation is implemented 1.1. API Concepts 3 The Open CV Reference Manual, Release 2. 4.9.0 Some notable exceptions from this scheme are cv:: mixChannels, cv: RNG: fill, and a few other functions and methods. They are not able to allocate the output array, so you have to do this in advance Saturation Arithmetics As a computer vision library, OpenC V deals a lot with image pixels that are often encoded in a compact, 8-or 16-bit per channel, form and thus have a limited value range. Furthermore, certain operations on images, like color space conversions, brightness/contrast adjustments, sharpening, complex interpolation(bi-cubic, Lanczos)can produce val ues out of the available range. If you just store the lowest&(16) bits of the result, this results in visual artifacts and may affect a further image analysis. To solve this problem, the so-called saturation arithmetics is used. For example to store r, the result of an operation, to an 8-bit image, you find the nearest value within the 0.. 255 range I(x, y)=min(max(round(r), 0), 255 Similar rules are applied to &-bit signed, 1 6-bit signed and unsigned types. This semantics is used everywhere in the library. In C++ code, it is done using the saturate- cast functions that resemble standard C++ cast operations. See below the implementation of the formula provided above I.at< uchar>(y,×) aturatecastsuchar>(r) where cv: uchar is an Opencv 8-bit unsigned integer type. In the optimized sIMd code, such SSE2 instructions as paddusb, packuswb, and so on are used. They help achieve exactly the same behavior as in C++ code Note: Saturation is not applied when the result is 32-bit integer Fixed Pixel Types. Limited Use of Templates Templates is a great feature of C++ that enables implementation of very powerful, efficient and yet safe data struc tures and algorithms. However, the extensive use of templates may dramatically increase compilation time and code size. Besides, it is difficult to separate an interface and implementation when templates are used exclusively. This could be fine for basic algorithms but not good for computer vision libraries where a single algorithm may span thou sands lines of code. Because of this and also to simplify development of bindings for other languages, like Python Java, Matlab that do not have templates at all or have limited template capabilities, the current OpencV implemen ation is based on poly morphism and runtime dispatching over templates. In those places where runtime dispatching would be too slow (like pixel access operators), impossible(generic Ptr implementation), or just very inconve nient(saturate-cast()) the current implementation introduces small template classes, methods, and functions Any where else in the current OpenCv version the use of templates is limited Consequently, there is a limited fixed set of primitive data types the library can operate on. That is, array elements should have one of the following types 8-bit unsigned integer(uchar 8-bit signed integer(schar 16-bit unsigned integer(ushort) 1 6-bit signed integer(short) 32-bit signed integer(int) 32-bit Hoating-point number(Float) 64-bit floating-point number(double) Chapter 1. Introduction