# [BlogPost 1]: Implementation of a Shape Matching procedure

In the context of the project  “Ornaments in Print” described in a previous post (see here), we are interested in creating a shape matching procedure in order to classify a large database of medieval ornaments. We investigate in this article a shape matching procedure using shape contexts, inspired from the procedure described in [1].

#### Introduction

Shape matching is an attempt to operationalize our intuitive notion of similarity between shapes: if we consider for example the two handwritten letters in figure 1, they appear rather similar to a human observer, while being radically different if compared as vectors of pixels.

One way to address this issue could be to design a notion of distance between two shapes : then, two shapes that match pretty well will be close in terms of this distance.  To this end, we need first to define in a more rigorous way what is meant by the shape of an object. In mathematical terms, the shape of an object is classically defined as an equivalence class under the group of  similarity transformations. Roughly speaking, this just means that we remove all information concerning location, scale and orientation. Thus, two objects can be said to have the same shape if they are similar in the sense of Euclidean geometry.

However, such a definition is too restrictive for our purposes: such a definition only tells us when two shapes are exactly the same. But in practice, objects rarely have the same shape, due to measurement errors. Therefore, we need need to relax a little this definition to obtain the desired distance notion.

#### Description of the procedure

In the approach we investigate here, we will, as suggested by [1], treat an object as a (possibly infinite) point set and we assume that the shape of an object is captured by a finite subset of its points. In practice, we sample the internal and external contour of the object using an edge detector (see figure 2). This gives us a set $\mathcal{P}=\{p_1,\ldots,p_n\}, p_i\in\mathbb{R}^2$, of $n$ points. These points need not to be landmarks (points of special interest for the considered shape): they can simply be chosen uniformly spaced, which is particularly convenient for an unsupervised shape matching procedure.

Figure 2: Edge detection performed on two “o” letters from different fonts. Top: original letters, bottom: edge detection of the letters.

Then, we face a correspondence problem:  for each point $p_i$ on the first shape, we want to find the “best” matching point $q_i$ on the second shape. This matching can be efficiently performed by using rich local descriptors of the shape. In our case, we will use the shape context, introduced in [1], as a rich local descriptor. The idea is the following :

Figure 3: Set of vectors originating from a point to all other sample points on the “o” shape.

Given a point $p_i$ on the shape, we consider the set of vectors originating from a point to all other sample points on a shape. These vectors express the configuration of the entire point with respect to the reference point (see figure 3). Obviously, as the number of points increase, the description of the shape becomes more precise. But the full set of vectors is much too detailed to be used as a shape descriptor, and we choose a more compact descriptor: we consider the densities of these vectors in the log-polar space.

More specifically, for each point $p_i$ on the shape, the shape context of $p_i$ is defined by the histogram $h_i$  (see figure 4):

$h_i(k)=\#\{q\in\mathcal{P},q\neq p_i : (q-p_i)\in bin(k)\},$

where the bins $bin(k)$ are taken uniformly in the log-polar space.

 Histogram in the log-polar space. Alternative representation of the same histogram.

Then, the problem is to compare the shape context of two points $p_i$ on the first shape and $q_i$ on the second shape. We use the $\chi^2$ test statistic $C_{ij}$ to determine wether or not two histograms (i.e. shape context) are significantly different:

$C_{ij}=C(p_i,q_i)=\frac{1}{2}\sum\limits_{k=1}^K\frac{(h_i(k)-h_j(k))^2}{h_i(k)+h_j(k)}.$

This statistic can be interpreted as the cost of matching $p_i$ and $q_i$.

Then, we draw a bipartite graph, with nodes  the sampled points of the first and second shape. Then,  every points $p_i$ and $q_i$ are linked by an edge with weight $C_{ij}$  if we cannot reject the hypothesis that their two shape context are the same (see figure 5). This gives us for every point on the first shape, a set of potential match on the second one, and respectively (see figure 5).

 We build a bipartite graph matching connecting points of each shape which shape context are not significantly different Potential matching points on the second shape for a point on the first shape.

Finally, we want to obtain the best bipartite graph matching, namely, we want to minimize the total cost matching,

$H(\pi)=\sum\limits_i C(p_i,q_{\pi(i)}),$

with the constraint that the matching be one-to-one,  $\pi$ is a permutation on the neighbors of $p_i$. Applied to the case of the two “o”s we obtain the best matching represented in figure 6.

#### Conclusion and milestones for this semester

We observe that the results obtained are pretty encouraging in the context of matching two simple shapes like the two”o”‘s in the figures above. We need now to test this procedure on more complex shapes such as ornaments, and see how well it performs on it. If the results are satisfactory, then we will try to implement the procedure on C++ to make it more efficient, and will to apply it to a database of ornaments. This will provide us with a graph of shapes (each ornaments) with distance between them (total cost matching). Then, we could design a classifier on this graph (a cluster analysis for example could be an approach).

SIMEONI Matthieu

#### References

[1] S. Belongie,J. malik, J. Puzicha, Shape Matching an Object Recognition using Shape Contexts, IEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 24, April 2002