This version is outdated by a newer approved version.DiffThis version (2017/03/22 17:37) is a draft.
Approvals: 0/1

This is an old revision of the document!

Whitepaper - LTU Image Technologies

1. Introduction

JASTEC France delivers image recognition technologies via his product LTU Engine. The solution is available via licensed software or via the hosted platform: LTU Engine ON Premise/ON Demand.

LTU Engine includes two images recognition technologies - Visual Search and Image Processing.
The Visual Search is divided into two standard recognition solutions – the image matching and the visual similarity search.
The Image Processing offers a Fine Images Comparison solution.

In this document, you are given details about:

  • image matching technology
  • visual similarity search
  • color search
  • fine images comparison

2. Visual Search Solutions

The visual search technology allows to find, from a query image, identical or similar visuals in images databases. The search is based on object recognition, shape or color .

Visual similarity search

Visual similarity search

2.1.1. Overview

The image matching technology is used to find in a database images that are:

  • Exactly the same (e.g. for deduplication)
  • Edited in any way (e.g. for tracking on copyright images)
  • Photos taken of the same visual content (e.g. for print to mobile applications)

Image matching example

Image matching example

To identify matches between a query picture and images of a database, the technology compares the image signature. An image signature (also called image DNA) is computed from distinct local image features. Each signature is unique.

The visual search process is done in 2 steps :

  • Indexation: Computing the LTU image DNA and storing it in a reference database. Indexing time thus refers to the time it takes to complete this computation (usually given in time per image).
  • Retrieval: Finding the clones or similar images of a given query image in the reference database. Retrieval time thus refers to the time this computation takes (usually given in time per comparison or numbers of images that can be searched per second).

Visual search process

Image matching process

Image matching is highly configurable and thus applicable to a vast number of different applications that require image matching. The technology can be configured to be optimized for different applications with the following options:

  • Rotation invariance
  • Thresholds
  • Text removal
  • Flip invariance
  • True Positives (tp): Retrieved image that is indeed a clone of the query image.
  • False Positive (fp): Retrieved image that is not a clone of the query image.
  • False Negative (fn): Clone image that has not been retrieved.

Precision and Recall: Used to measure the performance of a system:

  • Precision = tp/(tp+fp)
  • Recall = tp/(tp+fn)

2.1.2. Visual Distance

LTU engine computes a visual distance between the query image and the images in the reference database. Usually it is preferred to consider the order (ranking) in which the returned images are presented. The visual distance is normalized such that a visual distance below 1.0 indicates a match. The distance is an indicator for the relevance of the retrieved images: the closer the value to 0.0, the closer the retrieved image shares the same visual content as the query image. Identical images have a distance of exactly 0.0.

An example for a ranked result is provided below.

In addition, it is also possible to explore matches with a distance greater than one, i.e. matches with low confidence. As a result, the retrieval rate of matches that would otherwise not be included can be improved; however this also includes an increased rate of false positives. The example below depicts an image that has been found with a visual distance of greater than one.

The distance of this last example is greater than 1.0 as it only represents a fraction of the query image and thus only shares a small portion of the same visual content. Such images have a low rank and will thus figure last in the results, however will most likely be very relevant for many applications using the technology.

2.1.3. Matching Zone

In addition to visual distance, LTU engine is able to return rich information for a query. For example, LTU engine can return for each result image, the zones that have matched. This feature is useful:

  • to get visual feedback on the algorithm behavior
  • to implement custom filtering heuristic (do not return result if the matching zone is too small)

2.1.4. Image Transformations

A list of typical image transformations is provided below followed by more detailed descriptions: geometric transformations (crop, resize), structural transformations (overlay, composite images), photometric transformations (brightness, contrast), arbitrary rotations, flip, exact copies

Transformations: LTU engine’s image matching functionality is robust against several types of image transformations, detecting not only the exact same image, but also modified versions of the original image and object matches (photographs of same object) by computing a unique signature called an Image DNA.

This document illustrates the types of image transformations that LTU engine can handle in order to identify a match. It also details additional features to LTU engine’s image matching technology and finally provides an overview of what the JASTEC France lab is currently working on.

What are image transformations? Image transformations can be broadly divided into several groups:

  • Geometric transformations: Includes scale changes, rotations, translations, flips and projective transformations.
  • Photometric transformations: Includes color space conversions, gray level transformations, changes in hue, brightness and contrast.
  • Filtering effects: Includes noise and smoothing.
  • Recompression: Due to different compression algorithms (such as jpg) and different image encodings, information can get lost and artifacts may appear.
  • Structural transformations: Includes composite images, i.e. images that have been overlaid on top of each other, overlay of text, adding of borders and cropping.

Often images may be modified with a combination of the above transformations; however, the LTU image matching technology is robust even in those instances. The image matching technology included in LTU engine uses distinct local image features to compute a unique image signature (also called image DNA). These signatures are used to identify matches between a query image and the reference database. Geometric Transformations

LTU engine is capable of identifying image matches despite geometric distortions.

  • Resizing of the original image

  • Unordered List ItemArbitrary Rotations

JASTEC image matching technology supports rotation of images.

  • Unordered List ItemProjective distortions

Projective distortions appear when an object or a scene is pictured under different viewpoints, also referred to as perspective. JASTEC image matching technology is capable of handling some degrees of perspective distortions. Photometric Transformations

Photometric transformations include grayscale, brightness, contrast and hue. LTU’s Image matching technology can detect matching images regardless of these photometric transformations.

Photometric transformations include grayscale, brightness, contrast and hue. LTU’s Image matching technology can detect matching images regardless of these photometric transformations.

  • Grayscale: Color image converted to shades of gray.
  • Brightness: Luminance settings correspond to the degree of luminance within each image pixel. For a distant observer, the word ‘luminance’ is substituted by the word ‘brightness’, which corresponds to the sparkling parts of an object or image.
  • Contrast: The difference between the darkest and the brightest parts of an image.
  • Color change (Hue): Changes in coloration; hue is a complex color obtained by a mix of basic colors – Red, Blue, Green. Image Filtering and Noise

Filtering effects are mainly linked with image printing, but also with modifying image metadata. Filtering transformations affect the image 'clarity'. Depending on the filter used, they can either sharpen or blur the image. JASTEC’ image matching technology processes these images without difficulty. Structural Transformations

Structural transformations relate to changes that affect the structure of the image. These transformations do not limit the matching of images using JASTEC’s image matching technology.

2.1.5. Framed, flipped, text added, cropped

  • Addition of a border or frame: A border of uniform color is added on one, several, or all sides of the image.
  • Flip: Using a particular configuration of LTU’s image matching signature optimized for image tracking applications, the technology is capable of matching flipped images.
  • Addition of text to the image/superimposition: The addition of text to the image with or without a background. With LTU’s technology images are matched regardless of the addition of text.
  • Cropped Images: To cut out or trim unneeded portions of an image or a page. Image matching from JASTEC handles cropped images without difficulty.

2.1.6. Composite Images

Definition: A composite image contains several photographs or graphics in one image and often has a modified background or added text. For this kind of transformation, JASTEC’s image matching technology delivers extremely accurate results.

In addition to the visually apparent image transformations detailed above, LTU engine is capable of detecting image clones even if the format or compression of the image has changed. Different image file formats include .bmp, .gif, .jpeg, .pcx, .png, .rsb, .tga, .tif.

JPEG compression: Images are often saved in compressed file formats in order to facilitate faster downloading on the Internet. That compression alters the image slightly, but does not typically impact LTU engine’s ability to identify a match.

LTU engine can detect an image clone even when several of the transformations listed in this document are performed on the same image.

JASTEC’s image matching technology easily matched the above combination which includes Gray scale, blur, re-encoding, projective transformation and overlay composite transformations.

The image matching technology from JASTEC has been optimized to handle query images taken with a mobile device. Due to induced scale changes, motion blur, compression artifacts and usually low quality optics, queries from mobile devices can be challenging to match. JASTEC has developed an image matching DNA that is particularly robust against combinations of these types of transformations. It is recommended, however, to avoid extensive glare, deep angled shots, very dark lighting and to frame the object of interest accordingly.

3. Matching Performance

Local and global matching approaches are tested daily on two internal databases as well as real feedback and image transformations cases from our clients. One is used for testing the retrieval capacity (i.e. the queries are transformed versions of images in the reference database) and the other to measure the error rate (i.e. the queries are not in the reference database).

The examples below present challenges when matching, due to strong cropping with little structure or advanced composite images.

The rate of the false positives depends on the application the image matching technology is integrated with.

Since image matching is very prone to detecting small common parts in images such as logos, it sometimes can result in false positives as seen below because parts of the images indeed did match.

3.2.1. When false positives are wanted

Sometimes image matching signature will detect the same object or scene, but not the same image. According to traditional image recognition research terminology these instances would be classified as false positives. However, these types of false positives are desirable when performing very fine similarity searches and when the objective is to match photographs taken of objects – proven especially relevant to mobile applications.

Sometimes an image may not be indexed. This is due either to an unknown image format (however this rarely happens as LTU engine supports more than 150 file formats) or due to missing image information, i.e. uniform colored images are rejected. Images containing very little information, i.e. having no distinct image features may be rejected too such as this image below.

Finally images with dimensions less than 64×64 pixels are rejected in the default value of the LTU engine (the default setting can be changed).

If the query image contains text, i.e. the scan of a magazine, a screenshot or a sign, false positives may occur for local matching. However, an optional pre-filtering step can be applied to disregard the textual part of the image, which will result in a decreased false positive rate. For instance, if the query image is a scan of a magazine page, the pre-processing step can be applied to extract only the image of interest.

4. Similarity Search

We provides a solution for finding similar images. By submitting a query image, our technology can find visually similar images. Similarity can focus on the shapes within the image, its color, or both.

The relative importance of the color can be set at each query with a color weight.

  • Color Weight 0: If the color weight is zero, then the algorithm will only focus on the similar shapes.
  • Color Weight 100: With a color weight at 100,the algorithm will only take colors into account when looking for similar images.
  • Color Weight 50: An intermediate value between zero and one hundred indicates that both shapes and colors should be taken into account.

We recommended advanced signature for similarity search is: Signature 4.

Signature 4
New similarity signature, signature 4, analyzes two characteristics: shapes and colors. These parts are independent and their scores are only merged at the end into the final score of the signature.

  • Shape: Shape similarity is very powerful and can find images regarding different levels of similarity.
  • Overall shape: Algorithm can find images with overall similar shapes. That means if the query image looks like a ball, we will be able to retrieve other images whose overall shape is a ball as well.
  • Texture: On a finer level, the algorithm is able to detect the kind of texture used in the image. As a result, it finds paintings from the same painter to be similar, if the painter used the same texture techniques on different paintings.
  • Color: The color part of the signature 4 is invariant to scale, rotation or any linear transformation. Color search with signature 4 is quite flexible and can find images sharing the same colors. It also takes proportion of colors into account.

The shape part of the signature is currently sensible to scale and rotation. Subsequent versions are expected to be invariant to scale.

These images illustrate just one example of the retrieval results possible with similarity search:

5. Fine Comparison of Images

Fine image comparison is a specialized technology especially pertinent to media intelligence applications such as advertising identification.

Fine image comparison is designed

  • To automate the comparison of images which match but which may contain difference
  • To provide additional details on the results of matches. The fine image comparison feature provides visual feedback about matched images including a visual highlight showing where differences are located.

The examples below are typical of the types of images to which Fine Image Comparison is applied:

These two images are identical, except for the pricing details in the lower part of the image. The whitened zones in the image at right indicate the zones in which differences are detected.

The differences between these two images are highlighted in the upper left corner.

The Fine Image Comparison process generates these elements:

  • score: a score is generated which quantifies the visual distance between the two images
  • visual indicators: two analytical images are generated for each fine comparison effected. These analytical images indicate the zones within the images in which there are variations.

In a media intelligence application, Fine Image Comparison is typcially used in conjunction with LTU image matching.

  • Unidentified advertisements are compared with a database of know advertisements.
  • Certain ads are identified as definite matches.
  • Other ads are identified as possible matches, but which need validation (their matching scores may indicate the possibility of variations)
  • Fine Image Comparison is applied to the pairs of possible matches. The score generated by Fine Image Comparison determines whether the possible matches should be classified as definite matches or should be examined in a human validation process.

6. Clustering

[Not available on LTU engine /ON demand]

Clustering enables the grouping of images within a given set of images. The clustering algorithm is computed for each image and is grouped with the images that it matches in the set. The result can be presented in a graphical illustration of the complete set of images. A typical result can be seen in the figure below:

Image database presented in clusters

The illustration above provides a close-up of one of the clusters. The image in red represents the cluster center. Clustering easily facilitates the review and navigation across large datasets.

7. Color Search

Additionally to image matching, LTU engine provides LTU Color Query.

LTU Color Query is a powerful tool that analyzes the colors in an image collection and provides a set of tools to navigate in this collection thanks to colors. LTU Color Query includes two kind of functionalities: - Indexation. This part analyses the colors of an image collection - Query. This part let you explore an image collection through color modality once it has been indexed.

Both part are fully optimized and let you index a collection of millions of images in few hours, store on a standard server/computer and run all kind of queries on it in the twinkling of an eye.

Whereas lots of existing color tools that require human annotation of the image collection, LTU Color Query is able to analyze the content of your images and automatically identify the present colors. As the process is fully automatic, it is also very accurate. LTU engine analyses the color that are actually present in the images not only a rough hue. This accuracy allows to look for very specific color tints in an image collection.

By default, the signature is computed on the whole image. On some specific case, this behavior can be problematic. For instance in eCommerce the articles are often shown on a uniform background. Thus the algorithm considers the background color as the article main color. To tackle this issue, LTU engine introduces a background removal algorithm that identifies uniform backgrounds and computes the signature only on the foreground image. If no uniform background is detected the signature is computed on the whole image.

Images to the right in these two examples show in blue the detected background:

Once LTU engine has indexed an image collection, it is possible to run queries on it. There are four kind of queries: get image colors, query by color, query by image, compute palette

7.3.1. Get image Colors

For each image in the collection LTU engine can return you the list of colors that are present in this image. This feature is typically used in combination with query by color or query by image to print the colors of the query results.

Colors returned by “get image colors”:

7.3.2. Query by color

With LTU engine you can search in an image collection using a set of colors. For example, LTU engine let you run a query by color like “pink” or “pink and green”. Then LTU engine returns you a list of images that have the desired color(s). This list is sorted by relevance. LTU algorithm is very accurate. It is able to look very specific tints. It is also very robust. The algorithm returns the images with the required color tints at top positions but it also return images with slightly different tints at higher positions (or at top positions if none of the image contains the required color tints.

Results for a query by color “pink”:

Results for a query by color “pink and green”:

With LTU engine it is also possible to specify the desired color proportion. For instance you can run like a query like ‘look for images with 50% red and 25% yellow’.
Results for a query by color in varying proportions:

7.3.3. Query by image

Once you have found an interesting photo (using a query by color for example) you may want to find similar photos in the collection. That is what query by image is for.

Given an input image, query by image looks for images in your collection that have similar colors.

This feature is useful when: - there are too many colors in an image to type them all - you do not know a specific color code

7.3.4. Interaction with keywords

Keywords can be assigned to each image in the collection. Keywords can then be used to restrict the query result to some specific categories. For instance it is possible to run a query by color “red with keyword sofa”. Keywords are compatible with Query by color and Query by image.

Results of a query by color “red with keyword ‘sofa’”:

7.3.5. Compute Palette

LTU Color Query can analyze an image collection and return the most frequent colors. The set of the most frequent colors is what we call a color palette.

The color palettes can be used to : - suggest relevant queries to the user. (Queries that have results) - provide a quick overview on an image collection

An interesting feature of palette is that they can be computed on any subset of a collection.

For instance subsets can be categories. LTU engine can compute a palette for the “Women Shirt” category. This will be different from the whole image collection palette. Some colors that are not present in this category will be removed and LTU engine will introduce color nuances for the most present colors.

These subsets can also be result sets. If they are used to propose queries to the user, this feature can be a powerful tool for query refinement.