Machine vision, artificial intelligence, and the limitations within


A recent article studied the issues deep learning vision algorithms have at classifying images. The major finding was that these neural networks emphasize surface texture over edge detection, i.e. object shapes. The underlying reason is that these textures contain an order of magnitude more information than the edges, thus the texture data dominates the learning algorithms. The result gives a lot of leeway for camouflage: the outline of an aircraft painted with overlapping clock faces was mislabeled as a clock, as a cat painted with elephant skin texture was mislabeled as an elephant.

For classifying images, deep learning vision algorithms are trained with thousands of images that either contain or do not contain the sought elements, such as a cat. The algorithm learns to find patterns, which it then uses to label images it has never seen before. The researchers then improved the learning algorithms by splitting the images into patches which were analyzed individually.

“Neural network architecture allows for integration of different features for decision-making. It doesn’t automatically happen, however. Removing unwanted deep-rooted biases is possible, if not easy.”

This brings to my mind how the human visual system works. Leaving out the differences in photosensors and the image processing happening there, our brains do not work as a single neural network analyzing the images provided by the two eyes. The data is split and the image is analyzed in different regions of the brain for different features: motion detection, edge detection leading to shape recognition, color hue recognition, etc. All this information is also compared and merged with visual memory. Most of what we think we see is actually recalled from memory, our vision is sharp (as we feel it) only in the fovea of the retina — sensing the target spot we are focusing at. There may be good reasons that evolution has split the process to different blocks.

Texture has, of course, an effect also on humans. Camouflage painting and clothing has been used to break up the shape of ships or soldiers to make them visually more difficult to spot. For simpler visual systems of the insects, simpler means seem to work: the striped texture of zebras has been found to disrupt the eyesight of the horseflies. When they get close enough to on a zebra, they cannot find the surface and either fly past or bump to the surface, not able to land. This has been proven by dressing humans and horses with zebra stripes.

Checker shadow illusion

The checker shadow illusion. Although square A appears a darker shade of grey than square B, in the image the two have exactly the same luminance. Source: Wikipedia. Original: Edward H. Adelson, vectorized by Pbroks13. 


The complex operation of the human visual system has caused some problems to the display manufacturing industry. About twenty years ago, the industry moved from CRT displays to flat panels, such as LCDs or OLEDs. The quality control of CRT display surface was relatively simple, whereas, in flat panels, all the pixels are basically individual entities. As the human vision actively and effectively adapts to changes in color hue (see the checker shadow illusion above), a human quality controller could not easily see if the color of a large display was uniform from corner to corner. Using emerging digital photography technologies had issues, too, as possible quality issues such as optical illusions were not recorded with standard visual inspection cameras. The cameras and image processing software had to simulate the human visual system to be able to see quality problems (such as Grid Illusion) that could cause headache to humans.

Most of the visual inspection processes required by quality control are of simpler nature, however. In the pharmaceutical industry, examples would be inspecting the number of pills in blister stripes, quality of the printed serialization codes in carton packages or bottles, and readability of human-readable expiry date information. Medical devices have also codes, numbers, and scales whose readability must be inspected. As the context is well-specified, the machine vision system is only required to measure, if the print quality is consistent with the required standards. One can, however, find more use for the visual inspection data, such as using the noise intrinsic to the printing process as a unique fingerprint, adding robustness against falsification and malicious QR code attacks.

Servicepoint offers services of an experienced full-service device assembly and packaging, machine vision & quality inspection, robotics, and serialization & traceability automation integration company. Call us, we can help you!

Servicepoint Oy — the reliable partner for the manufacturing industry

Iiro Jantunen
CTO, Chief Technology Officer
Servicepoint Kuopio Oy