How To Hide From Automated Surveillance Cameras?
In recent years, interest in machine learning models has increased, including for the recognition of visual images and faces. Although the technology is far from perfect, it already allows you to calculate criminals, find profiles on social networks, track changes and much more.
Simen Thys and Wiebe Van Ranst proved that by making only minor changes to the input information of the convolutional neural network, you can replace the final result. In this article, we will look at visual patches for conducting attacks on recognition.
The first attacks on recognition systems were small changes in the pixels of the input image to deceive the classifier and derive the wrong class.
The goal was to create a patch capable of successfully hiding a person from the detector. The result was an attack scheme that could be used, for example, to bypass surveillance systems. Attackers can sneak unnoticed, holding in front of him a small cardboard tablet with a “patch” directed towards the surveillance camera.
The development of convolutional neural networks (CNN) has led to tremendous advances in computer vision. The data-driven end-to-end pipeline, in which SNAs are trained on images, showed the best results in a wide range of computer vision tasks. Because of the depth of these architectures, neural networks are able to study the most basic filters at the bottom of the network (where the data comes in) to achieve abstract high-level functions at the top. For this typical SNA contains millions of studied parameters. Although this approach leads to very accurate models, interpretability decreases dramatically.
In studies, various images were used to deceive observation systems, including abstract “noise” and blurring.
To create the patch, the original image was used, which underwent the following transformations:
- Rotate 20 degrees;
- Noise overlap;
- Blur;
- Brightness modification;
- Contrast modification.
Researchers conducted many Inria tests to identify the best “concealment” of a person.
To achieve the desired effect, the image of 40×40 centimeters (which is indicated in the expert report by the word patch) must be placed in the middle of the camera detection box and be constantly in its field of view. Of course, this method will not help a person to hide a face, however, the algorithm for detecting people, in principle, will not be able to detect a person in a frame, which means that subsequent recognition of facial features will not be launched either.
As a demonstration, researchers published a video demonstration of visual patch capabilities: