Highly advanced artificial intelligence (AI) and convolutional neural network (CNN) technology has made the automatic detection of a range of objects possible. Still, it will never be possible to fully eliminate erroneous classifications—one reason the reliability of automatic image processing must continue to improve. The correct classification of objects is a matter of life or death in autonomous driving, and this requires a deeper understanding of decision-making processes within the neural networks.
Pedestrian recognition Research forms basis for improving road safety
Automatic image recognition is only capable of identifying objects—such as people—with a certain degree of probability. Errors in object recognition pose a major safety risk in autonomous driving. Photo: ARRK Engineering
Gaining a better grasp of these mechanisms is the only way to reduce misclassification to a minimum and comply with the ISO 26262 and ISO/PAS 21448 safety standards for the reduction of, for example, unknown or unsafe scenarios. That’s why, as part of its research activities, ARRK Engineering has developed the foundations of a framework for better understanding how CNNs work, and ultimately improving their object classification capabilities. This framework makes it easy to identify and eliminate vulnerabilities in a CNN, thus minimising the risk of errors and accidents caused by incorrect classifications.
There is a strong push in the automotive industry to develop better advanced driver assistance systems (ADASs) with the help of, for instance, new hardware with more efficient and robust sensors or more powerful algorithms. In the ample research being conducted in this area, the recognition rate in automatic image processing is of central importance.
“What’s key to autonomous driving is that the algorithms for object recognition work fast and yield a minimal error rate,” explains Václav Diviš, senior engineer for ADAS & Autonomous Driving at ARRK Engineering. “But it will only be possible to develop optimal safety features for autonomous driving once we have understood neural networks down to the last detail. The ISO 26262 and ISO/PAS 21448 standards provide the general framework for this, and it will be especially important to ensure the development processes and evaluation metrics are uniform.”
To achieve this goal, ARRK Engineering has established an evaluation framework for machine learning in the form of software as part of its research activities. This software will enable deeper insight into the recognition process of neural networks. From there, it will be possible to optimise algorithms and improve automatic object recognition. The experiment also served to gain a better understanding of how neural networks work.
Training the neural network
The first step was to select a reliable generative adversarial network (GAN) architecture, consisting of two neural networks—one generator and one discriminator—to provide a basis for the framework and to augment the dataset. In this phase, the used dataset comprised more than 1,000 photos of pedestrians. “Additional images were generated using the GAN to extend the dataset,” Diviš explains.
“The GAN’s generator synthesis an image and the discriminator assessed the quality of this image. The interaction between these two neural networks allowed us to extract the features from the original objects, generate new image and extend the original dataset relatively easily.”
Then, the classification network was trained on the original dataset and the test results were evaluated. To achieve the best possible results, ARRK used state-of-the-art architectures for all elements in the experiment.
“The generalisation of the object represents a challenge in image processing. The basic question is: What defines pedestrian?” says Diviš. “This can be easily answered by humans, since we generalise inductively. Neural networks, on the other hand, work deductively and require numerous examples to identify a specific object.”
Furthermore is important to observe “corner cases”—special cases in which pedestrians are not recognised, because of a pedestrian’s unusual posture, an obtrusion blocking a sensor’s view, or poor lighting due to weather conditions. Datasets typically lack suitable image material to classify these exceptional cases, but thanks to the GAN structure that has been established, ARRK has managed to supplement the dataset with computer-generated images and thus mitigate this problem.
GAN training
To train the neural networks in this framework, the GAN’s generator used an existing external dataset comprising photos of pedestrians to create computer-generated images. The discriminator then distinguished between genuine and generated images. This interaction will improve the generator’s ability to produce realistic images over time. Image: ARRK Engineering
Optimisation of object classification processes
ARRK then began with comprehensive tests to gain a deeper understanding of the processes that underlie CNN training, focusing particularly on the filtering of object attributes as well as the depiction of regions of interest (ROI) in the image area being examined. The emergence of these kernel weights and the resulting ROI are essential for finding evaluation metrics and thus automated object classification. In their analyses, experts looked at a number of processes that occur in neural networks and examined approaches to understand the neurons’ flow of information.
“Some neurons are more associated with the identification of pedestrians and produce stronger responses than others,” explains Diviš. “That’s why we’ve tested a range of scenarios in which we deactivated certain neurons to see how they influence decision-making processes. We could confirm that not every neuron responsible for identifying pedestrians needs to be activated, and in fact not removing some neurons can even lead to quicker and better results.”
The framework that was created can be used to analyse these types of changes.
All of this allows the stability of algorithms to be sustainably increased, which will serve to make autonomous driving safer. Precautions could be taken, for example, to reduce the risk of an “adversary attack”—the external deployment of a malicious code disguised as a neutral image to compromise the neural network. This code generates a disturbance and influences the decisions of certain neurons, making it impossible to correctly recognise objects. The effects of these types of external disruptions could be reduced by removing inactive neurons, as this would provide fewer targets to attack in the neural network.
“We will never be able to guarantee correct object classification 100% of the time,” says Diviš. “In the automotive industry, our job is to identify and better understand vulnerabilities in neural networks. Only by doing so can we take efficient counteractive measures and ensure maximum safety.”
A system’s object classification capabilities can also be improved immensely through the evaluation and combination of various data collected by sensors such as cameras, lidar, and radar.
Back to Homepage
Back to Transportation