Imagine the following scenario: you’ve got a human driver and an autonomous driving car at a four-way intersection with stop signs. Each human driver at best yields to the other drivers and makes their turn, but what does the autonomous vehicle do? Do they wait for everyone to go through the intersection because the human drivers aren’t coming to a full stop? Now imagine this: in another scenario, where a human driver comes to an intersection amidst construction work, the human drivers are being waved through, but what happens to the autonomous vehicle, how can the car possibly know what to do and how to handle the situation?
The root of the problem when it comes to autonomous vehicles and why they’re not readily available today is that you can train the neural network to reach 95% accuracy with reasonable effort but it takes many more miles to train the neural network to reach an accuracy of up to 99%. An edge case is a rare situation that only occurs occasionally; however, it still requires specific attention and needs to be dealt with in a reasonable and safe manner. This is where the problem lies with edge cases: neural networks have an inherent weakness in generalizing to those special corner cases.
Did you know that approximately every 165,000 miles a human driver drives, they have an accident? This infers that in 99% of the cases, a human driver is right when they drive by themselves. But when it comes to autonomous vehicles, regulators demand even higher safety results than human behavior and this is where edge cases are extremely crucial to the success of the autonomous driving industry. It’s going to come down to the scenario of a four-way intersection and the autonomous vehicle who may be the only one abiding by the traffic rules.
Understanding Edge Cases
When it comes to autonomous driving, we’re dealing with life-critical situations that demand validation. Common edge case examples are difficult weather conditions, behavior that is out of the norm for pedestrians or cyclists, objects on the road, etc. The even trickier piece of this equation is when you take into account how a human perceives an unusual occurrence as opposed to a machine. There can be slight differences in the way a machine registers an image.
The goal for autonomous vehicles is that eventually, they’ll drive better than humans and to achieve that level of ability, requires thousands of edge cases. These edge cases need to consist of both scenarios: those that are likely, and edge cases that are highly unlikely. Both are equally important when dispatching millions of vehicles on the road.
How to Deal With Edge Cases
Adding millions of images isn’t what’s going to add improvement to your model; rather choosing images that add information is what you’re looking for and modeling the scene to deal with edge cases from day 1 requires a different modeling architect than used in most systems today. The challenge is so significant that to date, there hasn’t been one AV company that has solved this in a reliable manner.
Regardless of the modeling architecture, one still needs those edge cases in the AI training set. AV companies probably have many edge cases in their databases, yet these are probably not entering into the training flow. This is why scaling is such a challenge for many companies as they need to select the data that has a high level of information and not just adding mass amounts of data.
But how can you get the most out of your budget and choose the images to best help your model learn?
Look for what you’re not familiar with. The most basic answer is to use an auto-encoder in order to identify anomalies, instead of selecting randomly; this is already a much better way than just randomly sampling data during collection. Once each sample is encoded into a feature vector, it is time to do an information selection of data in order to achieve optimal coverage. Additionally, the fact is that your model is already familiar with your data. Therefore, you can greatly improve the quality of the vectors used to map the information coverage. Features in your models can be used (even on unseen data) in order to generate feature vectors for smarter selection. Again, the problem here is the scarcity of edge cases. But once you’re familiar with them, then they will expose you to different features that are already present in your existing models. And then you can train your model to detect these instances.
According to McKinsey’s Senior Partner Asutosh Padhi:
This essentially means you can’t accomplish this via physical validation and testing. Instead you’ll need to “borrow” techniques from software-based simulations from other industries in order to complete the level of validation you require.
The Bottom Line
In the autonomous vehicle space, it’s all about safety and foreseeing possible collisions or obstacles. Generating numerous datasets is crucial when it comes to edge cases. This will ensure the vehicle can safely handle unusual or out-of-the-norm conditions. While an edge case may be potentially harmless, it can create serious problems. For instance, misidentifying a man as an object, in the self-driving world is a catastrophic disaster. Therefore, accurate obstacle detection, regardless of the lighting or weather conditions, is essential to avoid collisions and obstacles. The process of generating many datasets is extremely labor-intensive and requires human involvement which is crucial to edge cases. Find out how Foresight Automotive, a leader in the autonomous vehicle industry, eliminates failed detections with automation. Read their full story here.