On June 9, 2020, DENSO Tech Links Tokyo #7, an event organized by DENSO Corporation, was held as a webinar.
The theme was 'Human Drivers and AI-Automated Driving from the Viewpoint of Human Characteristics.' DENSO employees who play a key role in advanced technology talked about the development of automated driving technology and research on AI, taking human characteristics into account, to realize new mobility. The next speaker was Naoki Ito, General Manager of the Applied AI R&I Dept. , AI R&I Div., Advanced Research and Innovation Center. He introduced an R&D project using tracking, free-viewpoint image synthesis, and DNN accelerator to achieve AI-based automated driving.
General Manager of the Applied AI R&I Dept., AI R&I Div., Advanced Research and Innovation Center
Four Automated Driving Levels and Future Mission
Naoki Ito: I will talk about DENSO's initiatives in automated driving from the viewpoint of R&D on AI.
The theme of my presentation is DENSO's initiatives on advanced driver assistance systems (ADAS) and automated driving (AD) systems. These will become a global trend in the future.
The vertical axis shows the target vehicles, including passenger cars or so-called 'privately owned vehicles,' commercial vehicles such as trucks, and shared and service cars including taxis and small buses. The horizontal axis shows the AD/ADAS levels. The level increases from left to right.
Automated driving is categorized into four levels (Level 1 to 4). Active safety, which is shown on the left, and about half of ADAS/AD functions fall under Level 1 or 2. On Level 1 and 2, the driver is basically responsible for driving, and the system supports the driver. On Level 3 and Level 4, the system performs the driving task. On Level 3, the driver must take over driving when the system cannot handle it.
On Level 4, the fully automated driving system achieves driverless driving of small vehicles. Automated valet parking will also be required, though this may fall outside the definition of automated driving.
At present, Level 1 and Level 2 active safety technologies, including collision avoidance braking, adaptive cruise control (ACC), and lane keep assist (LKA) on expressways, are spreading.
Level 2 and Level 3 technologies will need to be developed for arterial and general public roads and Level 4 technologies for service cars in operational design domains (ODDs). Our department serves as an R&D team, so our mission is to develop elemental technologies for Level 3 and Level 4 and contribute to the company's business.
Current Issues in Automated Driving
In research on automated driving, we must address various difficult issues. For example, the sensing performance must be maintained to cope with the backlight at the exit of a tunnel, as well as very poor weather conditions such as heavy rain and dense fog.
Even if the sensing performance is maintained, objects must be detected by the system. Ordinary traffic participants include vehicles, pedestrians, and bicycles. Sometimes there may be unfamiliar objects on the road, and they must be detected. Collision with animals that may enter the road must also be avoided. If a road which was open yesterday is under construction today, it is necessary to make a detour. There are many difficult situations to cope with.
Human drivers can correctly understand the situation based on appropriate judgment, but this is a very difficult problem for fully automated driving.
As I explained, automated driving systems are expected to evolve gradually from simple systems to more complicated systems. Applications will spread from expressways, where it is relatively easy to achieve automated driving, to arterial and general public roads.
There are various challenging issues from the viewpoint of R&D, so this is an exciting field of research.
What Can Be Achieved by Using AI
AI is one solution to address these challenging issues. Here, AI refers to deep learning and machine learning technologies. We have been working to apply AI technologies to vehicles.
Algorithms are the main focus of R&D on AI. To improve the accuracy of AI algorithms, a large amount of data will be required, but the larger the amount of data, the more time the learning process takes. We will need technologies to efficiently use computers for the learning process.
R&D on algorithms and computer technologies is actively being conducted around the world, so I do not believe that DENSO is the global leader in these technologies.
As various technologies and papers are presented around the world, we try to identify technologies that may be used for vehicles, apply them to vehicle systems, and determine their effectiveness. As described here, we are committed to developing technologies to achieve speedy implementation to systems.
Even if there is an algorithm which seems to have a high recognition performance, it may not work in real time in a vehicle system. We develop technologies to quickly attain real-time performance.
These three technologies alone are not enough to apply AI to vehicles. Specifically, the calculation resources in a vehicle are limited. Embedded technologies and semiconductor technologies are also required to operate AI properly.
Quality is a very important factor in vehicle production, so it is essential to assure the quality of AI.
Regarding the bottom two items of the pentagon, we have well-established embedded technologies as well as quality assurance technologies and expertise. It is essential to harness these capabilities and work on the five elemental technologies in a comprehensive manner in order to apply AI to vehicles.
Now, I would like to briefly introduce the development process of respective elemental technologies.
First, I will talk about algorithms, which are the main topic. The block diagram that I showed earlier is indicated (in the upper part of the slide). First, let's take a look at the recognition process.
The situation in which an object in the back is hidden by another object in the front is referred to as occlusion. Occluded objects can be detected by using past tracking data. Thus, the algorithm enables relatively robust detection.
We gave a presentation about this technology at the Intelligent Vehicles Symposium about a year ago. If you are interested, you can search the web and find the details. We are conducting R&D on such technology.
Trajectory Prediction of Pedestrians
This technology made it possible to identify the positions of objects and obtain information about their past trajectories, so we were then motivated to predict the trajectories.
I would like to introduce our project to predict the trajectories of pedestrians. The detection and tracking results that I mentioned earlier were used as the input sources of the trajectory prediction.
Future trajectories are predicted based on past tracking data. As the first step to predicting a trajectory, we worked on predicting the destinations. This image is small and may be difficult to see, but this (green box on the slide) is the object whose trajectory is to be predicted. It shows a bicycle as an example. Multiple destinations of the bicycle are predicted.
This algorithm uses a generative adversarial network (GAN) model for trajectory prediction and a curriculum learning approach. The curriculum learning approach is used because a comprehensive learning process does not work smoothly. The process of learning the trajectories to the destinations is divided into multiple steps to enable gradual learning.
By using these techniques, the algorithm can properly estimate the trajectories of pedestrians to some extent. The research was jointly conducted with Carnegie Mellon University.
Examples of Trajectory Prediction Results
Let me show you a video on the demonstration results of predicting the trajectories of pedestrians. This was not a test of automated driving; the vehicle was driven by a human driver.
We created a pedestrian warning demonstration app. A hazardous area, which can be changed by the setting, is defined around one's own vehicle. A warning is issued if a pedestrian is likely to enter the hazardous area.
The app uses the information of past trajectory, current pedestrian position, and simple road shape. The position of the target object two seconds later is predicted by using the information of the intersection shape and building position. The image on the right shows a bird's-eye view, and the image on the left shows a projected result on the image captured by the camera.
The app issues a warning by indicating a red boundary box if the probability of a pedestrian entering the hazardous area defined around one's own vehicle is higher than a threshold, which also can be changed by the setting. If a pedestrian is unlikely to enter the hazardous area, the pedestrian is indicated by a green boundary box. The green line indicates the true value, and it represents the actual trajectory of the target pedestrian.
Here, the app issues a red warning because the probability of this pedestrian entering the yellow area of the vehicle is high.
Data collection is followed by the process of creating annotations and learning datasets. We use in-house tools in this process as well. In this example, images captured by a camera and point cloud data obtained from a LiDAR sensor are used as the input sources.
The LiDAR sensor is synchronized with the camera, so the images captured by the camera and the point cloud data can be indicated at the same time. The point cloud data enables us to recognize the presence of 3D objects and add 3D boundary boxes indicating the position, attitude, and direction of the objects. We use a tool that automatically projects the 3D boundary boxes on the 2D image in order to create data for our R&D.
After data has been accumulated, the learning process must then be performed. We use computer servers for the learning process to conduct R&D. We implement algorithms derived from the learning process to move this vehicle. Thus, our resources cover the entire research process.
As I mentioned before, there are two types of data: image data and point cloud data. Because they are synchronized, 3D boundary boxes are automatically projected on the 2D image.
This is made possible by assigning IDs for tracking. Depending on the research, only images are used for algorithms. Some members conduct R&D on algorithms that use both point cloud data obtained from a LiDAR sensor and images captured by a camera.
This is an example of our data on expressway, and another is in downtown areas where there are more pedestrians and bicycles. We are collecting data depending on research themes, and conducting R&D.
Tel: 1800 532 4365