Toshiba : Visual Question Answering AI Deliver the World's Highest Accuracy

September 14, 2021 at 10:22 pm EDT

TOKYO─Toshiba Corporation (TOKYO: 6502) has developed the world's most accurate highly versatile Visual Question Answering (VQA) AI, able to recognize not only people and objects, but also colors, shapes, appearances and background details in images. The AI overcomes the long-standing difficulty of answering questions on the positioning and appearance of people and objects, and has the ability to learn information required to handle a wide range of questions and answers. It can be applied to a wide range of purposes without any need for customization.

In experiments using a public dataset^*1 comprising a large volume of images and data text, the VQA AI correctly answered 66.25% of questions without any pre-learning and 74.57% with pre-learning. For example, the AI can find a worker standing in a designated place by asking questions like, 'is the person on a black mat?' which requires recognition of the individual, position, shape and color. Applying it to safety monitoring systems at production sites is expected to help improve safety and to reduce workloads on onsite supervisors. It can also be used to identify specific scenes in broadcast content and surveillance video footage.

Toshiba presented the technology at ICANN2021^*2, the international conference for neural networks, on September 14.

Coming years are expected to see growing manpower shortages at production sites in Japan, a trend also become apparent in other advanced nations. This situation is being made all the worse by the emergence of COVID-19, which is making it more essential than ever to ensure worker safety and reduce workloads on site management. One solution is AI, which is being increasingly introduced to production sites. The global AI market, including software, hardware, and services, is forecast to grow 16.4% year over year in 2021 to $327.5 billion and is expected to reach $554.3 billion by 2024^*3.

Current image recognition AI supports safety inspections at the level where it can detect individual objects learned beforehand, such as people, headwear, and work clothing. This allows it to analyze camera images to determine whether or not someone is wearing a hardhat, or to detect dropped or fallen objects, helping to ensure workplace safety and reduce the site management workload.

However, getting to this point requires the creation of a determination function that provides a basis for how the AI should recognize an inspection item. For example, when checking for headgear, it must learn how to detect and determine if an individual is wearing a hat-and this has to be done for every individual item that is detected. In a workplace, it is essential to have flexibility that allows immediate changes in inspection items, but this is difficult with current AI due to time needed to set up and adjust the determination function.

Toshiba's new AI meets the need for flexibility with the world's highest accuracy in answering questions, and it is also able to change or add questions quickly. Its ability to recognize not only people and objects but also image backgrounds, plus the extensive database at its disposal, ensure that it can process quickly the features of images and pre-learned questions to derive the correct answer. After learning a large set of images, questions and answers that cover the presence of people and objects, and information such as their location and status, the AI is able to provide an appropriate answer to a question from approximately 3,000 answer patterns. The AI is highly flexible and can be updated by adding inspection items, or changed to handle a different situation, by a simple 'Image and Question' process of adding new question sentences (Fig. 1).

AI for VQA is a cutting edge-technology now being researched worldwide. The conventional approach^*4 mainly relies on the features of people and objects in an image, but Toshiba's new method also extracts background features and spatial areas, including the floors and passageways where these people and objects are to be found (Fig. 2). This feature enables the new AI to derive accurate answers.

For example, the AI can answer questions such as whether there is an object on a path or if a person is standing in a designated area, as well as whether there is an object (Fig 3 and 4). By applying this AI to safety monitoring at production sites, it is expected to improve workplace safety, to reduce workloads on supervisors, and to contribute to work style improvement.

In a performance evaluation with a global standard public dataset, Toshiba achieved accuracy levels of 66.25% without pre-learning and 74.57% with pre-learning, the highest levels ever recorded^*5, while the results with the current methods were respectively 65.88% and 74.00% (Fig. 5).

Figure 1: Safety Monitoring with Question-Answering AI

Figure 2: Features of the developed AI

Figure 3: Example of Question-Answering with AI

Figure 4: Example of Question-Answering with AI

Figure 5: Accuracy Comparison with Conventional Methods

The versatility of the new AI suits it for application in searches for specific scenes from broadcast content, specific circumstances or people in a disk drive recorders and security footage, and past near-misses in similar situations.

Toshiba will continue system development and accuracy improvement, toward introducing the AI technology into safety monitoring systems in fiscal 2023.

*1: VQA-v2 data set
Antol, S., Agrawal, A., Lu, J., Mitchell, M., Batra, D., Zitnick, C.L., Parikh, D.:VQA: Visual Question Answering. In: ICCV (2015)

*2: The 30th International Conference on Artificial Neural Networks to be held online from September 14th to 17th

*3: Source: IDC Forecasts Improved Growth for Global AI Market in 2021
https://www.idc.com/getdoc.jsp?containerId=prUS47482321

*4: Li, Linjie and Gan, Zhe and Cheng, Yu and Liu, Jingjing: Relation-aware Graph Attention Network for Visual Question Answering: ICCV(2019)

*5: As of the paper submission date (April 15, 2021)

*6: Image source: Okayama Labor Bureau, Ministry of Health (Japanese)
https://jsite.mhlw.go.jp/okayama-roudoukyoku/news_topics/kantokusho_oshirase/niimipato.html

*7: Image source: Labour and Welfare, Ministry of Health (Japanese)
https://hatarakikatakaikaku.mhlw.go.jp/file60/

Attachments

Original document
Permalink

Disclaimer

Toshiba Corporation published this content on 15 September 2021 and is solely responsible for the information contained therein. Distributed by Public, unedited and unaltered, on 15 September 2021 02:21:08 UTC.

Toshiba Unveils New Design - New Air Fry Microwave Oven at European Trade Conference 2024 Meeting in Greece	Mar. 25	CI
Toshiba sees power chips as immediate growth driver after $14 bln buyout	Dec. 22	RE
Toshiba Corp. Delisting its Shares from Tokyo and Nagoya Stock Exchange Effective Wednesday	Dec. 20	MT
Global disinflation cheer as records break	Dec. 20	RE
Toshiba shares delisted after 74 years	Dec. 20	RE
Toshiba delisted after 74 years, faces future with new owners	Dec. 19	RE
From scandal to delisting: Toshiba's long-running crisis	Dec. 19	RE
Toshiba to be delisted after 74 years, faces future with new owners	Dec. 19	RE
In buying Toshiba, a little-known fund takes on Japan Inc's toughest job	Dec. 19	RE
Toshiba Corporation(TSE:6502) dropped from FTSE All-World Index	Dec. 18	CI
Toshiba Appoints Former Mitsubishi CFO as Senior EVP; CEO Shimada to Retain Role	Dec. 14	MT
Toshiba Announces JIP-Led Management Board Members	Dec. 14	CI
Toshiba Corporation Announces Management Resignations	Dec. 14	CI
Toshiba to retain Shimada as CEO, get 4 board members from JIP	Dec. 13	RE
Toshiba says President Shimada to remain in post after going private	Dec. 13	RE
Toshiba Reportedly Set to Appoint Former Mitsubishi CFO as VP for Its Revival Efforts	Dec. 13	MT
Rohm, Toshiba to Form 388.3 Billion Yen Partnership to Boost Power Chip Supply	Dec. 10	MT
Industrials Climb on Deal Activity -- Industrials Roundup	Dec. 08	DJ
Japan's Commerce Ministry Hails ROHM, Toshiba Collaboration to Strengthen Semiconductor Supply Chain	Dec. 08	MT
North American Morning Briefing : Futures Pause Ahead of Jobs Report	Dec. 08	DJ
Toshiba, Rohm to Invest $2.69 Billion to Make Power Devices in Boost to Japan Chip Industry	Dec. 08	DJ
Toshiba, Rohm to invest $2.7 bln to jointly produce power chips	Dec. 08	RE
Rohm Shares Climb After Reports of Joint Power Chip Production With Toshiba	Dec. 07	DJ
Toshiba to invest in Rohm's new power chip plant in Japan's Miyazaki -sources	Dec. 07	RE
HSBC tests protecting FX trading from quantum computer attacks	Dec. 06	RE

Toshiba

Equities

6502

JP3592200004

Toshiba : Visual Question Answering AI Deliver the World's Highest Accuracy

Latest news about Toshiba

Chart Toshiba

Company Profile