Shimadzu Corporation, Fujitsu Limited, and Fujitsu Laboratories Ltd. are undertaking collaborative research*1 to develop technology that utilizes AI (artificial intelligence) to process the vast amounts of data used in analyzing the measurement results, which are essential to analytical processes, acquired from Shimadzu mass spectrometers.
Fujitsu, an information and communication technology company, has been exploring the creation of new businesses by using AI as an enabler of digital innovation. Shimadzu is an analytical equipment maker with customers who had voiced their hopes for automated, highly accurate analysis of complex data. Given that the aims of both companies matched, in November of last year they began collaborative research into AI for the automation of data analysis.
Mass spectrometers are used for research and quality control in various areas such as the establishment of early detection techniques for diseases and the measurement of residual pesticides in foods, and because of improvements in sensitivity and speed, the amount of data acquired is enormous. As a result, the data analysis step called “peak picking”*2 has become the bottleneck in the workflow. Complete automation is difficult and to some extent manual adjustments are required. Therefore, there are differences in analysis accuracy depending on each operator and there is a possibility that analytical results might be affected by each operator’s practices and data alterations. In recent years, automated data analysis with high accuracy that eliminates this kind of dependence on individuals is now demanded in the fields of healthcare and new drug development.
To solve this issue using AI, the three companies investigated the application of deep learning, a neural network technology that imitates brain neurons. Arising to confront this process were two problems: 1) insufficient training data*3; and 2) learning could not proceed when analytical equipment output data was input, as is, into the deep learning network. In this collaborative research, Shimadzu developed technology to produce extra data to compensate for the lack of training data, and Fujitsu and Fujitsu Laboratories developed technology to convert the analysis equipment output features into images. Moreover, the companies developed the feature extraction technology to learn the analytical skills of experienced analysts. By doing this, the deep learning network was able to learn from the over 30,000 items of generated training data. Compared with manual peak picking results by an experienced operator, the automated peak picking results using AI had a false detection rate of 7% and an undetected rate of 9%*4. These results indicate that an automated peak picking can compare favorably with a peak picking by an experienced operator.
In the Shimadzu medium-term business plan that started in April 2017, strong emphasis is placed on the “Advanced Healthcare” area, aiming to create innovative products and services in the fields of prevention, diagnosis, treatment of disorders, and development of new drugs. This metabolomics research is also part of that initiative and is the first application to use the AI technology developed by the company. Metabolomics technology is used to investigate cells by detecting metabolites and following their behavior. The technology is expected to be used for unraveling physiological and pathological mechanisms and for exploring disease biomarkers.
Fujitsu has been developing AI for over 30 years, and introduced knowledge and technology based on this experience as “Fujitsu Human Centric AI Zinrai” in November 2015. Since then the company has implemented AI products and services for customers in a variety of fields. This time the company has turned its attention to metabolomics, a new application for deep learning, and plans to develop AI for the automation of data analysis.
Shimadzu and Fujitsu aim to integrate the AI systems developed in this collaboration into mass spectrometer software in 2018.
*1 Professor Fukusaki Eiichiro of the Graduate School of Engineering, Osaka University, and Professor Fumio Matsuda of the Graduate School of Information Science and Technology, Osaka University, provided input about researcher requirements. Shimadzu and Osaka University have established the “Osaka University and Shimadzu Analytical Innovation Research Laboratory” to develop analysis technology for metabolomics.
*2 The process of reading the width and height of waveforms (peaks) from data acquired by a mass spectrometer
*3 Data used for learning by a deep learning network. This is the set of data that is input to the network and the output that is expected. In the case of this technology, it would be the set of output data from the analytical instrument and the corresponding peaks picked by an experienced operator.
*4 The peak picking result obtained manually by an experienced operator is called to be in the “correct range”. This is compared with the “predicted range”, which indicates the peak picking result produced automatically by the AI system. If the “correct range” and “predicted range” overlap by 50% or more, they are considered to match, and otherwise there is no match. A match indicates that the peak was correctly detected, whereas if the predicted range does not match the correct range, it is defined as “false positive.” When the correct range does not match the predicted range, it is defined as “undetected”. The false positive rate was calculated as the false positive count/ (detection count + false positive count), and the undetected rate as undetected count/ (detected count + undetected count).