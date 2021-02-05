Log in
Artefact : How did we use computer vision to help medical experts diagnose Follicular Lymphoma?

02/05/2021 | 08:32am EST
Introduction

This project is part of Artefact's contribution in Tech for Good. The project has been conducted in collaboration with Institut Carnot CALYM, a consortium dedicated to partnership research on lymphoma, and Microsoft.

In autumn 2019, the Institut Carnot CALYM launched a structuring programme aimed at setting up a roadmap to optimise the valorisation and exploitation of data from the clinical, translational and preclinical research conducted by the members of the consortium for more than 20 years. This project, proposed by Pr Camille Laurent (LYSA, IUCT, CHU Toulouse, France) and Pr Christiane Copie (LYSARC, Pierre-Bénite, France), both members of Institut Carnot CALYM, is part of this structuring programme.

The primary objective of this research project is to develop a deep-learning algorithm to assist pathologists in diagnosing Follicular Lymphoma. A secondary objective is to identify informative criteria that could help medical experts understanding the morphological differences between Follicular Lymphoma and Follicular Hyperplasia which will be referred below as FL and FH.

What is Follicular Lymphoma? What are the challenges in its diagnosis?

FL is a subtype of Lymphoma, the most frequent blood cancer in the world. There are more than 80 types of Lymphoma and this diversity makes its diagnosis difficult, even for experts. Moreover, FL is very similar to FH which is not cancerous, adding challenges to its diagnosis.

In this article, we will describe our approach in building a classifier for FL and FH using only labelled whole-slide images. Whole slide images are high resolution digital files of scanned microscope slides. In our case they contain extract of lymph nodes.

How could deep learning help in its detection?

Using whole-slide images of FL and FH, we trained a binary classifier through a patch-based approach. Our model architecture is a simple Resnet-18 trained on a few epochs (~10).

After predicting the class of an observation with the classifier, we extract the last activation layer to build a heatmap on top of the input image to highlight parts that have prompted the model in defining a given class.

Why did we use a patch-based classification?

Patch-based classification is a classification technique where the class of a given observation is built based on the aggregation of the predictions of its components (patches). In our case it is used because the images are way too large to be used directly on the model.

In fact, whole-slide images are very large (~10⁵ pixel square). Their size makes training a deep learning model almost impossible with common tools. To solve this issue, we divided them into patches of the same size following two important criteria:

  • the patches must be big enough so that the follicles remain visible in them
  • the patches should be small enough so that training a model can be done in a reasonable amount of time

In patch-based classification, the model output can be interpreted as that of a classical classification except that the first layer of computation is at the whole-slide level. For example, when predicting the class of a slide of FL, a score of 98% would mean that 98 % of the patches it is composed of have been predicted to be FL.

At the dataset level, this slide will be predicted with a score of 0.98 for the FL class.

PS: We made the hypothesis of dividing the images into patches based on medical experts' conclusions stating that in a whole-slide of FL, the follicles are expected to be present everywhere.

Training Set

Our training set is composed of 58k randomly selected patches (1024 pixel square) of FL and FH extracted from a set of 30 whole-slide images in each of the 2 classes.

Validation Set

20% of the patches was sampled for validating the model performance at training time.

Testing Set

Our testing set is composed of 15 whole-slide images, each divided into patches. This reference set has been used to compare the results of different training approaches that we will precise below.

Modelling

The global pipeline is described below:

Disclaimer

Artefact SA published this content on 05 February 2021 and is solely responsible for the information contained therein. Distributed by Public, unedited and unaltered, on 05 February 2021 13:31:07 UTC.


© Publicnow 2021
