Cervical Cancer Screening with Einstein Vision Service

Author : Rajdeep Dua
Last Updated : Aug 16 2017

Warning: Data used in this article contains graphic contents that some may find disturbing.

In this article we look at how Einstein Vision Service can help screen patients for cervix types which is a very useful first step for detecting Cervical Cancer

Background

Salesforce launched Einstein Vision Service couple of months ago with the aim to help developers provide service which can help them detect and label images. In this article we look at how this service could be successfully used in medicine to detect cervix types.

../_images/einstein_cervix.png

DataSet

We leveraged cervical cancer screening dataset from Kaggle’s competition sponsored by Intel.

https://www.kaggle.com/c/intel-mobileodt-cervical-cancer-screening.

In this article we will train model to correctly classify cervix types based on cervical images. These different types of cervix in the data set are all considered normal (not cancerous), but since the transformation zones aren’t always visible, some of the patients require further testing while some don’t.Identifying the transformation zones is not an easy task for the healthcare providers, therefore, an algorithm-aided decision will significantly improve the quality and efficiency of cervical cancer screening for these patients.

../_images/cervix-types.png

We reduced the size of each image to 200 x 200 pixel for demo purpose. You can find the dataset at the url https://www.dropbox.com/s/321da9o91d8xueg/train-200-200.zip

We uploaded the DataSet using standard curl commands and trained the model

Creating DataSet in Einstein

We need to do a HTTP POST call to https://api.einstein.ai/v2/vision/datasets/upload/sync with appropriate parameters as shown below.

curl -X POST -H "Authorization: Bearer <TOKEN>" \
             -H "Cache-Control: no-cache" \
             -H "Content-Type: multipart/form-data" \
             -F "path=https://www.dropbox.com/s/321da9o91d8xueg/train-200-200.zip?raw=1" \
             -F "type=image" https://api.einstein.ai/v2/vision/datasets/upload/sync

Response will be similar to listing below

{"id":1009482,"name":"train-200-200; filename*=UTF-8''train-200-200",
 "createdAt":"2017-08-14T09:33:54.000+0000",
 "updatedAt":"2017-08-14T09:33:54.000+0000",
 "labelSummary":{"labels":[
                 {"id":91447,"datasetId":1009482,"name":"Type_1","numExamples":139},
                 {"id":91448,"datasetId":1009482,"name":"Type_2","numExamples":781},
                 {"id":91449,"datasetId":1009482,"name":"Type_3","numExamples":450}]
}

Note the dataset id which is 1009482 in our case.

Training the DataSet

We will make a HTTP POST call to URL https://api.einstein.ai/v2/vision/datasets/<id> where <id> is 1009482 in our case

curl -X GET -H "Authorization: Bearer <TOKEN>" \
            -H "Cache-Control: no-cache" \
             https://api.einstein.ai/v2/vision/datasets/1009482
{
   "datasetId":1009482,
   "datasetVersionId":0,
   "name":"cervical cancer model",
   "status":"QUEUED",
   "progress":0,
   "createdAt":"2017-08-16T06:11:19.000+0000",
   "updatedAt":"2017-08-16T06:11:19.000+0000",
   "learningRate":0.0,
   "epochs":0,
   "queuePosition":3,
   "object":"training",
   "modelId":"4IMVU4QRDCJVWP4OX2BJK3SFCA",
   "trainParams":null,
   "trainStats":null,
   "modelType":"image"
}

Once the model is trained the model metrics can be seen

{
   "createdAt":"2017-08-16T06:16:49.000+0000",
   "metricsData":{
      "f1":[
         0.3902439024390243,
         0.5645161290322581,
         0.7154471544715447
      ],
      "labels":[
         "Type_1",
         "Type_2",
         "Type_3"
      ],
      "testAccuracy":0.6042,
      "trainingLoss":0.5593,
      "confusionMatrix":[
         [
            8,9,0
         ],
         [
            13,35,27
         ],
         [
            3,5,44
         ]
      ],
      "trainingAccuracy":0.7568
   }

Predicting the Cervix Type

Let us take an Image and try to predict the Cervix Type

Prediction Command

We will send a prediction HTTP POST request to Url https://api.einstein.ai/v2/vision/predict with authorization token and path to the image to be predicted.

curl -X POST -H "Authorization: Bearer <TOKEN>" \
             -H "Cache-Control: no-cache" \
             -H "Content-Type: multipart/form-data" \
             -F "sampleId=Photo Prediction"  \
             -F "sampleContent=@./Type_1/7.png" \
             -F "modelId=4IMVU4QRDCJVWP4OX2BJK3SFCA" \
             https://api.einstein.ai/v2/vision/predict

Results

{
   "sampleId":"Photo Prediction",
   "probabilities":[
      {
         "label":"Type_1",
         "probability":0.48673657
      },
      {
         "label":"Type_2",
         "probability":0.47965106
      },
      {
         "label":"Type_3",
         "probability":0.033612385
      }
   ],
   "object":"predictresponse"
}

Predicted Response : Type 1 Cervix

Summary

As we can see the Type 1 image was correctly classified but the probability percentage was not very high. Primary reason for this is that we had reduced the resolution of the images. If we use full resolution images with the dataset of around 6GB the accuracy will improve drastically.

This is a great first step in screening for cervix types for cervical cancer detection.