Diabetic Retinopathy - Detecting Blindness

APTOS 2019 Blindness

Diabetic Retinopathy is a disease that affects the retina of the eye. Millions around the world suffer from this disease.

Currently, diagnosis happens through the use of a technique called fundus photography, which involves photographing the rear of the eye.

Medical screening for diabetic retinopathy occurs around the world, but is more difficult for people living in rural areas.

Using machine learning and computer vision, we attempt to automate the process of diagnosis, which currently is manually being performed doctors.

On Kaggle (https://www.kaggle.com/c/aptos2019-blindness-detection/data and https://www.kaggle.com/c/diabetic-retinopathy-detection) we will have access to a dataset of tens of thousands of real-world clinical images of both healthy patients and pateints with the disease, and labelled by trained clinicians.

Using this dataset, we'll be able to train a machine learning model to acheive a high level of accuracy when predicting occurrences of the disease in patients.

Results

We train our model on a combined dataset of approx 40,000 images. We then perform inference against the public and private leaderboard on Kaggle for the APTOS 2019 competition. The public leaderboard contains approx 30% of the total test dataset, and the private LB 70%. The test set contains in total approx 13,000 images at over 20gb in size.

The models are trained offline, and then uploaded to a Kaggle private data set linked to the kernel, which we use soley for inference.

We also make sure to pre-process the test images with the same image treatments that were performed on the training data. We use a custom ItemList to perform image manipulations on the test set before running our predictions.

Using an ensemble of B3 and B5 Efficientnets, we achieve a Quadratic Weighted Kappa score of 0.905775.

In comparison, the winning solution achieved 0.936129

Contents

The following notebook has been organised as follows:

  1. Code has been listed initially first, and roughly sectioned off into the following key parts. It is worth going through this code, and as you read through the next major section of experiment discusssion, you can refer back to the code section as is relevant.
    • Imports and Setup
    • Image processing
    • Metrics
    • Learner and Databunch
    • Predictions and Inference
    • Pipeline experimental methods
  2. Outline of experiment and results
    • Data exploration
    • Image processing baselines
    • Model and architecture baselines. Decide if Regression or Classification is the best approach.
    • Adding data and data augmentations
    • Increasing image size
    • Tuning other hyperparameters like dropout and weight decay
    • Progressive resizing
    • Increasing epochs and training times
    • Ensembling
  3. Appendix
    • Image pre-processing methods
    • Select experiments
    • References

Imports and Setup

The following cell contains all of the setup code for each of intialisation whenever restarting the kernel

In [30]:
from fastai.callbacks import*
from fastai.vision import *
from fastai.metrics import error_rate

# Import Libraries here
import os
import json 
import shutil
import zipfile
import numpy as np
import pandas as pd
import PIL
import cv2

from PIL import ImageEnhance

import scipy as sp
from functools import partial
from sklearn import metrics
from sklearn.metrics import cohen_kappa_score
from sklearn.metrics import confusion_matrix

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torch.utils.data import Dataset, DataLoader
from torchvision import transforms, utils
import torchvision.transforms.functional as TF

from torchvision.models import *

%reload_ext autoreload
%autoreload 2
%matplotlib inline
import pretrainedmodels
%load_ext jupyternotify
    

# set the random seed
np.random.seed(42)
    
    
import fastai; fastai.__version__
The jupyternotify extension is already loaded. To reload it, use:
  %reload_ext jupyternotify
Out[30]:
'1.0.57'

Datasource Selectors

In my experiments I've setup a few data sources, both with the 2019 dataset only and also the combines 2019+2015 datasets. The code below helps me switch out between the two so I can benchmark with various options

In [31]:
# Downloaded from https://www.kaggle.com/benjaminwarner/resized-2015-2019-blindness-detection-images

def switch2019Only(sub_folder:str = ''):
    base_dir = '/hdd/data/blindness-detection/2015_and_2019/'

    !mkdir -p "{base_dir}"

    train_img_path = f'{base_dir}train/{sub_folder}'  # need to split this folder into train and val sets
    test_img_path = f'{base_dir}test/{sub_folder}' # images only, use to test

    df_train = pd.read_csv(base_dir + 'labels/trainLabels19.csv')
    df_train.head()
    
    return (train_img_path, base_dir, train_img_path, test_img_path, df_train)
In [32]:
# Downloaded from https://www.kaggle.com/benjaminwarner/resized-2015-2019-blindness-detection-images

def switch2015Only(sub_folder:str = ''):
    base_dir = '/hdd/data/blindness-detection/2015_and_2019/'

    !mkdir -p "{base_dir}"

    train_img_path = f'{base_dir}train/{sub_folder}'  # need to split this folder into train and val sets
    test_img_path = f'{base_dir}test/{sub_folder}' # images only, use to test

    df_train = pd.read_csv(base_dir + 'labels/trainLabels15.csv')
    df_train.columns = ['id_code', 'diagnosis']
    df_train.head()

    
    return (train_img_path, base_dir, train_img_path, test_img_path, df_train)
In [33]:
# Downloaded from https://www.kaggle.com/benjaminwarner/resized-2015-2019-blindness-detection-images

def switch2019And2015(sub_folder:str = ''):
    base_dir = '/hdd/data/blindness-detection/2015_and_2019/'

    !mkdir -p "{base_dir}"

    train_img_path = f'{base_dir}train/{sub_folder}'  # need to split this folder into train and val sets
    test_img_path = f'{base_dir}test/{sub_folder}' # images only, use to test

    df_train_15 = pd.read_csv(base_dir + 'labels/trainLabels15.csv')
    df_train_15.columns = ['id_code', 'diagnosis']
    df_train_15.head()

    df_train_19 = pd.read_csv(base_dir + 'labels/trainLabels19.csv')
    df_train_19.head()

    df_train = pd.concat([df_train_15, df_train_19])
    df_train=df_train.reset_index(drop=True)
    df_train.head()

    
    return (train_img_path, base_dir, train_img_path, test_img_path, df_train)

    

Metrics

In [34]:
# ---------- Metrics ----------

# Competition uses the quadric kappa metric, defined here
# Definition of Quadratic Kappa
from sklearn.metrics import cohen_kappa_score
def quadratic_kappa(y_hat, y):
    return torch.tensor(cohen_kappa_score(torch.round(y_hat), y, weights='quadratic'),device='cuda:0')
    

Learner and Data

In [68]:
from torch.utils.data.sampler import WeightedRandomSampler

class OverSamplingCallback(LearnerCallback):
    def __init__(self,learn:Learner,weights:torch.Tensor=None):
        super().__init__(learn)
        self.labels = self.learn.data.train_dl.dataset.y.items
        _, counts = np.unique(self.labels,return_counts=True)
        self.weights = (weights if weights is not None else
                        torch.DoubleTensor((1/counts)[self.labels.astype(int)]))
        self.label_counts = np.bincount([self.learn.data.train_dl.dataset.y[i].data for i in range(len(self.learn.data.train_dl.dataset))])
        self.total_len_oversample = int(self.learn.data.c*np.max(self.label_counts))
    def on_train_begin(self, **kwargs):
        self.learn.data.train_dl.dl.batch_sampler = BatchSampler(WeightedRandomSampler(self.weights,self.total_len_oversample), self.learn.data.train_dl.batch_size,False)
    

    
# ---------- Learner and Databunch ----------
def get_data_bunch_explore(data_source, image_size, bs=64, mode=0, use_xtra_tfms=False):
    # data source
    data_in_use, base_dir, train_img_path, test_img_path, df_train = data_source
    print(f'Using data in: {train_img_path}') # print out which dataset is in use
    
    # lets start off with a small image size first 
    # and use progressive resizing to see how our initial model is performing
    sz=image_size 
    
    # 1. Setup data bunch
    source = (CleanedImageList
                .from_df(df_train, train_img_path, suffix='.jpg', image_size=sz, mode=mode)
                .split_by_rand_pct(0.2, seed=42)
                .label_from_df(cols='diagnosis',label_cls=FloatList))
    
    data_bunch = (
        source
            .databunch(bs=bs)
            .normalize(imagenet_stats)
        );
    
    # if using data aug
    if use_xtra_tfms:
        source = (CleanedImageList
                .from_df(df_train, train_img_path, suffix='.jpg', image_size=sz, mode=mode)
                .split_by_rand_pct(0.2, seed=42)
                .label_from_df(cols='diagnosis',label_cls=FloatList))
        
        transforms = get_transforms(do_flip=True, 
                              flip_vert=True,
                              max_rotate=360,
                              max_zoom=False, 
                              max_lighting=0.1,
                              p_lighting=0.5,
                              xtra_tfms=zoom_crop(scale=(1.01, 1.45), do_rand=True))
    
        data_bunch = (
            source
                .transform(transforms,size=sz)
                .databunch(bs=bs)
                .normalize(imagenet_stats)
            );
        
    
    return data_bunch
    
def get_data_bunch(data_source, image_size, bs=64, use_xtra_tfms=False):
    
    # data source
    data_in_use, base_dir, train_img_path, test_img_path, df_train = data_source
    print(f'Using data in: {train_img_path}') # print out which dataset is in use
    
    # lets start off with a small image size first 
    # and use progressive resizing to see how our initial model is performing
    sz=image_size 
    
    # 1. Setup data bunch
    source = (ImageList
                .from_df(df_train, train_img_path, suffix='.jpg')
                .split_by_rand_pct(0.2, seed=42)
                .label_from_df(cols='diagnosis',label_cls=FloatList))
        
    data_bunch = (
        source
            .databunch(bs=bs)
            .normalize(imagenet_stats)
        );
    
    if use_xtra_tfms:
        source = (ImageList
                .from_df(df_train, train_img_path, suffix='.jpg')
                .split_by_rand_pct(0.2, seed=42)
                .label_from_df(cols='diagnosis',label_cls=FloatList))

        transforms = get_transforms(do_flip=True, 
                                  flip_vert=True,
                                  max_rotate=360,
                                  max_zoom=False, 
                                  max_lighting=0.1,
                                  p_lighting=0.5,
                                  xtra_tfms=zoom_crop(scale=(1.01, 1.45), do_rand=True))


        data_bunch = (
            source
                .transform(transforms,size=sz)
                .databunch(bs=bs)
                .normalize(imagenet_stats)
            );
        
    
    # add test set
    sample_df = pd.read_csv(base_dir + 'sample_submission.csv')
    sample_df.head()
    
    # Remember, for inference, we should apply the same image processing as what we trained on!
    data_bunch.add_test(ImageList.from_df(sample_df,base_dir,folder='test',suffix='.jpg'))

    return data_bunch

# "pretrained" is hardcoded to adapt to the PyTorch model function
from efficientnet_pytorch import EfficientNet
def efficient_net(b_class='b5'):
    return EfficientNet.from_pretrained(f'efficientnet-{b_class}', num_classes=1)


def get_cnn_learner(arch, data_bunch, tofp16=True, oversample=False):
    
    # 1. Get data bunch
    data_bunch_cleaned = data_bunch
    
    callback_fns = [ShowGraph]
    
    if oversample:
        print('is oversampling')
        callback_fns = [partial(OverSamplingCallback), ShowGraph]     

    # 2. Setup new learner. 
    learner = Learner(data_bunch_cleaned, arch, model_dir="models", metrics=quadratic_kappa, callback_fns=callback_fns)
    
    if tofp16:
        learner = Learner(data_bunch_cleaned, arch, model_dir="models", metrics=quadratic_kappa, callback_fns=callback_fns)
        learner.to_fp16()

    return learner

Pipelining Helper methods

I use these general helper methods to run training. These methods help encapsulate a lot of the benchmarking and training runs that I execute, and help to pass hyperparameters through easily whilst abstracting out some of the cnn setup code.

In [36]:
class Experiment():
    
    def __init__(self, name, data_source, arch, image_size, bs, wd, use_xtra_tfms, oversample, pretrained_model_name=None):
        
        super().__init__()
        
        self.name = name
        self.data_source = data_source
        self.arch = arch
        self.image_size = image_size
        self.bs = bs
        self.wd = wd
        self.use_xtra_tfms = use_xtra_tfms
        self.oversample = oversample
        self.pretrained_model_name = pretrained_model_name
        
        self.data_in_use, self.base_dir, self.train_img_path, self.test_img_path, self.df_train = self.data_source
        
        if self.pretrained_model_name:
            
            self.learner, self.data_bunch = get_learner_and_databunch(
                                                self.arch, 
                                                self.data_source,
                                                image_size=self.image_size,
                                                bs=self.bs,
                                                use_xtra_tfms=self.use_xtra_tfms,
                                                oversample=self.oversample)
            
            print(f'Loading pretrained model: {self.pretrained_model_name}')
            self.learner.load(self.base_dir + self.pretrained_model_name)
            self.learner.to_fp16()
            
        else:
            
            self.learner, self.data_bunch = get_learner_and_databunch(
                                                self.arch, 
                                                self.data_source,
                                                image_size=self.image_size,
                                                bs=self.bs,
                                                use_xtra_tfms=self.use_xtra_tfms,
                                                oversample=self.oversample)
            
    def find_lr(self):
        # find the inital lr for frozen training
        self.learner.lr_find(wd=self.wd)
        self.learner.recorder.plot(suggestion=True)
        
        
    def fit_frozen(self, epochs, lr):
        
        self.learner.fit_one_cycle(
            epochs, 
            lr, 
            wd=self.wd, 
            callbacks=[SaveModelCallback(self.learner, monitor='valid_loss', name=f'best_{self.name}')])
        
        %notify -m "fit_one_cycle finished"
        
        print(f'Saved model: {self.base_dir + self.name}')
        self.learner.save(self.base_dir + self.name)
        
    def unfreeze(self):
        self.learner.unfreeze()
        self.learner.lr_find()
        self.learner.recorder.plot()
        
    def fit_unfrozen(self, epochs, lr):
        
        self.learner.fit_one_cycle(
            epochs, 
            lr, 
            wd=self.wd, 
            callbacks=[SaveModelCallback(exp.learner, monitor='valid_loss', name=f'best_unf_{exp.name}')])
        
        %notify -m "unfrozen fit_one_cycle finished"


        self.learner.save(self.base_dir + 'unf_' + self.name)
        print(f'Saved model: {self.base_dir + "unf_" + self.name}')
        
    def load_frozen(self):
        self.learner.load(self.base_dir + self.name)
        self.learner.to_fp16()
              
    def load_best_frozen(self):
        self.learner.load(self.train_img_path + 'models/best_' + self.name)
        self.learner.to_fp16()
        print(f'Loaded best model {self.train_img_path + "models/best_" + self.name}')
        
    
        
    def get_kappa_score(self):
        get_kappa_score(self.learner)
        
    def show_batch(self):
        self.data_bunch.show_batch(4, figsize=(20,20))

# This method is a helper method that we use to help us setup a learner and a databunch with baselined defaults
# It returns the learner, the data bunch, and also runs an lr finder to use to find an appropriate learning rate to feed into fit one cycle
# The only required parameter is the architecture. For everything else 
# you can pass in overrides values to test different hyperparameters
def get_learner_and_databunch(
        arch, 
        data_source,
        image_size=128,
        bs=64,
        use_xtra_tfms=False,
        oversample=False):

    # data bunch
    data_bunch = get_data_bunch(
        data_source,
        image_size, 
        bs=bs, 
        use_xtra_tfms=use_xtra_tfms)

    # create a learner
    learner = get_cnn_learner(arch, data_bunch, oversample=oversample) 
    
        
    return (learner, data_bunch)


def get_kappa_score(learner):
    preds, y = learner.get_preds()
    score = quadratic_kappa(preds, y)
    print('Kappa score is {0}'.format(score))

    return score

Summary

The following is an outline of how I approached the problem and is roughly in the order of how I tackled the project. At each stage the aim was to find the best settings that would allow me to move forward on each experiments, and I spent a lot of time getting to know the data, baselining, and trying to uncover bugs during the training process.

Roughly, the order of operations for this project are outlined below:

  1. Exploratory Data analysis
  2. Image processing baselines
  3. Model and architecture baselines. Regression or Classification is the best approach.
  4. Adding data and data augmentations
  5. Increasing image size
  6. Tuning other hyperparameters like dropout and weight decay
  7. Progressive resizing
  8. Increasing epochs and training times
  9. Ensembling

Common training settings

  • Using transfer learning with pre-trained weights
  • We use fit_one_cycle policy to help vary learning rates for best results.
  • We use Adam as our optimiser
  • Treated as a regression problem with MSELoss as our cost function.
  • Oversampled the dataset
  • Using expanded dataset from 2015 and 2019
  • Using heavy data augmentations: flipping, rotation, zoom, crops, and lighting.

Things we did not attempt

  • Stratified Kfolds
  • Test Time Augmentations
  • Psuedo labelling

Exploratory Data Analysis (EDA)

Before we start any training we try to get a good sense of the raw data, understand its distributions, and explore its features and idiosyncracies.

Getting to know our dataset is an important first step, and helps us tune our model towards more accurate predictions.

In [19]:
data_source = switch2019Only()

learner, data_bunch = get_learner_and_databunch(
        efficient_net('b2'), 
        data_source,
        image_size=224,
        bs=32,
        use_xtra_tfms=False,
        oversample=False)

data_bunch.show_batch(5, figsize=(20,20))
Loaded pretrained weights for efficientnet-b2
Using data in: /hdd/data/blindness-detection/2015_and_2019/train/