Towards taming oral cancer data in the wild: an explainable machine learning approach

Published in ICAASP 2020, 2009

Recommended citation: Halpern, van Son, van den Brekel (2020). "Towards taming oral cancer data in the wild: an explainable machine learning approach; ICASSP. 1(1). http://academicpages.github.io/files/paper1.pdf

This is our supplementary page for the paper: “Towards taming oral cancer data in the wild: an explainable machine learning approach”.

In the paper we have presented three methods to explain machine learning decisions with audio. The sound files below are examples of the three method:

---Control (health) speechOral cancer (pathological) speech
Original
Activation map multiplication
Thresholded activation map multiplication
Binary thresholded activation map multiplication

You can find that the thresholded and binary thresholded functions sound really similar, and creates a very simple acoustic signal, close to a simple tone. This is caused by the thresholding tossing away information. The normal multiplications reweights and renormalises the spectrum so the intellegibility is often entirely preserved.

Examples with binary thresholding sometimes sounds like network is considering buzzing residuals of a recording microphone. Some characteristics of the original prosody is preserved in this case.