Collaborative deep learning models to handle class imbalance in FlowCam plankton imagery

Kerr, T., Clark, J. R., Fileman, E. S., Widdicombe, C. E. and Pugeault, N. (2020) Collaborative deep learning models to handle class imbalance in FlowCam plankton imagery. IEEE Access, 8, pp. 170013-170032. (doi: 10.1109/ACCESS.2020.3022242)

[img] Text
223144.pdf - Published Version
Available under License Creative Commons Attribution.

4MB

Abstract

Using automated imaging technologies, it is now possible to generate previously unprecedented volumes of plankton image data which can be used to study the composition of plankton assemblages. However, the current need to manually classify individual images introduces a bottleneck into processing chains. Although Machine Learning techniques have been used to try and address this issue, past efforts have suffered from accuracy limitations, especially in minority classes. Here we use state-of-the-art methods in Deep Learning to investigate suitable architectures for training an automated plankton classification system which achieves high efficacy for both abundant and rare taxa. We collected live plankton from Station L4 in the Western English Channel and imaged 11,371 particles covering 104 taxonomic groups using the automated plankton imaging system FlowCam. The image set contained a severe class imbalance, with some taxa represented by > 600 images while other, rarer taxa were represented by just 14. We demonstrate that by allowing multiple Deep Learning models to collaborate in a single classification system, classification accuracy improves for minority classes when compared with the best individual model. The top collaborative model achieved a 6% improvement in F1 accuracy over the best individual model, while overall accuracy improved by 3.2%. This resulted in a 97.4% overall accuracy score and a 96.2% F1 macro score on a separate holdout test set containing 104 taxonomic groups. Based on a survey of similar studies in the literature, we believe collaborative deep learning models can significantly improve the accuracy of existing automated plankton classification systems.

Item Type:Articles
Additional Information:CW, EF and JC were funded through the UK Natural Environment Research Council’s National Capability Long-term Single Centre Science Programme, Climate Linked Atlantic Sector Science, grant number NE/R015953/1, and is a contribution to Theme 1.3 - Biological Dynamics. NP was supported by the Alan Turing Institute, grant number EP/N510129/1.
Status:Published
Refereed:Yes
Glasgow Author(s) Enlighten ID:Pugeault, Dr Nicolas
Authors: Kerr, T., Clark, J. R., Fileman, E. S., Widdicombe, C. E., and Pugeault, N.
College/School:College of Science and Engineering > School of Computing Science
Journal Name:IEEE Access
Publisher:IEEE
ISSN:2169-3536
ISSN (Online):2169-3536
Published Online:07 September 2020
Copyright Holders:Copyright © 2020 The Authors
First Published:First published in IEEE Access 8: 170013-170032
Publisher Policy:Reproduced under a Creative Commons license

University Staff: Request a correction | Enlighten Editors: Update this record