Kerr, T., Clark, J. R., Fileman, E. S., Widdicombe, C. E. and Pugeault, N. (2020) Collaborative deep learning models to handle class imbalance in FlowCam plankton imagery. IEEE Access, 8, pp. 170013-170032. (doi: 10.1109/ACCESS.2020.3022242)
Text
223144.pdf - Published Version Available under License Creative Commons Attribution. 4MB |
Abstract
Using automated imaging technologies, it is now possible to generate previously unprecedented volumes of plankton image data which can be used to study the composition of plankton assemblages. However, the current need to manually classify individual images introduces a bottleneck into processing chains. Although Machine Learning techniques have been used to try and address this issue, past efforts have suffered from accuracy limitations, especially in minority classes. Here we use state-of-the-art methods in Deep Learning to investigate suitable architectures for training an automated plankton classification system which achieves high efficacy for both abundant and rare taxa. We collected live plankton from Station L4 in the Western English Channel and imaged 11,371 particles covering 104 taxonomic groups using the automated plankton imaging system FlowCam. The image set contained a severe class imbalance, with some taxa represented by > 600 images while other, rarer taxa were represented by just 14. We demonstrate that by allowing multiple Deep Learning models to collaborate in a single classification system, classification accuracy improves for minority classes when compared with the best individual model. The top collaborative model achieved a 6% improvement in F1 accuracy over the best individual model, while overall accuracy improved by 3.2%. This resulted in a 97.4% overall accuracy score and a 96.2% F1 macro score on a separate holdout test set containing 104 taxonomic groups. Based on a survey of similar studies in the literature, we believe collaborative deep learning models can significantly improve the accuracy of existing automated plankton classification systems.
Item Type: | Articles |
---|---|
Additional Information: | CW, EF and JC were funded through the UK Natural Environment Research Council’s National Capability Long-term Single Centre Science Programme, Climate Linked Atlantic Sector Science, grant number NE/R015953/1, and is a contribution to Theme 1.3 - Biological Dynamics. NP was supported by the Alan Turing Institute, grant number EP/N510129/1. |
Status: | Published |
Refereed: | Yes |
Glasgow Author(s) Enlighten ID: | Pugeault, Dr Nicolas |
Authors: | Kerr, T., Clark, J. R., Fileman, E. S., Widdicombe, C. E., and Pugeault, N. |
College/School: | College of Science and Engineering > School of Computing Science |
Journal Name: | IEEE Access |
Publisher: | IEEE |
ISSN: | 2169-3536 |
ISSN (Online): | 2169-3536 |
Published Online: | 07 September 2020 |
Copyright Holders: | Copyright © 2020 The Authors |
First Published: | First published in IEEE Access 8: 170013-170032 |
Publisher Policy: | Reproduced under a Creative Commons license |
University Staff: Request a correction | Enlighten Editors: Update this record