Abstract

Insect identification by combining different neural networks

[abstract]

Background: Traditional insect species classification relies on taxonomic experts examining unique physical characteristics of specimens, a time-consuming and error-prone process. Machine learning (ML) offers a promising alternative by identifying subtle morphological and genetic differences computationally. However, most existing approaches classify undescribed species as outliers, which limits their utility for biodiversity monitoring.

Objective: This study aims to develop an ML method capable of simultaneously classifying described species and grouping undescribed species by genus, thereby advancing the field of automated insect classification.

Method: We propose a novel ensemble approach combining neural networks (convolutional and attention-based) and Support Vector Machines (SVM), with both DNA barcoding and insect images as input data. To optimize the neural networks for diverse data types, we transform one-dimensional feature vectors into matrices using wavelet transforms. Additionally, a transformer-based architecture integrates DNA barcoding and image features for enhanced classification accuracy.

Experimental Results: Our method was evaluated on a comprehensive dataset containing paired insect images and DNA barcodes for 1,040 species across four insect orders. The results demonstrate superior performance compared to existing methods in classifying described species and grouping undescribed ones by genus.

Conclusion: The proposed approach represents a significant advancement in automated insect classification, addressing both described and undescribed species. This method has the potential to revolutionize global biodiversity monitoring. The MATLAB/PyTorch source code and dataset used are available at https://github.com/LorisNanni/Insect-identificatio.

Keywords ensemble; convolutional neural networks; support vector machine; attention network; insect classification; DNA barcode.

[full paper]