(Received: 3-Aug.-2021, Revised: 5-Sep.-2021 , Accepted: 12-Sep.-2021)
Marwin B. Alejo,
The advantages of the ears as a means of identification over other biometric modalities provided an avenue for researchers to conduct biometric recognition studies on state-of-the-art computing methods. This paper presents a deep learning pipeline for unconstrained ear recognition using a transformer neural network: Vision Transformer (ViT) and Data-efficient image Transformers (DeiTs). The ViT-Ear and DeiT-Ear models of this study achieved a recognition accuracy comparable or more significant than the results of state-of-the-art CNN- based methods and other deep learning algorithms. This study also determined that the performance of Vision Transformer and Data-efficient image Transformer models works better than that of ResNets without using exhaustive data augmentation processes. Moreover, this study observed that the performance of ViT-Ear is nearly like that of other ViT-based biometric studies.

