Nutritional Content Detection Using Vision Transformers- An Intelligent Approach

Authors

  • Saikat Banerjee State Aided College Teacher, Department of Computer Applications, Vivekananda Mahavidyalaya, Haripal, Hooghly, West Bengal, India
  • Debasmita Palsani State Aided College Teacher, Department of Nutrition, Vivekananda Mahavidyalaya, Haripal, Hooghly, West Bengal, India
  • Abhoy Chand Mondal Professor, Department of Computer science, The University of Burdwan, Golapbag, West Bengal, India

Keywords:

Machine Learning, Vision Transformer (ViT), Convolutional Neural Networks Food, Nutrition

Abstract

The nutritional composition of food facilitates energy production, growth, and overall health while also preventing diseases and enhancing immunity. A balanced diet improves physical and mental health, fostering a longer, better life. Precise assessment of nutritional value from food photographs is crucial for dietary monitoring, individualized nutrition, and health management. Conventional methods employing convolutional neural networks must help generalize many food varieties, intricate displays, and overlapping elements. Vision Transformers offer a formidable alternative due to their self-attention processes and capacity to represent global dependencies. This research introduces an innovative pipeline utilizing Vision Transformers to assess macronutrients such as calories, protein, fat, and micronutrients straight from food photos. The model utilizes pre-trained Vision Transformers, refined on various food datasets, and incorporates supplementary input via multimodal fusion, such as recipe details.

Downloads

Published

2024-12-06

Issue

Section

Articles