Digital repository of Slovenian research organisations

Show document
A+ | A- | Help | SLO | ENG

Title:Large language models in food and nutrition science : opportunities, challenges, and the case of FoodyLLM
Authors:ID Gjorgjevikj, Ana, Institut "Jožef Stefan" (Author)
ID Martinc, Matej, Institut "Jožef Stefan" (Author)
ID Cenikj, Gjorgjina, Institut "Jožef Stefan" (Author)
ID Drole, Jan, Institut "Jožef Stefan" (Author)
ID Ogrinc, Nives, Institut "Jožef Stefan" (Author)
ID Džeroski, Sašo, Institut "Jožef Stefan" (Author)
ID Koroušić-Seljak, Barbara, Institut "Jožef Stefan" (Author)
ID Eftimov, Tome, Institut "Jožef Stefan" (Author), et al.
Files:URL URL - Source URL, visit https://www.sciencedirect.com/science/article/pii/S2665927126000511?via%3Dihub
 
.pdf PDF - Presentation file, download (6,37 MB)
MD5: B15A5C3391C3B2DF6F3C98B691CC9D4E
Description: Raziskovalni podatki so na voljo na straneh https://github.com/matejMartinc/FoodyLLM, https://huggingface.co/Matej/FoodyLLM in https://zenodo.org/records/17798877.
 
Language:English
Typology:1.01 - Original Scientific Article
Organization:Logo IJS - Jožef Stefan Institute
Abstract:Background Reliable nutrient profiling and semantic interoperability are essential for scalable dietary assessment, food labeling (e.g., traffic-light schemes), and FAIR integration of food composition and consumption data. However, general-purpose large language models (LLMs) are not systematically exposed to structured recipe–nutrition mappings and food ontologies, limiting their accuracy and trustworthiness in food and nutrition tasks. Scope and approach We review recent LLM advances in life sciences and healthcare and analyze the gap in food and nutrition applications. To address this gap, we introduce FoodyLLM, a domain-specialized LLM fine-tuned on 225k task-aligned QA pairs for (i) recipe nutrient estimation, (ii) traffic-light classification, and (iii) ontology-based entity linking to support FAIR food data interoperability. We benchmark FoodyLLM against strong general-purpose baselines (e.g., Llama 3 8B, Gemini 2.0) under zero-/few-shot prompting across five evaluation folds. Key findings Across all tasks, FoodyLLM substantially outperforms general-purpose LLMs for nutrient estimation across all macronutrients (fat, protein, salt, saturates, sugar), accuracy increases from 0.43 to 0.63 to 0.91–0.97; for traffic-light classification across all nutrients and color categories, macro F1 improves from 0.46 to 0.80 to 0.86–0.97; and for ontology-based food entity linking across FoodOn, SNOMED-CT, and Hansard, macro F1 increases from 0.33 to 0.44 (best general-purpose baseline) to 0.93–0.98 on artificial NEL data, and from 0.24 to 0.51 to 0.67–0.84 on real corpora (CafeteriaSA and CafeteriaFCD). Overall, our results demonstrate the practical value of domain-specialized LLMs in food and nutrition research. They enable automated dietary assessment, large-scale nutritional monitoring, and FAIR data integration, while opening new pathways toward sustainable and personalized nutrition.
Keywords:FoodyLLM, nutrient estimation, data interoperability
Publication status:Published
Publication version:Version of Record
Submitted for review:13.11.2025
Article acceptance date:13.02.2026
Publication date:16.02.2026
Publisher:Elsevier
Year of publishing:2026
Number of pages:str. 1-26
Numbering:Vol. 12, [article no.] 101351
Source:Nizozemska
PID:20.500.12556/DiRROS-27977 New window
UDC:004.8
ISSN on article:2665-9271
DOI:10.1016/j.crfs.2026.101351 New window
COBISS.SI-ID:270414595 New window
Copyright:© 2026 The Authors.
Note:Nasl. z nasl. zaslona; Soavtorji: Matej Martinc, Gjorgjina Venikj, Jan Drole, Nives Ogrinc, Sašo Džeroski, Barbara Koroušić Seljak, Tome Eftimov; Opis vira z dne 4. 3. 2026;
Publication date in DiRROS:04.03.2026
Views:41
Downloads:19
Metadata:XML DC-XML DC-RDF
:
Copy citation
  
Share:Bookmark and Share


Hover the mouse pointer over a document title to show the abstract or click on the title to get all document metadata.

Record is a part of a journal

Title:Current research in food science
Publisher:Elsevier B.V.
ISSN:2665-9271
COBISS.SI-ID:18959875 New window

Document is financed by a project

Funder:ARIS - Slovenian Research and Innovation Agency
Project number:P2-0098-2019
Name:Računalniške strukture in sistemi

Funder:ARIS - Slovenian Research and Innovation Agency
Project number:P2-0103-2022
Name:Tehnologije znanja

Funder:ARIS - Slovenian Research and Innovation Agency
Project number:P1-0143-2020
Name:Kroženje snovi v okolju, snovna bilanca in modeliranje okoljskih procesov ter ocena tveganja

Funder:ARIS - Slovenian Research and Innovation Agency
Project number:GC-0001-2024
Name:Umetna inteligenca za znanost

Funder:ARIS - Slovenian Research and Innovation Agency
Project number:BI-US/24-26-081-2024
Name:Preiskovanje znanstvene literature za napovedovanje interakcij med hrano, boleznimi in zdravili z uporabo grafov znanja

Funder:EC - European Commission
Project number:101211695
Name:Framework for Robust and Explainable Automated Large Language Model Selection
Acronym:AutoLLMSelect

Funder:EC - European Commission
Project number:101187010
Name:Leveraging Benchmarking Data for Automated Machine Learning and Optimization
Acronym:AutoLearn-SI

Funder:EC - European Commission
Project number:101060712
Name:European integration of new technologies and social-economic solutions for increasing consumer trust and engagement in seafood products
Acronym:FishEUTrust

Funder:EC - European Commission
Project number:101254461
Name:Slovenian AI Factory
Acronym:SLAIF

Funder:EC - European Commission
Project number:101198470
Name:Large Language Models for the European Union
Acronym:LLMs4EU

Funder:ARIS - Slovenian Research and Innovation Agency
Project number:J7-70265
Name:Developing and Validating a Comprehensive Food Composition Database and a Knowledge Base Aligned with FAIR Principles, Artificial Intelligence Methods, and Large Language Models
Acronym:AI4Food

Funder:ARIS - Slovenian Research and Innovation Agency
Project number:PR-12393
Name:Young Researchers Grant

Licences

License:CC BY 4.0, Creative Commons Attribution 4.0 International
Link:http://creativecommons.org/licenses/by/4.0/
Description:This is the standard Creative Commons license that gives others maximum freedom to do what they want with the work as long as they credit the author.
Licensing start date:16.02.2026
Applies to:VoR

Secondary language

Language:Slovenian
Title:Large language models in food and nutrition science: opportunities, challenges, and the case of FoodyLLM
Keywords:FoodyLLM, interoperabilnost podatkov


Back