<?xml version="1.0"?>
<metadata xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dc="http://purl.org/dc/elements/1.1/"><dc:title>High entropy alloys database generated with large language model</dc:title><dc:creator>Chizhevskiy,	Vladimir	(Avtor)
	</dc:creator><dc:creator>Cvelbar,	Uroš	(Avtor)
	</dc:creator><dc:creator>Zavašnik,	Janez	(Avtor)
	</dc:creator><dc:creator>Nominé,	Alexandre	(Avtor)
	</dc:creator><dc:subject>high-entropy alloys</dc:subject><dc:subject>natural language processing</dc:subject><dc:subject>materials informatics</dc:subject><dc:description>High entropy alloys (HEAs) represent a promising area in materials science, but systematic analysis of the extensive literature remains a challenge. In this study, we used Natural Language Processing (NLP) techniques to analyze 4,625 scientific articles from a restricted corpus representing publisher-accessible literature, successfully identifying and characterizing 12,427 of different high entropy alloys. Through prompt engineering and experiments with Large Language Models (LLMs), including mamba-transformer hybrid architectures, we developed a structured database that captures important parameters such as alloy compositions, phase numbers and crystallographic structures. In our analysis, we distinguish between theoretical and experimental studies, considering specific methodological details for each category. For theoretical work, we have systematically documented modeling approaches and key computational parameters, while experimental studies are cataloged with their synthesis methods and critical processing conditions. This database represents a large-scale, automated extraction of HEA research data. The accuracy of the data ranges from 78.7% for HEA phase identification to 94.3% for HEA composition.</dc:description><dc:publisher>Nature Publishing Group</dc:publisher><dc:date>2026</dc:date><dc:date>2026-04-30 11:59:21</dc:date><dc:type>Neznano</dc:type><dc:identifier>29238</dc:identifier><dc:identifier>UDK: 620.1/.2</dc:identifier><dc:identifier>ISSN pri članku: 2052-4463</dc:identifier><dc:identifier>DOI: 10.1038/s41597-026-06930-z</dc:identifier><dc:identifier>COBISS_ID: 271580163</dc:identifier><dc:source>Združeno kraljestvo</dc:source><dc:language>sl</dc:language><dc:rights>© The Author(s) 2026</dc:rights></metadata>
