| Naslov: | FuDoBa : fusing document and knowledge graph based representations with Bayesian optimisation |
|---|
| Avtorji: | ID Koloski, Boshko, Institut "Jožef Stefan" (Avtor) ID Pollak, Senja, Institut "Jožef Stefan" (Avtor) ID Navigli, Roberto (Avtor) ID Škrlj, Blaž, Institut "Jožef Stefan" (Avtor) |
| Datoteke: | URL - Izvorni URL, za dostop obiščite https://link.springer.com/article/10.1007/s10994-026-07008-y
PDF - Predstavitvena datoteka, prenos (7,33 MB) MD5: 9BB8D0EAE48F39924BFB52DCFE589268
|
|---|
| Jezik: | Angleški jezik |
|---|
| Tipologija: | 1.01 - Izvirni znanstveni članek |
|---|
| Organizacija: | IJS - Institut Jožef Stefan
|
|---|
| Povzetek: | Building on the success of large language models (LLMs), LLM-based representations have dominated the document representation landscape, achieving strong performance on document embedding benchmarks. However, high-dimensional, computationally expensive LLM embeddings can be too generic or inefficient for domain-specific and resource-scarce applications. To address these limitations, we introduce FuDoBa—a Bayesian optimisation-based representation learning method that integrates LLM embeddings with domain-specific structured knowledge, sourced both locally and from external repositories such as WikiData. This fusion produces low-dimensional, task-relevant representations while reducing training complexity and yielding interpretable early-fusion weights for improved classification performance. We demonstrate the effectiveness of our approach on six datasets across two domains, showing that when paired with robust AutoML-based classifiers, our method performs on par with, or surpasses, proprietary LLM-only embedding baselines, while offering modality-wise interpretability and a smaller dimensional footprint. |
|---|
| Ključne besede: | document classification, Bayesian optimisation, representation learning, knowledge graphs |
|---|
| Status publikacije: | Objavljeno |
|---|
| Verzija publikacije: | Objavljena publikacija |
|---|
| Poslano v recenzijo: | 23.04.2025 |
|---|
| Datum sprejetja članka: | 02.02.2026 |
|---|
| Datum objave: | 06.03.2026 |
|---|
| Založnik: | Springer Nature |
|---|
| Leto izida: | 2026 |
|---|
| Št. strani: | str. 1-39 |
|---|
| Številčenje: | Vol. 115, article no. 61 |
|---|
| Izvor: | Švica |
|---|
| PID: | 20.500.12556/DiRROS-28309  |
|---|
| UDK: | 004.8 |
|---|
| ISSN pri članku: | 1573-0565 |
|---|
| DOI: | 10.1007/s10994-026-07008-y  |
|---|
| COBISS.SI-ID: | 271609091  |
|---|
| Avtorske pravice: | © The Author(s) 2026 |
|---|
| Opomba: | Nasl. z nasl. zaslona;
Soavtorja iz Slovenije: Senja Pollak, Blaž Škrlj;
Opis vira z dne 13. 3. 2026;
|
|---|
| Datum objave v DiRROS: | 13.03.2026 |
|---|
| Število ogledov: | 22 |
|---|
| Število prenosov: | 20 |
|---|
| Metapodatki: |  |
|---|
|
:
|
Kopiraj citat |
|---|
| | | | Objavi na: |  |
|---|
Postavite miškin kazalec na naslov za izpis povzetka. Klik na naslov izpiše
podrobnosti ali sproži prenos. |