Digitalni repozitorij raziskovalnih organizacij Slovenije

Izpis gradiva
A+ | A- | Pomoč | SLO | ENG

Naslov:Counting trees : a treebank-driven exploration of syntactic variation in speech and writing across languages
Avtorji:ID Dobrovoljc, Kaja, Institut "Jožef Stefan" (Avtor)
Datoteke:URL URL - Izvorni URL, za dostop obiščite https://www.degruyterbrill.com/document/doi/10.1515/cllt-2025-0046/html
 
.pdf PDF - Predstavitvena datoteka, prenos (1,93 MB)
MD5: 956FA3ECE7A9C6897BC90B5C5D9AC9A7
 
Jezik:Angleški jezik
Tipologija:1.01 - Izvirni znanstveni članek
Organizacija:Logo IJS - Institut Jožef Stefan
Povzetek:This paper presents a novel treebank-driven approach to comparing syntactic structures in speech and writing using dependency-parsed corpora. Adopting a fully inductive, bottom-up method, we define syntactic structures as delexicalized dependency (sub)trees and extract them from spoken and written Universal Dependencies (UD) treebanks in two syntactically distinct languages, English and Slovenian. For each corpus, we analyze the size, diversity, and distribution of syntactic inventories, their overlap across modalities, and the structures most characteristic of speech. Results show that, across both languages, spoken corpora contain fewer and less diverse syntactic structures than their written counterparts, with consistent cross-linguistic preferences for certain structural types across modalities. Strikingly, the overlap between spoken and written syntactic inventories is very limited: most structures attested in speech do not occur in writing, pointing to modality-specific preferences in syntactic organization that reflect the distinct demands of real-time interaction and elaborated writing. This contrast is further supported by a keyness analysis of the most frequent speech-specific structures, which highlights patterns associated with interactivity, context-grounding, and economy of expression. We argue that this scalable, language-independent framework offers a useful general method for systematically studying syntactic variation across corpora, laying the groundwork for more comprehensive data-driven theories of grammar in use.
Ključne besede:register variation, dependency treebanks, syntactic structures, syntactic comparison, keyness analysis, corpus-driven linguistics
Status publikacije:Objavljeno
Verzija publikacije:Objavljena publikacija
Poslano v recenzijo:02.06.2025
Datum sprejetja članka:27.01.2026
Datum objave:23.02.2026
Založnik:Mouton de Gruyter
Leto izida:2026
Št. strani:str. 2-37
Izvor:Nemčija
PID:20.500.12556/DiRROS-28231 Novo okno
UDK:81'32
ISSN pri članku:1613-7035
DOI:10.1515/cllt-2025-0046 Novo okno
COBISS.SI-ID:271469571 Novo okno
Avtorske pravice:© 2026 the author(s), published by De Gruyter.
Opomba:Nasl. z nasl. zaslona; Opis vira z dne 12. 3. 2026;
Datum objave v DiRROS:12.03.2026
Število ogledov:45
Število prenosov:25
Metapodatki:XML DC-XML DC-RDF
:
Kopiraj citat
  
Objavi na:Bookmark and Share


Postavite miškin kazalec na naslov za izpis povzetka. Klik na naslov izpiše podrobnosti ali sproži prenos.

Gradivo je del revije

Naslov:Corpus linguistics and linguistic theory
Založnik:Mouton de Gruyter
ISSN:1613-7035
COBISS.SI-ID:520104729 Novo okno

Gradivo je financirano iz projekta

Financer:ARIS - Javna agencija za znanstvenoraziskovalno in inovacijsko dejavnost Republike Slovenije
Številka projekta:Z6-4617-2022
Naslov:Na drevesnici temelječ pristop k raziskavam govorjene slovenščine

Financer:ARIS - Javna agencija za znanstvenoraziskovalno in inovacijsko dejavnost Republike Slovenije
Številka projekta:P6-0411-2019
Naslov:Jezikovni viri in tehnologije za slovenski jezik

Licence

Licenca:CC BY 4.0, Creative Commons Priznanje avtorstva 4.0 Mednarodna
Povezava:http://creativecommons.org/licenses/by/4.0/deed.sl
Opis:To je standardna licenca Creative Commons, ki daje uporabnikom največ možnosti za nadaljnjo uporabo dela, pri čemer morajo navesti avtorja.
Začetek licenciranja:26.02.2026
Vezano na:VoR

Sekundarni jezik

Jezik:Slovenski jezik
Ključne besede:registerska variacija, odvisnostni drevesniki, odvisnostno označeni korpusi, skladenjske strukture, primerjava skladnje, analiza ključnosti, korpusno gnano jezikoslovje


Nazaj