Where shall I put this? Distance-to-V, length and verb disposition effects on PP placement in Belgian Dutch Annelore Willems, Gert De Sutter Faculty of Translation Studies University College Ghent – Ghent University {annelore.willems,gert.desutter}@hogent.be New Ways of Analyzing Variation 2012

(1) A multifactorial investigation of PP placement in Dutch subordinate clauses (2) Refine common assumptions in syntactic and psycholinguistic theory Dutch language users do not strive at maximally reducing the distance between depending elements Goals

dat ikSU binnen een vijftiental ondernemingen van de Bel20 contacten hebV-final. that ISU within about five enterprises of the Bel20 contacts haveV-final. Midfield dat ikSU contacten hebV-final binnen een vijftiental ondernemingen van de Bel20. that ISU contacts haveV-final within about five enterprises of the Bel20. Postfield The structural position before V-final (midfield) is the standard slot for PPs, with the slot after V-final being an expansion tank for an overladen midfield slot (ANS 1997, Jansen 1979) The distance between SU and V should be reduced as much as possible (Jansen 1979, Van Haeringen 1949) Research object

Dutch Parallel Corpus (DPC) a 10-million-word, parallel corpus of Dutch, English and French sentence-aligned with basic linguistic annotations 5 different text genres but for this presentation only journalistic texts Data selection: dependent clauses starting with the grammatical conjunction dat (= that) PP phrases where variation between extraposition and non- extraposition is possible Belgian Dutch Method: Corpus and data

Logistic regression analysis and generalised linear mixed model PP position (midfield vs. postfield) as binary response variable Predictor variables: Fixed effects 1.The length of the PP 2.The distance-to-V 3.The distance between V and the end of the clause Random effects 1.Verbs 2.Prepositions Method: Statistical evaluation

Results

Overview general distribution Monofactorial analysis 1.Fixed effect 1: Length of PP 2.Fixed effect 2: Distance-to-V 3.Fixed effect 3: Distance between V and end Multifactorial analysis Overview Results

[…] dat de Belgenaan de Olympische Spelendeelnamen […] that the Belgiansin the Olympicstake part […] dat de Belgendeelnamenaan de Olympische Spelen […] that the Belgianstake partin the Olympics Distribution of PPs in midfield or postfield

Operationalised in terms of syllables Example: […] dat de Belgen aan /de/ O/lym/pi/sche/ Spe/len deelnamen = 8 Also counted in terms of words Fixed effect1: The length of the PP

1 = 2 syllables 2 = 3 to 7 syllables 3 = 8 to 12 syllables 4 = 13 or more syllables AV = postfield MV = midfield

Statistical evaluation: Fixed effect1: The length of the PP Length of the PPO.R.p-value syllables12.05< 2e-16 ***

Operationalised for the syllables between SU and V (o.a. Jansen 1978, Gibson 2000) Example: dat ik binnen een vijftiental ondernemingen van de Bel20 con/tac/ten heb. = 3 syllables Also counted in terms of words and phrases Fixed effect2: Distance-to-V

1 = 0 syllables 2 = 1 or 6 syllables 3 = 7 or more syllables AV = postfield MV = midfield

Fixed effect2: Distance-to-V 1 = 0 syllables 2 = 1 or 6 syllables 3 = 7 or more syllables AV = postfield MV = midfield

Statistical evaluation: Fixed effect2: Distance-to-V Distance-to-VO.R.p-value syllables1.480.0001***

Operationalised in terms of syllables Example: Dat mensen een sympathieke collega zullen verkiezen als/ part/ner. = 3 syllable Also counted in terms of words Fixed effect3: Distance between V and the end

1 = 0 syllables 2 = 1 or more syllables AV = postfield MV = midfield

Statistical evaluation: Fixed effect3: Distance between V and the end Distance-V-endO.R.p-value syllables0.43<2e-16 ***

No correlation No interaction Multicollinearity C concordance = 0.75 Logistic regression analysis FactorO.R.p-value Length PP12.48<2e-16 *** Distance-to-V1.620.0001*** Distance-V-end0.485.47e-13 ***

Verbs and preposition as random variables C Concordance = 0.86 Generalised mixed effect model varianceStd.Dev. Verbs0.530.72 Prepositions0.360.59 O.R.p-value Length PP16.89<2e-16 *** Distance-to-V1.680.00*** Distance-V-end0.436.15e-14 ***

Gries, Stefanowitsch 2004: Collostructional analysis An analysis of the verbs/prepositions that are distinctive for each construction may help us elucidate the existence and degree of fine semantic differences that might explain the different restrictions. Interpretation random effects

Collostructional analysis (Gries, Stefanowitsch 2004) : PP disposition MVAV in7.33van5.29 binnen2.7voor3.64 na2.68aan1.5 tijdens1.76met1.47

Collostructional analysis (Gries, Stefanowitsch 2004) : Verb disposition MVAV Doen3.02Recht hebben1.89 Komen2.53Rol spelen1.89 Staan1.95Deel uitmaken1.65 Beschikken1.78Bezig zijn1.42 Halen1.62Tevreden zijn1.42 lijden1.52Verantwoordelijk zijn1.41

1. Postfield position is more often preferred than midfield position 2. PP placement will be determined by 3 length factors 2 random effects Summary

Common assumption: dat ik contacten heb V-final binnen een vijftiental ondernemingen van de Bel20. But the structural position before V-final (midfield) is not the standard slot for PPs. Implications for linguistic theory OR Distance-V-end 0.43 > OR Distance-to-V 1.68

Subject and verb in subordinate clauses are mostly not adjacent Distance between subject and V is not to be reduced as much as possible in Dutch dependent clauses Implications for linguistic theory

Thank you! For further information annelore.willems@hogent.be

The length of the PP [words] 1 = 2 or 3 words 2 = 4 to 6 words 3 = 7 to 11 words 4 = 12 or more words AV = extraposition MV = midfield

Logistic regression analysis: Correlation = 0.81, p < 2.2e-16 The length of the PP FactorO.R.p-value words25.791.62e-11 *** syllables12.05< 2e-16 ***

Distance-to-V [words] 1 = 0 words 2 = 1 or 2 words 3 = 3 or more words AV = extraposition MV = midfield

Distance-to-V [phrases] 1 = 0 phrases 2 = 1 phrase 3 = 2-4 phrases AV = extraposition MV = midfield

Logistic regression analysis: Correlation [words, syllables] = 0.84, p < 2.2e-16 Correlation [words, phrases] = 0.64, p < 2.2e-16 Correlation [syllables, phrases] = 0.58, p < 2.2e-16 Distance-to-V FactorO.R.p-value words1.410.00*** syllables1.480.00*** phrases0.49

Verbs as random variable Mixed effect in logistic regression: C Concordance = 0.85 Verb disposition FactorO.R.p-value Length PP15.61<2e-16 *** Distance-to-V1.60.00*** Distance-V-end0.421.61e-15 ***

