Lexical Access in the Processing of Word Boundary Ambiguity

Language ambiguity results from, among other things, the vagueness of the syntactic structure of phrases and whole sentences. Numerous types of syntactic ambiguity are associated with the placement of the phrase boundary. A special case of the segmentation problem is the phenomenon of word boundary ambiguities; in spoken natural language words coalesce, making it possible to interpret them in different ways (e.g., a name vs. an aim). The purpose of the study was to verify whether the two meanings of words with boundary ambiguities are activated, or whether it is a case of semantic context priming. The study was carried out using the cross-modality semantic priming paradigm. Sentences containing phrases with word boundary ambiguities were presented in an auditory manner to the participants. Immediately after, they performed a visual lexical decision task. Results indicate that both meanings of the ambiguity are automatically activated — independently of the semantic context. When discussing the results I refer to the autonomous and interactive models of parsing, and show other possible areas of research concerning word boundary ambiguities.

In The Cambridge Encyclopedia of Language (Crystal, 1988), an example of an ambiguous answer is given by the Oracle of Delphi when one of the generals asks whether or not he should set out on an expedition. The Oracle's answer could have been interpreted in two ways, either as Domine stes ("Master, stay") or Domi ne stes ("Do not stay home"). In fluent speech, individual words, phrases, and sentences merge with each other, and there are no real breaks between them -isolating them is the task of the listener. As noted by Harley (2005, p. 237), in normal speech, the strings "I scream" and "ice cream" sound indistinguishable. Thus, this ambiguity is linked to the problem of word boundaries; it appears in spoken language, and is a part of a wider phenomenon of ambiguity related to speech segmentation (Harley, 2005;Norris, McQueen, Cutler, & Butterfield, 1997).
The aim of the present study was to determine whether the ambiguity resulting from the blurring of a word boundary is immediately solved due to the sentential context in which it occurs, or whether both meanings are accessed simultaneously. Lexical access is a fundamental problem in the history of research on both lexical ambiguity (the concept of a word having two or more possible meanings) and syntactic ambiguity (the concept that sentences or phrases can be interpreted in two or more ways due to their grammatical structure and the syntactic function of the words they contain).
Lexical ambiguity is, on one hand, a particular problem in the theory of mental lexicon and the question of how to store meanings. On the other hand, research results on this subject have influenced the creation of various models for lexical access, which explain the process of word activation while using a language (recognizing and recalling words, when we listen or read or when we speak or write) (see Reeves, Hirsh-Pasek, & Golinkoff, 1998). It is often highlighted (e.g., Gleason & Ratner, 1998) that studies focused on managing syntactic ambiguity enrich knowledge about the parsing process, computing the syntactic structure of the sentence. As noted by Harley (2005, p. 264), "most of the evidence that drives modern theories of parsing comes from studies of syntactic ambiguity. " The primary processing questions related to ambiguity are: how do we choose the appropriate meaning, what role does context play in resolving ambiguity, and at what stage is context used? (Harley, 2005). Do we immediately select the appropriate sense (direct-access model), or do we access all of the senses and then choose between them (two-stage model)?
In the case of lexical ambiguity, the direct access model assumes that one meaning is quickly chosen on the basis of context and meaning frequency. According to the two-stage model, all meanings of ambiguous words are initially activated and then discarded at a later stage (i.e., the context is very quickly used to select the appropriate sense) (Davis, Marslen-Wilson, & Gaskell, 2002;Martin, Vu, Kellas, & Metcalf, 1999;Simpson, 1984).
The above-mentioned answer provided by the Oracle of Delphi is a case of syntactic ambiguity that results from the segmentation problem. The segmentation problem is related to the ambiguity of the phrase boundaries and gives rise to some types of syntactic ambiguity (see Allbritton, McKoon, & Ratcliff, 1996;Harley, 2005;Lyons, 1977). A particular case of ambiguity linked to speech segmentation in the process of speaking is word boundary ambiguity (WBA). As previously mentioned, when we speak, the words coalesce, which generates different meanings (e.g., "an ice-bucket" vs. "a nice bucket"). This can be the base for creating puns and jokes, such as the joke in which a teacher asks a student, "What do you know about French syntax?" and the student answers, "Gosh, I didn't know they had to pay for their fun. " The condition to understand this joke is discovering the heterographic homophones ("syntax" and "sin tax" sound the same) and word boundary ambiguities (see Shultz & Scott, 1974). In everyday communication, the semantic and situational context, as well as the prosodic cues (Marslen-Wilson, Tyler, Warren, Grenier, & Lee, 1992), usually help us understand the meaning of statements correctly (for a review of word segmentation, see Davis et al., 2002).
The phenomenon of word boundary ambiguity (WBA) can be considered in the context of the problem of sentence processing, that is, the question of how a language user can quickly determine the structure of a sentence and understand its meaning as a whole (despite the high rate of fluent speech and the merging of individual words, as there are often no gaps between words, phrases, and sentences). One of the basic problems in the research on language and communication concerns the relationship between the syntactic and semantic level of processing of sentences. There are two main positions-autonomous and interactive models of parsing. The first model uses the principle of syntactic autonomy-syntactic processing has to precede and happen independently of semantic analysis of sentences (the concept of modularity). While the interactive model assumes that semantic processing happens at the same time as syntactic, with each word, the recipient processes the heard material both syntactically and semantically, to the extent it is possible. In the context of syntactic ambiguity, two models have dominated research on parsing; the garden path model (an autonomous, two-stage model) (see Frazier, 1987;Rayner, Carlson, & Frazier, 1983) and the constraint-based model (an interactive model) (see Harley, 2005, p. 283;MacDonald, Pearlmutter, & Seidenberg, 1994;Trueswell & Tanenhaus, 1994).
There is a little empirical research on word boundary ambiguity processing. Davis et al. (2002) were interested in the issue of segmentation and the ambiguity created by embedded words. However, their research did not deal with access to meanings as related to WBA, but rather concerned recognition of words that are embedded in the onset of other words. Yan and Kliegl (2016) tested whether eye movements (a saccade target selection) were influenced by ambiguity of word boundaries during the reading of Chinese sentences. Word boundary ambiguity occurs commonly in Chinese due to the absence of explicit word spacing. This is similar to other non-segmented languages, such as Japanese (Kudo, Yamamoto, & Matsumoto, 2004). However, these studies did not deal with the issue of access to meanings related to word boundary ambiguity.
The main subject of this paper is the accessibility of meanings of this kind of ambiguity. As mentioned above, the problem of accessibility of meanings (of both words and sentences) is one of the most important issues in research concerning lexical and syntactic ambiguity. It is interesting to see whether the two-step model (or multiple-access model) can be applied to the processing of ambiguity related to word boundaries. Of course, WBA concerns syntactic ambiguity; in the case of WBA, ambiguity is not due to the fact that two words have the same meaning, but instead stems from the fact that the boundary between the words is unclear and the syntactic function of either word may thus also be unclear. The main question is whether the process of parsing sentences with word boundary ambiguity happens according to an autonomous or an interactive model of parsing.
It is worth emphasizing that the syntactic and lexical ambiguities can be treated as the same phenomenon, with similar processing mechanisms (see MacDonald et al., 1994). As noted by MacDonald and colleagues, "both lexical and syntactic ambiguity are governed by the same types of knowledge representations and processing mechanism" (p. 682), and "the syntactic ambiguities are caused by ambiguities associated with lexical items" (p. 676).

My Research Approach
The above assumption became the inspiration to design a study in which I used the crossmodality semantic priming paradigm. In a classic study on the processing of lexical ambiguity, Swinney (1979) used the cross-modal priming technique to show that hearing ambiguous words leads to activation of all of their meanings; for example, the results showed that hearing the word "bug" provides instant access to two possible meanings ("insect" and "spy gadget"), even if the word was presented in a clear context. Next, the context is used to select the right meaning; the wrong meanings are quickly suppressed. This research supported the multiple-access model, which in its extreme form assumes that all meanings of ambiguous words are activated in parallel and to the same degree, and that this activation is independent of both the frequency of meanings and the context (Onifer & Swinney, 1981;Swinney, 1979).
I decided to use the cross-modal priming technique to verify whether a similar phenomenon would occur, as in the study by Swinney (1979), in the case of word boundary ambiguities. The semantic priming paradigm is based on the semantic links between particular words (Field, 2004). Semantic priming means that if words with related meanings occur sequentially, then the processing of the first word will make it easier to recognize the next one. Lexical Decision Tasks are often used in research on ambiguity in the semantic priming paradigm; participants have to decide whether a string of letters visible on the screen is a word or a nonword (a group of letters that is not a word in the given language). Reaction times are measured for each of the letter strings. It is assumed that faster recognition of a given word means its mental representation is more easily accessible (see Reeves et al., 1998).
My research approach consisted of introducing a strong biasing context related to one meaning of the word boundary ambiguity presented in Polish (an example in English: In this zoo there is the great ape or In this recorder there is the grey tape). The second meaning was embedded in the sentence. The research problem was whether both meanings of ambiguity that arise at the boundaries of words would be activated (especially the contextually inappropriate meaning of the ambiguity). The problem I sought to examine in my research lies in the context of the dispute between the autonomous and interactive models of parsing (as well as the two-stage model and the direct-access model). My assumption was that the semantic priming effect would occur, that is, that the target words related to the meaning of the phrase that is indicated by the sentence context would be recognized more quickly than the neutral words. This assumption would mean that after hearing the sentence In this zoo there is the great ape, the letter string ANIMAL will be recognized more quickly as a word compared to a control word, that is, a word unrelated to the sentence contexts, such as HOLIDAY. My main question was whether the second meaning, which was not indicated by the sentence context, would be activated (in this case, the grey tape).
In the presented material, the propositional context appears before ambiguity and indicates what could be the meaning of the last phrase, which included the WBA. The interactive model assumes that previous semantic operations facilitate later decisions regarding parsing. Therefore, on the basis of this model, it can be anticipated that the meaning indicated by the context will be activated. This means that the target words related to the meaning of the phrase that is indicated by the sentence context will be recognized more quickly than the target words related to the second meaning of the ambiguous phrase. According to the autonomous model, on the other hand, at the first stage of processing, the limiting influence of the context will not take place, and both meanings can be activated and influence the lexical decisions. The prediction is thus that both meanings of the ambiguous phrase will be activated, which means faster recognition of the words related to both meanings (e.g., ANIMAL and CASSETTE) than the control words (e.g., HOLIDAY).

Participants
Overall, 180 students (124 women and 56 men) participated in the study; their mean age range was from 18 to 28 years (M = 21.0; SD = 1.8). The participants were volunteers who did not receive compensation for participation, financial or otherwise. As part of the consent procedure prior to participation, they were reminded that they were free to withdraw at any time. The procedure, including giving written consent, was approved by the Ethical Committee of the Institute of Psychology, Jagiellonian University.

Materials, Procedure and Design
The study was carried out using the cross-modality semantic priming paradigm. The participants had two tasks. First, they were to listen via headphones to Polish sentences which contained word boundary ambiguities, for example, zasłona (curtain) vs. za słona (too salty), for which the sentential context suggested one of the meanings (i.e., for "too salty" the phrase was "The soup was too salty [za słona]"; for "curtain, " "On the window there was a curtain [zasłona]"). The ambiguous word was always at the end of the sentences.
Subsequently, the participants were asked to look at a screen that displayed letter strings (e.g., PLATE) immediately after listening to the sentence (time interval of 100ms). The participants' task was to decide and indicate by pressing the right or left Control key whether the presented letter string was an actual word (lexical decision task). The letter string that was the target word was either thematically related to one or the other meaning of the previously presented ambiguous phrase, a control word (e.g., PEPPER, related to "too-salty"; DRAPERY, related to "curtain"; MONEY -a control word), or a nonword (e.g., PALTE). It is crucial to note that nonwords were only used as fillers to prevent participants from realizing that all stimuli were words, thus keeping the task a lexical decision one. Nonwords were not thematically related to other stimuli and reaction times (RTs) for nonwords were not part of the analyses.
Ten ambiguous phrases were chosen. For each phrase, two sentences were created, which pointed to one or the other ambiguous meaning (e.g., for English a great ape vs. a grey tape, the sentences could be In this zoo there is a great ape vs. In this recorder there is a grey tape). For each ambiguous sentence, three target words (presented visually) were prepared for the lexical decision task: 1) a word related to the meaning suggested by the sentence context (e.g., MONKEY), 2) a word related to the second meaning of the ambiguous phrase (e.g., RIBBON), and 3) a control word (unrelated to either meaning). The target words were chosen in such a way that they did not differ in length (two or three syllables) or usage frequency (the word usage frequency was assessed using Frequency Dictionary of Polish; the influence of frequency and length in lexical decisions must be kept in mind; see Whaley, 1978). All of the target words were selected based on a pilot study to ensure the same baseline speed of lexical decisions. Because the time taken for the lexical decisions was measured, I determined that the issue of usage frequency concerned target words (lexical decision stimuli), rather than ambiguous phrases (no RT measurement). Therefore, for target words, I selected words which met the following criteria: a) they were nouns in the nominative case; b) they were lexically associated with either the first or the second meaning of an ambiguous phrase, or with a neutral meaning; c) they were of similar length; d) they were of similar usage frequency based on data from a frequency dictionary; e) they exhibited similar lexical decision time requirements in pilot studies.
The difficulty with Polish syntax may partly concern sentences with word boundary ambiguity. Polish language is characterised by an elaborate and complicated flexion, which was utilized in this study: the ambiguous sentences contained longer or shorter words, which formed identical auditory stimuli. The shorter word was the noun in the prepositional phrase (e.g., "na wóz / on the wagon," "za pałki / for the batons"). After combining the noun and preposition, a longer word was formed, which itself was a noun ("nawóz / fertiliser," "zapałki / matches"). In this manner, word boundary ambiguities were generated. These ambiguous phrases sound identical in spoken language. For experimental stimuli, I selected nouns that had the exact same form in the prepositional phrase as in the nominative case (e.g., "On mocno trzyma zapałki" / "He has a tight grip on the matches" vs. "Policjanci chwycili za pałki" / "The police officers reached for their batons." Only in two cases did the shorter word (in the prepositional phrase) belong to another part of speech than the longer word; in those cases, the longer word was a noun ("zasłona / curtain"), while the shorter word was an adjective in the prepositional phrase ("za słona / too salty").
The fact that longer and shorter words were used is only important when considering their written form. As stated above, the auditory stimuli were identical, even though their meaning changed between noun+preposition (separate words) and one "longer" word. The differentiation between these stimuli was only possible based on the context. The above examples only demonstrate how the words would look if they were written and how the differences in meanings were created. Without context, the spoken phrase "na wóz" would be indistinguishable from "nawóz, " just as the English word "hummingbird" (Trochilidae) is indistinguishable from the phrase "humming Bird" (Mr. Bird hums).
Full counterbalancing was applied so that one of the three target words (Context 1, Context 2, Unrelated) appeared after each ambiguous sentence. As such, for the ten sentences indicating one meaning of a word, three experimental series were created (for three counterbalancing treatments). For the second group of sentences, where the context implied the second ambiguous meaning, three experimental series were also prepared. The six variations derive from a factorial combination of variables: meaning derived from both the auditory stimulus (2x) and the target visual stimulus (related to Meaning 1/Meaning 2/Unrelated) (3x). Hence, six experimental sets were created, with 30 participants in each set. For each participant, the experimental session consisted of 30 auditory sentences and 30 visual stimuli, each of which had to be evaluated as a word or a nonword, with participants reacting to a single lexical decision task after each sentence. Auditory or visual stimuli were not repeated for the same participant. Among the 30 presented target stimuli, only 10 were words related to either the first or the second meaning of the auditory phrase (or unrelated)-the only stimuli used in the analyses. The remaining 20 phrases and visual stimuli were words or nonwords combined with unambiguous phrases (fillers) to make participants truly consider nonwords a possibility, therefore creating a lexical decision task. As an example, a participant received 10 ambiguous phrases (five with Context Meaning 1 and five with Context Meaning 2), each followed by a visual stimulus (three Meaning-consistent, three Meaning-inconsistent, four Meaningunrelated), and 20 fillers, each followed by a word or nonword (10 words, 10 nonwords), for a total of 30 sentences and lexical decision tasks. These were presented in a random order, such that ambiguous sentences were mixed with fillers.

Results
The purpose of the analysis carried out here was to measure the differences between mean reaction times to target words in each category in the ambiguous condition (it should be noted that there were no significant differences between reaction times for unrelated control words and for words appearing after non-ambiguous sentences). The mean reaction times for targets words are presented in Table 1. Results of the repeated measures ANOVA showed the main effect of the kind of presented target words, F(2, 330) = 5.03, p = .007; η 2 = .03. The reaction times for target words with a meaning related to the sentence context were shorter than the reaction times to neutral words, F(1, 165) = 9.69, p = .002. In the next step, the comparison carried out was between the reaction times for lexical decisions concerning words related to the other (not related to the context) meaning of the phrase and the reaction times of lexical decisions for neutral words. These results showed that the lexical decision was significantly faster for words related to the second meaning of the key phrase, F(1, 165) = 5.08, p = .025, compared to the control condition. The results also showed that there were no significant differences between reaction times to words related to the first meaning of the presented phrase (the meaning indicated by the sentence context) and reaction times to the words related to the second meaning of the presented phrase, F(1, 165) = 0.28, p = .59. This suggests that both meanings of the phrase played a similar part in priming the lexical decision.

Discussion
When we speak, our articulation is smooth and continuous; unlike in writing, spoken words do not usually contain clues that define the beginnings and ends of words or separate the different phonetic segments. This lack of clear-cut limits of bordering forms can generate ambiguous messages. The topic of the present research was the processing of ambiguity that occurs at the word boundary as it relates to the problem of phonetic segmentation. The purpose of this study was to examine whether simultaneous access to both meanings of this kind of ambiguity occurs when the sentential context points to one of those meanings. Several decades of research on resolving lexical and syntactic ambiguity has yielded a wealth of results that are used to create lexical accessibility models and syntactic structure analysis models. However, it would seem that the issue of lexical access in the course of processing sentences with word boundary ambiguity has yet to be taken up in experimental research.
In the present study, a cross-modality semantic priming paradigm was employed, which has previously been used for the purpose of examining the processing of lexical ambiguity. The results indicated that lexical decisions (recognizing visual words) were facilitated for words associated with each meaning of the presented ambiguity-both with the meaning set by the sentential context and with the embedded meaning. This suggests that both meanings of this kind of syntactic ambiguity are activated immediately and automatically, regardless of the semantic context. These results are in keeping with the multiple- access model (which concerns lexical ambiguity) and the autonomous model of parsing. According to the autonomous view, we automatically access multiple meanings of a statement. The context is used to select the appropriate meaning of the ambiguity. According to the interactive models, semantic information can influence the syntactic processor at an early stage; the context that precedes the word has a significant influence on the speed and ease with which the word heard among other words is recognized. This means a stronger activation of the meaning indicated by the context. Therefore, it seems that the obtained results support the autonomous model. These results are also consistent with the cohort model of spoken word recognition, which was proposed by Marslen-Wilson and colleagues (Marslen-Wilson, 1989, 1990. The central idea of the model is that as we hear speech, we set up a cohort of possible items that could be represented by the word. As noted by Gaskell and Marslen-Wilson (1999), transient ambiguity (which is a critical property of the perception of spoken words) is captured by allowing the parallel activation of multiple lexical representations. It seems that the claim made by, among others, MacDonald and colleagues (1994) that both lexical and syntactic ambiguity are governed by the same type of knowledge representations and processing mechanisms (p. 682) is supported in the case of word boundary ambiguities.
Empirical studies on the processing of word boundary ambiguities may concern similar issues as those undertaken in the case of lexical ambiguity. As such, numerous questions arise. Under what conditions is the influence of the semantic context observable? What is the role of the temporal distance between the ambiguous phrase and the target word, and does this factor modify the influence of the propositional context on the activation of meanings? Does access to one meaning of the ambiguous phrase depend on its usage frequency? Another possible area of interest is the role of prosody in solving word boundary ambiguities. Other types of syntactic ambiguities are more easily resolved in speech than in writing, thanks to prosody signals (see Marslen-Wilson et al., 1992). The ambiguity of word boundaries appears when listening and not when reading; thus, the role of prosodic cues is particularly striking.
As once noted by Simpson (1984), the incompatibility of results and the variety of models available in lexical ambiguity studies is due to the use of various research paradigms, the preference of different methods, and the use of various experimental tasks. It would be interesting to see whether the effects obtained in this study would be replicated when using methods other than lexical decision tasks.

Funding
The authors have no funding to report.