这是用户在 2024-10-21 20:56 为 https://app.immersivetranslate.com/pdf-pro/uploading/ 保存的双语快照页面,由 沉浸式翻译 提供双语支持。了解如何保存?
to be relevant in its context, with the schema specifying the kind of information that is to be considered relevant for a particular comprehension task.
模式规定了与特定理解任务相关的信息类型。

Process Model
Forming Coherent Text Bases
形成连贯文本基础的过程模型

The model takes as its input a list of propositions that represent the meaning of a text. The derivation of this proposition list from the text is not part of the model. It has been described and justified in Kintsch (1974) and further codified by Turner and Greene (in press). With some practice, persons using this system arrive at essentially identical propositional representations, in our experience. Thus, the system is simple to use; and while it is undoubtedly too simple to capture nuances of meaning that are crucial for some linguistic and logical analyses, its robustness is well suited for a process model. Most important, this notation can be translated into other systems quite readily. For instance, one could translate the present text bases into the graphical notation of Norman and Rumelhart (1975), though the result would be quite cumbersome, or into a somewhat more sophisticated notation modeled after the predicate calculus, which employs atomic propositions, variables, and constants (as used in van Dijk, 1973). Indeed, the latter notation may eventually prove to be more suitable. The important point to note is simply that although some adequate notational system for the representation of meaning is required, the details of that system often are not crucial for our model.
该模型的输入是代表文本含义的命题列表。从文本中推导出这个命题表并不是模型的一部分。金奇(Kintsch,1974 年)对其进行了描述和论证,特纳和格林(Turner and Greene,出版中)对其进行了进一步编纂。根据我们的经验,使用该系统的人经过一段时间的练习后,会得出基本相同的命题表述。因此,该系统简单易用;虽然它无疑过于简单,无法捕捉对某些语言和逻辑分析至关重要的细微意义,但它的稳健性非常适合过程模型。最重要的是,这种符号可以很容易地翻译成其他系统。例如,我们可以将目前的文本基础翻译成 Norman 和 Rumelhart(1975 年)的图形符号,尽管结果会相当繁琐,或者翻译成以谓词微积分为模型的更复杂的符号,后者采用原子命题、变量和常量(van Dijk,1973 年)。事实上,后一种符号最终可能会被证明更为合适。需要注意的重要一点是,尽管我们需要一些适当的符号系统来表示意义,但该系统的细节往往对我们的模型并不重要。

Propositional Notation 命题符号

The propositional notation employed below will be outlined here only in the briefest possible way. The idea is to represent the meaning of a text by means of a structured list of propositions. Propositions are composed of concepts (the names of which we shall write in capital letters, so that they will not be confused with words). The composition rule states that each proposition must include first a predicate, or relational concept, and one or more arguments.
下文将尽可能简要地介绍命题符号。其目的是通过结构化的命题列表来表示文本的含义。命题由概念组成(概念名称用大写字母书写,以免与单词混淆)。构成规则规定,每个命题必须首先包括一个谓词或关系概念,以及一个或多个参数。
The latter may be concepts or other embedded propositions. The arguments of a proposition fulfill different semantic functions, such as agent, object, and goal. Predicates may be realized in the surface structure as verbs, adjectives, adverbs, and sentence connectives. Each predicate constrains the nature of the argument that it may take. These constraints are imposed both by linguistic rules and general world knowledge and are assumed to be a part of a person’s knowledge or semantic memory.
后者可以是概念或其他内嵌命题。命题的参数具有不同的语义功能,如代理人、客体和目标。谓词可以在表层结构中以动词、形容词、副词和句子连接词的形式实现。每个谓词都限制了它可以接受的论据的性质。这些限制由语言规则和一般世界知识共同施加,并被假定为人的知识或语义记忆的一部分。
Propositions are ordered in the text base according to the way in which they are expressed in the text itself. Specifically, their order is determined by the order of the words in the text that correspond to the propositional predicates.
命题在文本库中的排序是根据它们在文本中的表达方式而定的。具体来说,它们的顺序是由文本中与命题谓词相对应的词的顺序决定的。
Text bases must be coherent. One of the linguistic criteria for the semantic coherence of a text base is referential coherence. In terms of the present notational system, referential coherence corresponds to argument overlap among propositions. Specifically, (P, A, B) is referentially coherent with ( R , B , C R , B , C R,B,C\mathrm{R}, \mathrm{B}, \mathrm{C} ) because the two propositions share the argument B , or with (Q, D, (P, A, B)) because one proposition is embedded here as an argument into another. Referential coherence is probably the most important single criterion for the coherence of text bases. It is neither a necessary nor a sufficient criterion linguistically. However, the fact that in many texts other factors tend to be correlated with it makes it a useful indicator of coherence that can be checked easily, quickly, and reliably. We therefore propose the following hypothesis about text processing: The first step in forming a coherent text base consists in checking out its referential coherence; if a text base is found to be referentially coherent, that is, if there is some argument overlap among all of its propositions, it is accepted for further processing; if gaps are found, inference processes are initiated to close them; specifically, one or more propositions will be added to the text base that make it coherent. Note that in both cases, the argu-ment-referent repetition rule also holds for arguments consisting of a proposition, thereby establishing relations not only between individuals but also between facts denoted by propositions.
文本基础必须连贯。文本基础语义连贯的语言学标准之一是指称连贯。就目前的符号系统而言,指称一致性对应于命题之间的论据重叠。具体来说,(P, A, B) 与 ( R , B , C R , B , C R,B,C\mathrm{R}, \mathrm{B}, \mathrm{C} ) 在指称上是一致的,因为这两个命题共享参数 B,或者与(Q, D, (P, A, B))在指称上是一致的,因为一个命题作为参数嵌入到另一个命题中。参照一致性可能是文本基础一致性的最重要的单一标准。在语言学上,它既不是必要标准,也不是充分标准。然而,在许多文本中,其他因素往往与它相关联,这使得它成为一个有用的连贯性指标,可以方便、快捷、可靠地进行检查。因此,我们提出以下关于文本处理的假设:形成连贯文本基础的第一步是检查其指称连贯性;如果发现一个文本基础在指称上是连贯的,也就是说,如果它的所有命题之间都有一些论据重叠,那么它就会被接受,以便进一步处理;如果发现有空白,那么就会启动推理过程来填补这些空白;具体来说,就是在文本基础上添加一个或多个命题,使其连贯起来。请注意,在这两种情况下,由命题组成的论证--论据--参照物重复规则同样适用于论证,从而不仅建立了个体之间的关系,也建立了命题所表示的事实之间的关系。

Processing Cycles 加工周期

The second major assumption is that this checking of the text base for referential coherence and the addition of inferences wherever necessary cannot be performed on the text base as a whole because of the capacity limitations of working memory. We assume that a text is processed sequentially from left to right (or, for auditory inputs, in temporal order) in chunks of several propositions at a time. Since the proposition lists of a text base are ordered according to their appearance in the text, this means that the first n 1 n 1 n_(1)n_{1} propositions are processed together in one cycle, then the next n 2 n 2 n_(2)n_{2} propositions, and so on. It is unreasonable to assume that all n i s n i s n_(i)sn_{i} \mathrm{~s} are equal. Instead, for a given text and a given comprehender, the maximum n i n i n_(i)n_{i} will be specified. Within this limitation, the precise number of propositions included in a processing chunk depends on the surface characteristics of the text. There is ample evidence (e.g., Aaronson & Scarborough, 1977; Jarvella, 1971) that sentence and phrase boundaries determine the chunking of a text in short-term memory. The maximum value of n i n i n_(i)n_{i}, on the other hand, is a model parameter, depending on text as well as reader characteristics. 1 1 ^(1){ }^{1}
第二个主要假设是,由于工作记忆容量的限制,检查文本基础的指代一致性以及在必要时添加推论的工作无法在整个文本基础上进行。我们假定,文本是按从左到右的顺序(或者,对于听觉输入,按时间顺序)分块处理的,每次处理几个命题。由于文本基础的命题列表是根据它们在文本中的出现顺序排列的,这意味着第一个 n 1 n 1 n_(1)n_{1} 命题会在一个循环中被一起处理,然后是下一个 n 2 n 2 n_(2)n_{2} 命题,依此类推。假设所有 n i s n i s n_(i)sn_{i} \mathrm{~s} 都是相同的是不合理的。相反,对于给定的文本和给定的编译器,将指定 n i n i n_(i)n_{i} 的最大值。在这一限制范围内,处理块中包含的命题的确切数量取决于文本的表面特征。有大量证据(例如,Aaronson & Scarborough, 1977; Jarvella, 1971)表明,句子和短语的边界决定了短时记忆中文本的分块。而 n i n i n_(i)n_{i} 的最大值则是一个模型参数,取决于文本和读者的特征。 1 1 ^(1){ }^{1}

If text bases are processed in cycles, it becomes necessary to make some provision in the model for connecting each chunk to the ones already processed. The following assumptions are made. Part of working memory is a short-term memory buffer of limited size s s ss. When a chunk of n i n i n_(i)n_{i} propositions is processed, s s ss of them are selected and stored in the buffer. 2 2 ^(2){ }^{2} Only those s s ss propositions retained in the buffer are available for connecting the new incoming chunk with the already processed material. If a connection is found between any of the new propositions and those retained in the buffer, that is, if there exists some argument overlap between the input set and the contents of the short-term memory buffer, the input is accepted as coherent with the previous text. If not, a resource-consuming search of all previously processed propositions is made. In auditory comprehension, this search ranges over all text propositions stored in long-term memory. (The storage assumptions are detailed below.) In reading comprehension, the search includes all previous propositions because even
如果文本库是循环处理的,那么就有必要在模型中作出一些规定,以便将每个文本块与已 经处理过的文本块连接起来。假设如下工作记忆的一部分是 s s ss 大小有限的短期记忆缓冲区。当处理一个由 n i n i n_(i)n_{i} 个命题组成的语块时,其中的 s s ss 个命题会被选中并存储在缓冲区中。 2 2 ^(2){ }^{2} 只有那些保留在缓冲区的 s s ss 命题才可用于连接新进入的语块和已处理的材料。如果在任何新命题和缓冲区中保留的命题之间找到了连接,也就是说,如果输入集和短时记忆缓冲区的内容之间存在某些参数重叠,那么输入将被视为与之前的文本一致。如果不一致,就会对以前处理过的所有命题进行耗费资源的搜索。在听觉理解中,这种搜索的范围是存储在长期记忆中的所有文本命题。(存储假设详见下文。)在阅读理解中,搜索包括所有先前的命题,因为即使是"......

those propositions not available from longterm memory can be located by re-reading the text itself. Clearly, we are assuming here a reader who processes all available information; deviant cases will be discussed later.
那些无法从长期记忆中获得的命题可以通过重读文本本身来找到。显然,我们在此假定读者会处理所有可用信息;偏差情况将在后面讨论。
If this search process is successful, that is, if a proposition is found that shares an argument with at least one proposition in the input set, the set is accepted and processing continues. If not, an inference process is initiated, which adds to the text base one or more propositions that connect the input set to the already processed propositions. Again, we are assuming here a comprehender who fully processes the text. Inferences, like long-term memory searches, are assumed to make relatively heavy demands on the comprehender’s resources and, hence, contribute significantly to the difficulty of comprehension.
如果搜索过程成功,即发现一个命题与输入集合中的至少一个命题共享一个参数,则接受该集合并继续处理。如果没有,则启动推理过程,在文本库中添加一个或多个命题,将输入集合与已处理的命题联系起来。在此,我们再次假设编译者完全处理了文本。推理,就像长期记忆搜索一样,被认为对编译者的资源提出了相对较高的要求,因此在很大程度上增加了理解的难度。

Coherence Graphs 相干图

In this manner, the model proceeds through the whole text, constructing a network of coherent propositions. It is often useful to represent this network as a graph, the nodes of which are propositions and the connecting lines indicating shared referents. The graph can be arranged in levels by selecting that proposition for the top level that results in the simplest graph structure (in terms of some suitable measure of graph complexity). The second level in the graph is then formed by all the propositions connected to the top level; propositions that are connected to any proposi-
通过这种方式,该模型贯穿整个文本,构建出一个连贯的命题网络。通常,将这一网络表示为一个图,其中的节点是命题,连接线表示共享的参照物。通过选择最简单的图结构(以某种合适的图复杂度衡量)的命题作为顶层命题,可以将图分层排列。图中的第二层由与顶层相连的所有命题组成;与任何命题相连的命题都可以被称为第二层。
tion at the second level, but not to the firstlevel proposition, then form the third level. Lower levels may be constructed similarly. For convenience, in drawing such graphs, only connections between levels, not within levels, are indicated. In case of multiple betweenlevel connections, we arbitrarily indicate only one (the first one established in the processing order). Thus, each set of propositions, depending on its pattern of interconnections, can be assigned a unique graph structure.
在第二层的命题中,如果没有第一层的命题,则形成第三层的命题。更低层次也可以用类似的方法构建。为方便起见,在绘制此类图表时,我们只标明层级之间的连接,而不标明层级内部的连接。如果有多个层级之间的联系,我们只任意标出一个(处理顺序中建立的第一个)。因此,每一组命题,根据其相互连接的模式,都可以被赋予一个独特的图结构。
The topmost propositions in this kind of coherence graph may represent presuppositions of their subordinate propositions due to the fact that they introduce relevant discourse referents, where presuppositions are important for the contextual interpretation and hence for discourse coherence, and where the number of subordinates may point to a macrostructural role of such presupposed propositions. Note that the procedure is strictly formal: There is no claim that topmost propositions are always most important or relevant in a more intuitive sense. That will be taken care of with the macro-operations described below.
在这种连贯图中,最顶层的命题可能代表其下级命题的预设,因为它们引入了相关的话语所指,而预设对于语境解释和话语连贯都很重要,下级命题的数量可能表明这些预设命题的宏观结构作用。需要注意的是,这个过程是严格形式化的:我们并没有说最重要的命题总是最重要的,或者在更直观的意义上是最相关的。这一点将通过下面描述的宏观操作来解决。
A coherent text base is therefore a connected graph. Of course, if the long-term memory searches and inference processes required by the model are not executed, the resulting text base will be incoherent, that is, the graph will consist of several separate clusters. In the experimental situations we are concerned with, where the subjects read the text rather carefully, it is usually reasonable to assume that coherent text bases are established.
因此,一个连贯的文本库就是一个连通的图。当然,如果不执行模型所要求的长期记忆搜索和推理过程,所得到的文本基础将是不连贯的,也就是说,图将由几个独立的群组成。在我们所关注的实验情况中,受试者会相当仔细地阅读文本,因此通常可以合理地假设建立了连贯的文本基础。
To review, so far we have assumed a cyclical process that checks on the argument overlap in the proposition list. This process is automatic, that is, it has low resource requirements. In each cycle, certain propositions are retained in a short-term buffer to be connected with the input set of the next cycle. If no connections are found, resource-consuming search and inference operations are required.
回顾一下,到目前为止,我们假设了一个循环过程,检查命题列表中的参数重叠情况。这个过程是自动的,也就是说,它对资源的要求很低。在每个循环中,某些命题会被保留在一个短期缓冲区中,以便与下一个循环的输入集进行连接。如果找不到连接,则需要进行耗费资源的搜索和推理操作。

Memory Storage 内存存储

In each processing cycle, there are n i n i n_(i)n_{i} propositions involved, plus the s s ss propositions held over in the short-term buffer. It is assumed that in each cycle, the propositions
在每个处理周期中,都会涉及 n i n i n_(i)n_{i} 个命题,加上短期缓冲区中保留的 s s ss 个命题。假设在每个周期中,命题

currently being processed may be stored in long-term memory and later reproduced in a recall or summarization task, each with probability p p pp. This probability p p pp is called the reproduction probability because it combines both storage and retrieval information. Thus, for the same comprehension conditions, the value of p p pp may vary depending on whether the subject’s task is recall or summarization or on the length of the retention interval. We are merely saying that for some comprehen-sion-production combination, a proposition is reproduced with probability p p pp for each time it has participated in a processing cycle. Since at each cycle a subset of s s ss propositions is selected and held over to the next processing cycle, some propositions will participate in more than one processing cycle and, hence, will have higher reproduction probabilities. Specifically, if a proposition is selected k 1 k 1 k-1k-1 times for inclusion in the short-term memory buffer, it has k k kk chances of being stored in long-term memory, and hence, its reproduction probability will be 1 ( 1 p ) k 1 ( 1 p ) k 1-(1-p)^(k)1-(1-p)^{k}.
当前正在处理的信息可能会存储在长时记忆中,随后在回忆或总结任务中再现,每个信息的再现概率为 p p pp 。这个概率 p p pp 被称为再现概率,因为它结合了存储和检索信息。因此,在相同的理解条件下, p p pp 的值可能会因受试者的任务是回忆还是总结或保持间隔的长短而不同。我们只是说,对于某些理解--概括--生成的组合,一个命题在每次参与处理循环时,都会以 p p pp 的概率被再现。由于在每个处理周期中, s s ss 命题的一个子集都会被选中并保留到下一个处理周期,因此有些命题会参与不止一个处理周期,因而会有更高的重现概率。具体来说,如果一个命题被选中 k 1 k 1 k-1k-1 次以纳入短时记忆缓冲区,那么它就有 k k kk 次机会被存储到长时记忆中,因此,它的重现概率将是 1 ( 1 p ) k 1 ( 1 p ) k 1-(1-p)^(k)1-(1-p)^{k}
Which propositions of a text base will be thus favored by multiple processing depends crucially on the nature of the process that selects the propositions to be held over from one processing cycle to the next. Various strategies may be employed. For instance, the s s ss buffer propositions may be chosen at random. Clearly, such a strategy would be less than optimal and result in unnecessary long-term memory searches and, hence, in poor reading performance. It is not possible at this time to specify a unique optimal strategy. However, there are two considerations that probably characterize good strategies. First, a good strategy should select propositions for the buffer that are important, in the sense that they play an important role in the graph already constructed. A proposition that is already connected to many other propositions is more likely to be relevant to the next input cycle than a proposition that has played a less pivotal role so far. Thus, propositions at the top levels of the graph to which many others are connected should be selected preferentially. Another reasonable consideration involves recency. If one must choose between two equally important propositions in the sense outlined above, the more recent one might be
因此,文本基础中哪些命题会受到多重处理的青睐,关键取决于从一个处理周期到下一个处理周期选择保留命题的过程的性质。可以采用各种策略。例如,可以随机选择 s s ss 缓冲命题。显然,这种策略不是最佳策略,会导致不必要的长期内存搜索,从而降低阅读性能。目前还不可能指定一个唯一的最佳策略。不过,有两个因素可能是好策略的特征。首先,一个好的策略应该为缓冲区选择重要的命题,因为它们在已构建的图中起着重要的作用。一个已经与许多其他命题连接起来的命题,比一个迄今为止作用不那么关键的命题,更有可能与下一个输入周期相关。因此,应优先选择位于图谱顶层、与许多其他命题有关联的命题。另一个合理的考虑因素是重复性。如果必须在上述两个同样重要的命题中做出选择,那么较新的命题可能是

expected to be the one most relevant to the next input cycle. Unfortunately, these considerations do not specify a unique selection strategy but a whole family of such strategies. In the example discussed in the next section, a particular example of such a strategy will be explored in detail, not because we have any reason to believe that it is better than some alternatives, but to show that each strategy leads to empirically testable consequences. Hence, which strategy is predominantly used in a given population of subjects becomes an empirically decidable issue because each strategy specifies a somewhat different pattern of reproduction probabilities over the propositions of a text base. Experimental results may be such that only one particular selection strategy can account for them, or perhaps we shall not be able to discriminate among a class of reasonable selection strategies on the basis of empirical frequency distributions. In either case, we shall have the information the model requires.
预期是与下一个输入周期最相关的。遗憾的是,这些考虑因素并没有指明一种独特的选择策略,而是指明了一整套这样的策略。在下一节讨论的例子中,我们将详细探讨这种策略的一个特定例子,这并不是因为我们有任何理由相信它优于某些替代策略,而是为了说明每种策略都会导致可通过经验检验的结果。因此,在特定的受试者群体中主要使用哪种策略就成了一个可根据经验决定的问题,因为每种策略都规定了对文本基础命题的再现概率的不同模式。实验结果可能是只有一种特定的选择策略可以解释这些结果,也可能是我们无法根据经验频率分布来区分一类合理的选择策略。无论哪种情况,我们都将获得模型所需的信息。
It is already clear that a random selection strategy will not account for paragraph recall data, and that some selection strategy incorporating an “importance” principle is required. The evidence comes from work on what has been termed the “levels effect” (Kintsch & Keenan, 1973; Kintsch et al., 1975; Meyer, 1975). It has been observed that propositions belonging to high levels of a textbase hierarchy are much better recalled (by a factor of two or three) than propositions low in the hierarchy. The text-base hierarchies in these studies were constructed as follows. The topical proposition or propositions of a paragraph (as determined by its title or otherwise by intuition) were selected as the top level of the hierarchy, and all those propositions connected to them via argument overlap were determined, forming the next level of the hierarchy, and so on. Thus, these hierarchies were based on referential coherence, that is, arguments overlap among propositions. In fact, the propositional networks constructed here are identical with these hierarchies, except for changes introduced by the fact that in the present model, coherence graphs are constructed cycle by cycle.
显然,随机选择策略无法解释段落回忆数据,因此需要某种包含 "重要性 "原则的选择策略。证据来自被称为 "层级效应 "的研究(Kintsch & Keenan, 1973; Kintsch et al.)据观察,属于文本基础层次结构中较高层次的命题比层次结构中较低层次的命题的记忆效果要好得多(高出两到三倍)。这些研究中的文本基础层次结构如下。选取一个或多个段落的主题命题(根据标题或其他直觉确定)作为层次结构的顶层,然后确定通过论据重叠与之相连的所有命题,形成层次结构的下一层,依此类推。因此,这些层次结构的基础是指涉一致性,即命题之间的论据重叠。事实上,这里所构建的命题网络与这些层次结构完全相同,只是在本模型中,连贯图是一个周期一个周期地构建的。
Note that the present model suggests an interesting reinterpretation of this levels effect.
请注意,本模型对这种水平效应提出了一个有趣的重新解释。
Suppose that the selection strategy, as discussed above, is indeed biased in favor of important, that is, high-level propositions. Then, such propositions will, on the average, be processed more frequently than low-level propositions and recalled better. The better recall of high-level propositions can then be explained because they are processed differently than low-level propositions. Thus, we are now able to add a processing explanation to our previous account of levels effects, which was merely in terms of a correlation between some structural aspects of texts and recall probabilities.
假设如上所述,选择策略确实偏向于重要命题,即高层命题。那么,平均而言,这些命题的处理频率会高于低层次命题,回忆效果也会更好。高级命题之所以更容易被回忆,是因为它们的处理方式与低级命题不同。因此,我们现在可以在之前对水平效应的解释中加入加工解释,而之前的解释只是从文本的某些结构方面与回忆概率之间的相关性角度进行的。

Although we have been concerned here only with the organization of the micropropositions, one should not forget that other processes, such as macro-operations, are going on at the same time, and that therefore the buffer must contain the information about macropropositions and presuppositions that is required to establish the global coherence of the discourse. One could extend the model in that direction. Instead, it appears to us worthwhile to study these various processing components in isolation, in order to reduce the complexity of our problem to manageable proportions. We shall remark on the feasibility of this research strategy below.
虽然我们在这里只关注微观命题的组织,但我们不应忘记,其他过程,如宏观操作,也在同时进行,因此缓冲区必须包含宏观命题和预设的信息,这是建立话语的整体一致性所必需的。我们可以朝这个方向扩展这个模型。相反,我们认为值得单独研究这些不同的处理成分,以便将问题的复杂性降低到可控的程度。我们将在下文讨论这一研究策略的可行性。

Testing the Model 测试模型

The frequencies with which different text propositions are recalled provide the major possibility for testing the model experimentally. It is possible to obtain good estimates of such frequencies for a given population of readers and texts. Then, the bestfitting model can be determined by varying the three parameters of the model n n -n-n, the maximum input size per cycle; s s ss, the capacity of the short-term buffer; and p p pp, the reproduction probability-at the same time exploring different selection strategies. Thus, which selection strategy is used by a given population of readers and for a given type of text becomes decidable empirically. For instance, in the example described below, a selection strategy has been assumed that emphasizes recency and frequency (the “leading-edge” strategy). It is conceivable that for college students reading texts rather carefully under laboratory conditions, a fairly sophisticated
不同文本命题被唤起的频率为实验检验模型提供了主要可能性。对于给定的读者和文本群体,我们有可能获得这些频率的良好估计值。然后,通过改变模型的三个参数: n n -n-n (每个周期的最大输入量)、 s s ss (短期缓冲区的容量)和 p p pp (复制概率)--同时探索不同的选择策略,就可以确定最合适的模型。因此,对于特定类型的文本,特定读者群体使用哪种选择策略是可以根据经验决定的。例如,在下面描述的例子中,我们假定了一种选择策略,即强调经常性和频率("前沿 "策略)。可以想象,对于在实验室条件下认真阅读文本的大学生来说,一种相当复杂的

strategy like this might actually be used (i.e., it would lead to fits of the model detectably better than alternative assumptions). At the same time, different selection strategies might be used in other reader populations or under different reading conditions. It would be worthwhile to investigate whether there are strategy differences between good and poor readers, for instance.
这样的策略可能会被实际使用(即它会使模型的拟合效果明显优于其他假设)。同时,在其他读者群体或不同的阅读条件下,可能会使用不同的选择策略。例如,好读者和差读者之间是否存在策略差异就值得研究。

If the model is successful, the values of the parameters s , n s , n s,ns, n, and p p pp, and their dependence on text and reader characteristics might also prove to be informative. A number of factors might influence s s ss, the short-term memory capacity. Individuals differ in their short-term memory capacity. Perfetti and Goldman (1976), for example, have shown that good readers are capable of holding more of a text in short-term memory than poor readers. At the same time, good and poor readers did not differ on a conventional memory-span test. Thus, the differences observed in the reading task may not be due to capacity differences per se. Instead, they could be related to the observation of Hunt, Lunneborg, and Lewis (1975) that persons with low verbal abilities are slower in accessing information in shortterm memory. According to the model, a comprehender continually tests input propositions against the contents of the short-term buffer; even a slight decrease in the speed with which these individual operations are performed would result in a noticeable deterioration in performance. In effect, lowering the speed of scanning and matching operations would have the same effect as decreasing the capacity of the buffer.
如果模型成功,参数 s , n s , n s,ns, n p p pp 的值及其与文本和读者特征的关系也可能被证明是有参考价值的。一些因素可能会影响短时记忆能力 s s ss 。个人的短时记忆能力各不相同。例如,Perfetti 和 Goldman(1976 年)的研究表明,阅读能力强的人比阅读能力差的人能够在短时记忆中保留更多的文章内容。同时,在传统的记忆广度测试中,好读者和差读者并无差异。因此,在阅读任务中观察到的差异可能并不是因为能力差异本身。相反,它们可能与 Hunt、Lunneborg 和 Lewis(1975 年)的观察结果有关,即语言能力低的人在获取短时记忆中的信息时速度较慢。根据该模型,阅读理解者会根据短时记忆缓冲区的内容不断检验输入的命题;即使这些单项操作的速度稍有下降,也会导致阅读理解能力的明显下降。实际上,降低扫描和匹配操作的速度与降低缓冲区容量的效果是一样的。
The buffer capacity may also depend on the difficulty of the text, however, or more precisely, on how difficult particular readers find a text. Presumably, the size of the buffer depends, within some limits, on the amount of resources that must be devoted to other aspects of processing (perceptual decoding, syn-tactic-semantic analyses, inference generation, and the macro-operations discussed below). The greater the automaticity of these processes and the fewer inferences required, the larger the buffer a reader has to operate with. Certainly, familiarity should have a pronounced effect on n n nn, the number of propositions accepted per cycle. The process of construct-
不过,缓冲区的容量也可能取决于文本的难度,或者更准确地说,取决于特定读者认为文本有多难。据推测,缓冲区的大小在一定范围内取决于必须用于其他方面处理(知觉解码、同步触觉-语义分析、推理生成以及下文讨论的宏观操作)的资源量。这些过程的自动化程度越高,所需的推理次数越少,读者可操作的缓冲区就越大。当然,熟悉程度对 n n nn ,即每个循环所接受的命题数量,应该有明显的影响。建构过程

ing a text base is perhaps best described as apperception. That is, a reader’s knowledge determines to a large extent the meaning that he or she derives from a text. If the knowledge base is lacking, the reader will not be able to derive the same meaning that a person with adequate knowledge, reading the same text, would obtain. Unfamiliar material would have to be processed in smaller chunks than familiar material, and hence, n n nn should be directly related to familiarity. No experimental tests of this prediction appear to exist at present. However, other factors, too, might influence n n nn. Since it is determined by the surface form of the text, increasing the complexity of the surface form while leaving the underlying meaning intact ought to decrease the size of the processing chunks. Again, no data are presently available, though Kintsch and Monk (1972) and King and Greeno (1974) have reported that such manipulations affect reading times.
用 "感知"(apperception)来形容阅读文本也许最恰当不过。也就是说,读者的知识在很大程度上决定了他或她从文本中获得的意义。如果缺乏知识基础,读者就无法获得与拥有足够知识的人阅读同一文本时相同的意义。因此, n n nn 应该与熟悉程度直接相关。目前似乎还没有对这一预测进行实验测试。然而,其他因素也可能影响 n n nn 。由于 n n nn是由文本的表面形式决定的,因此在保持基本含义不变的情况下,增加表面形式的复杂性应该会减少处理块的大小。虽然 Kintsch 和 Monk(1972 年)以及 King 和 Greeno(1974 年)报告说这种操作会影响阅读时间,但目前还没有这方面的数据。
Finally, the third parameter of the model, p p pp, would also be expected to depend on familiarity for much the same reasons as s s ss : The more familiar a text, the fewer resources are required for other aspects of processing, and the more resources are available to store individual propositions in memory. The value of p p pp should, however, depend above all on the task demands that govern the comprehension process as well as the later production process. If a long text is read with attention focused mainly on gist comprehension, the probability of storing individual propositions of the text base should be considerably lower than when a short paragraph is read with immediate recall instructions. Similarly, when summarizing a text, the value of p p pp should be lower than in recall because individual micropropositions are given less weight in a summary relative to macropropositions.
最后,由于与 s s ss 基本相同的原因,该模型的第三个参数 p p pp 预计也将取决于熟悉程度:文本越熟悉,其他方面的加工所需的资源就越少,记忆中用于存储单个命题的资源就越多。然而, p p pp 的价值首先应取决于支配理解过程和后期制作过程的任务要求。如果在阅读长篇文章时,注意力主要集中在要点理解上,那么存储文章基础中单个命题的概率应该大大低于在阅读短小段落时立即回忆的概率。同样,在总结文章时, p p pp 的值应该低于回忆时的值,因为相对于宏命题而言,单个微命题在总结中的权重较低。
Note that in a model like the present one, such factors as familiarity may have rather complex effects. Not only can familiar material be processed in larger chunks and retained more efficiently in the memory buffer; if a topic is unfamiliar, there will be no frame available to organize and interpret a given proposition sequence (e.g., for the purpose of generating inferences from it), so readers might continue to pick up new propositions in the hope of finding information that may organize what
请注意,在本模型中,熟悉程度等因素可能会产生相当复杂的影响。熟悉的材料不仅可以以更大的块状进行处理,而且可以更有效地保留在记忆缓冲区中;如果一个主题是陌生的,那么就没有可用的框架来组织和解释给定的命题序列(例如,为了从中产生推论),因此读者可能会继续拾取新的命题,希望找到可以组织所读内容的信息。

they already hold in their buffer. If, however, the crucial piece of information fails to arrive, the working memory will be quickly overloaded and incomprehension will result. Thus, familiarity may have effects on comprehension not only at the level of processing considered here (the construction of a coherent text base) but also at higher processing levels.
他们已经在缓冲区中保存了这些信息。但是,如果关键信息没有到达,工作记忆就会很快超载,从而导致不理解。因此,熟悉程度不仅会在这里所考虑的处理层次(构建连贯的文本基础)上对理解产生影响,而且会在更高的处理层次上产生影响。

Readability 可读性

The model’s predictions are relevant not only for recall but also for the readability of texts. This aspect of the model has been explored by Kintsch and Vipond (1978) and will be only briefly summarized here. It was noted there that conventional accounts of readability have certain shortcomings, in part because they do not concern themselves with the organization of long texts, which might be expected to be important for the ease of difficulty with which a text can be comprehended. The present model makes some predictions about readability that go beyond these traditional accounts. Comprehension in the normal case is a fully automatic process, that is, it makes low demands on resources. Sometimes, however, these normal processes are blocked, for instance, when a reader must retrieve a referent no longer available in his or her working memory, which is done almost consciously, requiring considerable processing resources. One can simulate the comprehension process according to the model and determine the number of resource-consuming operations in this process, that is, the number of long-term memory searches required and the number of inferences required. The assumption was made that each one of these operations disrupts the automatic comprehension processes and adds to the difficulty of reading. On the other hand, if these operations are not performed by a reader, the representation of the text that this reader arrives at will be incoherent. Hence, the retrieval of the text base and all performance depending on its retrieval (such as recall, summarizing, or question answering) will be poor. Texts requiring many operations that make high demands on resources should yield either increased reading times or low scores on comprehension tests. Comprehension, therefore,
该模型的预测不仅与回忆有关,而且与文本的可读性有关。Kintsch 和 Vipond(1978 年)对该模型的这一方面进行了探讨,在此仅作简要概述。他们指出,关于可读性的传统说法有一定的缺陷,部分原因是这些说法并不关注长篇文章的组织结构,而这一点可能会对文章理解的难易程度产生重要影响。本模型对可读性的一些预测超出了这些传统的说法。正常情况下,理解是一个完全自动的过程,也就是说,它对资源的需求很低。然而,有时这些正常过程会受阻,例如,当读者必须检索其工作记忆中已不存在的参照物时,这几乎是有意识地进行的,需要大量的处理资源。我们可以根据这一模型模拟理解过程,并确定这一过程中消耗资源的操作次数,即所需的长期记忆搜索次数和所需的推理次数。假设这些操作中的每一个都会扰乱自动理解过程,增加阅读难度。另一方面,如果读者不进行这些操作,那么他所获得的文本表述将是不连贯的。因此,文本基础的检索和所有依赖于检索的性能(如回忆、总结或问题解答)都会很差。对于需要进行大量操作、对资源要求较高的文本,要么会增加阅读时间,要么会在理解能力测试中得分较低。因此,理解能力

must be evaluated in a way that considers both comprehension time and test performance because of the trade-off between the two. Only measures such as reading time per proposition recalled (used in Kintsch et al., 1975) or reading speed adjusted for comprehension (suggested by Jackson & McClelland, 1975) can therefore be considered adequate measures of comprehension difficulty.
由于理解时间和考试成绩之间存在权衡,因此必须同时考虑理解时间和考试成绩。因此,只有每回忆一个命题所需的阅读时间(Kintsch 等人,1975 年使用)或根据理解能力调整后的阅读速度(Jackson & McClelland,1975 年建议)等指标才能被认为是衡量理解难度的适当指标。
The factors related to readability in the model depend, of course, on the input size per cycle ( n n nn ) and the short-term memory capacity ( s s ss ) that are assumed. In addition, they depend on the nature of the selection strategy used, since that determines which propositions in each cycle are retained in the short-term buffer to be interconnected with the next set of input propositions. A reader with a poor selection strategy and a small buffer, reading unfamiliar material, might have all kinds of problems with a text that would be highly readable for a good reader. Thus, readability cannot be considered a property of texts alone, but one of the textreader interaction. Indeed, some preliminary analyses reported by Kintsch and Vipond (1978) show that the readability of some texts changes a great deal as a function of the shortterm memory capacity and the size of input chunks: Some texts were hard for all parameter combinations that were explored; others were easy in every case; still others could be processed by the model easily when short-term memory and input chunks were large, yet became very difficult for small values of these parameters.
当然,模型中与可读性有关的因素取决于每个周期的输入量( n n nn )和假定的短期存储器容量( s s ss )。此外,它们还取决于所使用的选择策略的性质,因为选择策略决定了每个周期中哪些命题被保留在短期缓冲区中,以便与下一组输入命题相互连接。一个选择策略差、缓冲区小的读者,在阅读不熟悉的材料时,可能会对一篇对优秀读者来说可读性很高的文章产生各种问题。因此,可读性不能被认为仅仅是文本的属性,而是文本阅读器交互的属性之一。事实上,Kintsch 和 Vipond(1978 年)报告的一些初步分析表明,一些文本的可读性随短期记忆容量和输入块大小的变化而发生很大的变化:有些文本在探索的所有参数组合中都很困难;有些文本在任何情况下都很容易;还有一些文本在短期记忆和输入块较大时可以很容易地被模型处理,但在这些参数值较小时就变得非常困难。

Macrostructure of a Text
文本的宏观结构

Macro-operators transform the propositions of a text base into a set of macropropositions that represent the gist of the text. They do so by deleting or generalizing all propositions that are either irrelevant or redundant and by constructing new inferred propositions. “Delete” here does not mean “delete from memory” but “delete from the macrostructure.” Thus, a given text proposition-a micropropositionmay be deleted from the text’s macrostructure but, nevertheless, be stored in memory and subsequently recalled as a microproposition.
宏运算符可将文本基础的命题转换为一组代表文本要点的宏命题。它们通过删除或概括所有不相关或多余的命题,并构建新的推断命题来实现这一目的。这里的 "删除 "不是指 "从记忆中删除",而是指 "从宏观结构中删除"。因此,一个给定的文本命题--微命题--可能会从文本的宏观结构中删除,但仍会存储在记忆中,随后作为微命题被唤起。

Role of the Schema
模式的作用

The reader’s goals in reading control the application of the macro-operators. The formal representation of these goals is the schema. The schema determines which micropropositions or generalizations of micropropositions are relevant and, thus, which parts of the text will form its gist.
读者的阅读目标控制着宏运算符的应用。这些目标的正式表述就是图式。模式决定了哪些微命题或微命题的概括是相关的,因此,文本的哪些部分将构成其要点。

It is assumed that text comprehension is always controlled by a specific schema. However, in some situations, the controlling schema may not be detailed, nor predictable. If a reader’s goals are vague, and the text that he or she reads lacks a conventional structure, different schemata might be set up by different readers, essentially in an unpredictable manner. In such cases, the macro-operations would also be unpredictable. Research on comprehension must concentrate on those cases where texts are read with clear goals that are shared among readers. Two kinds of situations qualify in this respect. First of all, there are a number of highly conventionalized text types. If a reader processes such texts in accordance with their conventional nature, specific well-defined schemata are obtained. These are shared by the members of a given cultural group and, hence, are highly suitable for research purposes. Familiar examples of such texts are stories (e.g., Kintsch & van Dijk, 1975) and psychological research reports (Kintsch, 1974). These schemata specify both the schematic categories of the texts (e.g., a research report is supposed to contain introduction, method, results, and discussion sections), as well as what information in each section is relevant to the macrostructure (e.g., the introduction of a research report must specify the purpose of the study). Predictions about the macrostructure of such texts require a thorough understanding of the nature of their schemata. Following the lead of anthropologists (e.g., Colby, 1973) and linguists (e.g., Labov & Waletzky, 1967), psychologists have so far given the most attention to story schemata, in part within the present framework (Kintsch, 1977; Kintsch & Greene, 1978; Poulsen, Kintsch, Kintsch, & Premack, in press; van Dijk, 1977b) and in part within the related “story grammar” approach (Mandler & Johnson, 1977; Rumelhart, 1975; Stein & & &\& Glenn, in press).
一般认为,文本理解总是受特定模式的控制。然而,在某些情况下,控制图式可能并不详细,也无法预测。如果读者的目标是模糊的,而他或她阅读的文本又缺乏常规结构,那么不同的读者可能会建立不同的图式,而且基本上是以不可预测的方式建立的。在这种情况下,宏观操作也是不可预测的。关于理解的研究必须集中于读者在阅读文本时有明确的共同目标的情况。在这方面,有两种情况符合条件。首先,有许多高度常规化的文本类型。如果读者按照这些文本的常规性质进行处理,就会获得特定的、定义明确的图式。这些图式是特定文化群体成员共有的,因此非常适合用于研究目的。故事(如 Kintsch & van Dijk, 1975)和心理研究报告(Kintsch, 1974)就是这类文本的熟悉例子。这些图式既规定了文本的图式类别(例如,研究报告应该包括引言、方法、结果和讨论等部分),也规定了每个部分中与宏观结构相关的信息(例如,研究报告的引言必须说明研究目的)。要预测此类文章的宏观结构,就必须透彻了解其图式的性质。在人类学家(如 Colby, 1973)和语言学家(如到目前为止,心理学家对故事图式的关注最多,其中一部分是在本框架内(Kintsch, 1977; Kintsch & Greene, 1978; Poulsen, Kintsch, Kintsch, & Premack, in press; van Dijk, 1977b),另一部分是在相关的 "故事语法 "方法内(Mandler & Johnson, 1977; Rumelhart, 1975; Stein & & &\& Glenn, in press)。
A second type of reading situation where well-defined schemata exist comprises those cases where one reads with a special purpose in mind. For instance, Hayes, Waterman, and Robinson (1977) have studied the relevance judgments of subjects who read a text with a specific problem-solving set. It is not necessary that the text be conventionally structured. Indeed, the special purpose overrides whatever text structure there is. For instance, one may read a story with the processing controlled not by the usual story schema but by some special-purpose schema established by task instructions, special interests, or the like. Thus, Decameron stories may be read not for the plot and the interesting events but because of concern with the role of women in fourteenth-century Italy or with the attitudes of the characters in the story toward morality and sin sin sin\sin.
第二种存在明确图式的阅读情况包括带着特殊目的阅读的情况。例如,Hayes、Waterman 和 Robinson(1977 年)研究了带着解决问题的特定目的阅读文章的受试者的相关性判断。文本不一定要采用传统的结构。事实上,无论文本结构如何,其特殊目的都会压倒一切。例如,一个人在阅读一个故事时,其处理过程可能不是由通常的故事模式控制,而是由任务指示、特殊兴趣或类似情况所建立的一些特殊目的模式控制。因此,阅读《十日谈》故事可能不是为了情节和有趣的事件,而是因为关注十四世纪意大利妇女的角色或故事中人物对道德和 sin sin sin\sin 的态度。
There are many situations where comprehension processes are similarly controlled. As one final example, consider the case of reading the review of a play with the purpose of deciding whether or not to go to that play. What is important in a play for the particular reader serves as a schema controlling the gist formation: There are open slots in this schema, each one standing for a property of plays that this reader finds important, positively or negatively, and the reader will try to fill these slots from the information provided in the review. Certain propositions in the text base of the review will be marked as relevant and assigned to one of these slots, while others will be disregarded. If thereby not all slots are filled, inference processes will take over, and an attempt will be made to fill in the missing information by applying available knowledge frames to the information presented directly. Thus, while the macro-operations themselves are always information reducing, the macrostructure may also contain information not directly represented in the original text base, when such information is required by the controlling schema. If the introduction to a research report does not contain a statement of the study’s purpose, one will be inferred according to the present model.
在许多情况下,理解过程也会受到类似的控制。最后一个例子是,阅读戏剧评论的目的是决定是否去看该剧。对于特定的读者来说,一部戏剧中什么是重要的,这就是控制要点形成的图式:在这个图式中存在一些空位,每一个空位都代表了该读者认为重要的戏剧属性,无论是正面的还是负面的,读者将试图从剧评中提供的信息中填补这些空位。评论文本基础中的某些命题将被标记为相关命题并分配到其中一个空格中,而其他命题将被忽略。如果不是所有的槽都被填满,推理过程就会接手,并尝试将现有的知识框架应用到直接呈现的信息中,以填补缺失的信息。因此,虽然宏观操作本身总是在减少信息,但宏观结构也可能包含原始文本库中没有直接表示的信息,如果控制模式需要这些信息的话。如果研究报告的引言没有包含研究目的的说明,那么根据本模式就可以推断出研究目的。
In many cases, of course, people read loosely structured texts with no clear goals in mind. The outcome of such comprehension
当然,在很多情况下,人们在阅读结构松散的文章时并没有明确的目标。这种理解的结果

processes, as far as the resulting macrostructure is concerned, is indeterminate. We believe, however, that this is not a failure of the model: If the schema that controls the macro-operations is not well defined, the outcome will be haphazard, and we would argue that no scientific theory, in principle, can predict it.
就所产生的宏观结构而言,这些过程是不确定的。但我们认为,这并不是模型的失败:如果控制宏观运行的模式没有得到很好的定义,那么结果将是杂乱无章的,我们认为原则上任何科学理论都无法预测它。

Macro-operators 宏观操作员

The schema thus classifies all propositions of a text base as either relevant or irrelevant. Some of these propositions have generalizations or constructions that are also classified in the same way. In the absence of a general theory of knowledge, whether a microproposition is generalizable in its particular context or can be replaced by a construction must be decided on the basis of intuition at present. Each microproposition may thus be either deleted from the macrostructure or included in the macrostructure as is, or its generalization or construction may be included in the macrostructure. A proposition or its generalization/ construction that is included in the macrostructure is called a macroproposition. Thus, some propositions may be both micro- and macropropositions (when they are relevant); irrelevant propositions are only micropropositions, and generalizations and constructions are only macropropositions. 3 3 ^(3){ }^{3}
因此,该模式将文本库中的所有命题分为相关或不相关。其中一些命题的概括或构造也以同样的方式进行分类。在缺乏一般知识理论的情况下,一个微命题在其特定语境中是否可以被概括,或者是否可以被一个构造所取代,目前必须根据直觉来决定。因此,每个微命题既可以从宏观结构中删除,也可以原封不动地包含在宏观结构中,或者将其泛化或构造包含在宏观结构中。包含在宏观结构中的命题或其概括/构造称为宏观命题。因此,有些命题既可能是微命题,也可能是宏命题(当它们相关时);不相关的命题只能是微命题,而概括和构造只能是宏命题。 3 3 ^(3){ }^{3}

The macro-operations are performed probabilistically. Specifically, irrelevant micropropositions never become macropropositions. But, if such propositions have generalizations or constructions, their generalizations or constructions may become macropropositions, with probability g g gg when they, too, are irrelevant, and with probability m m mm when they are relevant. Relevant micropropositions become macropropositions with probability m m mm; if they have generalizations or constructions, these are included in the macrostructure, also with probability m m mm. The parameters m m mm and g g gg are reproduction probabilities, just as p p pp in the previous section: They depend on both storage and retrieval conditions. Thus, m m mm may be small because the reading conditions did not encourage macroprocessing (e.g., only a short paragraph was read, with the expectation of an immediate recall test) or because the production was delayed for such a long time that
宏观操作是以概率方式进行的。具体来说,无关的微命题永远不会成为宏命题。但是,如果这些命题有概括或构造,那么它们的概括或构造就可能成为宏命题,当它们也是不相关的时候,概率为 g g gg ;当它们是相关的时候,概率为 m m mm 。相关的微命题成为宏命题的概率为 m m mm ;如果微命题有概括或结构,这些概括或结构就会被包含在宏结构中,概率也是 m m mm 。参数 m m mm g g gg 是重现概率,就像上一节中的 p p pp 一样:它们取决于存储和检索条件。因此, m m mm 之所以小,可能是因为阅读条件不鼓励进行宏观处理(例如,只读了一小段,但预期会立即进行回忆测试),也可能是因为制作延迟了很长时间,以至于 m m mm g g gg的再现概率小于 p p pp

even the macropropositions have been forgotten.
甚至连宏观命题也被遗忘了。
Macrostructures are hierarchical, and hence, macro-operations are applied in several cycles, with more and more stringent criteria of relevance. At the lowest level of the macrostructure, relatively many propositions are selected as relevant by the controlling schema and the macro-operations are performed accordingly. At the next level, stricter criteria for relevance are assumed, so that only some of the first-level macropropositions are retained as second-level macropropositions. At subsequent levels, the criterion is strengthened further until only a single macroproposition (essentially a title for that text unit) remains. A macroproposition that was selected as relevant at k k kk levels of the hierarchy has therefore a reproduction probability of 1 ( 1 m ) k 1 ( 1 m ) k 1-(1-m)^(k)1-(1-m)^{k}, which is the probability that at least one of the k k kk times that this proposition was processed results in a successful reproduction.
宏观结构是分层的,因此,宏观运算要分几个周期进行,相关性的标准也越来越严格。在宏观结构的最底层,控制模式会选择相对较多的相关命题,并据此进行宏观运算。在下一级,相关性标准更加严格,因此只有部分一级宏命题被保留为二级宏命题。在随后的层级中,标准会进一步加强,直到只剩下一个宏命题(基本上是该文本单元的标题)为止。因此,在 k k kk 层级中被选为相关的宏命题的重现概率为 1 ( 1 m ) k 1 ( 1 m ) k 1-(1-m)^(k)1-(1-m)^{k} ,即该命题在 k k kk 次处理中至少有一次成功重现的概率。

Production 生产

Recall or summarization protocols obtained in experiments are texts in their own right, satisfying the general textual and contextual conditions of production and communication. A protocol is not simply a replica of a memory representation of the original discourse. On the contrary, the subject will try to produce a new text that satisfies the pragmatic conditions of a particular task context in an experiment or the requirements of effective communication in a more natural context. Thus, the language user will not produce information that he or she assumes is already known or redundant. Furthermore, the operations involved in discourse production are so complex that the subject will be unable to retrieve at any one time all the information that is in principle accessible to memory. Finally, protocols will contain information not based on what the subject remembers from the original text, but
在实验中获得的回忆或概括协议本身就是文本,符合制作和交流的一般文本和语境条件。协议并不是原始话语记忆表征的简单复制品。恰恰相反,受试者会努力制作一个新的文本,以满足实验中特定任务语境的语用条件或在更自然的语境中进行有效交际的要求。因此,语言使用者不会产生他或她认为已知或多余的信息。此外,话语生成所涉及的操作非常复杂,受试者在任何时候都不可能检索到原则上可用于记忆的所有信息。最后,协议所包含的信息不是基于主体从原文中记住的内容,而是
consisting of reconstructively added details, explanations, and various features that are the result of output constraints characterizing production in general.
由重建添加的细节、解释和各种特征组成,这些特征是一般生产限制的结果。

Optional transformations. Propositions stored in memory may be transformed in various essentially arbitrary and unpredictable ways, in addition to the predictable, schemacontrolled transformations achieved through the macro-operations that are the focus of the present model. An underlying conception may be realized in the surface structure in a variety of ways, depending on pragmatic and stylistic concerns. Reproduction (or reconstruction) of a text is possible in terms of different lexical items or larger units of meaning. Transformations may be applied at the level of the microstructure, the macrostructure, or the schematic structure. Among these transformations one can distinguish reordering, explication of coherence relations among propositions, lexical substitutions, and perspective changes. These transformations may be a source of errors in protocols, too, though most of the time they preserve meaning. Whether such transformations are made at the time of comprehension, or at the time of production, or both cannot be decided at present.
可选变换。存储在内存中的命题,除了通过本模型所关注的宏观操作实现的可预测的、由图式控制的转换之外,还可以通过各种本质上任意的和不可预测的方式进行转换。一个基本概念可以通过各种方式在表层结构中实现,这取决于实用性和文体方面的考虑。可以通过不同的词项或更大的意义单位来复制(或重建)文本。转换可以应用于微观结构、宏观结构或图式结构层面。在这些转换中,我们可以区分出重新排序、阐述命题之间的连贯关系、词汇替换和视角变化。这些转换也可能是协议错误的根源,尽管它们在大多数情况下都能保留意义。至于这些转换是在理解时进行的,还是在生成时进行的,抑或是两者兼而有之,目前尚无定论。

A systematic account of optional transformations might be possible, but it is not necessary within the present model. It would make an already complex model and scoring system even more cumbersome, and therefore, we choose to ignore optional transformations in the applications reported below.
对可选转换进行系统的说明也许是可能的,但在本模型中并无必要。这会使本已复杂的模型和评分系统更加繁琐,因此,我们选择在下文报告的应用中忽略可选变换。

Reproduction. Reproduction is the simplest operation involved in text production. A subject’s memory for a particular text is a memory episode containing the following types of memory traces: (a) traces from the various perceptual and linguistic processes involved in text processing, (b) traces from the comprehension processes, and © contextual traces. Among the first would be memory for the typeface used or memory for particular words and phrases. The third kind of trace permits the subject to remember the circumstances in which the processing took place: the laboratory setting, his or her own reactions to the whole procedure, and so on. The model is not concerned with either of these two types of memory traces. The traces resulting from the
复制。复制是文本制作过程中最简单的操作。受试者对特定文本的记忆是一个记忆片段,包含以下类型的记忆痕迹:(a)文本处理过程中各种感知和语言过程的痕迹;(b)理解过程的痕迹;以及© 上下文痕迹。前者包括对所用字体的记忆或对特定单词和短语的记忆。第三种痕迹允许被试记住处理过程中的环境:实验室环境、自己对整个过程的反应等。本模型不涉及这两种记忆痕迹中的任何一种。由

comprehension processes, however, are specified by the model in detail. Specifically, those traces consist of a set of micropropositions. For each microproposition in the text base, the model specifies a probability that it will be included in this memory set, depending on the number of processing cycles in which it participated and on the reproduction parameter p p pp. In addition, the memory episode contains a set of macropropositions, with the probability that a macroproposition is included in that set being specified by the parameter m m mm. (Note that the parameters m m mm and p p pp can only be specified as a joint function of storage and retrieval conditions; hence, the memory content on which a production is based must also be specified with respect to the task demands of a particular production situation.)
然而,理解过程是由模型详细规定的。具体来说,这些痕迹由一组微命题组成。对于文本基础中的每一个微命题,模型都会根据其参与的处理循环次数和再现参数 p p pp 来指定其被包含在这一记忆集中的概率。此外,记忆集还包含一组宏命题,参数 m m mm 指定了宏命题被包含在记忆集中的概率(请注意,参数 m m mm p p pp 只能作为存储和检索条件的联合函数来指定;因此,必须根据特定生产情况的任务要求来指定生产所依据的记忆内容)。
A reproduction operator is assumed that retrieves the memory contents as described above, so that they become part of the subject’s text base from which the output protocol is derived.
假设有一个复制操作员,负责检索上述记忆内容,使其成为受试者文本库的一部分,并从中导出输出协议。
Reconstruction. When micro- or macroinformation is no longer directly retrievable, the language user will usually try to reconstruct this information by applying rules of inference to the information that is still available. This process is modeled with three reconstruction operators. They consist of the inverse application of the macro-operators and result in the reconstruction of some of the information deleted from the macrostructure with the aid of world knowledge: (a) addition of plausible details and normal properties, (b) particularization, and © specification of normal conditions, components, or consequences of events. In all cases, errors may be made. The language user may make guesses that are plausible but wrong or even give details that are inconsistent with the original text. However, the reconstruction operators are not applied blindly. They operate under the control of the schema, just as in the case of the macro-operators. Only reconstructions that are relevant in terms of this control schema are actually produced. Thus, for instance, when “Peter went to Paris by train” is a remembered macroproposition, it might be expanded by means of reconstruction operations to include such a normal component as “He went into the station to buy a ticket,”
重建。当微观或宏观信息已无法直接检索时,语言用户通常会尝试通过对仍然可用的信息应用推理规则来重建这些信息。这一过程可以用三个重构算子来模拟。它们由宏观运算符的反向应用组成,借助世界知识重建从宏观结构中删除的部分信息:(a) 添加可信的细节和正常属性,(b) 特殊化,© 指定正常条件、组成部分或事件后果。在所有情况下,都可能出现错误。语言使用者可能会做出似是而非的猜测,甚至给出与原文不一致的细节。然而,重构运算符并不是盲目使用的。它们在模式的控制下运行,就像宏运算符一样。只有与控制模式相关的重构才会被实际生成。因此,举例来说,当 "彼得坐火车去巴黎 "是一个记忆中的宏命题时,可以通过重构运算将其扩展为 "他去车站买票 "这样的普通成分。

but it would not include irrelevant reconstructions such as “His leg muscles contracted.” The schema in control of the output operation need not be the same as the one that controlled comprehension: It is perfectly possible to look at a house from the standpoint of a buyer and later to change one’s viewpoint to that of a prospective burglar, though different information will be produced than in the case when no such schema change has occurred (Anderson & Pichert, 1978).
但不包括 "他的腿部肌肉收缩 "等无关的重构。控制输出操作的图式不必与控制理解的图式相同:从买主的角度看一栋房子,然后再将自己的视角转变为潜在窃贼的视角,这完全是可能的,尽管产生的信息会与没有发生这种图式变化的情况不同(Anderson & Pichert,1978 年)。

Metastatements. In producing an output protocol, a subject will not only operate directly on available information but will also make all kinds of metacomments on the structure, the content, or the schema of the text. The subject may also add comments, opinions, or express his or her attitude.
元评论。在制作输出协议时,主体不仅会直接对可用信息进行操作,还会对文本的结构、内容或模式进行各种元评论。主体还可以添加评论、意见或表达自己的态度。

Production plans. In order to monitor the production of propositions and especially the connections and coherence relations, it must be assumed that the speaker uses the available macropropositions of each fragment of the text as a production plan. For each macroproposition, the speaker reproduces or reconstructs the propositions dominated by it. Similarly, the schema, once actualized, will guide the global ordering of the production process, for example, the order in which the macropropositions themselves are actualized. At the microlevel, coherence relations will determine the ordering of the propositions to be expressed, as well as the topic-comment structure of the respective sentences. Both at the micro- and macrolevels, production is guided not only by the memory structure of the discourse itself but also by general knowledge about the normal ordering of events and episodes, general principles of ordering of events and episodes, general principles of ordering of information in discourse, and schematic rules. This explains, for instance, why speakers will often transform a noncanonical ordering into a more canonical one (e.g., when summariziing scrambled stories, as in Kintsch, Mandel, & Kozminsky, 1977).
制作计划。为了监控命题的生成,尤其是连接和连贯关系,必须假定说话者使用文本每个片段的可用宏命题作为生成计划。对于每个宏命题,说话者都会再现或重构受其支配的命题。同样,模式一旦实现,将指导生产过程的整体排序,例如宏观命题本身实现的顺序。在微观层面,连贯关系将决定要表达的命题的排序,以及相应句子的主题-评论结构。无论是在微观层面还是宏观层面,语篇的生成不仅受话语本身的记忆结构的指导,而且还受有关事件和情节正常排序的一般知识、事件和情节排序的一般原则、话语中信息排序的一般原则以及图式规则的指导。例如,这就解释了为什么说话者经常会将非规范排序转化为更规范的排序(例如,在总结乱码故事时,如 Kintsch, Mandel, & Kozminsky, 1977)。

Just as the model of the comprehension process begins at the propositional level rather than at the level of the text itself, the production model will also leave off at the propositional level. The question of how the text base that underlies a subject’s protocol is
正如理解过程的模型始于命题层面而非文本层面,制作模型也将从命题层面开始。作为主体协议基础的文本基础是如何形成的?

transformed into an actual text will not be considered here, although it is probably a less intractable problem than that posed by the inverse transformation of verbal texts into conceptual bases.
在此,我们将不考虑将文本转化为实际文本的问题,尽管这个问题可能没有将语言文本反向转化为概念基础的问题那么棘手。
Text generation. Although we are concerned here with the production of secondorder discourses, that is, discourses with respect to another discourse, such as free-recall protocols or summaries, the present model may eventually be incorporated into a more general theory of text production. In that case, the core propositions of a text would not be a set of propositions remembered from some specific input text, and therefore new mechanisms must be described that would generate core propositions de novo.
文本生成。虽然我们在这里关注的是二阶话语的生成,即与另一话语相关的话语,如自由回忆协议或摘要,但本模型最终可能会被纳入更一般的文本生成理论。在这种情况下,文本的核心命题就不是从某个特定输入文本中记忆出来的命题集,因此必须描述新的机制来重新生成核心命题。

Processing Simulation 加工模拟

How the model outlined in the previous section works is illustrated here with a simple example. A suitable text for this purpose must be (a) sufficiently long to ensure the involvement of macroprocesses in comprehension; (b) well structured in terms of a schema, so that predictions from the present model become possible; and © understandable without technical knowledge. A 1,300-word research report by Heussenstam (1971) called “Bumperstickers and the Cops,” appeared to fill these requirements. Its structure follows the conventions of psychological research reports closely, though it is written informally, without the usual subheadings indicating methods, results, and so on. Furthermore, it is concerned with a social psychology experiment that requires no previous familiarity with either psychology, experimental design, or statistics.
这里通过一个简单的例子来说明上一节概述的模型是如何运作的。为此,一篇合适的文章必须:(a) 足够长,以确保宏观过程参与到理解中;(b) 在模式方面结构合理,从而使本模式的预测成为可能;以及© 无需技术知识即可理解。海森斯塔姆(Heussenstam,1971 年)撰写的一份 1300 字的研究报告《碰碰车和警察》似乎满足了这些要求。该报告的结构严格遵循了心理学研究报告的惯例,尽管它写得很不正式,没有通常的小标题来说明方法、结果等。此外,它涉及的是一项社会心理学实验,不需要事先熟悉心理学、实验设计或统计学。
We cannot analyze the whole report here, though a tentative macroanalysis of “Bumperstickers” has been given in van Dijk (in press). Instead, we shall concentrate only on its first paragraph:
虽然 van Dijk(出版中)对 "Bumperstickers "进行了初步的宏观分析,但我们无法在此对整份报告进行分析。因此,我们只集中讨论报告的第一段:

A series of violent, bloody encounters between police and Black Panther Party members punctuated the early summer days of 1969 . Soon after, a group of Black students I teach at California State College, Los Angeles, who were members of the Panther Party, began to complain of continuous harassment by law enforcement officers. Among their many grievances, they complained about receiving so many traffic
1969 年初夏,警察与黑豹党成员之间发生了一系列暴力血腥冲突。不久之后,我在洛杉矶加利福尼亚州立学院教书的一群黑人学生(他们是黑豹党成员)开始抱怨不断受到执法人员的骚扰。在他们的诸多不满中,他们抱怨受到如此多的交通管制。

citations that some were in danger of losing their driving privileges. During one lengthy discussion, we realized that all of them drove automobiles with Panther Party signs glued to their bumpers. This is a report of a study that I undertook to assess the seriousness of their charges and to determine whether we were hearing the voice of paranoia or reality. (Heussenstam, 1971, p. 32)
有些人可能会被吊销驾驶执照。在一次长时间的讨论中,我们意识到他们所有人驾驶的汽车保险杠上都粘有黑豹党的标志。这是我进行的一项研究的报告,目的是评估他们被指控的严重性,并确定我们听到的是偏执狂的声音还是现实。(海森斯塔姆,1971 年,第 32 页)
Our model does not start with this text but with its semantic representation, presented in Table 1. We follow here the conventions of Kintsch (1974) and Turner and Greene (in press). Propositions are numbered and are listed consecutively according to the order in which their predicates appear in the English text. Each proposition is enclosed by parentheses and contains a predicate (written first) plus one or more arguments. Arguments are concepts (printed in capital letters to distinguish them from words) or other embedded propositions, which are referred to by their number. Thus, as is shown in Table 1, (complain, student, 19) is shorthand for (complain, student, (harass, police, student)). The semantic cases of the arguments have not been indicated in Table 1 to keep the notation simple, but they are obvious in most instances.
我们的模型不是从文本开始的,而是从表 1 所示的语义表征开始的。我们在此沿用 Kintsch(1974 年)和 Turner 与 Greene(出版中)的惯例。命题被编号,并根据其谓词在英文文本中出现的顺序连续列出。每个命题都用括号括起来,包含一个谓词(先写)和一个或多个论据。论据是概念(用大写字母印刷,以区别于单词)或其他内嵌命题,用其编号表示。因此,如表 1 所示,(投诉、学生、19)是(投诉、学生、(骚扰、警察、学生))的简称。表 1 中没有标明参数的语义情况,以保持符号的简洁,但在大多数情况下它们是显而易见的。
Some comments about this representation are in order. It is obviously very “surfacy” and, in connection with that, not at all unambiguous. This is a long way from a precise, logical formalism that would unambiguously represent “the meaning” of the text, preferably through some combination of elementary semantic predicates. The problem is not only that neither we nor anyone else has ever devised an adequate formalism of this kind; such a formalism would also be quite inappropriate for our purposes. If one wants to develop a psychological processing model of comprehension, one should start with a nonelaborated semantic representation that is close to the surface structure. It is then the task of the processing model to elaborate this representation in stages, just as a reader starts out with a conceptual analysis directly based on the surface structure but then constructs from this base more elaborate interpretations. To model some of these interpretative processes is the goal of the present enterprise. These processes change the initial semantic representation (e.g., by specifying coherence relations of a certain kind, adding required in-
关于这一表述,有必要作一些评论。它显然是非常 "表面化 "的,与此相关的是,它一点也不明确。这离精确的逻辑形式主义还有很长的路要走,而这种形式主义最好是通过一些基本语义谓词的组合来明确表示文本的 "意义"。问题不仅在于我们和其他人都没有设计出这种适当的形式主义,而且这种形式主义对于我们的目的来说也是非常不合适的。如果我们想建立一个理解的心理加工模型,就应该从一个接近表层结构的、未经加工的语义表征开始。然后,加工模型的任务就是分阶段阐释这一表征,就像读者一开始直接根据表层结构进行概念分析,然后在此基础上构建更精细的解释一样。对其中一些解释过程进行建模是本研究的目标。这些过程会改变最初的语义表征(例如,通过指定某种连贯关系、添加所需的语义内涵和语义外延)。
Table 1 表 1
Proposition List for the Bumperstickers Paragraph
标语段落的提案清单
Proposition number 命题编号 Proposition 提案
1 (SERIES, ENCOUnter) (系列, ENCOUnter)
2 (Violent, ENCOUnter) (暴力,ENCOUnter)
3 (BLOODY, ENCOUNTER) (血腥、遭遇)
4 (BETWEEN, ENCOUNTER, POLICE, BLACK PANTHER)
(之间、遭遇、警察、黑豹)。
5 (time: in, ENCOUnter, SUMMER)
(时间:in, ENCOUnter, SUMMER)
6 (EARLy, SUMMER) 夏初
7 (TIME: in, SUMMER, 1969)
(时代:1969 年夏)
8 (soon, 9) (很快,9)
9 (after, 4, 16) (之后,4,16)
10 (GROUP, STUDENT) (小组、学生)
11 (Black, student) (布莱克,学生)
12 (tEACH, speAKER, STUDENT)
(教师、演讲者、学生)
13 (location: at, 12, cal state COLLEGE)
(地点:12号,加州州立大学)
14 (LOCATION: AT, CAL State COLLEGE, LOS ANGELES)
(地点:AT,加州州立大学,洛杉矶)
15 (is a, student, black Panther)
(是一名学生,黑豹党成员)
16 (begin, 17) (开始,17)
17 (complain, student, 19) (投诉,学生,19岁)
18 (continuous, 19) (连续,19)
19 (harass, police, student)
(骚扰、警察、学生)
20 (AMONG, COMPlaint) (AMONG,COMPlaint)
21 (MANY, COMPLAINT) (许多人在抱怨)
22 (COMPlain, student, 23) (COMPlain, 学生,23 岁)
23 (receive, student, ticket)
(接收、学生、门票)
24 (MANY, TICKET) (多,票)
25 (cause, 23, 27) (原因、23、27)
26 (SOME, STUDENT) (一些学生)
27 (in danger of, 26, 28)
(危险,26,28)
28 (LOSE, 26, LicEnSE) (损失, 26, LicEnSE)
29 (during, discussion, 32)
(期间,讨论,32)
30 (LENGTHY, DISCussion) (冗长,讨论)
31 (and, STUDENT, SPEAKER) (学生发言)
32 (realize, 31, 34) (实现、31、34)
33 (all, student) (所有学生)
34 (brive, 33, AUto) (布里韦,33,奥托)
35 (have, auto, sign) (有、自动、签署)
36 (black PANTHER, SIGN) (黑豹,标志)
37 (GLUEd, SIGN, BUMPER) (胶水、标志、车盖)
38 (REPORT, SPEAKER, STUDY)
(报告、发言、研究)
39 (DO, SPEAKER, STUDY) (做、讲、学)
40 (purpose, study, 41) (目的、研究、41)
41 (assess, StUdy, 42, 43)
(评估,StUdy,42,43)。
42 (true, 17) (真实,17)
43 (hear, 31, 44) (听到、31、44)
44 (OR, 45, 46) (俄勒冈州,45,46)
45 (of reality, vOICE) (现实,VOICE)
46 (of paranoia, voice) (偏执狂的声音)
Proposition number Proposition 1 (SERIES, ENCOUnter) 2 (Violent, ENCOUnter) 3 (BLOODY, ENCOUNTER) 4 (BETWEEN, ENCOUNTER, POLICE, BLACK PANTHER) 5 (time: in, ENCOUnter, SUMMER) 6 (EARLy, SUMMER) 7 (TIME: in, SUMMER, 1969) 8 (soon, 9) 9 (after, 4, 16) 10 (GROUP, STUDENT) 11 (Black, student) 12 (tEACH, speAKER, STUDENT) 13 (location: at, 12, cal state COLLEGE) 14 (LOCATION: AT, CAL State COLLEGE, LOS ANGELES) 15 (is a, student, black Panther) 16 (begin, 17) 17 (complain, student, 19) 18 (continuous, 19) 19 (harass, police, student) 20 (AMONG, COMPlaint) 21 (MANY, COMPLAINT) 22 (COMPlain, student, 23) 23 (receive, student, ticket) 24 (MANY, TICKET) 25 (cause, 23, 27) 26 (SOME, STUDENT) 27 (in danger of, 26, 28) 28 (LOSE, 26, LicEnSE) 29 (during, discussion, 32) 30 (LENGTHY, DISCussion) 31 (and, STUDENT, SPEAKER) 32 (realize, 31, 34) 33 (all, student) 34 (brive, 33, AUto) 35 (have, auto, sign) 36 (black PANTHER, SIGN) 37 (GLUEd, SIGN, BUMPER) 38 (REPORT, SPEAKER, STUDY) 39 (DO, SPEAKER, STUDY) 40 (purpose, study, 41) 41 (assess, StUdy, 42, 43) 42 (true, 17) 43 (hear, 31, 44) 44 (OR, 45, 46) 45 (of reality, vOICE) 46 (of paranoia, voice)| Proposition number | Proposition | | :---: | :---: | | 1 | (SERIES, ENCOUnter) | | 2 | (Violent, ENCOUnter) | | 3 | (BLOODY, ENCOUNTER) | | 4 | (BETWEEN, ENCOUNTER, POLICE, BLACK PANTHER) | | 5 | (time: in, ENCOUnter, SUMMER) | | 6 | (EARLy, SUMMER) | | 7 | (TIME: in, SUMMER, 1969) | | 8 | (soon, 9) | | 9 | (after, 4, 16) | | 10 | (GROUP, STUDENT) | | 11 | (Black, student) | | 12 | (tEACH, speAKER, STUDENT) | | 13 | (location: at, 12, cal state COLLEGE) | | 14 | (LOCATION: AT, CAL State COLLEGE, LOS ANGELES) | | 15 | (is a, student, black Panther) | | 16 | (begin, 17) | | 17 | (complain, student, 19) | | 18 | (continuous, 19) | | 19 | (harass, police, student) | | 20 | (AMONG, COMPlaint) | | 21 | (MANY, COMPLAINT) | | 22 | (COMPlain, student, 23) | | 23 | (receive, student, ticket) | | 24 | (MANY, TICKET) | | 25 | (cause, 23, 27) | | 26 | (SOME, STUDENT) | | 27 | (in danger of, 26, 28) | | 28 | (LOSE, 26, LicEnSE) | | 29 | (during, discussion, 32) | | 30 | (LENGTHY, DISCussion) | | 31 | (and, STUDENT, SPEAKER) | | 32 | (realize, 31, 34) | | 33 | (all, student) | | 34 | (brive, 33, AUto) | | 35 | (have, auto, sign) | | 36 | (black PANTHER, SIGN) | | 37 | (GLUEd, SIGN, BUMPER) | | 38 | (REPORT, SPEAKER, STUDY) | | 39 | (DO, SPEAKER, STUDY) | | 40 | (purpose, study, 41) | | 41 | (assess, StUdy, 42, 43) | | 42 | (true, 17) | | 43 | (hear, 31, 44) | | 44 | (OR, 45, 46) | | 45 | (of reality, vOICE) | | 46 | (of paranoia, voice) |
Note. Lines indicate sentence boundaries. Propositions are numbered for eace of reference. Numbers as propositional arguments refer to the proposition with that number.
注。横线表示句子边界。命题编号是为了便于参考。作为命题参数的数字指的是具有该数字的命题。

Cycle 2: Buffer: P3, 4, 5, 7: input: P8-19.
周期 2:缓冲器:P3、4、5、7:输入:P8-19.

Cycle 3: Buffer: P4, 9, 15, 19: Input: P20-28.
周期 3: 缓冲器:P4、9、15、19: 输入:P20-28.

Cycle 4: Buffer: P4,9,15,19: Input: P29-37.
周期 4: 缓冲区:P4、9、15、19: 输入:P29-37.

Cycle 5: Buffer: P4, 19, 36, 37: Input: P38-46.
周期 5: 缓冲区:P4、19、36、37:输入:P38-46.

Figure 1. The cyclical construction of the coherence graph for the propositions ( P P P\mathbf{P} ) shown in Table 1.
图 1.表 1 所示命题 ( P P P\mathbf{P} ) 的一致性图的循环构造。

ferences, and specifying a macrostructure) toward a deeper, less surface-dependent structure. Thus, the deep, elaborated semantic representation should stand at the end rather than at the beginning of a processing model, though even at that point some vagueness and
因此,深层的、精细的语义表征应该是处理模型的终点而不是起点,尽管即使是在这个点上,也会存在一些模糊性和宏观结构。因此,深层的、详尽的语义表征应该是处理模型的终点而不是起点,尽管即使在这一点上也存在一些模糊性和不确定性。

ambiguity may remain. In general, neither the original text nor a person’s understanding of it is completely unambiguous.
可能仍然存在含糊不清之处。一般来说,原文和个人对原文的理解都不是完全明确的。
This raises, of course, the question of what such a semantic representation represents. What is gained by writing (complain, stuDENT) rather than The students complained? Capitalizing complain and calling it a concept rather than a word explains nothing in itself. There are two sets of reasons, however, why a model of comprehension must be based on a semantic representation rather than directly on English text. The notation clarifies which aspects of the text are important (semantic content) and which are not (e.g., surface structure) for our purpose; it provides a unit that appears appropriate for studying comprehension processes, that is, the proposition (see Kintsch & Keenan, 1973, for a demonstration that number of propositions in a sentence rather than number of words determines reading time); finally, this kind of notation greatly simplifies the scoring of experimental protocols in that it establishes fairly unambiguous equivalence classes within which paraphrases can be treated interchangeably.
当然,这就提出了这样一个问题:这样的语义表征代表了什么?写(抱怨,学生)而不是 "学生们抱怨 "有什么好处?将 "抱怨 "大写,称其为概念而非单词,这本身并不能说明什么。然而,理解模型必须基于语义表征而不是直接基于英语文本,这其中有两组原因。对于我们的目的来说,语义表征明确了文本的哪些方面是重要的(语义内容),哪些方面是不重要的(例如,表面结构);它也明确了文本的哪些方面是重要的(语义内容),哪些方面是不重要的、最后,这种符号大大简化了实验方案的评分,因为它建立了相当明确的等价类,在这些等价类中,转述可以互换处理。
However, there are other even more important reasons for the use of semantic representations. Work that goes beyond the confines of the present article, for example, on the organization of text bases in terms of fact frames and the generation of inferences from such knowledge fragments, requires the development of suitable representations. For instance, complain is merely the name of a knowledge complex that specifies the normal uses of this concept, including antecedents and consequences. Inferences can be derived from this knowledge complex, for example, that the students were probably unhappy or angry or that they complained to someone, with the text supplying the professor for that role. Once a concept like complain is elaborated in that way, the semantic notation is anything but vacuous. Although at present we are not concerned with this problem, the possibility for this kind of extension must be kept open.
然而,使用语义表征还有其他更重要的原因。超出本文范围的工作,例如根据事实框架组织文本库以及根据这些知识片段生成推理,都需要开发合适的表征。例如,"抱怨 "只是一个知识复合体的名称,它规定了这一概念的正常用法,包括前因后果。从这个知识复合体中可以得出推论,例如,学生们可能不开心或生气了,或者他们向某人抱怨了,而文本提供了教授的角色。一旦像 "抱怨 "这样的概念以这种方式得到阐述,语义符号就不再是空洞的了。虽然目前我们并不关心这个问题,但这种扩展的可能性必须保持开放。

Cycling 自行车运动

The first step in the processing model is to organize the input propositions into a coherent graph, as illustrated in Figure 1. It was
处理模型的第一步是将输入命题组织成一个连贯的图形,如图 1 所示。这是

assumed for the purpose of this figure that the text is processed sentence by sentence. Since in our example the sentences are neither very short nor very long, this is a reasonable assumption and nothing more complicated is needed. It means that n i n i n_(i)n_{i}, the number of input propositions processed per cycle, ranges between 7 and 12.
在本图中,我们假设文本是逐句处理的。由于在我们的示例中,句子既不太短也不太长,因此这是一个合理的假设,不需要更复杂的假设。这意味着, n i n i n_(i)n_{i} ,即每个周期处理的输入命题数,介于 7 和 12 之间。
In Cycle 1 (see Figure 1), the buffer is empty, and the propositions derived from the first sentence are the input. P4 is selected as the superordinate proposition because it is the only proposition in the input set that is directly related to the title: It shares with the title the concept poulce. (Without a title, propositions are organized in such a way that the simplest graph results.) P1, P2, P3, and P5 are directly subordinated to P 4 because of the shared argument Encounter; P6 and P7 are subordinated to P5 because of the repetition of SumMer.
在循环 1 中(见图 1),缓冲区是空的,从第一句话中得出的命题就是输入。P4 被选为上位命题,因为它是输入集合中唯一与标题直接相关的命题:它与标题共享概念 "poulce"。(在没有标题的情况下,命题的组织方式会产生最简单的图形)。P1、P2、P3 和 P5 由于共享参数 Encounter 而直接从属于 P4;P6 和 P7 由于 SumMer 的重复而从属于 P5。

At this point, we must specify the short-term memory assumptions of the model. For the present illustration, the short-term memory capacity was set at s = 4 s = 4 s=4s=4. Although this seems like a reasonable value to try, there is no particular justification for it. Empirical methods to determine its adequacy will be described later. We also must specify some strategy for the selection of the propositions that are to be maintained in the short-term memory buffer from one cycle to the next. As was mentioned before, a good strategy should be biased in favor of superordinate and recent propositions. The “leading-edge strategy,” originally proposed by Kintsch and Vipond (1978), does exactly that. It consists of the following scheme. Start with the top proposition in Figure 1 and pick up all propositions along the graph’s lower edge, as long as each is more recent than the previous one (i.e., the index numbers increase); next, go to the highest level possible and pick propositions in order of their recency (i.e., highest numbers first); stop whenever s s ss propositions have been selected. In Cycle 1, this means that the process first selects P4, then P5 and P7, which are along the leading edge of the graph; thereafter, it returns to Level 2 (Level 1 does not contain any other not-yet-selected propositions) and picks as a fourth and final proposition the most recent one from that level, that is, P3.
此时,我们必须明确模型的短时记忆假设。在本示例中,短时记忆容量设定为 s = 4 s = 4 s=4s=4 。虽然这似乎是一个可以尝试的合理值,但并没有特别的理由。稍后将介绍确定其适当性的经验方法。我们还必须指定某种策略,用于选择从一个周期到下一个周期在短时记忆缓冲区中保留的命题。如前所述,一个好的策略应该偏重于上位命题和最新命题。最初由 Kintsch 和 Vipond(1978 年)提出的 "前沿策略 "正是如此。它包括以下方案。从图 1 中最上面的命题开始,沿着图的下边选取所有命题,只要每个命题都比前一个命题更新颖(即索引数字增加);接下来,尽可能往上选取命题,并按其新颖程度排序(即先选取数字最高的命题);只要选取了 s s ss 个命题,就停止选取。在循环 1 中,这意味着处理过程首先选择 P4,然后是 P5 和 P7,它们位于图的前沿;之后,它返回第 2 层(第 1 层不包含任何其他尚未选择的命题),并从该层中选择最近的命题,即 P3,作为第四个也是最后一个命题。

Figure 2. The complete coherence graph. (Numbers represent propositions. The number of boxes represents the number of extra cycles required for processing.)
图 2.完整的一致性图。(数字代表命题。方框数代表处理所需的额外周期数)。
Cycle 2 starts with the propositions carried over from Cycle 1; P9 is connected to P4, and P 8 is connected to P 9 P 9 P9\mathrm{P9}; the next connection is formed between P4 and P15 because of the common argument black panther; P19 also connects to P4 because of POLice; P10, P11, P12, and P17 contain the argument student and are therefore connected to P15, which first introduced that argument. The construction of the rest of the graph as well as the other processing cycles should now be easy to follow. The only problems arise in Cycle 5, where the input propositions do not share a common argument with any of the propositions in the buffer. This requires a long-term memory search in this case; since some of the input
循环 2 以从循环 1 转来的命题开始;P9 与 P4 相连,P8 与 P 9 P 9 P9\mathrm{P9} 相连;下一个连接在 P4 和 P15 之间形成,因为有共同的参数 black panther;P19 也与 P4 相连,因为有 POLice;P10、P11、P12 和 P17 包含参数 student,因此与 P15 相连,P15 首次引入了该参数。图的其余部分以及其他处理循环的构建现在应该很容易理解了。唯一的问题出现在循环 5 中,输入命题与缓冲区中的任何命题都没有共同的参数。在这种情况下,需要进行长期的记忆搜索;因为有些输入的

Figure 3. Some components of the report schema. (Wavy brackets indicate alternatives. $ $ $\$ indicates unspecified information.)
图 3.报告模式的部分组件。(波浪形括号表示备选方案。 $ $ $\$ 表示未指定信息)。

propositions refer back to P17 and P31, the search leads to the reinstatement of these two propositions, as shown in Figure 1. Thus, the model predicts that (for the parameter values used) some processing difficulties should occur during the comprehension of the last sentence, having to do with determining the referents for “we” and “their charges.”
如图 1 所示,当 "我们 "和 "他们的指控 "这两个命题指向 P17 和 P31 时,搜索会导致这两个命题的恢复。因此,该模型预测(在所使用的参数值下)在理解最后一句话时会出现一些处理困难,这些困难与确定 "我们 "和 "他们的指控 "的指代有关。
The coherence graph arrived at by means of the processes illustrated in Figure 1 is shown in Figure 2. The important aspects of Figure 2 are that the graph is indeed connected, that is, the text is coherent, and that some propositions participated in more than one cycle. This extra processing is indicated in Figure 2 by the number of boxes in which each proposition is enclosed: P4 is enclosed by four boxes, meaning that it was maintained during four processing cycles after its own input cycle. Other propositions are enclosed by three, two, one, or zero boxes in a similar manner. This information is crucial for the model because it determines the likelihood that each proposition will be reproduced.
通过图 1 所示过程得出的连贯图如图 2 所示。图 2 的重要之处在于,该图确实是连通的,即文本是连贯的,而且有些命题参与了不止一个循环。图 2 中每个命题所包含的方框数量表明了这种额外的处理:P4 由四个方框围住,这意味着它在自己的输入周期之后的四个处理周期中都得到了保留。其他命题也以类似的方式被三个、两个、一个或零个方框包围。这些信息对模型至关重要,因为它决定了每个命题被重现的可能性。

Schema 模式

Figure 3 shows a fragment of the report schema. Since only the first paragraph of a report will be analyzed, only information in the schema relevant to the introduction of a report is shown. Reports are conventionally organized into introduction, method, results, and discussion sections. The introduction section must specify that an experiment, or
图 3 显示了报告模式的一个片段。由于只分析报告的第一段,因此只显示模式中与报告导言相关的信息。按照惯例,报告分为引言、方法、结果和讨论部分。引言部分必须说明实验或

an observational study, or some other type of research was performed. The introduction contains three kinds of information: (a) the setting of the report, (b) literature references, and © a statement of the purpose (hypothesis) of the report. The setting category may further be broken down into location and time ( $ $ $\$ signs indicate unspecified information). The literature category is neglected here because it is not instantiated in the present example. The purpose category states that the purpose of the experiment is to find out whether some causal relation holds. The nature of this causal relation can then be further elaborated.
观察研究或其他类型的研究。引言包含三类信息:(a) 报告的背景;(b) 文献参考;© 报告目的(假设)的陈述。背景类别可进一步细分为地点和时间( $ $ $\$ 符号表示未指定的信息)。这里忽略了文献类别,因为在本例中没有实例。目的类别指出,实验的目的是找出某种因果关系是否成立。然后可以进一步阐述这种因果关系的性质。

Macro-operations 宏观操作

In forming the macrostructure, micropropositions are either deleted, generalized, replaced by a construction, or carried over unchanged. Propositions that are not generalized are deleted if they are irrelevant; if they are relevant, they may become macropropositions. Whether propositions that are generalized become macropropositions depends on the relevance of the generalizations: Those that are relevant are included in the macrostructure with a higher probability than those that are not relevant. Thus, the first macro-operation is to form all generalizations of the micropropositions. Since the present theory lacks a formal inference component, intuition must once again be invoked to provide us with the required generalizations. Table 2 lists the generalizations that we have noted in the text
在形成宏观结构的过程中,微观命题要么被删除,要么被泛化,要么被结构所取代,要么原封不动地保留下来。没有被概括的命题如果无关紧要,就会被删除;如果相关,就可能成为宏命题。被概括的命题是否成为宏命题取决于概括的相关性:与不相关的命题相比,相关的命题被纳入宏观结构的概率更高。因此,第一个宏观操作就是形成微观命题的所有概括。由于本理论缺乏形式推理部分,因此必须再次调用直觉来为我们提供所需的概括。表 2 列出了我们在文本中注意到的概括方法

base of the Bumperstickers paragraph. They are printed below their respective micropropositions and indicated by an “M.” Thus, P1 (series, encounter) is generalized to M1 (SOME, ENCOUNTER).
在 "碰碰贴 "段落的底部。它们被印在各自的微命题下面,并用 "M "表示。因此,P1(系列,相遇)被概括为 M1(一些,相遇)。

In Figure 4, the report schema is applied to our text-base example in order to determine which of the propositions are relevant and which are not. The schema picks out the generalized setting statements-that the whole episode took place in California, at a college, and in the sixties. Furthermore, it selects that an experiment was done with the purpose of finding out whether Black Panther bumperstickers were the cause of students receiving traffic tickets. Associated with the bumperstickers is the fact that the students had cars with the signs on them. Associated with the tickets is many tickets, and that the students complained about police harassment. Once these propositions have been determined as relevant, a stricter relevance criterion selects not all generalized setting information, but only the major setting (in California), and not all antecedents and consequences of the basic causal relationship, but only its major components (bumperstickers lead to tickets). Finally, at the top level of the macrostructure hierarchy, all the information in the introduction is reduced to an experiment was done. Thus, some propositions appear at only one level of the hierarchy, others at two levels, and one at all three levels. Each time a proposition is selected at a particular level of the macrostructure, the likelihood that it will later be recalled increases.
在图 4 中,我们将报告模式应用于文本基础示例,以确定哪些命题是相关的,哪些是不相关的。该模式选出了概括性的背景陈述--整个事件发生在加利福尼亚州的一所大学里,时间是六十年代。此外,它还选取了一个实验,目的是找出黑豹保险杠贴纸是否是学生收到交通罚单的原因。与保险杠贴纸相关的事实是,学生们的汽车上贴有这些标志。与罚单相关的是许多罚单,以及学生抱怨警察骚扰。一旦这些命题被确定为相关,更严格的相关性标准就不会选择所有的一般环境信息,而只会选择主要环境(在加利福尼亚州),也不会选择基本因果关系的所有前因后果,而只会选择其主要组成部分(保险杠贴纸导致罚单)。最后,在宏观结构层次的顶层,引言中的所有信息都简化为做了一个实验。因此,有些命题只出现在层次结构的一个层级上,有些则出现在两个层级上,而有一个命题则出现在所有三个层级上。每当一个命题被选入宏观结构的某一层次时,它以后被回忆起的可能性就会增加。
The results of both the micro- and macroprocesses are shown in Table 2. This table contains the 46 micropropositions shown in Table 1 and their generalizations. Macro-
表 2 显示了微观和宏观过程的结果。该表包含表 1 所示的 46 个微观命题及其概括。宏观

propositions are denoted by M in the table; micropropositions that are determined to be relevant by the schema and thus also function as macropositions are denoted by MP.
表中用 M 表示微命题;用 MP 表示被模式确定为相关的微命题,因此也可充当宏命题。
The reproduction probabilities for all these propositions are also derived in Table 2. Three kinds of storage operations are distinguished. The operation S S SS is applied every time a (micro-) proposition participates in one of the cycles of Figure 1, as shown by the number of boxes in Figure 2. Different operators apply to macropropositions, depending on whether or not they are relevant. If a macroproposition is relevant, an operator M M MM applies; if it is irrelevant, the operator is G. Thus, the reproduction probability for the irrelevant macroproposition some encounters is determined by G; but for the relevant in the sixties, it is determined by M. Some propositions can be stored either as micro- or macropropositions; for example, the operator SM 2 SM 2 SM^(2)\mathrm{SM}^{2} is applied to MP23-S because P23 participates in one processing cycle and M 2 M 2 M^(2)\mathrm{M}^{2} because M23 is incorporated in two levels of the macrostructure.
表 2 还列出了所有这些命题的重现概率。存储操作分为三种。每当一个(微观)命题参与图 1 中的一个循环时,就会执行 S S SS 操作,如图 2 中方框的数量所示。根据宏命题是否相关,不同的运算符适用于不同的宏命题。如果宏命题是相关的,则适用运算符 M M MM ;如果宏命题是不相关的,则适用运算符 G。因此,某些遇到的不相关宏命题的重现概率由 G 决定;但对于六十年代的相关宏命题,重现概率由 M 决定。有些命题既可以作为微观命题也可以作为宏观命题存储;例如,运算符 SM 2 SM 2 SM^(2)\mathrm{SM}^{2} 被应用于 MP23-S,因为 P23 参与了一个处理循环,而 M 2 M 2 M^(2)\mathrm{M}^{2} 则是因为 M23 被纳入了宏观结构的两个层次。

The reproduction probabilities shown in the last column of Table 2 are directly determined by the third column of the table, with p , g p , g p,gp, g, and m m mm being the probabilities that each application of S , G S , G S,GS, G, and M M MM, respectively, results in a successful reproduction of that proposition. It is assumed here that S S SS and M M MM are statistically independent.
表 2 最后一列显示的重现概率由表中第三列直接决定,其中 p , g p , g p,gp, g m m mm 分别是每次应用 S , G S , G S,GS, G M M MM 导致成功重现该命题的概率。这里假设 S S SS M M MM 在统计上是独立的。

Production 生产

Output protocols generated by the model are illustrated in Tables 3 and 4. Strictly speaking, only the reproduced text bases shown in these tables are generated by the
表 3 和表 4 展示了模型生成的输出协议。严格来说,只有这些表格中显示的复制文本基础是由模型生成的。

Figure 4. The macrostructure of the text base shown in Table 1. (Wavy brackets indicate alternatives.)
图 4:表 1 所示文本库的宏观结构表 1 所示文本库的宏观结构。(波浪形括号表示备选方案)。

model ; this output was simulated by reproducing the propositions of Table 2 with the probabilities indicated. (The particular values of p , g p , g p,gp, g, and m m mm used here appear reasonable in light of the data analyses to be reported in the next section.) The results shown are from a single simulation run, so that no selection is involved. For easier reading, English sentences instead of semantic representations are shown.
模型;这一输出是通过重现表 2 中的命题并标明概率来模拟的。(这里使用的 p , g p , g p,gp, g m m mm 的特定值从下一节报告的数据分析来看是合理的)。显示的结果来自单次模拟运行,因此不涉及选择。为了便于阅读,这里显示的是英文句子而不是语义表示。
The simulated protocols contain some metastatements and reconstructions, in addition to the reproduced propositional content. Nothing in the model permits one to assign a probability value to either metastatements or reconstructions. All we can say is that such statements occur, and we can identify them as metastatements or reconstructions if they are encountered in a protocol.
除了再现的命题内容外,模拟协议还包含一些转移和重构。模型中没有任何内容允许我们为转移或重构赋予一个概率值。我们只能说,这些语句会出现,如果在协议中遇到,我们可以将它们识别为转移或重构。
Tables 3 and 4 show that the model can produce recall and summarization protocols that, with the addition of some metastatements and reproductions, are probably indistinguishable from actual protocols obtained in psychological experiments.
表 3 和表 4 显示,该模型可以生成回忆和总结协议,如果再加上一些转移和重现,这些协议很可能与心理实验中获得的实际协议无异。

Preliminary Data Analyses
初步数据分析

The goal of the present section is to demonstrate how actual experimental protocols can be analyzed with the methods developed here. We have used “Bumperstickers and the Cops” in various experiments. The basic procedure of all these experiments was the same, and only the length of the retention interval varied. Subjects read the typewritten text at their own speed. Thereafter, subjects were asked to recall the whole report, as well as they could, not necessarily verbatim. They were urged to keep trying, to go over their protocols several times, and to add anything that came to mind later. Most subjects worked for at least 20 minutes at this task, some well over an hour. The subjects typed their protocols into a computer-controlled screen and were shown how to use the computer to edit and change their protocols. The computer recorded their writing times. After finishing the recall protocol, a subject was asked to write a summary of the report. The summary had to be
本节的目的是演示如何利用本节开发的方法分析实际实验方案。我们在不同的实验中使用了 "碰碰车和警察"。所有这些实验的基本程序都是一样的,只是保留时间间隔的长短有所不同。受试者以自己的速度阅读打字文本。之后,实验者被要求尽可能地回忆整篇报告,但不一定要逐字逐句地回忆。他们被要求不断尝试,多次复习他们的协议,并在之后补充任何想到的东西。大多数受试者在这项任务上至少工作了 20 分钟,有些甚至超过了一个小时。受试者将他们的方案输入计算机控制的屏幕,并向他们演示如何使用计算机编辑和更改他们的方案。计算机会记录他们的书写时间。完成回忆方案后,受试者被要求撰写报告摘要。摘要必须
Table 2 表 2
Memory Storage of Micro- and Macropropositions
微观和宏观命题的记忆存储
Proposition number 命题编号 Proposition 提案 Storage operation 存储操作 Reproduction probability
繁殖概率
P1 (SERIES, ENC) (系列, ENC) S S SS p p pp
M1 (SOME, ENC) (部分, ENC) G g
P2 (vIOL, ENC) (vIOL、ENC) S p p pp
P3 (blOODY, ENC) (blOODY、ENC) S 2 S 2 S^(2)\mathrm{S}^{2} 1 ( 1 p ) 2 1 ( 1 p ) 2 1-(1-p)^(2)1-(1-p)^{2}
P4 (BETW, ENC, POL, BP)
(BETW、EC、POL、BP)
S 5 S 5 S^(5)\mathrm{S}^{5} 1 ( 1 p ) 5 1 ( 1 p ) 5 1-(1-p)^(5)1-(1-p)^{5}
P5 (IN, ENC, SUM) S 2 S 2 S^(2)S^{2} 1 ( 1 p ) 2 1 ( 1 p ) 2 1-(1-p)^(2)1-(1-p)^{2}
P6 (EARLY, SUM) (早期,总计) S
P7 (in, SUM, 1969) (在,SUM,1969 年) S 2 S 2 S^(2)\mathrm{S}^{2} 1 ( 1 p ) 2 1 ( 1 p ) 2 1-(1-p)^(2)1-(1-p)^{2}
M7 (in, EPISODE, SIXTIES) (在 EPISODE, SIXTIES) M m m mm
P8 (SOON, 9) (马上,9) S p p pp
P9 (AFTER, 4, 16) (之后,4,16) S 3 S 3 S^(3)S^{3} 1 ( 1 p ) 3 1 ( 1 p ) 3 1-(1-p)^(3)1-(1-p)^{3}
P10 (GROUP, STUD) (团体,学生) S p p pp
M10 (SOME, STUD) (部分,学生) G g
P11 (BLACK, STUD) (黑体) S p p pp
P12 (teach, Speak, stud) (教、讲、学) S p p pp
M12 (HAVE, SPEAK, STUD) (有、说、种) G g
P13 (ar, 12, csc) (亚美尼亚、12、哥斯达黎加) S p p pp
M13 (at, episode, COLlege) (在,插曲,学院)