Ethnoscience 1972 Oswald Werner STOR ® Annual Review of Anthropology, Vol. 1 (1972), 271-308. Stable URL: http://links.jstor.org/sici?sici=0084-6570%281972%292%3Al%3C271%3AEl%3E2.0.CO%3B2-7 Annual Review of Anthropology is currently published by Annual Reviews. Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at http://www.jstor.org/about/terms.html. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use. Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at http://www.jstor.org/journals/annrevs.html. Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission. JSTOR is an independent not-for-profit organization dedicated to creating and preserving a digital archive of scholarly journals. For more information regarding JSTOR, please contact support@jstor.org. http://www.j stor.org/ Mon Feb 28 07:27:27 2005 Copyright 1972. All rights reserved ETHNOSCIENCE 1972 9510 Oswald Werner1 Department of Anthropology Northwestern University, Evanston, Illinois Introduction After the first flurry of successes at discovering folk classifications by relatively rigorous eliciting techniques, few major advances have been made in the direction of a better understanding of lexical/semantic fields.2 Casagrande & Hale (6) provided a nearly exhaustive list of the major semantic relations, but little work followed in spite of increasing evidence that lexical/semantic relations are more readily recognizable as language universals than are semantic components (Werner 69). A partial explanation is that work in this area is extremely difficult and sociolinguistic approaches (ethnography of communication) seem to offer easier solutions. I think this is an unrealistic estimation since considerations of context, though certainly crucial, are also certainly more complex than so-called "context free" semantics. Advances have been few because experimentation with large lexical fields exceeds the limitations of procedures by hand. It is simply too difficult to keep track of 1000 or more lexical items and their ramifications. The system-atization of lexicography—and ethnoscience can be viewed as an attempt at systematic and nonlinear (nonalphabetic) lexicography—has always been burdened by the bulk of data it had to process. The rather unsystematic variations of definitions in existing dictionaries are a living monument to this problem. Progress in ethnoscience is slow because the purpose of the exercise was never made entirely clear. Theorizing in terms of "synthetic informants" or in terms of question-answering systems promises to fulfill for the first time Goodenough's dictum that an ethnography should allow one to behave like a native of that culture—at least for the general area of cultural verbal behavior based on cultural knowledge. It cannot be stressed sufficiently that aThis research was in part supported by a grant from NIMH MH 10940. I am indebted to the following people for their comments: Roy D'Andrade, Martha Evans, John Farella, Terry Strauss, and especially for the detailed analysis of Paul Friedrich. Any shortcomings that remain are my own. 21 simply fail to understand Hymes' (31) optimism or see any evidence for its justification. Compare, for example, Hymes to Roger Keesing's recent (38) opening sentence, "Whatever happened to Ethnoscience?" 271 272 WERNER cultural knowledge is so vast that work in this area is unrealistic without machine aid. Since anthropologists cannot get inside the informant's head, psychological reality is an empty concept. Mind-like mechanical verbal behavior seems best suited for the validation of our assumptions. The clarity of purpose of question-answering systems allows the investigator to formulate prerequisites in greater precision and detail than by any other method. The requirement that all of the investigator's ideas be repre-sentäble as part of a computer program forces him into detailed conceptualization and planning exceeding in rigor by several orders of magnitude any similar requirements in anthropology today. In order to avoid the fate of machine translation, the emphasis should be not on quick payoffs or early practical applications. If the machine is enlisted in the solution of interesting theoretical problems, applications will inevitably follow. Answering questions by using a large data base creatively (deductively) is not easy, but no one has yet suggested a better method than attempting it by machine. In this paper I first show that indeed four relatively independent areas of intellectual endeavor (ethnoscience, generative semantics, computerized semantic information processing, and sociolinguistics) do converge. Subsequently I try to enumerate some of the contributions made in each of these fields. Finally, I select a few topics in which I have clarified some points sufficiently for myself to share these insights. Readers interested in more comprehensive coverage of ethnoscience and its history are referred to the excellent reviews by Hymes (30), Sturtevant (66), Colby (8), Durbin (16), Hymes (31), Friedrich (21), Werner (75) and to Tyler's (67) compendium. With the exception of Werner (75), these papers and most of Tyler's book represent a point of view different from the one I assume in this paper. Convergences The convergence of ethnoscience, generative semantics, computer simulation of semantic information processing, and some aspects of sociolinguistics (the ethnography of speaking) can be summarized roughly by the following diagram (Figure 1): INPUT I QUESTION OUTPUT O ANSWER Figure 1. Question-answer schema. The specific inputs and outputs in each of the fields mentioned can be characterized as follows: Ethnoscience.—If a system of this kind is proposed, for example, for a ETHNOSCIENCE 1972 273 componential analysis of a kinship terminology, the diagram can be conceived as a device which answers the following questions: "If one person is related to another in a specified manner, what does that person call the other?" or "What is the second person called by the first person?" If the analysis is a taxonomy, the question may be "What 'things' (in this culture, using its language) are considered animals?" or if some processes were part of the analysis, "How do you make a canoe?" or "How does one conduct a funeral?" The questions can be assumed to ask about general topics (e.g. "What are all the things that have been created?" or specific (e.g. "How do you thank the host at the end of a meal?") It seems that a question-and-an-swer approach is particularly fitting for ethnoscience: if the inputs are culturally appropriate and relevant questions, the outputs will be culturally appropriate and relevant answers. Generative semantics.—One has to distinguish between two cases: 1. The more extreme view where the semantic component of a language is truly generative in the technical sense. By this I mean that the output is a "random generation" of abstract phrase markers. A device of this sort is hardly comparable to any human-like behavior. I shall therefore exclude it from further consideration (see also p. 278). 2. If, however, the generating device is conceived more loosely as one triggered by some thought it is easy to surmise that such thought could have been evoked by a question. Since there seems to be no restrictions on questions that can evoke thought the device is comparable to a general question answerer, analogous to the ethnoscience examples above. Semantic information processing.—In computer experiments of semantic behavior, one possible aim is the simulation of human question-answering behavior. Thus the input is clearly a question and the output is an appropriate answer. If the device is general, that is, able to answer questions potentially in any area of a culture, the device is then analogous to the earlier examples from ethnoscience. Sociolinguistics.—Many problems in the social uses of language can be conceived as answers to questions: e.g. "What does one do when a stranger approaches camp?" "How does one behave in the presence of the president?" or "When is it proper for a young man to speak?" These are questions appropriate to an ethnography of speaking but also clearly questions that have possible answers. They are formulated in the native's language to reduce ethnocentric translation bias and, equally importantly, to avoid imposing one's own categories. Quite likely not all socially significant contexts are explicitly open to description by native speakers. They do misjudge situations on occasion or are unable to state the precise contextual rule. However, since verbalizable contexts are by no means rare or small in numbers there is no reason to exclude them from semantics (following Tyler 67). Contextual attributes can be listed with other attributes. Implicit contexts are considerably more diffi- 274 WERNER cult. Since no minimally adequate metalanguage has been proposed for their description, I exclude them from further consideration, although this exclusion should not be construed as a lack of my appreciation of their importance. Typology If the discussion thus far is minimally successful in indicating the convergence of these four fields, then why are they separate fields, how do they differ in method and theory, and what constitutes their major emphases? In order to help the reader clarify the destination of ethnoscience, I present a typology which characterizes each field. This typology is also a programmatic attempt at integration of the four fields.3 Ethnoscience,—Two general approaches in the field require discussion. First is the componential approach. The extension of this method of analysis to the entire vocabulary of a language has been attempted only in spirit. Katz & Postal (33) have sharply criticized anthropologists for this timidity. Nevertheless there are no empirical investigations extant that attempt a componential analysis of all the lexical resources of a language. It has been most successfully applied to small lexical sets, especialy kinship terminologies.4 A subbranch of componential analysis are analyses which can be characterized either as some aspect of decision making (e.g. Gladwins 25) and/or the im- * Howard Maclay (44, p. 180) quotes Victor Yngve's glee in predicting the demise of linguistics as an independent discipline. Although I can empathize with Yngve's joy, I prefer to interpret the demise of "autonomous syntax" [at least in the view of some generative semanticists (e.g. as described by Lakoff 41)] optimistically. It is a positive, promising sign of the integration that Paul Friedrich (personal communication) and I believe to be the best thing that could happen to the fields concerned. I further agree with Friedrich that the most significant advancement of the 1970s will be such a unification. 41 have attempted in previous publications (Werner 73, 75) to extend the notion of components to the entire lexicon by the introduction of the 'circle stať (®) operation. This operation—if all the components of a language are known (a considerable, or even impossible task)—"automatically" establishes all hierarchies (taxonomie levels) with paradigms on each level of the taxonomy, and enables the investigator to add easily empirical and logical constraints on the occurrence of incompatible component combinations. To the best of my knowledge no one (including myself) has ever seriously tried to apply this model to anything more than my reanalysis of Berlin and co-workers' (1) taxonomy of squashes in Tzeltal. The primary value of my model lies in its illustration of the form a whole language componential analysis might take; especially that there is a requirement for a very large number of empirical and logical constraints that are necessary in order to exclude items not in the speaker's environment, as well as items that are conceptually impossible. The latter idea, not pursued beyond the suggestion that such a logical restriction may be possible, is sometimes raised by generative semanticists. ETHNOSCIENCE 1972 275 position of some kind of a temporal order on a decision or recognition process (e.g. Geoghegan 24). It is perhaps too early to tell if such models, whose major overt characteristics are the use of flow charts, are enough of a departure to require separate treatment. Furthermore, it is not clear if the temporal order of the flow chart is an artifact of the analysis, or a reflection of the capabilities of real human beings. Most flow charts can be represented as decision tables which provide a more synchronous impression of the decision processes. More specifically, I was able to reduce the Gladwins' (25) flow chart to two decision tables, one following the other. Although I was unable to reduce it to one table, I am equally unable to decide if the supposed temporal order of the two decision tables has some reality in real time greater or lesser than the stepwise flow charts. Second is the lexical/semantic relations, semantic field, or lattice approach. It was first introduced in anthropology in the work of Frake (20), although any earlier attempt at drawing folk taxonomies in the strict or extended sense qualifies as well. The early origins of Field Theory are found in the works of Trier, Portzig, and Weisgerber. By taxonomy I mean: taxonomy in the strict sense which applies only to the test frame "_____is a (kind of) _____". By extended sense I mean the more usual interpretation in anthropology which comprises any scheme of classification. For theoretical purposes, as I hope to show and have shown, especially in Werner & Fenton (74), strict taxonomies and other transitive relations (i.e. those that form hierarchies) must be separated. Thus, for example, a taxonomy of most human ethnoanatomies in the extended sense turns out to be an alternation of strict taxonomies ("The thumb is a kind of finger") and the part-to-whole relation ("The finger is part of the hand"). The notion was extended from the beginning to the entire lexicon. The list of available lexical/semantic relations constructed by Casagrande & Hale is amazingly exhaustive (6). The relatively low level of acceptance of this model thus far is apparently due to the unfortunate selection of block diagrams for the representation of taxonomies: o0 ^C Figure 2. Block diagram of a taxonomy. Figure 3. Directed graph of a taxonomy. 1 A 8 C \ B \F & | H \ While Figures 2 and 3 are for all purposes isomorphic, Figure 2 fails to show clearly that for an interpretation of the formal graph the edges (lines) of the graph need to be labeled (by the relation, here T for strict taxonomy) and directed (by arrows) to imply the assymmetry of the relation, i.e. that 276 WERNER AT B =£ BT A. Almost no one seems to doubt—in sharp contrast to compo-nential analysis—that extension of the relational approach to the entire vocabulary of a language is valid. Among numerous field techniques for the collection of data that have been suggested, the elicitation frame is perhaps foremost. [See compendia of field techniques in Perchonock & Werner (52), Werner & Fenton (74), and Werner et al (75).] It is relevant because a well-chosen question will elicit an explicit statement about some of the field relations that may hold between two or more terms. The ultimate test for the existence of a particular kind of linkage is, with few exceptions, the existence of a linking sentence in the informant's language. Componential analysis can be interpreted as an analysis subsequent to a taxonomie analysis, that is, componential paradigms are further structures imposed on a particular level of a particular taxonomy. Anyone doubting the foregoing statement should consider that the isolation of the kinship vocabulary which precedes a componential analysis of kinship addresses itself to the question, "Which terms in this language are kin-terms?" or "Is ------- a kind of kinsman?" A major shortcoming of many componential analyses is that they do not take into account intermediate taxonomie organization. In the Yankee terminology, terms like "parent," "grandparent," "child," "grandchild," "ancestor," "descendant," or even "blood relative" and "relative by marriage" are surely part of the reckoning. By imposing taxonomie organization first, the paradigms to be analyzed become considerably smaller than those usually found in the literature. Furthermore, the taxonomie structure imposes some degree of order on the components that reduces the number of possible analyses [see Burling's (5) apprehensions]. Furthermore, areas with ambiguities or multiple analyses are often cases where the indeterminancy carries some cultural significance. Surely the fact that all alternate analyses of the Yankee system contrast along one and only one of the dimensions of the analysis (namely the dimension of collateral distance) implies that such a distance measure is at best vague and highly variable, or even idiosyncratic, in view of situations in which American families find themselves today. Thus paradigmatic structuring is not alien to a lexical/semantic field type description. However, the existence of universal semantic components is debatable (see Werner 69, 76). The unifying principle in both approaches is a common concern with the lexical resources of the languages under investigation. It is therefore possible to characterize this field as lexicographic ethno-science. Summary.—The scope of the field of ethnoscience vis-a-vis the general phenomenon of language is relatively narrow: the primary focus is the lexicon. Anthropologists have investigated the vocabulary of weddings, curers, illness, religion, anatomy, firewood, how to do things (naming the sequences of behavior), or similarly what the steps are that take one through a day—all in a variety of cultures. Historically most anthropologists did not exclude the ETHNOSCIENCE 1972 277 lexicon from language as proponents of autonomous linguistics did nor did they exclude the vocabulary from culture. Almost all statements of linguistic relativity or the Sapir/Whorf hypothesis are statements emphasizing the lexical resources of language. I have argued (Werner 71) that proponents of an "autonomous linguistics" subscribe to a bias which I called "grammaticalist." In this view everything outside of phonology and grammar—supposedly the only rule-governed parts of language—are outside of linguistics. Perhaps Voegelin and Harris espoused this view most consistently in a series of articles well known to anthropologists of the early 1950s. Most anthropologists (especially Opler as the spokesman of the other side of the Voegelin/Harris position) can be characterized as representing the "lexicologist" bias. To these anthropologists a vocabulary that was not part of language (more precisely linguistics) made no sense (and rightfully, I believe). The lexicographic interests of ethnosci-ence have their roots in this bias. In addition, anthropologists were concerned about how lexical items are related to each other. Folk taxonomies emerged early as an important principle of lexical organization. Although implicitly present in many entries of standard monolingual, English dictionaries, for example, they have not yet been systematized to any extent. The list of lexical/semantic relations of Casagrande & Hale, based on some 800 Papago folk definitions, is the first to be almost exhaustive. The authors postulate a relationship of constituency (part-to-whole) but find no examples. Neither do they find the Conklin/ Frake relation of "_____ is a stage of_____," probably because they are dealing with people less essentially agriculturalists than the people whom Conklin and Frake studied. It is important to mention here the work of Romney & D'Andrade (58) and D'Andrade (12, 13, and especially 14). All of these papers use a statistical approach for the validation of lexical/semantic structure. By this I mean following the typology of Campbell and Fiske, which calls for validation by maximally different procedures and replication by maximally similar procedures. Eventually it may become possible to simulate the statistical validations as part of the question-answering routines. Similar frequencies for men and for machines will surely be highly significant. More importantly, D'Andrade's latest paper (14) shows that statistical methods can be helpful in finding some strong lexical/semantic relations between pairs of terms. Some of these relations may not be elicitable with ease by any other technique. It is interesting that D'Andrade's work also seems to imply that significant success lies in the application of modal logic to deductive systems (see p. 275). Other significant contributions to the notion of lexical/semantic fields in anthropology are Berlin et al (1), Kay (34, 37), Sanday (60), and Werner (75). The stage is now set for a sophisticated view of the lexicon. In this view 278 WERNBR the lexicon is a complex structure, probably some sort of lattice, where large numbers of terms are intimately interrelated. The picture that emerges is close to Bierman's stars. Bierman (2), a logician, envisioned the organization of the vocabulary as a system of relations. Each distinct lexical item occupies a node in a large plane. Lexical items that are related are linked by colored strings. The color represents the type of relations. Since these relations are binary, every ray belongs to two stars, the one from which it emanates and the one that it reaches. Presumably sentences of a language are retrievals from, or paths in, this huge network. Any sequence of nodes, if properly constrained, is a possible sentence. The problem is, however, that anthropologists have found no explicit constraints which restrict in some justifiable manner the choice of possible paths through the lattice. The unrestricted lattice idea is obviously the major weakness of ethnosci-ence. In general, a most rudimentary view of syntax prevails. Sentence frames are used to discriminate vocabulary items from each other or help in the assignment of vocabulary items to specific domains. How human beings in any culture, with any language, in spite of (so to speak) huge lexical/semantic fields are able to speak or answer questions has not been broached. It is fair to say that language, or the cultural manifestation of language, is seen from the vantage point of a sophisticated lexicographer. The informant's universe is seen as being predominantly his lexicon. Generative semantics.—Following approximately Postal's (53, p. 261) and Maclay's (44, p. 177) representation GT-3b, generative semantics can be characterized as diagrammed in Figure 4. I am simplifying in order to serve better the purposes of this paper; for a more complex view, see Lakoff (41). SEMANTICS P(0) Domain of TRANSFORMATIONS P(n) PHONOLOGY Phonetic (or graphemic) representation Figure 4. Schema of generative semantics. ETHNOSCIENCE 1972 279 Roughly, Figure 4 can be explained as follows: SEMANTICS is considered first as truly generative, that is, randomly generating phrase markers P(0) or tree graphs which serve as structural descriptions of the arrangement of semantic units. I will not utilize this view because I do not see how it could contribute to ethnoscience. I will use the second view throughout—any output produced by the stimulus provided by a question. This view is perhaps closer to the view represented by Binnick (3). A second interpretation, which assumes that the character of the "device" called SEMANTICS is better known, produces one phrase marker P(0) at a time. The structure within SEMANTICS is conceived as somehow linked to a question that elicited it. That is, a P(0) may be produced in response to a question. The structure P(0) is then taken through a series of TRANSFORMATIONS which (a) insert phonological representations of semantic units and (b) change the structure P(0) into P(n)—the surface structure—which is an acceptable input to the PHONOLOGY, which in turn provides the sentence with directions for pronunciation or phonetic representations. If, according to generative semanticists, there is a boundary left between semantics and syntax, it is the boundary between SEMANTICS and the "transformations." However, what I identify here as "transformations" must be interpreted more broadly than the transformations in the Aspects (Chomsky 7) sense (see Lakoif 42). Furthermore, as Postal (53) points out, some of the transformations must apply before the boundary that separates SEMANTICS from syntax. This is due to the fact that lexical entries themselves possess an internal structure. This structure also has the form of dependency trees. Otherwise it would be difficult to account for the underlying structure of many verbs similar to "kill." The semantic primitive of this verb is assumed to be "to cause to die." The lexical representation (for greater detail of the notational convention see Figure 5) of the verb "to kill" is then depicted approximately as in Figure 5. x(l) kill x(2) NP:x(l) NP:(x(2) x(l) cause y(l) NP:x(l) NP:y(l) / s / \ x(2) die NP:x(2) Figure 5. 280 WERNER In the view of generative semanticists (especially Lakoff 41, McCawley 48, and other publications), P(0) markers do not stand alone. They are accompanied by presuppositions. Lakoff (41, p. 235) gives the following example: A sentence such as "Pedro regretted being Norwegian" presupposes that "Pedro is a Norwegian." A semantic representation SF of a P(0) is conceived as SR = (P(0), PR, Top, F, . . .) ". . . where PR is a conjunction of presuppositions, Top is an indication of the 'topic' of the sentence, and F is the indication of the focus of the sentence. We leave open the question of whether there are other elements of semantic representations that need to be accounted for" (Lakoff 41, pp. 234-35). The presuppositions contain the speaker's and the listener's knowledge of the world. They are indistinguishable from standing sentences, that is, sentences with a truth value independent of time. In the above example it is presupposed not only that "Pedro is Norwegian," but also that "Pedro was a Norwegian" and that "Pedro will always be a (native) Norwegian." That the knowledge of all the world is involved emerges from the fact that not only do we know that "Pedro is a Norwegian," but also that "Pedro is a certain man'* whom the speaker knows, but the hearer may not. That "Pedro is a man's name," that "Pedro is human, animate and a physical object," that "Pedro is an unusual name for a Norwegian," that "One would expect Pedro to be a Spanish name," that "Einar, Olaf, Knut ... are more usual Norwegian first names, ". . . etc, etc. All of these presuppositions may come into play at one time or another in a discourse about Pedro and his regrets. When Lakoff says "Pedro regretted being Norwegian." I am tempted to ask "Who the hell is Pedro?" To which Lakoff would probably reply by providing part of his presuppositions about Pedro, e.g. "He is that nut I told you about yesterday who likes Southern California but thinks that he can't immigrate because he doesn't have a valid birth certificate." Lakoff's reminder will serve to refresh my own memory of presuppositions and/or provide new ones as well. If a sentence contains embedded sentences, each embedded sentence requires its own presuppositions. Lakoff diagrams this as follows: S(0) L-------A S(1) ------------------------------> S(2) A A Figure 6. Embedded sentence presuppositions. ETHNOSCIENCE 1972 281 where the long arrow stands for "sentence S(l) presupposes sentence S(2)." McCawley (48) further assumes (which is at least implied by Lakoffs diagram) that semantic representations are in the form of phrase markers. Rather than resembling natural language, they resemble the predicate calculus of symbolic logic. Thus Sapir's sentence "The farmer killed the duckling" may be represented as follows: S Proposition NP:x(l) NP:x(2) AAA killed(x( 1) ,x(2)) by the farmer the duckling Figure 7. x(l) and x(2) are the variable arguments of the predicate "killed." The independence of the noun phrases NP is argued by McCawley on the basis of examples like "John says that Nancy wants to marry a Norwegian."5 The sentence is ambiguous among the three senses (i) there is a person who John says Nancy wants to marry and who the speaker identifies as a Norwegian, (ii) John says that Nancy wants to marry a certain person whom John identifies as a Norwegian, and (iii) John says that Nancy wants her future husband (whoever he might be) to be a Norwegian." [McCawley continues] It is difficult to see how these senses could be assigned different 'deep structures' unless those structures allowed Noun Phrases to occur separate from the propositions that they are involved in and to be constituents of sentences in which those propositions are embedded (McCawley 48, p. 225). The word order in kill (x(l),x(2)) is justified by McCawley (48) on the grounds that many transformations in English can be stated considerably more simply if the verb-first order is assumed. The only alternative in English is a verb-second position which complicates matters. In order to illustrate this point, let us look at a crude approximation of the passive transformation (articles and the introduction of the stative aspect of the verb "to be" in the passive are suppressed): 5 My apologies to all Norwegians. I am sure that neither Lakoff nor McCawley intended this coincidence, which is purely accidental. 282 WERNER S Proposition NP:x(l) NP:x(2) kill (x( 1) ,x(2)) by farmer duckling Active: Rule: move NP:x(l) to first (focal) position (by deleted in initial position). NP kill NP Farmer kill Passive: Rule: move NP:x(2) to first (focal) position. duckling Duck Figure 8. kill by farmer In the verb-second structure both NPs would have to be moved. Thereby an important generalization is lost: move the focal NP into initial position where it becomes the topic of the surface sentence. Recently it has been argued (e.g. McCawley 47) that the distinction of the traditional parts of speech is a characteristic of relatively shallow structure. Lakofř (39) has shown that adjectives and verbs may be considered as one form class. Furthermore, transitive nouns are formally identical to transitive verbs: Proposition NP:x(l) NP:x(2) / \ \ father (x( 1) ,x(2)) Felix of Roger Figure 9. ETHNOSCIENCE 1972 283 The only notion that remains for SEMANTICS is "predicate of" (more on this later). The insight that tense and modals and other verbs like "usually" are derivable as higher (further up in the phrase marker) predicates introduce important simplifications. Other work has concentrated on the nature of quantifiers, especially the similarity of the universal quantifier and the conjunction "and" and the existential quantifier and the disjunction "or." However, there are still many more unsolved problems and summarizing may be premature. Finally, some as yet rudimentary efforts have been directed toward the inclusion of deductive rules in the theory. Some sentences can be clearly derived by deduction from other sentences and possibly from the set of their presuppositions. Lakoff (40) in particular has made some exploratory forays into applications of modal logic, which introduces predicates of possibility ("may") and necessity ("must") and can be extended to temporal predicates (i.e. "sometimes" and "always"). His insights are at this point largely provisional. Summary.—It is apparent that generative semantics has made significant progress toward the solution of some of the problems that face ethnoscience. One of these is that lexical entries cannot be conceived simply as bundles of components. The notion of presuppositions converges with the notion of a lexical/semantic field. But while anthropologists have pointed out some of the characteristics of the lexical/semantic network, linguists have largely ignored it. Perhaps it can be justly said that the insights of generative semantics are still rooted in the notion of an autonomous syntax, which they rightly attack. Their methods, approaches, and arguments proceed largely from a syntactic base at the expense of the structure of the lexicon and the need for some deductive capacity for natural languages. Semantic information processing.—Although many workers in this field are associated with Minsky (49) at MIT, they do not operate with a paradigm that can be easily characterized. The efforts I want to describe are all concerned with computer-simulated question-answering systems. In recent years these studies have assumed two relatively independent directions differing primarily in their emphasis rather than in overall orientation. The first approach is an attempt to model the structures that are necessary for the representation of large memories. The second concentrates on the deductive capacities of question-answering devices. Memory models.—Among a dozen or so experimenters in artificial intelligence, Quillian (54, 55) is unique. He deals almost exclusively with the simulation of very large memory structures "... in which newly input symbolic material would typically be put in relation to large quantities of previously stored information . . ." (54, p. 220). "Actually most simulation pro- 284 WERNER grams . . . have not been primarily concerned with long term memory at all but rather with cognitive [deductive] processing" (54, p. 219). Quillian's semantic memory model can be envisioned as a card file representation of the lexicon. Every dictionary entry has one card. Each card contains one entry name which Quillian calls the Type. Under the entry are paraphrases (definitions) in a special notation. Occurrences of words in these paraphrases are termed Tokens. The linkages of lexical semantic relations are as follows: 1. B names a class of which A is a subclass: A —» B: corresponds to the relation of taxonomy; 2. B modifiesr A: A <— B corresponds to the relation of modification; 3. A, B, and C form disjunctive sets A or B or C: corresponds to disjunction; 4. A, B, C form a conjunctive set set Aai^/and1C corresponds to conjunction; 5 and 6. B, a grammatical subject, is related to C, a grammatical object, in the manner specified by the relation A (verb) Azr^~Z5c" » 7. the associative link is the system of addresses (links) that relates the occurrence of a word to its occurrence in a paraphrase, that is, it connects every entry card to the occurrences of that entry in the definitions of other entries. Thus this system of relations makes Quillian's model a general graph or network of associations. The experimenter's task is to select any pair of words from a store and to submit them to the program for comparison. He may check the machine's comparison against the comparison made by a native speaker. That is, "he considers whether or not the machine's output is one that might reasonably have been produced by a subject" (Quillian 54, p. 237). Quillian's second model (55), the TLC (Teachable Language Compre-hender), is also an attempt to simulate input and output information in smooth English. As a result the canned phrases and ad hoc syntactic form tests (parsings) of the sample outputs give an extremely and deceptively human-like impression. The basic parts of the program are (a) the memory network, (b) a processor, and (c) an on-line human monitor. The memory is structured similarly to Quillian's first model but is somewhat more abstract: the dictionary part is simply a list of lexical items with a list of addresses following every entry. The addresses point to nodes in an abstract network. In my terminology one could look at his lexicon as a linear arrangement of items which connect to a giant switching circuit, the abstract network. The network consists of units which have one obligatory element, a pointer to its superset (su-perordinate taxon). In addition, the unit may have pointers to several properties. These have attributes and values which are also pointers. Roughly, the units are the Types of Quillian's first model and the properties and values are the Tokens. The attributes are verbs and adjectives and the values are nouns that function as grammatical objects. Thus Quillian's second model is essentially Quillian's first with only the relation (1, 5 & 6) retained. ". . . The comprehension procedure of TLC may find a number of properties related to a piece of text and, by using adapted copies of these (i.e. representing them in Quillian's special notation), create a complex inter- ETHNOSCIENCE 1972 285 linked structure, in the same format as the memory (i.e. these structures are merged with the existing portion of the memory), representing the particular meaning of the current input string" (55, p. 475). By relating the meaning of the current input string to already present information, the device is capable of producing one (? all Quillian's examples contain but one) possible paraphrase (see below). An intermediate model between Quillian's and the models emphasizing the deductive capabilities rather than memory organization is PROTOSYN-THEX III by Schwartz et al (61). This structure consists of four parts. First are the ordered triplets of the form X R Y where the R are generally verbs and the X and Y can be simple or complex. Second, there are special relations. These contain the relations of taxonomy and every concept is in a taxonomie relation to at least one other concept. Concepts may also be in the relation of equivalence to each other. The primitives are the most general concepts of the system. Third are the complex relations which form the backbone of the deductive system. These are abbreviations of complex systems (relations of relations) of the special relations and are called inference rules. Finally, there are the semantic event forms whose function is similar to Katz & Postal's selection restrictions. That is, they are called upon to resolve ambiguities (?). While the deductive capabilities of the system are impressive (it can, for example, deduce the problem of the monkey, a stick, and the bananas above the monkey and out of his reach) it seems in spirit closer to Raphael's and Black's systems of limited deductive capabilities. At no point is there an attempt to correlate a large lexical/semantic field. All examples given start with some simple input sentence inserted by hand and deductions based on such very limited input. For lexical/semantic fields the inference rules appear to be unnecessarily complex. For example, if AR IB stands for "A is brother to B" and CR2B for "C is father of B," then the inference rule A[R1 C/P R2]B is constructed, that is, "A is uncle of B" (more precisely, "A is a kind of uncle of B"). It is not clear how this would constitute a simplification over: If (A)T (brother of C) and (C)T(father of B), then (A)T (uncle of B), where T stands for the taxonomie relation of the earlier formulation. This solution requires no new symbols nor new relations. The program searches a question like "Who lost the battle of Waterloo?" in two steps: (a) "Who lost the battle?'" and (b) "Battle of Waterloo." In order for the device to function properly, auxiliary propositions are necessary if it is to solve more complex situations. If the input sentence was "Napoleon was beaten in the Battle of Waterloo," then the above match will not work. Optimally one would expect the device to respond "if 'was beaten in' is equivalent to 'lost' then the answer is 'Napoleon." Please confirm first part of proposition." It would be the human monitor's task to make such confirmations. After validation of the equivalence, the two verbs become part of the permanent capability of the system. In this sense Quillian (55) is right in making his device teachable. Teachability, that is, the capacity to ask ques- 286 WERNER tions (or the confirmation of hypotheses as above) in the process of question answering, seems the only economical way to build such devices. If none of these models represents a "great leap forward," it should be remembered that at present "almost any program able to perform some task previously limited to humans will represent an advance in the psychological theory of that performance" (Quillian 55, p. 459). Deductive models.—Most of the models in this category deal with some form of symbolic logic. Much of the work in the field is brought together by Minsky (49). Later developments are described by Simmons (63). Instead of going into a detailed exposition of symbolic logic, let me try to indicate by an example roughly what is involved. Part of the following example is from the R2 program (Biss et al 4) which contains in a verb-first logical notation a data base of some 2000 sentences of the Illinois Driver's Manual Rules of the Road. The device has thus by far the largest memory on which to base its deduction. Logical operators include "and," "or," "not," and "implies.'' Quantification may occur over any variables of this artificial language. Following Bis et al (4), if the system receives the question (conjecture): (i) Do cars always yield to pedestrians? the resolution principle is invoked for the solution. Following Slagle (64) this means that "... a clause implies its factors . . . [and therefore the solution is found by] . . . working back from the conclusion toward the hypotheses of the theorem to be proved." In the above example this works approximately as follows: First, the conjecture (question) (i) is negated ('always' assumed for all examples), which yields the 'clause': (ii) Cars do not yield to pedestrians. This is compared to the 'knowledge' of the system which states: (iiia) If X yields to Y then Y does not yield to X. or for this example (insertion of x and y): (iiib) If a pedestrian yields to a car then a car does not yield to a pedestrian. Because of the formula p-> q<=> ~ p v q (p implies q if and only if not p or q). Therefore (iiib) can be restated: (iv) Pedestrians do not yield to cars or cars do not yield to pedestrians. It follows from the identity of (ii) and the second half of (iv) that (v) Pedestrians do not yield to cars ETHNOSCIENCE 1972 287 But the 'knowledge' of the system includes the statement: (vi) Pedestrians yield to cars in crosswalks Comparison of (v) and (vi) leads to a contradiction and the system replies to question (i): "No." This problem as presented is almost trivial. However, with a large number of propositions (axioms) to select from, heuristic methods have to be found. Such heuristic programs operate generally as follows: At every step in the proof procedure there are numerous alternatives. The program tries each alternative on the first level (usually there are many levels). Some evaluation measure exists that assigns to the first step of each possible alternative some numerical value. The program selects the alternative with the highest value and checks all the alternatives at this point and so on. In some very sophisticated programs values of past evaluations are stored so that if a lengthy backtrack is necessary, the evaluations do not have to be recalculated. The device goes back and simply chooses the alternative with the next highest evaluation. Most chess-playing programs operate this way. Summary.—The workers in the field of automatic answering systems usually use some simplified form of ordinary English. There is usually some kind of representation of propositions in a central memory. The notation systems are usually variations of notations of the first-order predicate calculus. All processes of deduction use variants of the propositional calculus of symbolic logic; some use the full power of the first-order predicate calculus. As soon as the problems reach some degree of complexity exhaustive searches are out of the question and heuristic programs have to be applied. These, stated most simply, use some evaluation procedure to reduce the number of cases which the device has to inspect in its path toward a solution. Both in the complexity of their deductions and the sophistication of their heuristic programs, the workers in this field are exceedingly advanced. However, their notions of the nature of the lexical field, the nature of lexical/semantic relations, and the fact that many of their solutions are simply "to solve the immediate problem" (i.e. ad hoc) rather than motivated by insights into the nature of language weaken their position. Much of the work in this area creates the impression that quick payoffs are the primary goal at the expense of theoretical insights into the nature of language. In this sense, although raising interesting questions, Dreyfus' (15) critique of "artificial intelligence" suffers from his tacit assumption that we have before us the ultimate in sophisticated computers, and more importantly, that our theories of language are anywhere near adequate to the task of answering questions in a humanlike manner. It is premature to talk seriously about the limitations of computers except as part of the limitation of deductive systems (see Werner 72, in preparation). The relatively slow progress and failure to meet predicted achievements (Dreyfus 15) are poor indicators of the ultimate potential and/ or limitations. 288 WERNER For anthropologists the cross-cultural validity of symbolic logic is an important question. Unfortunately, in anthropology the statement "The logic of the ABC" (where ABC is the name of a tribe or nation) is far from clear. It may refer to the fact that the ABC have a different set of basic axioms (propositions) about various parts of their universe than we do (see e.g., Malinow-ski: 44a), that the ABC use a different style of argumentation and do not accept some of ours. Finally, perhaps their rules of inference are different from ours. However, the evidence to support or refute the latter hypothesis is at best scanty. A very strong commitment to occidental symbolic logic and its extensions to modal logic seems to be the safest approach to the cross-cultural problem. The stronger the commitment and the more rigorous the application of European deductive systems to native systems of knowledge, the sooner will discrepancies surface and demonstrate beyond doubt that the deductive logic in question is truly different. Sociolinguistics.—According to Fishman (19), there are two aspects of sociolinguistics: micro sociolinguistics, which is synonymous with the ethnography of communication, and macro sociolinguistics. In this paper only the former is relevant. Perhaps the most fundamental contribution of sociolinguistics is the emphasis on not merely the speech act but on the context in which it takes place. Although some work in this field assumes context as given or as easily inferable by the observer, a more sophisticated view is closer to our aim. According to Hymes (29, p. 27), "native terms are one guide," and Fishman (19, p. 43) makes "verification from within the speech community" (emphasis his) requisite. In other words, context is a series of native language texts which following Ruesch & Bateson (59, p. 276) describe at least the following four contributing factors: "(1) 'Perception of the other's perception,' or the establishment of the unit of communication. (2) The position of each participant and his function as observing reporter. (3) Identification of the rules pertaining to a social situation. (4) Identification of the roles in a social situation." Except for contexts that are difficult to verbalize, descriptions of context in the native language are propositions about the use of language, hence, an integral part of a system of propositions which form lexical/semantic fields. Tyler's (67, p. 268) notion that "context itself is a part of the semantic system" is therefore correct. However, features of context in a lexical/semantic field are not separate from other semantic facts. An occasional sentence containing the referent "red" (in the literal sense) requires some red object to be present. Similarly, it implies that certain listeners are involved, or that these correlate to each other in some way, or what mode they use to communicate, or their style of delivery and the topic of discourse (based on Hymes 29, p. 216) depend on the presence of these factors. That these are modifiers analogous to "red" can be seen from the observation that the speaker and listener(s) must have past experiences against which to judge these external clues. Such internal representations are probably identical with attributes. It is ETHNOSCIENCE 1972 289 not necessary that attributes be simple, especially since the elicitation of contexts may be derived from the elicitation of long texts. But if higher order predicates (like "may") modify an entire sentence, there is no reason to deny the possibility that relatively simple sentences can be modified by several complex interlinked sentences. Since a general question-answering device should be capable of answering all kinds of questions, and since questions about the social, psychological, ecological, spatial, and temporal context are part of the method of getting at cultural rules (including rules for breaking rules, i.e. systems of priorities) there is no rational way to separate contextual variation from other kinds of variation. At least in part (as I show below) it is possible to control variation of the topic of discourse by a partitioning of the vocabulary. Summary.—Micro sociolinguistics or the ethnography of communication is a young field, although Malinowski ought to be considered an important ancestor. It does not have an explicit metalanguage for the description of its major concern, namely "context." The demonstration of the importance of context is a major contribution to our understanding of some of the variations of meaning. Nevertheless, the field is at this point primarily descriptive rather than highly theoretical, and although it can as yet contribute little to question-answering systems its lesson, that context is semantic and needs to be represented, or at least be representable, is of the utmost importance. Problems and Comments In this section I select a few problem areas raised in my exposition and elaborate on them. The selection is nearly random and reflects mostly my own interests among those which may some day lead to a working question-answering device: Predicate calculus.—According to Reichenbach (56), the notational convention of f(x) for intransitive sentences (f = predicate, x = argument) and f(x,y) (y — second argument) for transitive sentences rests on the fact that although both can be written a [f,x], or a [f,x,y] respectively, the function a is a constant. I will demonstrate, however, that in fact a does assume different values. In many languages the relations of taxonomy, synonymy, and attribution (modification) are all expressed by the same surface syntax. For example, "A doctor is a professional" (taxonomy); "A doctor is a physician" (synonymy); "A doctor is well educated" (attribution). This can be restated briefly as follows: all three relations are cases of attribution. The only explanation that remains is what kind of a relation of attribution is the relation of taxonomy or why it should be considered as one. First, it is well known that the taxonomie relation is the result of taking a superordinate taxon and modifying it in some way. Thus, according to Denisson's Loose Leaf Dictionary, "A lion is a wild animal," or (lion)T( (animal)M (wild) ) where T is the 290 WERNER relation of taxonomy and M is the relation of modification. By adding attributes (modifiers) a term is created that refers to a subclass of the superordi-nate taxon. It can then be argued that the superordinate taxon's function is also an attribute of the taxon it governs. That is, lion not only possesses the attributes of "wildness," it also has the attributes of "animalness." There is one more consideration before summarizing. In componential semantic analyses one encounters another kind of attributive (modification) relation. Note that in the more usual lexical modification the order is crucial especially when nouns are involved: "A dog house is not a house dog." However, in componential analysis or modification by adjectives there is little or no difference. For most purposes a "big red ball" is equivalent to "red big ball."6 The above can be restated as follows. There are basically at least four relations of predication (attribution) that differ from each other in the following manner: 1. Attribution a(l), symbolized by M(l) is (la) Reflexive: (x) M(l) (x) (lb) Symmetric: If (x) M(l) (y) then (y) M(l) (x). (lc) Intransitive: It is not true that if (x) M(l) (y) and (y) M(l) (z) then (x)M(l)(z). 2. Attribution a(2), symbolized by M(2) is (2a) Reflexive: (see above, la). (2b) Asymmetric: If (x) M(2) (y) then (y) M(2) (x) may or may not be true. (2c) Intransitive: (see above, lc). 3. Synonymy a(3) symbolized by S is (3a) Reflexive: (see above, la) (3b) Symmetric: (see above, lb) (3c) Transitive: If (x)S(y) and (y)S(z) then (x)S(z). 4. Taxonomy a (4) symbolized by T is (4a) Reflexive (see above) (4b) Asymmetric: (see above) (4c) Transitive: (see above, 3c) Many properties of lexical/ semantic fields can be characterized by these four relations. Applying this notation to McCawley's verb-first propositional form (p. 300 fn) and using M for M(2) and Polish parentheses free notation: •Although I do not want to ignore semantic subtleties that may point to differences of the two phrases, these are at present beyond the range of semantic sophistication available in anthropology or elsewhere. ETHNOSCIENCE 1972 291 Note that the order of the arguments is changed. The first M links the verb to its object, the second links the resulting verb phrase to the subject.7 S Proposition NP:x(l) NP:x(2) AAA M M kill x (1) x (2) duckling by farmer Figure 10. Compound relations,—There are at least two reasons for discussing what I call here compound relations: (a) there is a need to show how all of Casag-rande & Hale's relations can be explained by taxonomy, synonymy and modification; and (b) some notational convention is needed to account for the speaker's intuition that such compound relations are perceived very similarly to the primitive ones, or in other words, that lexical/semantic fields can be constructed from simple as well as from complex relations. The procedure I want to follow is to show the nature of complex relations and a proposed notational convention first on one example and then to generalize it to others that are possible. Part-to-whole.—This relation is based on a sentence frame in English roughly as follows: "_____is part of_____" 1 2 There are two ways to arrive at the same result: (1) By making the above sentence into a taxonomie statement in the strict sense: "-------is a (kind 1 of) _____-part." Thus by extension the first element is also a kind of a part, 2 T The simplest justification of this arrangement of the relation of modification is obtained by nominalizing the above sentences three ways: The killing of the duckling by the farmer ... The duckling killed by the farmer ... The farmer who killed the duckling. .. Farmer is deletable in the first two instances where it is not focal. There seems to be a general rule that a verb can be focal only in the nominalized form of a sentence, while only noun phrases can be focal in the declarative mode. 292 WERNER that is, the second element modifies 'part' and the first element is taxonomi-cally (transitively) related to the second element plus part; therefore T M (part) (_____) (_____). (2) Treating 'part' as a transitive predicate leads 2 1 to the same conclusion: the second element is analogous to the object, the first to the subject; therefore, by considering transitivity, T M (part) (-------) 2 (_____). A very similar argument can be made for most Casagrande & Hale 1 relations (6). Some of the Casagrande & Hale (6) relations translate into the three basic relations T, S, M (for M(2)) as follows: Class inclusion: (_____)T(_____) or T(_____) (_____) is equivalent 12 2 1 toTM(kind) (_____) (_____). 2 1 Spatial: (usually of the form) "_____Preposition_____" or M M(Prep- 1 2 osition) (_____) (_____). Some of these 'prepositions' require 2 1 dual or plural objects in position 2, e.g. M M (between) (------- & -------) (-------). 2 1 Attributive: (_____)M(_____), or M(_____) (_____). 12 2 1 Function and Operation: Although the precise meaning of these designations is not clear, I have argued in (74) that the function relation is that between subject and verb, therefore M; and similarly, the relation of operation between verb and object, also M. Thus the sentence "Brewers make beer" becomes M M (make) (beer) (by brewers.. Comparison: "_____is like_____," or M M(like) (_____) (_____). Exemplification: "_____is an example of_____," or "_____is exemplified by--------" The first is equivalent to the taxonomie relation T, and since it is transitive is also equivalent to T M (example) (_____) (_____). (The transivity needs amplification, i.e. "Grass is an example of green" is probably incomplete and should read fully "Grass is an example of a green object" and "Green is an example of color" is extended to "A green object is an example of a colored object." The second is the inverse of the taxonomie relation (orT). Grading: This is a relation not reducible to the basic T S M relation. It involves temporal and spatial order (the symbol for this relation is Q for Queuing): ETHNOSCIENCE 1972 293 Left to Right Proximal Q(l) neighborhood (immediate successor) Right to Left Q(2) Distal Q(3) neighborhood (eventual successor) | Q(4) Figure 11. Queuing relation. Where Q(l) is the inverse of Q(2) and Q(3) of Q(4) and Q(l) implies Q(3) and Q(2) implies Q(4) but not vice versa. However, before further commitment more work is needed in the explication of this relation. Provenience: "_____comes from_____," "_____is made of-------," and possibly "_____is a stage of_____" (or "matures into-------"). This looks, at this point, like variations of the queuing relation: stage and provenience are related to the distal Q(2). Apparently some more or less complex process intervenes between the first and second term. Again more work is necessary. Synonymy: (_____)S(_____) or S(_____) (_____) is reserved for ident- 12 2 1 ity. It is possible that some other relation may be necessary for functional identity: for example, Lounsbury type expansion/reduction rules. In the Yankee kinship system M M(_____) (uncle) (vocative) is in direct address functionally equivalent to 'uncle.' In other words, any modifier of uncle is dropped in direct address. Antonymy: Lyons (43) lists three types: 1. Complementarity, either "_____ implies not _____," or "_____ is equivalent to not _____" (or if not = N, unary relation, and implies = F, a binary one) F N(_____) 2 (_____) or possibly S N (_____) (_____). Whether the two cases are equiva- 1 2 1 lent remains to be seen. 2. Antonymy par excellence, a relation not well understood at present but somehow linked to queuing: big Q(l) bigger Q(l) biggest, implies small Q(2) smaller Q(2) smallest. (3) Converseness, also somehow connected to queuing: "A sold B to C" implies "(A owned B)" Q(l) "(C owns B)"" and " "C bought B from A" implies "(C owns B)" Q(2) "(A owned B)" " Paul Kay's notion of 'contrast' (37) also sheds some light on the problem. However, much more needs to be done. Partitioning of the vocabulary.—Partitioning of the vocabulary for ques- 294 WERNER tion-answering machines is extremely important. It accounts for the notion of context of discourse, and possibly can be extended to account for sociolin-guistic partitionings and the reduction of searches to manageable proportions both in space and in time. The details will become apparent with the example in Figure 12 of a taxonomie representation: a hypothetical three-level taxonomy, highly regular and with four branches on each level in a tree representation (drawn only partially). Figure 12. Hypothetical three-level taxonomy. This taxonomy can be coded into a matrix. A three-level taxonomy was selected because it can be represented in three dimensions. Obviously for matrices of greater depth n-dimensional matrices will be necessary. These are simple generalizations of the three-dimensional case. The three-dimensional matrix is represented as a cube. The rationale for this representation is to construct a space in such a way that every sibling and especially every ancestor node is an immediate neighbor (i.e. of one link distance). This is a condition of the transitivity of the relation of taxonomy. Since node 0 dominates all subordinate nodes, it occupies an entire plane in the space (note the 0 plane in Figure 13—two rows have been excised in order to reveal part of the internal structure). The next level of the taxonomy is perpendicular to this plane and consists of the rows of 1, 2, 3, and 4, i.e. 1 T 0 (or 1 is a kind of 0), 2 T 0, 3 T 0, and 4 T 0. The next level of the taxonomy is again perpendicular to these rows. These are the single cells—for example, 5, 6, 7, and 8 with 5T1,6T1,7T 1, and 8 T 1—but since each of these is also dominated by 0 they are also perpendicular to the 0 plane. The illustration shows the terminal elements of the taxonomy being perpendicular to the element 7, i.e. 29 T 7, 30 T 7, 31 T 7, and 32 T 7. Note that the terminal nodes are also perpendicular to the 0 plane and the 1 row, as well as to element 7. The major advantages of this representation are: L The transitivity of the taxonomie relation is a characteristic of the spatial arrangement of elements. In other words, syllogism, the simplest deductive capacity of the device, is part of the spatial arrangement. ETHNOSCIENCE 1972 295 Figure 13. Three-dimensional matrix of three-level taxonomy. 2. Perhaps more importantly, partitioning of a large taxonomy, that is, the application of different search strategies, is simply a partitioning of the n-dimensional taxonomie space (three-dimensional in the example). A depth-first search is, in the three-dimensional example, any proper sub-cube. For example, the following illustration shows the cube which represents the right-most branch of the tree in Figure 12. Searches that are depth first with a fanning out on the lower levels are sub-oblongs of the 5 X 5 X 5 taxonomie cube (see dotted line in Figure 14). A breadth-first search is the row shown in Figure 15. It includes only the first level of the tree. A breadth-first subtree of two levels is the plane represented in Figure 16. A subtree search on a subtree of n-1 depth is a hyperplane (true plane in o O K O 1 i OS Figure 14. Three-dimensional depth-first search. the three-dimensional illustration) parallel to the hyperplane of the top node. Figure 17 illustrates a subtree of Figure 13. Subtrees can be found in any search as subfigures of larger figures (sub-matrices of larger matrices). Thus Figure 18a is a subtree of Figure 16 and Figure 18b is one of Figure 15. The need for partitioning taxonomies can be explained in detail as follows: It is a well-known fact about human information retrieval that it is fast rather than accurate; it is not as exhaustive as computer retrievals tend to be. Frequently used items tend to be more in focal awareness and hence more easily accessible than little-used items. Items in the device are moved up to a more focal position every time they are used for retrieval or in deductions. The corner facing the viewer in Figure 13 represents such a more focal position. Thus if the branch of node 2 should become more used or of greater interest, it could be moved to the left of the branch of node 1 (that is, branch of 1 and branch of 2 would change places, and so on). This moving to the left procedure explains what happens if discourse is domain limited. For example, if branch 1 is dealing with chemistry and if the discussion is about chemistry, the term "radical" would be interpreted by its chemical sense. If the branch of node 2 deals with politics, then an exchange of the branches of 0 o o o 1 * 3 f Figure 15. Breadth-first search. 296 WERNER ETHNOSCIENCE 1972 297 Figure 16. Two-level breadth-first search. node 1 with the branch of node 2 brings politics into focus and gives an appropriate interpretation to the term "radical." Thus partitioning of the vocabulary accounts for use of speech in contexts bound to particular isolable cultural domains. Similarly, partitioning can be extended to include sociolinguistic contexts, e.g. intimate discouse vs formal discourse. It is conceivable that one could partition along two parameters at the same time: e.g. intimate talk about the domain of sex versus formal medical talk about the same domain. In addition, the partitioning schema accounts for the indeterminancy of the length of paraphrases in the following way: the length of a paraphrase to a particular question is determined by the maximum amount of available information, i.e. the maximum extent of the subtree dealing with a particular topic plus its associations (i.e. the rest of the lattice) and the available time to answer the question. The available time is gauged by some perceived implicit sociolinguistic criteria. The computer device, however, would have to request clarification by an explicit "impatience factor" or other such indicators. The response of the computer to the answer of the above request for a time limit is also representable as a partition of the response matrix to the most commonly used pathways, i.e. some subsolid of the cube in Figure 13. 298 WERNER Figure 17. Subtree search. Since the mechanisms of such subpartitioning are not very well understood, the n-dimensional hypercube representation of a taxonomy seems particularly well suited for the investigation of this phenomenon. By varying different parameters of partitioning, comparisons can be made with the behavior of human beings. Experimental work in psychology (e.g. Mandler 45) seems to indicate that the size of the taxonomies is somewhere of the order of 5 ±2 items on each level, Assuming the maximum number, i.e. 7, the requirement is an 8n matrix. For a maximal depth of n = 7 87 = 2,097,152, which is ample for extremely large vocabularies [even considering that 262,144 (or 86) cells are needed for the most general term—however, what seems wasted in space is gained in speed]. A reduction of storage requirements is achievable by breaking the large taxonomies into a series of interlinked smaller ones. This could account for the fact born out by observation that genus terms and species terms are more easily accessible than intermediate and very much more general terms. In Perchonock & Werner (52) we showed that the term "food" and specifically named foods were easiest to elicit. This argues for a relatively independent subtaxonomy of "food" loosely linked to the taxonomie aspect of the Navajo universe (Werner & Begishe 70). Such an arrangement ETHNOSCIENCE 1972 299 Figure 18 a, b. Subtrees. makes the retrieval of items in subtaxonomies like food, plants, or animals very efficient. Attributes can be incorporated into the present schema in the following fashion: First I expand the previously given "definition" of (lion) in the following way: "A lion is an African or Asian wild animal and a big cat." Let us mark modifiers with a suffix M and as an edge pointing to the term they modify. Further we assume that every modifier with the modified repre- 300 WERNER fc#btj í^