Computer modelling is a particular methodology which enables cognitive psychologists to accurately represent and test theories of how and why specific cognitive processes operate inside people (Mulhollard and Watt, 2001). The aim is to produce computer models which simulate human cognitive performance as concerns representation of knowledge and processes involved in utilizing it (Cohen, 2002).
This technique has proved very effective in providing detailed and explicit cognitive theories and in discarding those that cannot be implemented as working models, but is not without weaknesses. In this essay, we shall attempt to evaluate the strengths and shortcomings of this method for studying aspects of cognition, using illustrative applications from prominent theoretical models.
An example of a comprehensive theory of cognition implemented through computer modelling is Anderson’s (1983 in Kiss, 1993) ACT* (Adaptive Control of Thought) system which has been used to simulate many aspects of human cognitive behaviour like memory recall, skill-learning, comprehension and learning of language, and problem-solving. This system builds on schema theory of human cognition which postulates that incoming information is understood in relation to prior knowledge stored in memory in the form of schemas, each representing what is known about a particular subject-matter from past experience (Cohen, 2002). Schemas have slots that may take fixed (compulsory) values or specific (optional) values according to the specific instance, or default values provided by the schema in the absence of specific information according to the category membership of the particular instance (Mulhollard and Watt, 2001).
ACT* in particular, assumes the existence of a general computational unit, called production system, which consists of production rules specifying the conditions that trigger particular actions (Kiss, 1993). The model comprises three memories: working memory holds the information currently accessible to the system; (long-term) declarative memory stores permanent knowledge about the world, represented as schemas; (long-term) procedural memory contains the production rules. Declarative memory constitutes a network structure whereby information retrieval is accomplished through spreading activation: activation spreads over the network, and at anyone time, the most active nodes constitute the working memory content (Cohen, 2002). Through a serial pattern-matching mechanism, this content is matched against the condition part of every production rule stored, and the action part of the best match is triggered thus producing a real action or internal structure modification. Goal structures comprising sets of productions organized hierarchically, represent plans of action and regulate the progress of cognitive processing. Learning is modelled as a process of adding and modifying productions in procedural memory (Kahney, 1993).
ACT* has been tested against the implications of human memory experiments: it contains features comparable to various memory systems proposed by experimental researchers (short-term working memory, long-term memory, prospective memory, schemas, separate memory for actions and plans), and to fundamental memory processes (encoding, retention/forgetting, retrieval). The model’s capability to account for experimental findings which are not inherent in it, also strengthens the psychological plausibility of the underlying theory. For example, the improved memory performance resulting from elaboration on the to-be-remembered material is explained by the combination of production system and spreading activation mechanisms (Kiss, 1993).
In the field of learning, ACT* computer models simulated the performance of humans in various domains such as language learning (Anderson, 1983), geometry (Anderson et al., 1981) and computer programming (Anderson et al., 1984; Pirolli and Anderson, 1985), producing elaborate descriptions about learning and problem solving in these domains. They depicted how new procedures can be obtained by the learning system, which fit the specific conditions encountered, starting from some declarative knowledge about the particular domain and general procedures of problem-solving, stored in memory (Kahney, 1993).
The major advantage of this production system architecture is modularity: production rules can be added as needed, to accommodate further cognitive processes, with no major impact on the overall system functioning. However, this flexibility (in accommodating potentially any cognitive aspect) and the lack of limitations (in creating new productions, and in working memory capacity) contrast with the actualities of human cognition. Besides, the explanations of experimental findings generally rest on a single aspect of the ACT* architecture (the spreading activation mechanism), so they do not provide adequate testing of the overall framework (Kiss, 1993). Furthermore, ACT* implementations frequently display better performance than humans, or even depict the ideal rather than the average human performance in learning and problem solving (Kahney, 1993).
Despite its shortcomings, however, ACT* constitutes a unitary theory of cognition which accounts for several cognitive aspects in fine detail. And it is just a transient stage of a continuing work towards more integrated models of the human cognitive system (Kiss, 1993).
The above illustration of the ACT* system raised important issues that generally apply to computer modelling as a methodology. One concerns testability: evaluation of how closely a computer model matches human cognitive functioning. Producing a working model does not testify that people function as this model assumes, since another model postulating different operations might as well display similar performance (Cohen, 2002). For example, while ACT* assumes localized knowledge structures and serial processing, PDP models, which we shall consider below, account for memory and learning processes on the basis of distributed knowledge representation, and parallel processing.
Besides, computer models contrast with human cognitive processes in various aspects. First, computer models represent specific parts of the cognitive system which do certain jobs. We cannot fully assess the validity of such snapshots unless they are merged into an overall cognitive structure and still display similar performance – which is not feasible at present (Cohen, 2002).
Second, depending on the nature of the task, computers can be faster or slower than humans, and their performance speed may vary with the capacity of the particular hardware; so comparing directly with human reaction times is not reasonable (Mulhollard and Watt, 2001). The comparison can instead be made in terms of trends in the number of steps taken to run the model. Even better would be to associate each step with a predicted human-equivalent time and compare the model’s performance with actual experimental data. Thus, using the times taken for individual moves towards a successful solution of the four-ring Towers of Hanoi problem, the predictions of the ACT-R model (Anderson and Lebiere, 1998 in Mulhollard and Watt, 2001) corresponded closely to experimental findings, therefore verifying the model and the respective theory.
Third, the inputs and outputs of computers differ from those of humans, as they don’t have sense and response organs.
Finally, computers disregard prominent aspects of human cognition such as emotions, motives, intentions and consciousness (Cohen, 2002).
While all these factors potentially determine the extent to which a computer model simulates human cognitive processes, the assessment is usually done by comparing the model’s performance with that of humans. For example, Burton and Bruce’s (1990 in Cohen, 2002) IAC model of person identification and naming showed similar effects on performance to that of human participants, when name frequency was manipulated. This provided verification of the model and the theory so represented.
The psychological plausibility of a computer model is also assessed through people’s verbal reports about how they operate (Cohen, 2002).
Thus, computer modelling is often combined with other methodologies: experimental findings and introspective reports about cognitive operations are employed to build the model. Conversely, the findings about the model’s performance are fed back into the underlying theory, providing for potential refinement. So assuming that these methodologies are mutually dependent, they must be assessed together (Cohen, 2002). Nonetheless, the process of amending and revising the model until it actually works, is highly constructive as it entails an accurate and detailed specification of the underlying theory: computer modelling researchers have to depict every single operation performed by the system they postulate, in the respective computer program, whereas experimental researchers are not forced to be that precise. This process also enables discarding theories that cannot be modelled or whose models are not working properly (Cohen, 2002). For example, the running model (in Hank language) of Collins and Quillian’s (1969) theory of conceptual hierarchies in the context of animal categories (constructed for the purpose of the summer school project), which implemented category membership verification, inheritance of properties, and coping with exceptions, challenged Collins and Quillian’s assumptions about retrieval times from semantic memory. In particular, their predictions that property judgments should take longer than category-membership ones, and reaction times should increase with the number of hierarchical-levels that must be searched, were only verified for true responses. This outcome indicates the incompleteness of Collins and Quillian’s model in explaining how humans process false statements and discredits the underlying theory.
Another general issue in computer modelling is whether computers running valid cognitive models can be seen as ‘thinking’ in the sense that humans are (Mulhollard and Watt, 2001). This is proposed by the strong artificial intelligence tradition, which is concerned with building programs whose performance displays human intelligence (e.g. weather forecasting), as opposed to the weak artificial intelligence approach reflected by cognitive modelling, which aims to develop programs that mimic the way humans perform particular tasks (Cohen, 2002).
Given this distinction, some researchers claimed that the production system architecture is based on the power of artificial information-processing machines and therefore is not a realistic way of modelling human cognition; for instance, the limited capacity of human working memory contradicts the limitless ‘working memory’ of the ACT* system. It was also argued that the rule-based approach does not explain how knowledge is actually represented or how (and why) basic cognitive aspects such as generalization (storing generalized information about similar instances rather than specific details of each instance) operate, in humans (Le Voi, 1993). Thus, in an attempt to counterbalance these shortcomings, the PDP (parallel distributed processing) approach introduced neural network models which resembled the brain’s neural anatomy and relied on parallel distributed processing, rather than serial (as ACT* and schema models do). In this architecture, knowledge representation is distributed across collections of interconnected computational units by means of patterns of connection strengths (weights), instead of being local (as in ACT* and schema theory) (Cohen, 2002). Such PDP models exhibited various properties that characterize human cognition, which were not explicitly built-in: spontaneous generalization (similar patterns are classified as instances of the same category, and a few cues can induce generalization to the schema representation of the stimuli, which provides for handling new instances); content-addressability (presenting part of the stored content is enough to produce recall of the whole record); graceful degradation (performance is impaired gradually, rather than abruptly and destructively, when part of the system is damaged) (Le Voi, 1993).
These emergent properties of PDP models provided a potential ground for explaining aspects of human cognition, such as schema-based representations, in relation to the brain’s neural architecture: assuming that the human brain is a type of a PDP network, it is likely that these aspects result naturally from the underlying parallel distributed organization of information processing and representation (Le Voi, 1993). Moreover, PDP models were successfully used to simulate and systematically explore aspects of human performance such as reading English words (Seidenberg and McClelland, 1989 in Le Voi, 1993). For example, Seidenberg and McClelland’s (1989) model, comprising a suitable representational scheme for spelling-to-sound correspondence, after being trained to pronounce English words, simulated a wide range of experimental results about human reading (e.g. Waters and Seidenberg, 1985; Brown, 1987 in Le Voi, 1993) as concerns naming latencies and phonological errors on different kinds of words, e.g. high- and low-frequency words, regular and exception words.
One problem with PDP models is that actually they only resemble the brain’s neural architecture at a superficial level, as they rest on mathematical models of computational units which contrast with the characteristics of real neurons. Besides, particular PDP models cannot accurately simulate human performance; for example, unlike humans, they cannot learn a word’s pronunciation from a single trial (Le Voi, 1993).
Another problem concerns the extent to which such models are applicable to every aspect of human cognition, particularly to those already accounted for by the rule-based approach, such as language and problem-solving. For example, PDP researchers consider language regularities as a product of the general properties of parallel distributed processors whose functioning does not rely on explicit rules, whereas linguists consider that human language rests on the action of linguistic rules. Thus, Seidenberg & McClelland’s PDP simulation of the process of learning the pronunciation of English words would be implemented by a rule-based model in terms of creating increasingly more specific pronunciation rules from general ones, in order to handle irregular words (Le Voi, 1993).
In general, the critics of the PDP approach argue that the emergent properties of PDP models only explain low-level aspects of human cognition such as content addressability, schema-based processing, and pattern-matching, as opposed to high-level functions like language and problem solving for which no matching emergent properties are found in PDP models, and which are more amenable to the rule-based approach (Le Voi, 1993). Moreover, rule-based systems have also exhibited emergent properties, and actually of a higher level than those of PDP systems; for instance the AM rule-based system discovered mathematical knowledge that was not built-in, by applying heuristic rules to pre-stored knowledge (Lenat, 1977 in Le Voi, 1993).
It seems then, that at present, neither of these architectures can successfully account for every cognitive aspect; each is better fitted to particular areas of cognitive behaviour, contributing its insights to the common goal of explaining human cognition. Consequently, rather than trying to assess which framework provides the most thorough account for human cognition, it might be better to consider the different cognitive architectures as complementary (Cohen, 2002).
Thus, for example, models based on schemas focus on issues of knowledge representation and organization, while production systems concentrate on cognitive procedures. However, both assume that human cognition relies on symbolic representation of information which is processed through the operation of rules. By contrast, PDP models do not rest on symbols and rules, and aim at simulating prominent features of the brain such as the capability of parallel distributed processing. They challenge the universality of the rule-based explanation for particular cognitive aspects and instead propose an account which links the brain’s neuronal anatomy with the processing of symbolic representations (Cohen, 2002).
Considering that, in general, low-level processes, like object recognition, are automatic (essentially unconscious) whereas high-level ones, like problem-solving, are consciously controlled, it follows that PDP models are better at explaining automatic processing while rule-based models (ACT*, schema theory) are better at conscious controlled processing (Cohen, 2002).
Yet again, Miller (1981) raises the controversial issue of whether cognitive simulation, in general, can accommodate the distinction made above between unconscious and conscious processes. He seems to conclude that the computer modelling approach is incomplete in terms of generating true ‘consciousness’. However, consciousness has not yet been explicitly defined, so we cannot take this incompleteness for granted. We are only certain that, at present, there is no model which is conscious and can provide proof for this. Even so, computer models are very useful when considered as descriptions rather than as minds in themselves, or, in Searle’s terminology, as simulations rather than replications of the actual cognitive processes, so that, for instance, a computer simulation of a fire will not actually burn anything (Mulhollard and Watt, 2001).
To summarize the major points discussed in this essay, computer modelling is an excellent technique for testing theories of human cognition, as it entails precise specification of the underlying theoretical model in order for it to work as a computer simulation of the cognitive aspect that the theory is attempting to explain. In general, this methodology is not autonomous. Computer models build on experimental findings and introspective evidence, and their performance informs and revises the underlying theories; so the assessment should involve all the interdependent methodologies. At present, no cognitive architecture implemented through computer modelling can by itself provide a universal account of human cognition. Considering that each one provides important insights into particular aspects of cognition, the different architectures should be viewed as complementary rather than competitive. Overall, despite its shortcomings in terms of testability, divergence from actual human functioning and performance, and generating true consciousness, the computer modelling approach constitutes a valuable tool for studying cognitive behaviour.
Terms and Conditions for Republishing Content
Article Author: Panagiota Kypraiou MSc Health Psychology, MBPsS - Body & Gestalt Psychotherapist (ECP) - Body Psychotherapy Supervisor - Parents' Education Groups Coordinator https://www.psychotherapeia.net.gr
References
Cohen, G. (2002) D309 Course Overview: Themes and Issues, The Open University.
Collins, A. & Quillian, M.R. (1969) ‘Retrieval Time from Semantic Memory’, Journal of Verbal Learning & Verbal Behaviour, Vol.8 pp. 240-7.
Kahney, H. (1993) ‘Introduction to Problem Solving’ in Kahney, H. (ed) Problem Solving, Buckingham/Open University Press
Kahney, H. (1993) ‘Analogical Problem Solving and the Development of Expertise’ in Kahney, H. (ed) Problem Solving, Buckingham/Open University Press
Kiss, G. (1993) ‘Memory Systems: The Computer Modelling Approach’ in Cohen, G., Kiss, G. & LeVoi, M. (ed) Memory, Buckingham/Open University Press
Le Voi, M. (1993) ‘Parallel Distributed Processing and Its Application in Models of Memory’ in Cohen, G., Kiss, G. & LeVoi, M. (ed) Memory, Buckingham/Open University Press
Miller, G. (1981) ‘Trends and Debates in Cognitive Psychology’ Cognition, 10, 215-225.
Mulholland, P. & Watt, S. (2001) D309 TMA 3, Cognitive Modelling Project, The Open University.