Anderson, J. A. (1977). Neural models with cognitive implications. In LaBerge, D. and Samuels, S. J., editors, Basic processes in reading perception and comprehension, pages 27–90. Erlbaum, Hillsdale, N.J.
Anderson, J. A. (1983). Cognitive and psychological computation with neural models. IEEE Transactions on Systems, Man, and Cybernetics, 13:799–815.
Blake, A. (1983). The least disturbance principle and weak constraints. Recognition Letters, 1:393–399.
Cleeremans, A. and McClelland, J. L. (1991). Learning the structure of event sequences. J Exp Psychol Gen, 120:235–253.
Dilkina, K., McClelland, J. L., and Plaut, D. C. (2008). A single-system account of semantic and lexical deficits in five semantic dementia patients. Cogn Neuropsychol, 25:136–164.
Elman, J. L. (1990). Finding structure in time. Cognitive Science, 14:179–211.
Elman, J. L. (1991). Distributed representations, simple recurrent networks, and grammatical structure. Machine Learning, 7:195–224.
Elman, J. L. (1993). Learning and development in neural networks: The importance of starting small. Cognition, 48:71–99.
Farah, M. J. and McClelland, J. L. (1991). A computational model of semantic memory impairment: modality specificity and emergent category specificity. J Exp Psychol Gen, 120:339–357.
Feldman, J. A. (1981). A connectionist model of visual memory. In Hinton, G. E. and Anderson, J. A., editors, Parallel Models of Associative Memory, chapter 2. Erlbaum, Hillsdale, NJ.
Fukushima, K. (1975). Cognitron: A self-organizing multilayered neural network. Biological Cybernetics, 20:121–136.
Geman, S. and Geman, D. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions of Pattern Analysis and Machine Intelligence, PAMI-6:721–741.
Grossberg, S. (1976). Adaptive pattern classification and universal recoding: Part I: Parallel development and coding of neural feature detectors. Biological Cybernetics, 23:121–134.
Grossberg, S. (1978). A theory of visual coding, memory, and development. In Leeuwenberg, E. L. J. and Buffart, H. F. J. M., editors, Formal Theories of Visual Perception. John Wiley & Sons, New York.
Grossberg, S. (1980). How does the brain build a cognitive code? Psychological Review, 87:1–51.
Hebb, D. O. (1949). The Organization of Behavior. Wiley, New York.
Hertz, J. A., Palmer, R. G., and Krogh, A. (1991). Introduction to the Theory of Neural Computation. Westview Press.
Hinton, G. E. (1977). Relaxation and Its Role in Vision. PhD thesis, University of Edinburgh.
Hinton, G. E. and Anderson, J. A., editors (1981). Parallel models of associative memory. Erlbaum, Hillsdale, NJ.
Hinton, G. E. and Sejnowski, T. J. (1983). Optimal perceptual inference. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Washington, DC.
Hopfield, J. J. (1982). Neural networks and physical systems with emergent collective computational abilities. Proceedings of the National Academy of Sciences, USA, 79:2554–2558.
Hopfield, J. J. (1984). Neurons with graded response have collective computational properaties like those of two-state neurons. Proceedings of the National Academy of Sciences, USA, 81:3088–3092.
James, W. (1890/1950). The Principles of Psychology. Dover, New York.
Jenkins, W. M., Merzenich, M. M., Ochs, M. T., Allard, T., and Guíc-Robles, E. (1990). Functional reorganization of primary somatosensory cortex in adult owl monkeys after behaviorally controlled tactile stimulation. Journal of Neurophysiology, 63(1):82–104.
Kohonen, T. (1977). Associative memory: A system theoretical approach. Springer, New York.
Kohonen, T. (1982). Self-organized formation of topologically correct feature maps. Biological Cybernetics, 43:59–69.
Lambon Ralph, M. A., McClelland, J. L., Patterson, K., Galton, C. J., and Hodges, J. R. (2001). No right to speak? The relationship between object naming and semantic impairment: neuropsychological evidence and a computational model. J Cogn Neurosci, 13:341–356.
Levin, J. A. (1976). Proteus: An activation framework for cognitive process models. Technical Report ISI/WP-2, University of Southern California, Information Sciences Institute, Marina del Rey, CA.
McClelland, J. L. (1981). Retrieving general and specific information from stored knowledge of specifics. In Proceedings of the Third Annual Conference of the Cognitive Science Society, pages 170–172, Berkeley, CA. [PDF].
McClelland, J. L. (1991). Stochastic interactive activation and the effect of context on perception. Cognitive Psychology, 23:1–44. [PDF].
McClelland, J. L. and Patterson, K. (2002). ‘Words or Rules’ cannot exploit the regularity in exceptions. Trends in Cognitive Sciences, 6:464–465. [PDF].
McClelland, J. L. and Rogers, T. T. (2003). The parallel distributed processing approach to semantic cognition. Nature Reviews Neuroscience, 4:310–322. [PDF].
McClelland, J. L. and Rumelhart, D. E. (1981). An interactive activation model of context effects in letter perception: Part 1. An account of basic findings. Psychological Review, 88:375–407. [PDF].
McClelland, J. L. and Rumelhart, D. E. (1988). Explorations in parallel distributed processing: A handbook of models, programs, and exercises. MIT Press, Boston, MA. [Archive].
McClelland, J. L., Rumelhart, D. E., and the PDP Research Group (1986). Parallel distributed processing: Explorations in the microstructure of cognition. Volume 2: Psychological and biological models. MIT Press, Cambridge, MA. [Book].
Minsky, M. and Papert, S. (1969). Perceptrons: An Introduction to Computational Geometry. MIT Press, Cambridge, MA.
Pinker, S. and Prince, A. (1988). On language and connectionism: Analysis of a parallel distributed processing model of language acquisition. Cognition, 28:73–193.
Pinker, S. and Ullman, M. T. (2002). The past and future of the past tense. Trends in Cognitive Sciences, 6:456–463. [PDF].
Plaut, D. C., McClelland, J. L., Seidenberg, M. S., and Patterson, K. (1996). Understanding normal and impaired word reading: computational principles in quasi-regular domains. Psychol Rev, 103:56–115.
Plaut, D. C. and Shallice, T. (1993). Deep dyslexia: A case study of connectionist neuropsychology. Cognitive Neuropsychology.
Rebur, A. S. (1976). Implicit learning of synthetic languages: The role of instuctional set. Journal of Experimental Psychology: Human Learning and Memory, 2:88–94.
Rogers, T. T., Lambon Ralph, M. A., Garrard, P., Bozeat, S., McClelland, J. L., Hodges, J. R., and Patterson, K. (2004). The structure and deterioration of semantic memory: A neuropsychological and computational investigation. Psychological Review, 111(205-235). [PDF].
Rogers, T. T. and McClelland, J. L. (2004). Semantic Cognition: A Parallel Distributed Processing Approach. MIT Press, Cambridge, MA.
Rohde, D. (1999). Lens: The light, efficient network simulator. Technical Report CMU-CS-99-164, Carnegie Mellon University, Department of Computer Science, Pittsburgh, PA.
Rohde, D. and Plaut, D. C. (1999). Language acquisition in the absence of explicit negative evidence: How important is starting small? Cognition, 72:67–109.
Rosenblatt, F. (1959). Two theorems of statistical separability in the perceptron. In Mechanisation of Thought Processes: Proceedings of a Symposium Held at the National Physical Laboratory, November 1958, Volume 1, pages 421–456, London. HM Stationery Office.
Rosenblatt, F. (1962). Principles of neurodynamics. Spartan, New York.
Rumelhart, D. E. and McClelland, J. L. (1982). An interactive activation model of context effects in letter perception: Part 2. The contextual enhancement effect and some tests and extensions of the model. Psychological Review, 89:60–94. [PDF].
Rumelhart, D. E., McClelland, J. L., and the PDP Research Group (1986). Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Volume 1: Foundations. MIT Press, Cambridge, MA. [Book].
Rumelhart, D. E. and Todd, P. M. (1993). Learning and connectionist representations. In Meyer, D. E. and Kornblum, S., editors, Attention and Performance XIV: Synergies in Experimental Psychology, Artificial Intelligence, and Cognitive Neuroscience, pages 3–30. MIT Press, Cambridge, MA.
Rumelhart, D. E. and Zipser, D. (1985). Feature discovery by competitive learning. Cognitive Science, 9:75–112.
Servan-Schreiber, D., Cleeremans, A., and McClelland, J. L. (1991). Graded state machines: The representation of temporal contingencies in simple recurrent networks. Machine Learning, 7:161–193. [PDF].
Smolensky, P. (1983). Schema selection and stochastic inference in modular environments. In Proceedings of the National Conference on Artificial Intelligence AAAI-83, pages 109–113.
Sutton, R. (1988). Learning to predict by the methods of temporal differences. Machine learning, 3(1):9–34.
Sutton, R. S. and Barto, A. G. (1998). Reinforcement Learning: An Introduction. MIT Press.
Tabor, W., Juliano, C., and Tanenhaus, M. K. (1997). Parsing in a dynamical system: An attractor-based account of the interaction of lexical and structural constraints in sentence processing. Language and Cognitive Processes, 12(2):211–271.
Tesauro, G. (1992). Practical issues in temporal difference learning. Machine Learning, 8(3-4):257–277.
Tesauro, G. (2002). Programming backgammon using self-teaching neural nets. Artificial Intelligence, 134(1-2):181–199.
von der Malsburg, C. (1973). Self-organizing of orientation sensitive cells in the striate cortex. Kybernetik, 14:85–100.
Weisstein, N., Ozog, G., and Scoz, R. (1975). A comparison and elaboration of two models of metacontrast. Psychological Review, 82:325–343.
Widrow, G. and Hoff, M. E. (1960). Adaptive switching circuits. In Institute of Radio Engineers, Western Electronic Show and Convention, Convention Record, Part 4, pages 96–104, New York. IRE.
Williams, R. J. and Zipser, D. (1995). Gradient-based learning algorithms for recurrent networks and their computational complexity. In Chauvin, Y. and Rumelhart, D. E., editors, Back-propagation: Theory, Architectures and Applications. Erlbaum.