# Mathematical Perspectives on Neural Networks

By Paul Smolensky; Michael C. Mozer et al. | Go to book overview

19 Parametric Statistical Estimation with Artificial Neural Networks

Halbert White University of California, San Diego

1. INTRODUCTION

Learning in artificial neural networks is a process by which experience arising from exposure to measurements of empirical phenomena is converted to knowledge, embodied in network weights. This process can be viewed formally as statistical estimation of the parameters of a parametrized probability model. In this chapter we exploit this formal viewpoint to obtain a unified theory of learning in artificial neural networks. The theory is sufficiently general to encompass both supervised and unsupervised learning in either feedforward or recurrent networks.

2. DATA GENERATING PROCESSES AS OBJECTS OF LEARNING

The basis of any learning is data. Observed data can be considered as a sequence of realizations of an underlying stochastic process. For simplicity, we assume here that observations on a specific phenomenon of interest (e.g., a physical system, an economy, a group of patients) are made in discrete fashion. A formal assumption that embodies these heuristics in a way convenient for our purposes is the following.

Assumption A.1. The observed data are the realization of a stochastic process {Zt : Ω → ℝυ, t = 1,2, ..}, ν, ∈ ℝ ℕ, on the complete probability space (Ω, F, P0). For convenience and without loss of generality, we take

F= Bυ∞ ≡B(ℝυ∞) and let {Zt} be the coordinate variable process: for w = {zt} ∈ ℝυ∞, Zt (w) = zt, t = 1, 2, . . .

-719-

