Neural Networks Part 1: A Simple Proof of the Universal Approximation Theorem

Published in

Good Audience

2 min readAug 2, 2017

From this function…

one neural network can extrapolate the function below. A single layer neural network can accurately represent any continuous function thinkable.

A feedforward network with a single layer is sufficient to represent any function, but the layer may be infeasibly large and may fail to learn and generalize correctly.
— Ian Goodfellow, DLB

A grand statement, declaring any one layer neural network can solve almost any problem.

The Universal Approximation Theorem sparked the potential and functionality in neural networks we see today. A simple neural network including only a single hidden layer can approximate any continuous function. The sample function above, a randomized combination of sine functions, can easily be replicated by the neural network on the left, when properly configured. By adjusting weights (the lines in the diagram) and biases (a variable within each neuron in green on the diagram) a neural network can take any input for x and approximate it to fit the exact same line as the chaotically wavy line above.

By including n amount of neurons in a single hidden layer, a neural network can approximate an input of x for any function, f(x). This function must be continuous, if the function is not continuous a single hidden layer neural network does not easily or accurately approximate f(x). (We can easily fix this by adding another layer to mimic the output of a discontinuous function.)

In other words, a single hidden layer neural network can approximate any continuous function of x to any degree of precision. This theorem fostered our understanding of neural networks, giving rise to the neural network of today and potentially tomorrow.

Neural Networks Part 1: A Simple Proof of the Universal Approximation Theorem

Written by Joe Klein