Supervision: Konstantinos Pitas

Project type: Semester project (master) Master thesis

Available

How much information do deep neural network weights contain after training? One way of answering this question is modelling the weights as belonging to a posterior distribution of a stochastic classifier [1]. The information in the weights can then be found by minimizing the error of the stochastic classifier while keeping the KL divergence between the posterior and a prior distribution small. This "optimal" KL divergence can in some why be assumed to show the complexity or the information content of the resulting classifier.

For simple distributions such as Gaussians with diagonal covariance (in what is known as the mean field approximation), minimizing this KL term is relatively easy and can be done in a number of ways [2]. For more complex distributions the problem becomes computationally hard.

In this project the student will work on computing lower bounds to the amount of information contained in the weights of a trained deep neural networks for complicated posterior distributions and large architectures.

The student should be highly motivated and should have good knowledge of Tensorflow/Keras and/or Pytorch. Ideally the student should have experience in working with large architectures such as VGG-16.

The project is 20% theory and 80% application.

[1] Where is the information in a deep neural network? https://arxiv.org/abs/1905.12213

[2] Variational Dropout and the Local Reparameterization Trick https://papers.nips.cc/paper/5666-variational-dropout-and-the-local-reparameterization-trick.pdf