For seminar recording, please click here
Talk Title:
Modern Hopfield Networks for Novel Transformer Architectures
Abstract:
Modern Hopfield Networks or Dense Associative Memories are recurrent neural networks with fixed point attractor states that are described by an energy function. In contrast to conventional Hopfield Networks, which were popular in the 1980s, their modern versions have a very large memory storage capacity, which makes them appealing tools for many problems in machine learning and cognitive and neuro-sciences. In this talk I will introduce an intuition and a mathematical formulation of this class of models, and will give examples of problems in AI that can be tackled using these new ideas. Particularly, I will introduce an architecture called Energy Transformer, which replaces the conventional attention mechanism with a recurrent Dense Associative Memory model. I will explain the theoretical principles behind this architectural choice and show promising empirical results on challenging computer vision and graph network tasks.