machine learning processes information, but what do we mean by 'information"? One view comes from Information Theory, which was originally developed motivated by problems of communication. But learning problems are different, aren't they?
This project explores the notion of information that makes sense for machine learning problems. It turns out that there is a large family of measure sof information (subsuming the special case of Shannon information) which arise very "naturally" in machine learning. The justification for this naturalness is the key result, demonstrated in the paper linked below, that there is a 1:1 correspondence between measures of information and the best possible expected risk of a learning problem, which depends upon the loss function chosen.
A consequence of this new more general definition of information is that the classical information processing inequality (a basic result in information theory) becomes and information processing equality, which is quite a surprise, and offers interesting insights into the role of hypothesis classes in machine learning algorithms.