Francis Bach – Machine Learning Research Blog

Unraveling spectral properties of kernel matrices – II

Posted on March 24, 2025March 24, 2025 by Francis Bach

This month, we pursue our exploration of spectral properties of kernel matrices. As mentioned in a previous post, understanding how eigenvalues decay is not only fun but also key to understanding algorithmic and statistical properties of many learning methods (see, e.g., chapter 7 of my book “Learning Theory from First Principles“). This month, we look…

My book is (at last) out!

Posted on December 21, 2024 by Francis Bach

Just in time for Christmas, I received two days ago the first hard copies of my book! It is a mix of feelings of relief and pride after 3 years of work. As most book writers will probably acknowledge, it took much longer than I expected when I started, but overall it was an enriching…

Scaling laws of optimization

Posted on October 5, 2024October 21, 2024 by Francis Bach

Scaling laws have been one of the key achievements of theoretical analysis in various fields of applied mathematics and computer science, answering the following key question: How fast does my method or my algorithm converge as a function of (potentially partially) observable problem parameters. For supervised machine learning and statistics, probably the simplest and oldest…

Unraveling spectral properties of kernel matrices – I

Posted on January 7, 2024January 12, 2024 by Francis Bach

Since my early PhD years, I have plotted and studied eigenvalues of kernel matrices. In the simplest setting, take independent and identically distributed (i.i.d.) data, such as in the cube below in 2 dimensions, take your favorite kernels, such as the Gaussian or Abel kernels, plot eigenvalues in decreasing order, and see what happens. The…

Revisiting the classics: Jensen’s inequality

Posted on March 13, 2023March 15, 2023 by Francis Bach

There are a few mathematical results that any researcher in applied mathematics uses on a daily basis. One of them is Jensen’s inequality, which allows bounding expectations of functions of random variables. This really happens a lot in any probabilistic arguments but also as a tool to generate inequalities and optimization algorithms. In this blog…

Non-convex quadratic optimization problems

Posted on February 2, 2023March 14, 2023 by Francis Bach

Among continuous optimization problems, convex problems (with convex objectives and convex constraints) define a class that can be solved efficiently with a variety of algorithms and with arbitrary precision. This is not true more generally when the convexity assumption is removed (see this post). This of course does not mean that (1) nobody should attempt…

Sums-of-squares for dummies: a view from the Fourier domain

Posted on November 16, 2022November 16, 2022 by Francis Bach

In these last two years, I have been studying intensively sum-of-squares relaxations for optimization, learning a lot from many great research papers [1, 2], review papers [3], books [4, 5, 6, 7, 8], and even websites. Much of the literature focuses on polynomials as the de facto starting point. While this leads to deep connections…

Information theory with kernel methods

Posted on April 4, 2022December 11, 2023 by Francis Bach

In last month blog post, I presented the von Neumann entropy. It is defined as a spectral function on positive semi-definite (PSD) matrices, and leads to a Bregman divergence called the von Neumann relative entropy (or matrix Kullback Leibler divergence), with interesting convexity properties and applications in optimization (mirror descent, or smoothing) and probability (concentration…

Playing with positive definite matrices – II: entropy edition

Posted on March 7, 2022March 22, 2022 by Francis Bach

Symmetric positive semi-definite (PSD) matrices come up in a variety of places in machine learning, statistics, and optimization, and more generally in most domains of applied mathematics. When estimating or optimizing over the set of such matrices, several geometries can be used. The most direct one is to consider PSD matrices as a convex set…

Playing with positive definite matrices – I: matrix monotony and convexity

Posted on February 17, 2022January 27, 2023 by Francis Bach

In a series of a few blog posts, I will present classical and non-classical results on symmetric positive definite matrices. Beyond being mathematically exciting, they arise naturally a lot in machine learning and optimization, as Hessians of twice continuously differentiable convex functions and through kernel methods. In this post, I will focus on the benefits…