Machine Learning Research Blog

Francis Bach

Menu
  • Home
  • About
  • Home page
Menu

Category: Machine learning

Machine learning concepts or tools

Gradient descent for wide two-layer neural networks – I : Global convergence

Posted on June 1, 2020November 15, 2022 by Francis Bach

Supervised learning methods come in a variety of flavors. While local averaging techniques such as nearest-neighbors or decision trees are often used with low-dimensional inputs where they can adapt to any potentially non-linear relationship between inputs and outputs, methods based on empirical risk minimization are the most commonly used in high-dimensional settings. Their principle is…

Read more

Effortless optimization through gradient flows

Posted on May 1, 2020May 22, 2020 by Francis Bach

Optimization algorithms often rely on simple intuitive principles, but their analysis quickly leads to a lot of algebra, where the original idea is not transparent. In last month post, Adrien Taylor explained how convergence proofs could be automated. This month, I will show how proof sketches can be obtained easily for algorithms based on gradient…

Read more

Computer-aided analyses in optimization

Posted on April 3, 2020October 14, 2020 by Adrien Taylor

In this blog post, I want to illustrate how computers can be great allies in designing (and verifying) convergence proofs for first-order optimization methods. This task can be daunting, and highly non-trivial, but nevertheless usually unavoidable when performing complexity analyses. A notable example is probably the convergence analysis of the stochastic average gradient (SAG) [1],…

Read more

On the unreasonable effectiveness of Richardson extrapolation

Posted on March 1, 2020 by Francis Bach

This month, I will follow up on last month’s blog post, and describe classical techniques from numerical analysis that aim at accelerating the convergence of a vector sequence to its limit, by only combining elements of the sequence, and without the detailed knowledge of the iterative process that has led to this sequence. Last month,…

Read more

Are all kernels cursed?

Posted on October 8, 2019October 28, 2019 by Francis Bach

The word “kernel” appears in many areas of science (it is even worse in French with “noyau”); it can have different meanings depending on context (see here for a nice short historical review for mathematics). Within machine learning and statistics, kernels are used in two related but different contexts, with different definitions and some kernels…

Read more

The “η-trick” reloaded: multiple kernel learning

Posted on August 5, 2019August 5, 2019 by Francis Bach

In my previous post, I described various (potentially non-smooth) functions that have quadratic (and thus smooth) variational formulations, a possibility that I referred to as the η-trick. For example, in its simplest formulation, we have \( \displaystyle |w| = \min_{ \eta \geq 0} \frac{1}{2} \frac{w^2}{\eta} + \frac{1}{2} \eta\). While it seems most often used for…

Read more
  • Previous
  • 1
  • 2
  • 3

Recent Posts

  • Closed-form dynamics beyond quadratics
  • Beyond Power Laws: Scaling Laws for Next-Token Prediction
  • Revisiting scaling laws via the z-transform
  • Unraveling spectral properties of kernel matrices – II
  • My book is (at last) out!

About

I am Francis Bach, a researcher at INRIA in the Computer Science department of Ecole Normale Supérieure, in Paris, France. I have been working on machine learning since 2000, with a focus on algorithmic and theoretical contributions, in particular in optimization. All of my papers can be downloaded from my web page or my Google Scholar page. I also have a Twitter account. I recently published a book “Learning Theory from First Principles“.

Recent Posts

  • Closed-form dynamics beyond quadratics
  • Beyond Power Laws: Scaling Laws for Next-Token Prediction
  • Revisiting scaling laws via the z-transform
  • Unraveling spectral properties of kernel matrices – II
  • My book is (at last) out!

Recent Comments

  • Francis Bach on Beyond Power Laws: Scaling Laws for Next-Token Prediction
  • Akshay on Beyond Power Laws: Scaling Laws for Next-Token Prediction
  • Francis Bach on My book is (at last) out!
  • Sidd on My book is (at last) out!
  • Francis Bach on Sums-of-squares for dummies: a view from the Fourier domain

Archives

  • March 2026
  • September 2025
  • July 2025
  • March 2025
  • December 2024
  • October 2024
  • January 2024
  • March 2023
  • February 2023
  • December 2022
  • November 2022
  • September 2022
  • July 2022
  • April 2022
  • March 2022
  • February 2022
  • July 2021
  • June 2021
  • April 2021
  • March 2021
  • February 2021
  • January 2021
  • December 2020
  • November 2020
  • October 2020
  • September 2020
  • August 2020
  • July 2020
  • June 2020
  • May 2020
  • April 2020
  • March 2020
  • February 2020
  • January 2020
  • December 2019
  • November 2019
  • October 2019
  • September 2019
  • August 2019
  • July 2019
  • June 2019

Categories

  • Machine learning
  • Opinions
  • Optimization
  • Tools

Meta

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org
©2026 Machine Learning Research Blog | WordPress Theme by Superbthemes.com