Machine Learning Research Blog

Francis Bach

Menu
  • Home
  • About
  • Home page
Menu

Category: Tools

Cute mathematical tools

On the unreasonable effectiveness of Richardson extrapolation

Posted on March 1, 2020 by Francis Bach

This month, I will follow up on last month’s blog post, and describe classical techniques from numerical analysis that aim at accelerating the convergence of a vector sequence to its limit, by only combining elements of the sequence, and without the detailed knowledge of the iterative process that has led to this sequence. Last month,…

Read more

Acceleration without pain

Posted on February 4, 2020May 31, 2021 by Francis Bach

I don’t know of any user of iterative algorithms who has not complained one day about their convergence speed. Whether the data are too big, the processors not fast or numerous enough, waiting for an algorithm to converge unfortunately remains a core practical component of computer science and applied mathematics. This was already a concern…

Read more

The sum of a geometric series is all you need!

Posted on January 6, 2020March 10, 2020 by Francis Bach

I sometimes joke with my students about one of the main tools I have been using in the last ten years: the explicit sum of a geometric series. Why is this? From numbers to operators The simplest version of this basic result for real numbers is the following: $$ \forall r \neq 1, \ \forall…

Read more

Polynomial magic II : Jacobi polynomials

Posted on December 2, 2019April 16, 2020 by Francis Bach

Following up my last post on Chebyshev polynomials, another piece of polynomial magic this month. This time, Jacobi polynomials will be the main players. Since definitions and various formulas are not as intuitive as for Chebyshev polynomials, I will start by the machine learning / numerical analysis motivation, which is an elegant refinement of Chebyshev…

Read more

Polynomial magic I : Chebyshev polynomials

Posted on November 4, 2019December 1, 2019 by Francis Bach

Orthogonal polynomials pop up everywhere in applied mathematics and in particular in numerical analysis. Within machine learning and optimization, typically (a) they provide natural basis functions which are easy to manipulate, or (b) they can be used to model various acceleration mechanisms. In this post, I will describe one class of such polynomials, the Chebyshev…

Read more

The Gumbel trick

Posted on September 2, 2019March 28, 2022 by Francis Bach

Quantities of the form \(\displaystyle \log \Big( \sum_{i=1}^n \exp( x_i) \Big)\) for \(x \in \mathbb{R}^n\), often referred to as “log-sum-exp” functions are ubiquitous in machine learning, as they appear in normalizing constants of exponential families, and thus in many supervised learning formulations such as softmax regression, but also more generally in (Bayesian or frequentist) probabilistic…

Read more

The “η-trick” reloaded: multiple kernel learning

Posted on August 5, 2019August 5, 2019 by Francis Bach

In my previous post, I described various (potentially non-smooth) functions that have quadratic (and thus smooth) variational formulations, a possibility that I referred to as the η-trick. For example, in its simplest formulation, we have \( \displaystyle |w| = \min_{ \eta \geq 0} \frac{1}{2} \frac{w^2}{\eta} + \frac{1}{2} \eta\). While it seems most often used for…

Read more

The “η-trick” or the effectiveness of reweighted least-squares

Posted on July 1, 2019July 21, 2022 by Francis Bach

Optimizing a quadratic function is often considered “easy” as it is equivalent to solving a linear system, for which many algorithms exist. Thus, reformulating a non-quadratic optimization problem into a sequence of quadratic problems is a natural idea. While the standard generic way is Newton method, which is adapted to smooth (at least twice-differentiable) functions,…

Read more
  • Previous
  • 1
  • 2

Recent Posts

  • Unraveling spectral properties of kernel matrices – II
  • My book is (at last) out!
  • Scaling laws of optimization
  • Unraveling spectral properties of kernel matrices – I
  • Revisiting the classics: Jensen’s inequality

About

I am Francis Bach, a researcher at INRIA in the Computer Science department of Ecole Normale Supérieure, in Paris, France. I have been working on machine learning since 2000, with a focus on algorithmic and theoretical contributions, in particular in optimization. All of my papers can be downloaded from my web page or my Google Scholar page. I also have a Twitter account.

Recent Posts

  • Unraveling spectral properties of kernel matrices – II
  • My book is (at last) out!
  • Scaling laws of optimization
  • Unraveling spectral properties of kernel matrices – I
  • Revisiting the classics: Jensen’s inequality

Recent Comments

  • Francis Bach on Unraveling spectral properties of kernel matrices – II
  • Chanwoo Chun on Unraveling spectral properties of kernel matrices – II
  • Antonio Horta Ribeiro on Unraveling spectral properties of kernel matrices – II
  • Francis Bach on My book is (at last) out!
  • Francis Bach on Unraveling spectral properties of kernel matrices – I

Archives

  • March 2025
  • December 2024
  • October 2024
  • January 2024
  • March 2023
  • February 2023
  • December 2022
  • November 2022
  • September 2022
  • July 2022
  • April 2022
  • March 2022
  • February 2022
  • July 2021
  • June 2021
  • April 2021
  • March 2021
  • February 2021
  • January 2021
  • December 2020
  • November 2020
  • October 2020
  • September 2020
  • August 2020
  • July 2020
  • June 2020
  • May 2020
  • April 2020
  • March 2020
  • February 2020
  • January 2020
  • December 2019
  • November 2019
  • October 2019
  • September 2019
  • August 2019
  • July 2019
  • June 2019

Categories

  • Machine learning
  • Opinions
  • Optimization
  • Tools

Meta

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org
©2025 Machine Learning Research Blog | WordPress Theme by Superbthemes.com