H.B. Keller Colloquium
We introduce a new swarm-based gradient descent (SBGD) method for non-convex optimization. The swarm consists of agents, identified with positions x and masses m.
There are three key aspects to the SBGD dynamics:
(i) persistent transition of mass from high to lower ground;
(ii) a random choice of marching direction aligned with the orientation of the steepest gradient descent; and
(iii) a time stepping protocol, h(x,m), which decreases with m.
The interplay between positions and masses leads to dynamic distinction between 'leaders' and 'explorers'. Heavier agents lead the swarm near local minima with small time steps.
Lighter agents explore the landscape in random directions with large time steps, and lead to improve position, i.e., reduce the ‘loss' for the swarm.
Convergence analysis and numerical simulations demonstrate the effectiveness of SBGD method as a global optimizer.