Gradient Free Approaches to Update Weights in Deep Learning

less than 1 minute read

Published:

Gradient Descent for Non-Differentiable Functions

  1. Smooth Approximation
  2. E-subgradient method
  3. Cutting plane method
  4. Subgradient method
  5. EA
  6. Conjugate gradient method
  7. Hessian free optimization method
  8. Quasi-Newton method
  9. GA
  10. BFGS
  11. Simulated Annealing

Gradient-Free Optimization Algorithms:

  1. Bayesian optimization
  2. Coordinate descent and adaptive coordinate descent
  3. Cuckoo search
  4. Beetle Antennae Search (BAS)
  5. DONE
  6. Evolution strategies, Natural evolution strategies (CMA-ES, xNES, SNES)
  7. Genetic algorithms
  8. MCS algorithm
  9. Nelder-Mead method
  10. Particle swarm optimization
  11. Pattern search
  12. Random search (including Luus–Jaakola)
  13. Simulated annealing
  14. Stochastic optimization
  15. Subgradient method

Further Reading

  1. Difference Target Propagation https://arxiv.org/pdf/1412.7525.pdf

  2. The HSIC Bottleneck (Hilbert-Schmidt Independence Criterion) https://arxiv.org/pdf/1908.01580v1.pdf
  3. Online Alternating Minimization with Auxiliary Variables https://arxiv.org/pdf/1806.09077.pdf
  4. Decoupled Neural Interfaces Using Synthetic Gradients https://arxiv.org/pdf/1608.05343.pdf
  5. Accelerated Stochastic Gradient-free and Projection-free Methods
  6. On Correctness of Automatic Differentiation for Non-Differentiable Functions
  7. Direct Feedback Alignment Provides Learning in Deep Neural Networks
  8. Cubature Kalman filtering for training deep neural networks

Other : Nelder-Mead simplex algorithm, the Powell’s method, and the Hooke-Jeeves method.