Notes from the Wired
This is a website where I write articles on various topics that interest me, carving out a bit of cyberspace for myself.
You shouldn't believe anything I talk about — I use words entirely recreationally.
Pinned
- March 16, 2025
Do not miss the spectacle that is the VTLeague. My money is on /hag/, they destroyed /vtwbg/.
Most Recent
Mar. 30
Learning to Optimize Neural Nets
Paper Title: Learning to Optimize Neural Nets
Link to Paper: https://arxiv.org/abs/1703.00441 Date: 1. March 2017
Paper Type: Meta-Learning, Gradient secent, Neural Network
Short Abstract: This is a follow-up paper to Learning to Optimize, in which reinforcement learning was used to learn an optimizer. In this paper, the authors apply this framework to learning optimizers for shallow neural networks.
1. Introduction The philosophy of machine learning is that, in general, algorithms learned from data perform better than handcrafted algorithms. This idea can also be applied to the algorithms used for learning—specifically, optimization algorithms.
Mar. 29
Learning to Optimize
Paper Title: Learning to Optimize Link to Paper: https://arxiv.org/abs/1606.01885
Date: 6. June 2016
Paper Type: Meta-Learning, Gradient secent, Neural Network
Short Abstract: Designing algorithms by hand takes time and requires many iterations. This paper focuses on exploring optimization algorithms that are learned rather than handcrafted.
1. Introduction Our current approach to designing algorithms is time-consuming and difficult. It requires a mix of intuition and theoretical/empirical insight, followed by performance analysis and iterative refinement. Thus, automating this process would be beneficial.
Mar. 29
Learning to learn by gradient descent by gradient descent
Paper Title: Learning to learn by gradient descent by gradient descent
Link to Paper: https://arxiv.org/abs/1606.04474
Date: 14. June 2016
Paper Type: Meta-Learning, Gradient secent, Neural Network
Short Abstract: One of the reasons machine learning became so successful is because of a paradigm shift: instead of building algorithms by hand and finely tuning them, we let the computer learn the algorithm from data.
When we look at optimizers like SGD or ADAM, they are still handcrafted. In this paper, the authors attempt to train an optimizer using machine learning that outperforms other optimizers.