We are happy to share our recent publication:
Self-Improvement for Neural Combinatorial Optimization: Sample Without Replacement, but Improvement
Our latest paper “Self-Improvement for Neural Combinatorial Optimization: Sample Without Replacement, but Improvement” has been published in Transactions on Machine Learning Research (TMLR). This is impressive work by Jonathan Pirnay and was awarded with a “Featured” Certification.
What is the paper about?
The emerging field of Neural Combinatorial Optimization uses neural networks to solve complex planning problems by learning patterns and strategies from data. Typically, these neural networks are trained either by supervised learning from expert data, which is often unavailable, or by reinforcement learning, which can be unstable and computationally inefficient.
In this work, we bridge these approaches with a novel strategy that trains the network on “pseudo-expert” output that it generates itself, without any external guidance. This is a modern approach that aligns with the recent success of self-improved learning in reasoning language models. It also allows the training of deep transformer models where reinforcement learning is too slow. While our paper presents our method from a general machine learning perspective, we are currently successfully applying this approach to various problems in chemistry, chemical engineering, and genomics! Stay tuned for exciting papers to come!
Original Publication (Open Access)
Pirnay J & Grimm, D. G. (2024). Self-Improvement for Neural Combinatorial Optimization: Sample Without Replacement, but Improvement. Transactions on Machine Learning Research (TMLR), https://openreview.net/forum?id=agT8ojoH0X