New Paper: Policy-Based Self-Competition for Planning Problems

New Paper: Policy-Based Self-Competition for Planning Problems

New paper at International Conference on Learning Representation (ICLR): “Policy-Based Self-Competition for Planning Problems”. AlphaZero-type algorithms may stop improving on single-player tasks in case the value network guiding the tree search is unable to approximate the outcome of an episode sufficiently well. One technique to address this problem is transforming the single-player task through self-competition. The main idea is to compute a scalar baseline from the agent’s historical performances and to reshape an episode’s reward into a binary output, indicating whether the baseline has been exceeded or not. However, this baseline only carries limited information for the agent about strategies how to improve. We leverage the idea of self-competition and directly incorporate a historical policy into the planning process instead of its scalar performance. Based on the recently introduced Gumbel AlphaZero (GAZ), we propose our algorithm GAZ ‘Play-to-Plan’ (GAZ PTP), in which the agent learns to find strong trajectories by planning against possible strategies of its past self. We show the effectiveness of our approach in two well-known combinatorial optimization problems, the Traveling Salesman Problem and the Job-Shop Scheduling Problem. With only half of the simulation budget for search, GAZ PTP consistently outperforms all selected single-player variants of GAZ.

New Paper: ForeTiS – a comprehensive time series forecasting framework in Python

New Paper: ForeTiS – a comprehensive time series forecasting framework in Python

New paper in Machine Learning with Applications: “ForeTiS: A comprehensive time series forecasting framework in Python”. Time series forecasting is a research area with applications in various domains, nevertheless without yielding a predominant method so far. We present ForeTiS, a comprehensive and open source Python framework that allows rigorous training, comparison, and analysis of state-of-the-art time series forecasting approaches. Our framework includes fully automated yet configurable data preprocessing and feature engineering. In addition, we use advanced Bayesian optimization for automatic hyperparameter search. ForeTiS is easy to use, even for non-programmers, requiring only a single line of code to apply state-of-the-art time series forecasting. Various prediction models, ranging from classical forecasting approaches to machine learning techniques and deep learning architectures, are already integrated. More importantly, as a key benefit for researchers aiming to develop new forecasting models, ForeTiS is designed to allow for rapid integration and fair benchmarking in a reliable framework. Thus, we provide a powerful framework for both end users and forecasting experts.

Josef joins the Team as Research Assistant

Josef joins the Team as Research Assistant

Josef joins the team as research assistant. He will work on novel machine learning methods for time series forecasting within the project “Digital management support systems for small and medium-sized enterprises in value chains of ornamental plants, perennials and cut flowers (PlantGrid)”, funded by the Federal Office of Food and Agriculture.