Goku Technologies recently published on Step-wise Adaptive Integration of Supervised fine-tuning and Reinforcement Learning (SASR), which is currently available on arXiv and will be submitted to NeurIPS. SASR is an evolutionary training method for LLMs. Original LLMs utilized supervised learning (SFT), while DeepSeek divided SFT and reinforcement learning (RL) into separate stages. SASR is a new framework that uses an adaptive algorithm to dynamically adjust the training weights of SFT and RL throughout the continuous training process, which further enhances the efficiency of LLMs.
AI has been integral to Goku’s investment process since 2018. While this paper is not directly focused on quant, our R&D deepens our understanding on how we can better utilize tools such as LLMs to further improve risk-adjusted returns for our investors.
You can find the paper in the link below.
Investment approach
Impact & Insights
Contact