A pure MLX-based training pipeline for fine-tuning LLMs using GRPO on Apple Silicon. - View it on GitHub
Star
0
Rank
13850835