This article does not have a corresponding language version
Home
Archives
About
Friend
English
简体中文
English
Tag
Reinforcement Learning
Blog
Diffusion Models
Docker
Graphics
Input
Linux
LLM
Machine Learning
Reinforcement Learning
Web Tech
Blog
Linux
LLM Post Training
Machine Learning
Software
2026
2026-04
LLM后训练(五)--GRPO和DPO
2026-04
LLM后训练(四)--RLHF-PPO
2026-04
LLM后训练(三)--PPO算法
2026-04
LLM后训练(二)--价值函数
2026-04
LLM后训练(一)--强化学习
1
MIMI
Posts
13
Categories
5
Tags
10
Home
Archives
About
Friend