タグ RLHF に関するスライド | ドクセル

タグ #RLHF に関するスライド

slide-thumbnail

【大規模言語モデル入門Ⅱ】12章12.1

京都大学人工知能研究会KaiRA 1K

slide-thumbnail

【DL輪読会】Understanding the performance gap between online and offline alignment algorithms

user-img

Deep Learning JP 2.6K

slide-thumbnail

【DL輪読会】Alignment Algorithms for Diffusion Models

user-img

Deep Learning JP 7.7K

slide-thumbnail

【大規模言語モデル入門】４章4.4~4.6

京都大学人工知能研究会KaiRA 2.6K

slide-thumbnail

【DL輪読会】Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback

user-img

Deep Learning JP 11.8K

#RLHF

#大規模言語モデル

#DPO