(yet) Study on Preference based Learning
12 Feb 2024< 목차 >
tmp
tmp
Reference
- Papers
- PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via Relabeling Experience and Unsupervised Pre-training
- Learning Multimodal Rewards from Rankings
- Reward Uncertainty for Exploration in Preference-based Reinforcement Learning
- Few-Shot Preference Learning for Human-in-the-Loop RL
- Inverse Preference Learning: Preference-based RL without a Reward Function
- Direct Preference Optimization: Your Language Model is Secretly a Reward Model
-
Lectures