Offered By: IBMSkillsNetwork
Advanced Fine-Tuning for Large Language Models (LLMs)
Fine-tune Large Language Models (LLMs) to enhance AI accuracy and optimize performance with cutting-edge skills employers seek
Continue readingCourse
Artificial Intelligence
At a Glance
Fine-tune Large Language Models (LLMs) to enhance AI accuracy and optimize performance with cutting-edge skills employers seek
- In-demand Generative AI engineering skills in fine-tuning LLMs employers are actively looking for in just 2 weeks
- Instruction-tuning and reward modeling with Hugging Face, plus LLMs as policies and RLHF
- Direct preference optimization (DPO) with partition function and Hugging Face and how to create an optimal solution to a DPO problem
- How to use proximal policy optimization (PPO) with Hugging Face to create a scoring function and perform dataset tokenization
Course Syllabus
Module 0: Welcome
- Video: Course Introduction
- Specialization Overview
- General Information
- Learning Objectives and Syllabus
- Helpful Tips for Course Completion
- Grading Scheme
Module 1: Different Approaches to Fine-Tuning
- Reading: Module Introduction and Learning Objectives
- Video: Basics of Instruction-Tuning
- Video: Instruction-Tuning with Hugging Face
- Reading: Instruction Tuning
- Lab: Instruction Fine-Tuning LLMs
- Video: Reward Modeling: Response Evaluation
- Video: Reward Model Training
- Video: Reward Modeling with Hugging Face
- Reading:Reward Modeling & Response Evaluation
- Lab: Reward Modeling
- Practice Quiz: Different Approaches to Fine-Tuning
- Reading: Module Summary and Highlights
- Graded Quiz: Different Approaches to Fine-Tuning
- Reading: Module Introduction and Learning Objectives
- Video: Large Language Models (LLMs) as Distributions
- Video: From Distributions to Policies
- Video: Reinforcement Learning from Human Feedback (RLHF)
- Video: Proximal Policy Optimization (PPO)
- Video: PPO with Hugging Face
- Video: PPO Trainer
- Lab: Reinforcement Learning from Human Feedback Using PPO
- Video: DPO – Partition Function
- Video: DPO – Optimal Solution
- Video: From Optimal Policy to DPO
- Video: DPO with Hugging Face
- Lab: Direct Preference Optimization (DPO) Using Hugging Face
- Reading: Fine-Tune LLMs Locally with InstructLab
- Reading: Module Summary and Highlights
- Practice Quiz: Fine-Tuning Causal LLMs with Human Feedback and Direct Preference
- Graded Quiz: Fine-Tuning Causal LLMs with Human Feedback and Direct Preference
- Reading: Cheat Sheet – Generative AI Advanced Fine-Tuning for LLMs
- Reading: Glossary – Generative AI Advance Fine-Tuning for LLMs
- Reading: Course Conclusion
- Reading: Congratulations and Next Steps
- Reading: Teams and Acknowledgments
- Copyright and Trademarks
Recommended Skills Prior to Taking this Course
Estimated Effort
8 Hours
Level
Intermediate
Skills You Will Learn
Direct Preference Optimization (DPO), Hugging Face, Instruction-tuning, Proximal Policy Optimization (PPO), Reinforcement Learning
Language
English
Course Code
AI0212EN