【Datawhale 大模型基础】第十章大模型的 Adaptation

发布时间：2023年12月23日

第十章大模型的 Adaptation

This blog is based on datawhale files and a nice survey.

Following pre-training, LLMs can develop general capabilities for addressing various tasks. However, an increasing body of research indicates that the abilities of LLMs can be further tailored to specific objectives. This blog will present two primary methods for adapting pre-trained LLMs: instruction tuning and alignment tuning. The former primarily seeks to enhance or unlock the capabilities of LLMs, while the latter aims to align the behaviors of LLMs with human values or preferences. Additionally, this blog will also explore efficient tuning and quantization for model adaptation in resource-constrained environments. And this topic contains so much knowledge that for further study, can see the survey.

10.1 Instruction Tuning

Essentially, instruction tuning involves fine-tuning pre-trained LLMs using a set of formatted instances in natural language, which is closely related to supervised fine-tuning and multi-task prompted training. To carry out instruction tuning, the first step is to gather or create instances formatted as instructions. These formatted instances are then used to fine-tune LLMs in a supervised learning manner, such as training with sequence-to-sequence loss. Following instruction tuning, LLMs can exhibit enhanced abilities to generalize to unseen tasks, even in multilingual settings.

And Instruction Tuning contains:

Formatted Instance Construction
- Formatting NLP Task Datasets
- Formatting Daily Chat Data
- Formatting Synthetic Data
- Key Factors for Instance Construction
  - Scaling the instructions
  - Formatting design
Instruction Tuning Strategies
- Balancing the Data Distribution
- Combining Instruction Tuning and Pre-Training
- Multi-stage Instruction Tuning
- Other Practical Tricks
  - Efficient training for multi-turn chat data
  - Establishing self-identification for LLM
The Effect of Instruction Tuning
- Performance Improvement
- Task Generalization
- Domain Specialization
Empirical Analysis for Instruction Tuning
- Task-specific instructions are better suited for the QA environment but may not be applicable in a chat context
- Increasing the intricacy and variety of instructions results in enhanced model performance
- A larger model size results in improved performance in following instructions

10.2 Alignment Tuning

This section initially provides an overview of alignment, including its definition and criteria, then delves into the acquisition of human feedback data for aligning LLMs, and ultimately explores the pivotal technique of reinforcement learning from human feedback (RLHF) for alignment tuning.

Alignment Tuning contains:

Alignment Criteria
- Helpfulness
- Honesty
- Harmlessness
Collecting Human Feedback
- Human Labeler Selection
- Human Feedback Collection
  - Ranking-based approach
  - Question-based approach
  - Rule-based approach
Reinforcement Learning from Human Feedback

10.3 Parameter-Efficient Fine-Tuning

In prior research, there has been significant focus on parameter-efficient fine-tuning, which seeks to minimize the number of trainable parameters while maintaining optimal performance.

在这里插入图片描述

Adapter Tuning
Prefix Tuning
Prompt Tuning
Low-Rank Adaptation (LoRA)

10.4 Memory-Efficient Model Adaptation

Because of the substantial quantity of model parameters, LLMs require a significant memory footprint for inference, rendering deployment in real-world applications costly.

Post-Training Quantization (PTQ)
- Mixed-precision decomposition
- Fine-grained quantization
- Balancing the quantization difficulty
- Layerwise quantization
Other Quantization Methods
- Efficient fine-tuning enhanced quantization
- Quantization-aware training (QAT) for LLMs
Important Findings from Existing Work
- INT8 weight quantization frequently produces excellent results for LLMs, whereas the effectiveness of lower precision weight quantization relies on specific methods.
- Quantizing activations is more challenging than quantizing weights
- Efficient fine-tuning enhanced quantization is a good option to enhance the performance of quantized LLMs

In the end, I collect some surveys about this topic, readers interested in this field can further read:

Domain	Title	Paper URL	Project URL	Release Month
Instruction Tuning	Are Prompts All the Story? No. A Comprehensive and Broader View of Instruction Learning	https://arxiv.org/pdf/2303.10475.pdf	https://github.com/RenzeLou/awesome-instruction-learning	2023.03
Instruction Tuning	Instruction Tuning for Large Language Models: A Survey	https://arxiv.org/pdf/2308.10792.pdf	None	2023.08
Instruction Tuning	Vision-Language Instruction Tuning: A Review and Analysis	https://arxiv.org/pdf/2311.08172.pdf	https://github.com/palchenli/VL-Instruction-Tuning	2023.11
Human Alignment for LLM	Aligning Large Language Models with Human: A Survey	https://arxiv.org/pdf/2307.12966.pdf	https://github.com/GaryYufei/AlignLLMHumanSurvey	2023.07
Human Alignment for LLM	From Instructions to Intrinsic Human Values – A Survey of Alignment Goals for Big Model	https://arxiv.org/pdf/2308.12014.pdf	https://github.com/ValueCompass/Alignment-Goal-Survey	2023.08
Human Alignment for LLM	Large Language Model Alignment: A Survey	https://arxiv.org/pdf/2309.15025.pdf	None	2023.09
Human Alignment for LLM	AI Alignment: A Comprehensive Survey	https://arxiv.org/pdf/2310.19852	https://www.alignmentsurvey.com/	2023.10
Efficient LLMs	The Efficiency Spectrum of Large Language Models: An Algorithmic Survey	https://arxiv.org/pdf/2310.10844.pdf	https://github.com/tding1/Efficient-LLM-Survey	2023.12
Efficient LLMs	Efficient Large Language Models: A Survey	https://arxiv.org/pdf/2312.03863	https://github.com/AIoT-MLSys-Lab/Efficient-LLMs-Survey	2023.12

END

文章来源:https://blog.csdn.net/qq_52370024/article/details/135164392
本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若内容造成侵权/违法违规/事实不符，请联系我的编程经验分享网邮箱：chenni525@qq.com进行投诉反馈，一经查实，立即删除！

【Datawhale 大模型基础】第十章 大模型的 Adaptation