【Datawhale 大模型基础】第十章 大模型的 Adaptation

发布时间:2023年12月23日

第十章 大模型的 Adaptation

This blog is based on datawhale files and a nice survey.

Following pre-training, LLMs can develop general capabilities for addressing various tasks. However, an increasing body of research indicates that the abilities of LLMs can be further tailored to specific objectives. This blog will present two primary methods for adapting pre-trained LLMs: instruction tuning and alignment tuning. The former primarily seeks to enhance or unlock the capabilities of LLMs, while the latter aims to align the behaviors of LLMs with human values or preferences. Additionally, this blog will also explore efficient tuning and quantization for model adaptation in resource-constrained environments. And this topic contains so much knowledge that for further study, can see the survey.

10.1 Instruction Tuning

Essentially, instruction tuning involves fine-tuning pre-trained LLMs using a set of formatted instances in natural language, which is closely related to supervised fine-tuning and multi-task prompted training. To carry out instruction tuning, the first step is to gather or create instances formatted as instructions. These formatted instances are then used to fine-tune LLMs in a supervised learning manner, such as training with sequence-to-sequence loss. Following instruction tuning, LLMs can exhibit enhanced abilities to generalize to unseen tasks, even in multilingual settings.

And Instruction Tuning contains:

  • Formatted Instance Construction

    • Formatting NLP Task Datasets
    • Formatting Daily Chat Data
    • Formatting Synthetic Data
    • Key Factors for Instance Construction
      • Scaling the instructions
      • Formatting design
  • Instruction Tuning Strategies

    • Balancing the Data Distribution
    • Combining Instruction Tuning and Pre-Training
    • Multi-stage Instruction Tuning
    • Other Practical Tricks
      • Efficient training for multi-turn chat data
      • Establishing self-identification for LLM
  • The Effect of Instruction Tuning

    • Performance Improvement
    • Task Generalization
    • Domain Specialization
  • Empirical Analysis for Instruction Tuning

    • Task-specific instructions are better suited for the QA environment but may not be applicable in a chat context
    • Increasing the intricacy and variety of instructions results in enhanced model performance
    • A larger model size results in improved performance in following instructions

10.2 Alignment Tuning

This section initially provides an overview of alignment, including its definition and criteria, then delves into the acquisition of human feedback data for aligning LLMs, and ultimately explores the pivotal technique of reinforcement learning from human feedback (RLHF) for alignment tuning.

Alignment Tuning contains:

  • Alignment Criteria

    • Helpfulness
    • Honesty
    • Harmlessness
  • Collecting Human Feedback

    • Human Labeler Selection
    • Human Feedback Collection
      • Ranking-based approach
      • Question-based approach
      • Rule-based approach
  • Reinforcement Learning from Human Feedback
    在这里插入图片描述

10.3 Parameter-Efficient Fine-Tuning

In prior research, there has been significant focus on parameter-efficient fine-tuning, which seeks to minimize the number of trainable parameters while maintaining optimal performance.

在这里插入图片描述

  • Adapter Tuning
  • Prefix Tuning
  • Prompt Tuning
  • Low-Rank Adaptation (LoRA)

10.4 Memory-Efficient Model Adaptation

Because of the substantial quantity of model parameters, LLMs require a significant memory footprint for inference, rendering deployment in real-world applications costly.

  • Post-Training Quantization (PTQ)
    • Mixed-precision decomposition
    • Fine-grained quantization
    • Balancing the quantization difficulty
    • Layerwise quantization
  • Other Quantization Methods
    • Efficient fine-tuning enhanced quantization
    • Quantization-aware training (QAT) for LLMs
  • Important Findings from Existing Work
    • INT8 weight quantization frequently produces excellent results for LLMs, whereas the effectiveness of lower precision weight quantization relies on specific methods.
    • Quantizing activations is more challenging than quantizing weights
    • Efficient fine-tuning enhanced quantization is a good option to enhance the performance of quantized LLMs

In the end, I collect some surveys about this topic, readers interested in this field can further read:

DomainTitlePaper URLProject URLRelease Month
Instruction TuningAre Prompts All the Story? No. A Comprehensive and Broader View of Instruction Learninghttps://arxiv.org/pdf/2303.10475.pdfhttps://github.com/RenzeLou/awesome-instruction-learning2023.03
Instruction TuningInstruction Tuning for Large Language Models: A Surveyhttps://arxiv.org/pdf/2308.10792.pdfNone2023.08
Instruction TuningVision-Language Instruction Tuning: A Review and Analysishttps://arxiv.org/pdf/2311.08172.pdfhttps://github.com/palchenli/VL-Instruction-Tuning2023.11
Human Alignment for LLMAligning Large Language Models with Human: A Surveyhttps://arxiv.org/pdf/2307.12966.pdfhttps://github.com/GaryYufei/AlignLLMHumanSurvey2023.07
Human Alignment for LLMFrom Instructions to Intrinsic Human Values – A Survey of Alignment Goals for Big Modelhttps://arxiv.org/pdf/2308.12014.pdfhttps://github.com/ValueCompass/Alignment-Goal-Survey2023.08
Human Alignment for LLMLarge Language Model Alignment: A Surveyhttps://arxiv.org/pdf/2309.15025.pdfNone2023.09
Human Alignment for LLMAI Alignment: A Comprehensive Surveyhttps://arxiv.org/pdf/2310.19852https://www.alignmentsurvey.com/2023.10
Efficient LLMsThe Efficiency Spectrum of Large Language Models: An Algorithmic Surveyhttps://arxiv.org/pdf/2310.10844.pdfhttps://github.com/tding1/Efficient-LLM-Survey2023.12
Efficient LLMsEfficient Large Language Models: A Surveyhttps://arxiv.org/pdf/2312.03863https://github.com/AIoT-MLSys-Lab/Efficient-LLMs-Survey2023.12

END

文章来源:https://blog.csdn.net/qq_52370024/article/details/135164392
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。