提高效率：用Unsloth、TRL和PEFT深入研究医学模型微调

在提升我的语言模型用于医学任务的能力的旅程中，我正在微调OpenChat3.5模型。为了达到这个目标，我借助了一些强大的库，如Unsloth、PEFT和TRL。

Unsloth（不偷懒），我们的首个伴侣，不仅仅是一个库——它是一个工具包，为我们带来了许多功能。通过直接偏好优化（DPO）支持，与Hugging Face的集成，以及与Llama、Yi和Mistral等各类模型的兼容性，Unsloth确保了准确性与无损。它设计出来可以在Linux和Windows上无缝使用，支持NVIDIA GPU，并实现了高效的硬件资源管理。此库甚至允许从Hugging Face更快地下载模型，并提供4位和16位的量化微调选项。通过名为Llama-Factory的用户友好的微调UI和一个训练速度比原始版本快5倍的开源版本，Unsloth是机器学习爱好者的多功能工具。

现在，让我们来谈谈PEFT - 参数高效微调。这项技术是将预训练语言模型（如OpenChat3.5）适应于各种应用的一项改变游戏规则的技术。与传统微调调整所有模型参数不同，PEFT仅微调最小的子集，大大降低了计算成本。最先进的PEFT技术实现了与全面微调相当的性能，使其成为一种资源高效的选择，可以显著节省时间和资源。

最后，我们有TRL - Transformer增强学习。TRL是一个全套工具库，提供用于训练Transformer语言模型和稳定扩散模型的强化学习工具。从监督微调（SFT）到奖励建模（RM）和近端策略优化（PPO），TRL涵盖了整个领域。TRL建立在Hugging Face transformers库的基础上，支持大多数解码器架构和编码器-解码器架构。

一起，这些库构成了一个强大的三人组，每个库都为细化医学语言任务的OpenChat3.5做出独特的贡献。

在您开始编写代码之前，请确保完成以下步骤：

#I am using Kaggle
%%capture
!mamba install cudatoolkit xformers bitsandbytes pytorch pytorch-cuda=11.8 \
    -c pytorch -c nvidia -c xformers -c conda-forge -y
!pip install "unsloth[kaggle] @ git+https://github.com/unslothai/unsloth.git"
!pip uninstall datasets -y
!pip install datasets

建立基础模型：使用第二段代码片段来导入必要的库并配置您的基础模型。把这一步想象成为了准备您的舰船——OpenChat3.5模型，为航行做好准备。

from unsloth import FastMistralModel
import torch

#Define your hugging face token
hf_token = 'add_here_secret_hf_token'
max_seq_length = 4096 
dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+
load_in_4bit = True # Use 4bit quantization to reduce memory usage. Can be False.


#Load the model and tokenizer
model, tokenizer = FastMistralModel.from_pretrained(
    model_name = "openchat/openchat_3.5", # You can change this to any Llama model!
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
    token = hf_token, # use one if using gated models like meta-llama/Llama-2-7b-hf
)
model.config.use_cache = False
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"


#Get the peft model
model = FastMistralModel.get_peft_model(
    model,
    r = 16,
    target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
                      "gate_proj", "up_proj", "down_proj",],
    lora_alpha = 16,
    lora_dropout = 0, # Currently only supports dropout = 0
    bias = "none",    # Currently only supports bias = "none"
    use_gradient_checkpointing = True,
    random_state = 3407,
    max_seq_length = max_seq_length,
)

加载和格式化数据集：使用第三个代码片段加载 Llama2-MedTuned-Instructions 数据集。这就像收集所有必要的地图和导航工具来进行旅程。

# load the dataset
from datasets import load_dataset
dataset = load_dataset("nlpie/Llama2-MedTuned-Instructions",token = hf_token)

# Split the dataset into train and test sets
train_test_split = dataset["train"].train_test_split(test_size=0.1, shuffle=True)

# Get the train and test sets
train_set = train_test_split["train"]
test_set = train_test_split["test"]


def formatting_func(example):
    formatted_messages = []

    # Assuming 'instruction', 'input', and 'output' are keys in your example
    instruction = example['instruction']
    formatted_messages.append({"role": "assistant", "content": instruction})

    input_data = example['input']
    formatted_messages.append({"role": "user", "content": input_data})

    output_data = example['output']
    formatted_messages.append({"role": "assistant", "content": output_data})

    return {"text": "\n".join([str(msg) for msg in formatted_messages])}

格式化数据集：执行代码以格式化数据集。这一步对于清晰的导航、为模型创建具有指导、输入和输出的结构至关重要。

train_sft = train_set.map(formatting_func)
test_sft = test_set.map(formatting_func)

# Example usage
formatted_example = formatting_func(train_set[0])
print(formatted_example['text'])

{'role': 'user', 'content': 'you are a docter...'}
{'role': 'assistant', 'content':'i have pain in my head. ):'}

设置参数：根据代码调整学习率、批次大小和其他参数。把它看作在浩瀚的数据海洋中为你的船舶设定航线。

per_device_train_batch_size = 4,
gradient_accumulation_steps = 4,
learning_rate = 2e-4
weight_decay = 0.01
max_steps = 500
warmup_steps = 10
gradient_accumulation_steps = 4
lr_scheduler_type = "linear"
optimizer = "adamw_8bit"
use_gradient_checkpointing = True
random_state = 3407

开始旅程：最后，运行代码以开始微调过程。

from trl import SFTTrainer
from transformers import TrainingArguments

argument = TrainingArguments(
        per_device_train_batch_size = 4,
        gradient_accumulation_steps = 4,
        warmup_steps = warmup_steps,
        max_steps = 500,
        learning_rate = learning_rate,
        fp16 = not HAS_BFLOAT16,
        bf16 = HAS_BFLOAT16,
        logging_steps = 10,
        output_dir = "outputs",
        optim = optimizer,
        report_to = "none",
        weight_decay = weight_decay,
        lr_scheduler_type = lr_scheduler_type,
        seed = random_state,
    )
trainer = SFTTrainer(
    model = model,
    train_dataset=train_sft,
    eval_dataset=test_sft,
    dataset_text_field = "text",
    max_seq_length = max_seq_length,
    args = argument,
)

trainer_stats = trainer.train()

该模型在 2 小时 39 分钟内使用 T4 GPU 对18000个训练样本和 2,000 个测试样本进行训练，使用的上下文长度为 4096。

这里是模型Imran1/MedChat3.5的链接。

模型推理

{'role': 'user', 'content': 'you are stomach specialist.'}
{'role': 'assistant', 'content':'What role does gastric acid play in the process of digestion, and how does the stomach regulate its secretion to maintain a healthy digestive environment?'}

GPT4 Correct Assistant:
Gastric acid plays a crucial role in the process of digestion by breaking down food into its basic components. It is secreted by the cells lining the stomach, known as parietal cells, in response to the presence of food in the stomach.
The stomach regulates the secretion of gastric acid through a series of mechanisms that maintain a healthy digestive environment. The primary mechanism is the release of gastrin, a hormone produced by the stomach's G-cells in response to the presence of food. Gastrin stimulates the parietal cells to secrete gastric acid, which in turn aids in the breakdown of food.
The stomach also regulates the secretion of gastric acid through the release of histamine, which is produced by the ECL cells in response to the presence of food. Histamine acts on the parietal cells to stimulate gastric acid secretion.
Another mechanism involves the production of intrinsic factor, a protein produced by the stomach's mucous cells. Intrinsic factor is essential for the absorption of vitamin B12 in the small intestine. The production of intrinsic factor is regulated by gastric acid, which helps maintain a healthy balance of this essential nutrient.
Additionally, the stomach regulates the secretion of gastric acid through the release of somatostatin, a hormone produced by the D-cells of the stomach. Somatostatin inhibits gastric acid secretion, helping to maintain a healthy balance between acid production and neutralization.
In summary, the stomach regulates the secretion of gastric acid through a series of mechanisms that maintain a healthy digestive environment. These mechanisms include the release of gastrin, histamine, and intrinsic factor, as well as the release of somatostatin. By maintaining a balance between acid production and neutralization, the stomach ensures that the digestive environment remains conducive to proper digestion and absorption of nutrients.<|end_of_turn|>

{'role': 'user', 'content': 'Your role as a doctor requires you to answer the medical questions taking into account the patients description. Analyze the question and answer with the best option.'}
{'role': 'user', 'assistant': 'Which of the following is most sensitive to radiation Options: A. Stem cells B. Skin C. Lymphocyte D. Bone'}
#Output

#GPT4 Correct Assistant:
#The answer is option A: Stem cells

#Explanation:
#Radiation therapy is the use of ionizing radiation to treat various forms of cancer. The effectiveness of radiation therapy depends on the sensitivity of the target cells to radiation. Stem cells are particularly sensitive to radiation and are used as a target for radiation therapy.

#Skin, lymphocytes, and bone are also affected by radiation, but to a lesser extent than stem cells. The skin may develop erythema, dryness, and desquamation (peeling) as a result of radiation exposure. Lymphocytes are affected by radiation, but they are less sensitive than stem cells. Bone is also affected by radiation, but it is less sensitive than other tissues.

#Therefore, stem cells are the most sensitive to radiation among the given options.<|end_of_turn|>'

作为肾脏专家

{'role': 'user', 'content': '  your are a kidney specialist docter.'}
{'role': 'user', 'assistant': 'i have pain in my kidney what should i need to do?'}


#GPT4 Correct Assistant:

#Hi, and welcome to Chat Doctor.

#I understand your concern.

#Based on the information provided, it seems that you might have kidney stones.

#Kidney stones are hard, small, stone-like deposits that form inside the kidneys. The stones can cause pain as they move through the urinary system.

#You should:

#- Drink plenty of water to help flush the stones out of your kidneys and through the urinary system.
#- Eat a healthy diet low in salt and high in potassium, magnesium, and calcium.
#- Take pain relievers such as acetaminophen or ibuprofen to help relieve the pain.
#- Take medication that will help you pass the stones.
#- Consult a urologist for further treatment.

#I hope I have answered your query.
#If you have any other questions, please do not hesitate to write to us.

#Wishing you good health,
#Chat Doctor.<|end_of_turn|>

对于医学中的命名实体

{'role': 'user', 'content': ' In the provided text, your objective is to recognize and tag gene-related Named Entities using the BIO labeling scheme. Start by labeling the initial word of a gene-related phrase as B (Begin), and then mark the following words in the same phrase as I (Inner). Any words'} 
{'role': 'user', 'assistant': 'Following the oxidation of ascorbate, the ensuing ascorbate free radical becomes a pivotal oxidizing agent in biological environments, exerting a range of effects on cellular functions, particularly through its involvement in the plasma membrane redox system.'} 

#GPT4 Correct Assistant:
#Following : O
#the : O
#oxidation : O
#of : O
#ascorbate : O
#, : O
#the : O
#ensuing : O
#ascorbate : B
#free : I
#radical : I
#becomes : O
#a : O
#pivotal : O
#oxidizing : O
#agent : O
#in : O
#biological : O
#environments : O
#, : O
#exerting : O
#a : O
#range : O
#of : O
#effects : O
#on : O
#cellular : O
#functions : O
#, : O
#particularly : O
#through : O
#its : O
#involvement : O
#in : O
#the : O
#plasma : B
#membrane : I
#redox : I
#system : I
#. : O<|end_of_turn|>