自动提示工程（Auto Prompt）LMOps代码复现和解读

ProTeGi: Prompt Optimization with Textual Gradients
是一篇自动基于LLM的自动提示工程，非常感谢作者的创新和分享，以下是原论文地址和仓库地址，有兴趣可自行实践
项目地址
 论文地址

本章是笔者在该仓库上进行复现的过程中，做的一些备注和解读

参数说明

n_gradients：一个原始prompt，进行几次原因查找和反馈
errors_per_gradient：每个gradient用几个错误样例
gradients_per_error： llm为每个错误样本生成的错误原因个数
steps_per_gradient: 新生成的prompt数量（根据一个原始prompt + 错误样例 + 错误原因）
mc_samples_per_step：一个原始prompt根据同义转换生成prompt数量
max_expansion_factor：一个原始prompt最多扩展数量

核心组件构成

task 每个任务的数据加载器（分别加载训练集和测试集，统一文件格式可共用）
score: 评分计算，判断label 与生成label是否一致，根据任务类型规则共用（比如，分类是label==pred进行判断）
gpt4，使用调用服务的形式，利用生成模型，对给入的prompt和data生成pred
evaluator，策略网络，根据不同策略进行探索-利用，选择和优化action（prompt）
bf_eval，暴力评估，将扩展出的新prompt，用来快速暴力筛选到指定数量max_expansion_factor

生成新的prompt方式

根据当前预测错的原因，让llm生成新的prompt，错误原因也让llm自己生成；(梯度方式)。get_gradients本质上是利用llm找到一些当前prompt 预测错误的原因。
同义转换，根据当前prompt，不改变语义，生成新的prompt；（同义方式）

生成新的prompt详细步骤和代码解读：

对于一个给定的prompt 和一个batch 的训练数据集，得到pred
对比label 和 pred，得到一些预测错的样本error_idxs
从error_idxs中采样出n（errors_per_gradient）个错误样本
将采样出的预测错误样本，按照固定格式，组装成字符串error_string

  error_string = ''
  for i, (t, l, p) in enumerate(zip(sample_texts, sample_labels, sample_preds)):
      error_string += f'## Example {error_idx+1}\n'
      error_string += f'Text: \"{t.strip()}\"\nLabel: {task.stringify_prediction(l)}\nPrediction: {task.stringify_prediction(p)}\n\n'
      error_idx += 1
  return error_string.strip()

让llm 根据上述第4步产生的错误样例，以及自己写的找错误原因的提示词，让llm给出num_feedbacks个，为什么预测错误的原因（错误原因反馈）。注：获得错误原因反馈的提示词需要根据不同的任务提前写好

gradient_prompt = f"""
        I'm trying to write a zero-shot classifier prompt.
    
        My current prompt is:
        "{prompt}"

        But this prompt gets the following examples wrong:
        {error_string}

        give {num_feedbacks} reasons why the prompt could have gotten these examples wrong.
        Wrap each reason with <START> and <END>
        """

最后的prompt_feedbacks个数，一个prompt 产生 n_gradients * num_feedbacks个原因
将这些错误反馈feedbacks嵌入到反馈提示词中，生成新的prompt。注：反馈提示词需要提前写好

transformation_prompt = f"""
        I'm trying to write a zero-shot classifier.
        
        My current prompt is:
        "{prompt}"

        But it gets the following examples wrong:
        {error_str}

        Based on these examples the problem with this prompt is that {feedback_str}

        Based on the above information, I wrote {steps_per_gradient} different improved prompts.
        Each prompt is wrapped with <START> and <END>.

        The {steps_per_gradient} new prompts are:
        """

得到梯度反馈生成的新prompt;
根据同义转换，得到新的prompt;
合并两部分新的prompt，并去重得到new_sections;
根据设置的扩展上限max_expansion_factor，对new_sections进行筛选，具体如下：
a. reject_on_errors=True时
i. 在batch（默认64）中将之前prompt预测错误的样本找出来
ii. 在生成的新的prompt上暴力测试结果，得到每个prompt的平均值（为了快速，只采样评估了其中的一部分：max_expansion_factor的2倍，其他没有直接舍弃掉了）
iii. 最后选得分高的max_expansion_factor个，新生成的prompt：tmp_new_prompts
b. reject_on_errors=False时
随机在所有新prompt中采样max_expansion_factor个

自动提示工程（Auto Prompt）LMOps代码复现和解读

参数说明

核心组件构成

生成新的prompt方式

生成新的prompt详细步骤和代码解读：

推荐阅读更多精彩内容