第二章提示工程（2）

2.3Example Selectors

如果您有大量的示例，您可能需要选择包含在提示中的示例。ExampleSelector是负责执行此操作的类。

基本接口定义如下：

class BaseExampleSelector(ABC):

"""Interface for selecting examples to include in prompts."""

@abstractmethod

def select_examples(self, input_variables: Dict[str, str]) -> List[dict]:

"""Select which examples to use based on the inputs."""

ExampleSelector唯一需要公开的方法是select_examples。该方法接受输入变量，然后返回一个示例列表。如何选择这些示例取决于每个具体实现。以下是一些示例。

from langchain.prompts.example_selector import LengthBasedExampleSelector

# These are a lot of examples of a pretend task of creating antonyms.

examples = [

{"word": "happy", "antonym": "sad"},

{"word": "tall", "antonym": "short"},

{"word": "energetic", "antonym": "lethargic"},

{"word": "sunny", "antonym": "gloomy"},

{"word": "windy", "antonym": "calm"},

]

# Next, we specify the template to format the examples we have provided.

# We use the `PromptTemplate` class for this.

example_formatter_template = """

Word: {word}

Antonym: {antonym}\n

"""

example_prompt = PromptTemplate(

input_variables=["word", "antonym"],

template=example_formatter_template,

)

2.3.1根据长度选择要使用的示例

如果你有大量的示例，可以使用 ExampleSelector 来选择一组最具信息量的示例，以帮助语言模型生成更好的响应。这将帮助你生成更可能生成良好响应的提示。

在下面的示例中，我们将使用基于输入长度选择示例的 LengthBasedExampleSelector。当你担心构建的提示会超过上下文窗口的长度时，这很有用。对于较长的输入，它将选择较少的示例进行包含，而对于较短的输入，它将选择更多的示例。

在这个例子中，我们将创建一个提示来生成单词的反义词。我们将使用 LengthBasedExampleSelector 来选择示例

example_selector = LengthBasedExampleSelector(

examples=examples,

example_prompt=example_prompt,

# 下面是格式样例最大长度.

# 长度是指 get_text_length 函数返回值.

max_length=25,

)

# 使用`example_selector` 创建 `FewShotPromptTemplate`.

dynamic_prompt = FewShotPromptTemplate(

# 使用 ExampleSelector 替代 examples.

example_selector=example_selector,

example_prompt=example_prompt,

prefix="根据输入给出中文反义词",

suffix="词语: {input}\n反义词:",

input_variables=["input"],

example_separator=" ",

)

# 使用 `format` 方法生成提示.

print(dynamic_prompt.format(input="big"))

llm_chain=LLMChain(

prompt=dynamic_prompt,

llm=llm

)

llm_chain.run("高尚")

输出：

根据输入给出中文反义词 Word: happyAntonym: sad Word: tallAntonym: short Word: energeticAntonym: lethargic 词语: big反义词:

低俗

前面是print(dynamic_prompt.format(input="big"))的输出结果，后面是llm_chain.run("高尚")输出结果。

相比之下，如果我们提供一个非常长的输入，LengthBasedExampleSelector将选择较少的示例包含在提示中。因为要保证最大长度不能超过25，就牺牲了示例数量。

long_string = "big and huge and massive and large and gigantic and tall and much bigger than everything else"

print(dynamic_prompt.format(input=long_string))

输出：

根据输入给出中文反义词

Word: happy

Antonym: sad

词语: big and huge and massive and large and gigantic and tall and much bigger than everything else反义词:

2.3.2 相似度

SemanticSimilarityExampleSelector根据示例与输入的相似度选择示例。它通过查找嵌入与输入的余弦相似度最大的示例来实现此目的。

from langchain.prompts.example_selector import SemanticSimilarityExampleSelector

from langchain.vectorstores import Chroma

example_selector = SemanticSimilarityExampleSelector.from_examples(

# This is the list of examples available to select from.

examples,

# This is the embedding class used to produce embeddings which are used to measure semantic similarity.

HuggingFaceEmbeddings(),

# This is the VectorStore class that is used to store the embeddings and do a similarity search over.

Chroma,

# This is the number of examples to produce.

k=1

)

similar_prompt = FewShotPromptTemplate(

# We provide an ExampleSelector instead of examples.

example_selector=example_selector,

example_prompt=example_prompt,

prefix="Give the antonym of every input",

suffix="Input: {adjective}\nOutput:",

input_variables=["adjective"],

)

print(similar_prompt.format(adjective="worried"))

输出：

Give the antonym of every input

词语: happy

反义词: sad

Input: worried

Output:

给变量adjective赋值worried，属于情绪类，根据相似度选择器选择了同样是情绪类示例：happy/sad。

print(similar_prompt.format(adjective="fat"))

输出：

Give the antonym of every input

Word: tall

Antonym: short

Input: fat

Output:

给变量adjective赋值fat，属于体型特征类，根据相似度选择器选择了同样是体型特征类示例：tall/short。还可以增加示例，如下：

similar_prompt.example_selector.add_example({"input": "enthusiastic", "output": "apathetic"})

还有其他选择器，如ngram重叠。NGramOverlapExampleSelector根据ngram重叠得分选择和排序示例，该得分表示示例与输入的相似程度。 ngram重叠得分是一个介于0.0和1.0之间的浮点数。选择器允许设置阈值得分。 ngram重叠得分小于或等于阈值的示例将被排除。默认情况下，阈值设置为-1.0，因此不会排除任何示例，只会对它们进行重新排序。将阈值设置为0.0将排除具有与输入无ngram重叠的示例。具体参考官方文档。

2.4输出解析器

语言模型输出文本。但是很多时候，你可能想要获得比文本更结构化的信息。这就是输出解析器的作用。

输出解析器是帮助结构化语言模型响应的类。有两种主要的方法，一个输出解析器必须实现：

get_format_instructions() -> str：一个方法，返回一个包含有关如何格式化语言模型输出的字符串。

parse(str) -> Any：一个方法，接受一个字符串（假定为语言模型的响应)并将其解析为某个结构。

然后是一个可选的：

parse_with_prompt(str) -> Any：一个方法，它接受一个字符串（假设是语言模型的响应）和一个提示（假设是生成这样的响应的提示），并将其解析为某种结构。提示在此大多数情况下是为了提供信息以便OutputParser重新尝试或以某种方式修复输出。

2.4.1 逗号分隔列表输出解析器

输出内容，用逗号分隔列表

from langchain.output_parsers import CommaSeparatedListOutputParser

from langchain.prompts import PromptTemplate

output_parser = CommaSeparatedListOutputParser()

format_instructions = output_parser.get_format_instructions()

prompt = PromptTemplate(

template="使用中文列出7个 {subject}.\n{format_instructions}",

input_variables=["subject"],

partial_variables={"format_instructions": format_instructions}

)

_input = prompt.format(subject="世界七大洲")

#llm.temperature=0.0

output = llm(_input)

output_parser.parse(output)

输出：

['世界七大洲如下：\n\n非洲大陆，亚洲大陆，北美洲，南美洲，欧洲大陆，大洋洲，南极洲。']

2.4.2 PydanticOutputParser

from langchain.output_parsers import PydanticOutputParser

from pydantic import BaseModel, Field, validator

# Define your desired data structure.

class Joke(BaseModel):

setup: str = Field(description="question to set up a joke")

punchline: str = Field(description="answer to resolve the joke")

# You can add custom validation logic easily with Pydantic.

@validator('setup')

def question_ends_with_question_mark(cls, field):

if field[-1] != '?':

raise ValueError("Badly formed question!")

return field

# Set up a parser + inject instructions into the prompt template.

parser = PydanticOutputParser(pydantic_object=Joke)

prompt = PromptTemplate(

template="Answer the user query.\n{format_instructions}\n{query}\n",

input_variables=["query"],

partial_variables={"format_instructions": parser.get_format_instructions()}

)

# And a query intented to prompt a language model to populate the data structure.

joke_query = "Tell me a joke."

_input = prompt.format_prompt(query=joke_query)

output=llm(_input.to_string())

from langchain.output_parsers import OutputFixingParser

new_parser = OutputFixingParser.from_llm(parser=parser, llm=llm)

new_parser.parse(output)

输出：

Joke(setup="What do you call a cow that can't jump over the fence?", punchline='A dairy product!')

人面猴
序言：七十年代末，一起剥皮案震惊了整个滨河市，随后出现的几起案子，更是在滨河造成了极大的恐慌，老刑警刘岩，带你破解...
沈念sama阅读 216,919评论 6赞 502
死咒
序言：滨河连续发生了三起死亡事件，死亡现场离奇诡异，居然都是意外死亡，警方通过查阅死者的电脑和手机，发现死者居然都...
沈念sama阅读 92,567评论 3赞 392
救了他两次的神仙让他今天三更去死
文/潘晓璐我一进店门，熙熙楼的掌柜王于贵愁眉苦脸地迎上来，“玉大人，你说我怎么就摊上这事。” “怎么了？”我有些...
开封第一讲书人阅读 163,316评论 0赞 353
道士缉凶录：失踪的卖姜人
文/不坏的土叔我叫张陵，是天一观的道长。经常有香客问我，道长，这世上最难降的妖魔是什么？我笑而不...
开封第一讲书人阅读 58,294评论 1赞 292
港岛之恋（遗憾婚礼）
正文为了忘掉前任，我火速办了婚礼，结果婚礼上，老公的妹妹穿的比我还像新娘。我一直安慰自己，他们只是感情好，可当我...
茶点故事阅读 67,318评论 6赞 390
恶毒庶女顶嫁案：这布局不是一般人想出来的
文/花漫我一把揭开白布。她就那样静静地躺着，像睡着了一般。火红的嫁衣衬着肌肤如雪。梳的纹丝不乱的头发上，一...
开封第一讲书人阅读 51,245评论 1赞 299
城市分裂传说
那天，我揣着相机与录音，去河边找鬼。笑死，一个胖子当着我的面吹牛，可吹牛的内容都是我干的。我是一名探鬼主播，决...
沈念sama阅读 40,120评论 3赞 418
双鸳鸯连环套：你想象不到人心有多黑
文/苍兰香墨我猛地睁开眼，长吁一口气：“原来是场噩梦啊……” “哼！你这毒妇竟也来了？” 一声冷哼从身侧响起，我...
开封第一讲书人阅读 38,964评论 0赞 275
万荣杀人案实录
序言：老挝万荣一对情侣失踪，失踪者是张志新（化名）和其女友刘颖，没想到半个月后，有当地人在树林里发现了一具尸体，经...
沈念sama阅读 45,376评论 1赞 313
护林员之死
正文独居荒郊野岭守林人离奇死亡，尸身上长有42处带血的脓包…… 初始之章·张勋以下内容为张勋视角年9月15日...
茶点故事阅读 37,592评论 2赞 333
白月光启示录
正文我和宋清朗相恋三年，在试婚纱的时候发现自己被绿了。大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
茶点故事阅读 39,764评论 1赞 348
活死人
序言：一个原本活蹦乱跳的男人离奇死亡，死状恐怖，灵堂内的尸体忽然破棺而出，到底是诈尸还是另有隐情，我是刑警宁泽，带...
沈念sama阅读 35,460评论 5赞 344
日本核电站爆炸内幕
正文年R本政府宣布，位于F岛的核电站，受9级特大地震影响，放射性物质发生泄漏。R本人自食恶果不足惜，却给世界环境...
茶点故事阅读 41,070评论 3赞 327
男人毒药：我在死后第九天来索命
文/蒙蒙一、第九天我趴在偏房一处隐蔽的房顶上张望。院中可真热闹，春花似锦、人声如沸。这庄子的主人今日做“春日...
开封第一讲书人阅读 31,697评论 0赞 22
一桩弑父案，背后竟有这般阴谋
文/苍兰香墨我抬头看了看天上的太阳。三九已至，却和暖如春，着一层夹袄步出监牢的瞬间，已是汗流浃背。一阵脚步声响...
开封第一讲书人阅读 32,846评论 1赞 269
情欲美人皮
我被黑心中介骗来泰国打工，没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留，地道东北人。一个月前我还...
沈念sama阅读 47,819评论 2赞 370
代替公主和亲
正文我出身青楼，却偏偏与公主长得像，于是被迫代替她去往敌国和亲。传闻我的和亲对象是个残疾皇子，可洞房花烛夜当晚...
茶点故事阅读 44,665评论 2赞 354

第二章 提示工程（2）

2.3Example Selectors

2.3.1根据长度选择要使用的示例

2.3.2 相似度

2.4输出解析器

2.4.2 PydanticOutputParser

推荐阅读更多精彩内容

第二章提示工程（2）