配置文件目录
1. credentials.yml
定义和其他服务连接的一些细节
2. config.yml
该配置文件定义了nlu处理流程和core的策略,用来基于用户的输入进行下一步行为的预测。
配置pipline,NLU处理流程。配置policies, 对话策略配置。示例如下:
language: zh
pipeline:
- name: "MitieNLP"
model: "data/total_word_feature_extractor_zh.dat"
- name: "JiebaTokenizer"
dictionary_path: "dict.txt"
- name: "MitieEntityExtractor"
- name: "EntitySynonymMapper"
- name: "RegexFeaturizer"
- name: "MitieFeaturizer"
- name: "SklearnIntentClassifier"
policies:
- name: TEDPolicy
epochs: 100
max_history: 5
- name: MemoizationPolicy
max_history: 5
- name: RulePolicy
core_fallback_threshold: 0.3
enable_fallback_prediction: True
1)pipline
-name: “MitieNLP” 加载词向量
-model: "data/total_word_feature_extractor_zh.dat" 词向量文件
- name: “JiebaTokenizer” 结巴分词
- name: “MitieEntityExtractor” 实体识别
- name: “EntitySynonymMapper” 同义词匹配实体提取器
- name: “RegexFeaturizer” 正则表达式特征提取器
- name: “MitieFeaturizer” MITIE特征提取器(转化为词向量表示)
- name: “SklearnIntentClassifier” Sklearn意图分类器
2) policy
其中policy 默认优先级如下,当不同的policy预测的置信度相同时,要根据policy默认的优先级选择执行。
- 6 -
RulePolicy
- 3 -
MemoizationPolicy
orAugmentedMemoizationPolicy
- 1 -
TEDPolicy
: 机器学习策略
TEDPolicy
是一种用于预测下一步行动和实体识别的多任务架构。论文链接
MemoizationPolicy
MemoizationPolicy会从你的训练数据中记住故事。它会检查当前的对话是否与story.yml匹配。如果是这样,它将从匹配的训练数据故事中预测下一个动作,其可信度为1.0。如果没有找到匹配的会话,策略将以0.0的信心预测None。
FormPolicy
对MemorizationPolicy的扩展,用来处理forms填充的场景。一旦FormAction被调用,FormPolicy会持续预测FormAction,直到所有需要的slots被填满。更多的信息
RASA2.4版本,FormPolicy已弃用。
3. data/nlu.yml
配置意图和示例句子,同时可以标准示例句子中的实体。
这里简单针对实体进行简单介绍
格式一:[实体]{"entity":"实体名称","value":"实体值"}
What's the balance on my [credit card account]{"entity":"account","value":"credit"}
格式二:[实体](实体值)
What's my [credit] (account)balance ?
实体没有具体值,这个值就是实体本身 。
version: "2.0"
nlu:
- intent: greet
examples: |
- Hey
- Hi
- hey there [Sara](name)
- intent: check_balance
examples: |
- What's my [credit](account) balance?
- What's the balance on my [credit card account]{"entity":"account","value":"credit"}
- intent: faq/language
examples: |
- What language do you speak?
- Do you only handle english?
同义词
nlu:
- synonym: credit
examples: |
- credit card account
- credit account
正则
nlu:
- regex: account_number
examples: |
- \d{10,12}
- intent: inform
examples: |
- my account number is [1234567891](account_number)
- This is my account number [1234567891](account_number)
词典
nlu:
- lookup: country
examples: |
- Afghanistan
- Albania
- ...
- Zambia
- Zimbabwe
4. data/rules.yml
rules:
- rule: Say `hello` whenever the user sends a message with intent `greet`
steps:
- intent: greet
- action: utter_greet
5. domain.yml
标识 | 说明 |
---|---|
intents | 意图 |
actions | 动作 |
responses | 回复 |
entities | 实体 |
Slots | 词槽 |
6. data/stories.yml
stories:
- story: collect restaurant booking info # name of the story - just for debugging
steps:
- intent: greet # user message with no entities
- action: utter_ask_howcanhelp
- intent: inform # user message with entities
entities:
- location: "rome"
- price: "cheap"
- action: utter_on_it # action that the bot should execute
- action: utter_ask_cuisine
- intent: inform
entities:
- cuisine: "spanish"
- action: utter_ask_num_people
几十个训练故事足够启动开发
为了构建产品级别的助手,你至少需要几百个训练故事
domain中有一个属性叫做actions,actions是接下来要执行的操作,包括返回给用户的信息。Rasa Core中有三种类型的actions,分别为
1)default actions ,系统默认提供的action
2)utter actions,以 utter_作为开头, 该action只能用于给用户返回信息
3)custom actions ,自定义的action,该action可执行任何的操作
参考链接
rasa官方文档
rasa_demo示例
基于RASA的task-orient对话系统解析(一)
基于RASA的task-orient对话系统解析(二)——对话管理核心模块
基于RASA的task-orient对话系统解析(三)——基于rasa的会议室预定对话系统实例