Add Attention Fuse pass #809

wjj19950828 · 2022-07-15T09:30:26Z

Do follow contributes

1、添加Attention fuse pass，仅支持ORT
2、添加enable_extra_ort_opt开关，含义为是否开启额外只针对ORT的优化，默认为false
若使用，使用如下命令

paddle2onnx --model_dir msra_ner_pruned_infer_model/ --model_filename float32.pdmodel --params_filename float32.pdiparams --save_file ner_model_test_0713.onnx --opset_version 13 --enable_onnx_checker True --enable_dev_version True --enable_extra_ort_opt True

相关性能测试

GPU 100次warmup，1000次预测取平均
考虑到CPU 10次warmup+100次repeat方差太大，测速改为100次warmup + 2000次repeat，具体如下：

1、非裁剪模型
添加paddle2onnx Attention fuse pass，导出ONNX模型自带Attention node
GPU：
100次warmup，1000次预测取平均，预测时间仅包含run+数据拷贝，表示为（mean），单位为ms，w/o和w代表是否命中attention pass：

CPU:
取100次warmup以及2000次预测取平均，线程数为1，预测时间包含run+数据拷贝，表示为（mean），单位为ms，w/o和w代表是否命中attention pass：

2、裁剪模型
添加paddle2onnx Attention fuse pass，导出ONNX模型自带Attention node
GPU：
100次warmup，1000次预测取平均，预测时间仅包含run+数据拷贝，表示为（mean），单位为ms，w/o和w代表是否命中attention pass：

CPU:
取100次warmup以及2000次预测取平均，线程数为1，预测时间包含run+数据拷贝，表示为（mean），单位为ms，w/o和w代表是否命中attention pass：

jiangjiajun · 2022-07-16T06:33:43Z

paddle2onnx/mapper/exporter.cc

 #include "paddle2onnx/optimizer/fuse_constant_cast.h"
 #include "paddle2onnx/optimizer/fuse_constant_reshape.h"
 #include "paddle2onnx/optimizer/fuse_constant_unsqueeze.h"
 #include "paddle2onnx/optimizer/fuse_paddle_conv_bias.h"
 #include "paddle2onnx/optimizer/fuse_unsqueeze_conv2d_squeeze.h"

 namespace paddle2onnx {
-MapperHelper* MapperHelper::helper = nullptr;
+MapperHelper *MapperHelper::helper = nullptr;


代码风格被重新格式化了，需要修改回去

jiangjiajun · 2022-07-16T06:34:22Z

paddle2onnx/optimizer/fuse_attention.h

+
+ std::string getPassName() const override { return "fuse_attention"; }
+
+ bool patternMatchPredicate(Node *node) override {


在原始代码中都使用指针符号靠左对齐，新添加的代码需对齐

jiangjiajun · 2022-07-16T06:35:39Z

paddle2onnx/optimizer/fuse_attention.h

+// +------|---+
+// | |
+// Add
+


这里有一幅整图的fuse前后对比

在下面的代码中也对应写出fuse中每一步的几个点节（而不单纯只用QKV WEIGHT来描述）

Done.添加相关注释

jiangjiajun · 2022-07-16T06:36:22Z

paddle2onnx/optimizer/paddle2onnx_optimizer.h

@@ -35,6 +35,7 @@ struct OptimizerOption {
 passes.push_back("fuse_matmul_add_bias_into_gemm");
 passes.push_back("eliminate_identity");
 passes.push_back("eliminate_deadend");
+ passes.push_back("fuse_attention");


这不是一个必选的fuse，所以需要额外的开关来控制

Done.添加enable_extra_ort_opt开关，默认为False

jiangjiajun · 2022-07-16T06:36:56Z

这个PR，需给出最终的实验数据和效果

wjj19950828 · 2022-07-18T17:39:15Z

这个PR，需给出最终的实验数据和效果

在描述中给出相关实验数据，CPU上增加repeat次数，解决方差较大问题

wjj19950828 added 2 commits July 15, 2022 09:28

Add Attention Fuse pass

fc4d236

Merge remote-tracking branch 'upstream/develop' into add_attention_pass

130e191

jiangjiajun requested changes Jul 16, 2022

View reviewed changes

wjj19950828 added 6 commits July 18, 2022 09:10

deal with comments

9ad2e7c

fixed code style

501468b

resolve conflict

7c555a1

deal with conflict

846ba2b

update testcase

f3d3258

deal with comments

f7fcdc6

Zheng-Bicheng closed this May 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Attention Fuse pass #809

Add Attention Fuse pass #809

wjj19950828 commented Jul 15, 2022 •

edited

jiangjiajun Jul 16, 2022

wjj19950828 Jul 18, 2022

jiangjiajun Jul 16, 2022

wjj19950828 Jul 18, 2022

jiangjiajun Jul 16, 2022

wjj19950828 Jul 18, 2022

jiangjiajun Jul 16, 2022

wjj19950828 Jul 18, 2022

jiangjiajun commented Jul 16, 2022

wjj19950828 commented Jul 18, 2022


		std::string getPassName() const override { return "fuse_attention"; }

		bool patternMatchPredicate(Node *node) override {

Add Attention Fuse pass #809

Add Attention Fuse pass #809

Conversation

wjj19950828 commented Jul 15, 2022 • edited

Do follow contributes

相关性能测试

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jiangjiajun commented Jul 16, 2022

wjj19950828 commented Jul 18, 2022

wjj19950828 commented Jul 15, 2022 •

edited