SAS程序运行之后,为了确保程序运行成功,需要检查SAS的日志文件。检查的本质是,关键词筛查——输出含有不被接受的关键词的日志记录。
对于此,不同的公司可能有不同的处理方式。目前,我们公司是用脚本进行处理。我觉得脚本运行、输出界面不够友好,于是使用SAS进行实现。完整宏程序在第2节汇总。
1. 思路介绍
日志检查分为两部分。第一部分,将日志文件内容读入SAS数据集;第二部分,输出含有特定关键词的日志记录。
第一部分,参考SAS编程:如何批量读入某路径下外部文档数据?。
第二部分,关于日志文件的关键词,各个公司应该大同小异。举例如下:
ERROR
WARNING
_ERROR_
already on the library
already sorted
appears on a DELETE
At least
Cartesian product joins
ERROR DETECTED
has no effect
ignored
will be overwritten
Invalid
is not valid
is invalid
ivision by zero
Mathematical
Message
Missing values were gen
misspelled
more than one data set with repeats of BY values
nreferenced label
outside the axis range
pparent
requires remerging
roups are not created
stopped due to looping
syntax error
The query as specified involves
uninitialized
values have been converted
went to a new line
INTERRUPTION
将两部分内容结合,就是组成检查SAS日志相关issue的宏程序了。
2. 具体宏程序
考虑到Windows和UNIX系统中,文件地址的斜杠不同,以及在输入文件夹地址时,地址末尾可能添加斜杠,也可能不添加。在宏程序中,根据输入的地址判断斜杠的类型,同时,统一将输入地址末尾的斜杠移除。
%macro check_log(outdt=, dirpath=);
**Author: Jihai;
**Date: 2022-06-01;
%if "&dirpath." ne "" %then %do;
%local dirpath_tmp slash;
**Get the slash;
%let slash = %substr(%sysfunc(compress(&dirpath., : _ , a d)), 1, 1);
**Remove trailing slash;
%if "%substr(&dirpath.,%length(&dirpath.),1)" = "&slash." %then %let dirpath_tmp=%substr(&dirpath.,1,%length(&dirpath.)-1);
%else %let dirpath_tmp = &dirpath.;
**Get filepath;
data _tmp1;
fileres = filename("dirpath", "&dirpath_tmp.");
dirid = dopen("dirpath");
num = dnum(dirid);
length direct filename filepath $200;
if dirid > 0 and num >0 then do;
do i = 1 to num;
direct = "&dirpath_tmp.";
filename = dread(dirid, i);
filepath = catx("&slash.", direct, filename);
if upcase(scan(filename, -1, ".")) in ("LOG") then output;
end;
end;
keep filename filepath;
proc sort;
by filename;
run;
**Output Issue records;
data &outdt.;
set _tmp1;
infile dummy filevar=filepath end=lastrec truncover;
do while(not lastrec);
input text $1000.;
if index(text, "ERROR") or index(text, "WARNING") or index(text, "_ERROR_") or index(text, "already on the library") or index(text, "already sorted") or index(text, "appears on a DELETE") or index(text, "At least") or
index(text, "Cartesian product joins") or index(text, "ERROR DETECTED") or index(text, "has no effect") or index(text, "ignored") or index(text, "will be overwritten") or index(text, "Invalid") or index(text, "is not valid") or index(text, "is invalid") or
index(text, "ivision by zero") or index(text, "Message") or index(text, "Missing values were gen") or index(text, "Mathematical") or index(text, "misspelled") or index(text, "more than one data set with repeats of BY values") or index(text, "nreferenced label") or
index(text, "outside the axis range") or index(text, "pparent") or index(text, "requires remerging") or index(text, "roups are not created") or index(text, "stopped due to looping") or
index(text, "syntax error") or index(text, "The query as specified involves") or index(text, "uninitialized") or index(text, "values have been converted") or index(text, "went to a new line") or
index(text, "INTERRUPTION")
then output;
end;
run;
%end;
%else %put Dirpath is missing ! ;
%mend check_log;
***Invoke the macro;
%check_log(
outdt= check_log
,dirpath= E:\99_Test\Test\test5\
);
有一点需要注意,这个宏程序的记录筛选的关键词是基于英文环境下的,如果具体issue以中文的形式展示,就无法被筛选出来了。
输出结果如下:
总结
宏程序的主要难点在于第一部分——外部文件的读入,筛选记录的环节比较简单直白。如想要增减关键词,可以直接在if
条件语句中修改。
感谢阅读, 欢迎关注!
若有疑问,欢迎评论交流!