今天分享一个SAS画图小技巧。画图时,会出现某个分组没有数据的情况,这时对于缺失的数据会输出一个灰色的图例(这通常不是我们想要的),如何将这个图例去掉呢?
以sashelp.class数据集为例,我们以性别为分组,X轴为height,Y轴为weight画一个散点图。
#随机置空几行数据#;
data class;
set sashelp.class;
if _n_ in (1,5,7) then call missing(sex,height,weight);
run;
#画散点图#;
proc sgplot data=class;
styleattrs datacontrastcolors=(green blue);
scatter x=height y=weight/ group=sex markerattrs=(symbol=starfilled);
xaxis values=(55 to 75 by 5);
yaxis values=(40 to 160 by 20);
run;
筛掉缺失组别的数据
第一种方法,非常简单,只要将分组缺失的数据where排除掉就可以了。
proc sgplot data=class;
where sex>'';
styleattrs datacontrastcolors=(green blue);
scatter x=height y=weight/ group=sex markerattrs=(symbol=starfilled);
xaxis values=(55 to 75 by 5);
yaxis values=(40 to 160 by 20);
run;
可能有一些其他的情况,比如,我们画的图比较复杂,一张图里需要好几个xxplot语句,这些缺失分组的数据,在xxplot1中没有用到,但在xxplot2中又是需要画出来的数据,这种情况下,就不能采用这种筛选数据的方式了。
plot语句选项控制
SAS有相应的选项来控制不输出分组缺失的数据。在GTL中,使用includemissinggroup=false;在SGPLOT中,使用nomissinggroup。
proc sgplot data=class;
styleattrs datacontrastcolors=(green blue);
scatter x=height y=weight/ group=sex markerattrs=(symbol=starfilled) nomissinggroup;
xaxis values=(55 to 75 by 5);
yaxis values=(40 to 160 by 20);
run;
proc template;
define statgraph scatterplot;
begingraph /backgroundcolor = white designwidth = 800px designheight = 440px;;
layout lattice / columns=1;
layout overlay / walldisplay=all
yaxisopts = (linearopts = (viewmin=40 viewmax=160
tickvaluesequence = (start = 40 end = 160 increment = 20))
display = (tickvalues ticks label line))
xaxisopts = (linearopts = (viewmin=55 viewmax=75
tickvaluesequence = (start = 55 end = 75 increment = 5) tickvaluefitpolicy=none)
display = (tickvalues ticks label line));
scatterplot x=height y=weight/ name="scatter" group=sex markerattrs=(symbol=starfilled) includemissinggroup=false;
discretelegend "scatter"/ title="性别" location = outside;
endlayout;
endlayout;
endgraph;
end;
run;
proc sgrender data=class template=scatterplot;
run;
Legend选项控制
在legend语句后加上exclude= (" "),GTL同理。
proc sgplot data=class;
styleattrs datacontrastcolors=(green blue);
scatter x=height y=weight/ group=sex markerattrs=(symbol=starfilled);
xaxis values=(55 to 75 by 5);
yaxis values=(40 to 160 by 20);
discretelegend/exclude= (" ");
run;
使用legenditem自定义图例
SGPLOT代码示例
proc sgplot data=class;
styleattrs datacontrastcolors=(green blue);
scatter x=height y=weight/ group=sex markerattrs=(symbol=starfilled);
xaxis values=(55 to 75 by 5);
yaxis values=(40 to 160 by 20);
legenditem type=marker name="m_marker"/ markerattrs=(color=blue symbol=starfilled) label="男";
legenditem type=marker name="f_marker"/ markerattrs=(color=green symbol=starfilled) label="女";
discretelegend "m_marker" "f_marker"/title="性别";
run;
GTL代码示例
data attrmap;
length id $9 value markercolor markersymbol $100;
id='sex';value='男';markercolor='blue';markersymbol='starfilled';output;
id='sex';value='女';markercolor='green';markersymbol='starfilled';output;
run;
proc template;
define statgraph scatterplot;
begingraph /backgroundcolor = white designwidth = 800px designheight = 440px;
legenditem type=marker name="m_marker"/ markerattrs=(color=blue symbol=starfilled) label="男";
legenditem type=marker name="f_marker"/ markerattrs=(color=green symbol=starfilled) label="女";
layout lattice / columns=1;
layout overlay / walldisplay=all
yaxisopts = (linearopts = (viewmin=40 viewmax=160
tickvaluesequence = (start = 40 end = 160 increment = 20))
display = (tickvalues ticks label line))
xaxisopts = (linearopts = (viewmin=55 viewmax=75
tickvaluesequence = (start = 55 end = 75 increment = 5) tickvaluefitpolicy=none)
display = (tickvalues ticks label line));
scatterplot x=height y=weight/ name="scatter" group=sex;
discretelegend "m_marker" "f_marker"/title="性别" location = outside;
endlayout;
endlayout;
endgraph;
end;
run;
proc sgrender data=class template=scatterplot dattrmap=attrmap;
dattrvar sex='sex';
run;
legenditem可以完全按照我们的需求定制图例,适用于一些特殊情况,除了去掉缺失分组的图例,正常数据的图例也可以不让它显示,或者对于某些组别,我们知道只有x种可能取值,但是有时候数据中只有x-2,x-1种,这时SAS其他语句输出的图例就是图中出现的组别才显示相应的图例,如果需要将这x种的图例都强制画出来,也可以使用legenditem语句。
最终输出的效果如图:
大家可以根据自己的需求灵活选用以上方法,有其他方法也可以留言或者私信哦!