AGRO 6005

Anuncio
AGRO 6005
Conferencia 4
Uso de Procedimientos Gráficos
Una buena introducción a los gráficos en SAS está en la sección 1.9 del libro de Der y Everitt. Ver
también la descripción gráfica de un conjunto de datos en el capítulo 2 del mismo libro.
Procedimientos GCHART y GPLOT
PROC GCHART DATA=_______;
BY ________;
VBAR _________ / opciones;
HBAR _________ / opciones;
PIE
;
Opciones:
levels = 5
midpoints = 10 20 30
10 to 100 by 10
subgroup =
type = freq | pct | sum | mean
sumvar = (variable sobre la que se calculará la media o la suma)
group = (gráficos adyacentes)
PROC GPLOT DATA=__________;
BY ______;
PLOT yvar*xvar=variable
yvar*xbar=’char’
yvar*(xbar1 xbar2)
yvar1*xbar1=’char1’ ybar2*xbar2=’char2’ / overlay
Gráficos por menú:
SAS / ASSIST
(Solutions > ASSIST)
SAS/SPECTRAVIEW (Solutions >Analysis>3D Visual Analysis)
El siguiente material está tomado del manual online de SAS.
Proc Univariate: Generating Line Printer Plots
The PLOTS option in the PROC UNIVARIATE statement provides up to four diagnostic line printer plots
to examine the data distribution. These plots are the stem-and-leaf plot or horizontal bar chart, the box
plot, the normal probability plot, and the side-by-side box plots. If you specify the WEIGHT statement,
PROC UNIVARIATE provides a weighted histogram, a weighted box plot based on the weighted
quantiles, and a weighted normal probability plot.
Box Plot
The box plot, also known as a schematic plot, appears beside the stem-and-leaf plot. Both plots use the
same vertical scale. The box plot provides a visual summary of the data and identifies outliers. The
bottom and top edges of the box correspond to the sample 25th (Q1) and 75th (Q3) percentiles. The
box length is one interquartile range (Q3 - Q1). The center horizontal line with asterisk endpoints
corresponds to the sample median. The central plus sign (+) corresponds to the sample mean. If the
mean and median are equal, the plus sign falls on the line inside the box. The vertical lines that project
out from the box, called whiskers, extend as far as the data extend, up to a distance of 1.5 interquartile
ranges. Values farther away are potential outliers. The procedure identifies the extreme values with a
zero or an asterisk (*). If zero appears, the value is between 1.5 and 3 interquartile ranges from the top
or bottom edge of the box. If an asterisk appears, the value is more extreme. To generate box plot
using high-resolution graphics, use the BOXPLOT procedure in SAS/STAT software.
Normal Probability Plot
The normal probability plot is a quantile-quantile plot of the data. The procedure plots the empirical
quantiles against the quantiles of a standard normal distribution. Asterisks (*) indicate the data values.
The plus signs (+) provide a straight reference line that is drawn by using the sample mean and
standard deviation. If the data are from a normal distribution, the asterisks tend to fall along the
reference line.
Side-by-Side Box Plots
When you use a BY statement with the PLOT option, PROC UNIVARIATE produces full-page side-byside box plots, one for each BY group. The box plots (also known as schematic plots) use a common
scale that allows you to compare the data distribution across BY groups. This plot appears after the
univariate analyses of all BY groups. Use the NOBYPLOT option to suppress this plot.
Generating High-Resolution Graphics
If your site licenses SAS/GRAPH software, you can use the HISTOGRAM statement, PROBPLOT
statement, and QQPLOT statement to create high-resolution graphs.
The HISTOGRAM statement generates histograms and comparative histograms that allow you to
examine the data distribution. You can optionally fit families of density curves and superimpose kernel
density estimates on the histograms.
The QQPLOT statement generates a quantile-quantile plot, which compares ordered values of a
variable with quantiles of a specified theoretical distribution. Thus, you can use these plots to determine
how well a theoretical distribution models a set of measures.
Controlando los gráficos y la salida en SAS
Para manejar la salida gráfica a otras ventanas diferentes de la gráfica, debemos definir el
“graphic device” mediante la opción goptions. Algunas de las opciones interesantes para usar
son:
goptions device=activex;
goptions device=java;
goptions device=gif;
data maiz;
input local & $10. siembra : mmddyy8. hibrido $ repet peso
rendim altura longmazo diammazo;
format siembra date9.;
messiem=month(siembra);
if .<longmazo<16 then mazorca='MAZORCA LARGA';
else mazorca='MAZORCA CORTA';
if rendim>7000 then catrend='1.ALTO';
if 5000<rendim<=7000 then catrend='2.MEDIO';
if .<rendim<=5000 then catrend='3.BAJO';
datalines;
Juana Diaz
03/20/96 MLxHU3 1 70 2735 214 17.5 4.3
Juana Diaz
03/20/96 TTxHU5 1 69 4109 220 18.9 4.1
…
Isabela 12/12/96 TTxHU5 4 76.5 2003 190 15.8 4.5
Isabela 12/12/96 MLxTT1 4 75.5 5056 210 17 5
Isabela 12/12/96 JRExRE 4 76 4485 210 16.3 4.6
;
proc sort data=maiz;
by local hibrido;
ods rtf ;
ods html ;
goptions device=activex;
proc univariate noprint data=maiz;
by local;
histogram rendim / normal;
qqplot altura / normal(mu=est sigma=est color=red)
pctlminor; *indica marcas ("ticks") para ejes;
proc boxplot data=maiz;
by local;
plot altura*hibrido;
run;
proc gchart data=maiz;
vbar hibrido / type=mean sumvar=diammazo discrete;
vbar rendim / type=freq group=local levels=8;
pie catrend / type=freq ;
proc gplot data=maiz;
plot longmazo*diammazo=local;
plot longmazo*siembra='l' diammazo*siembra='d' / overlay;
run;
data hat;
do x=-5 to 5 by 0.25;
do y=-5 to 5 by 0.25;
z=sin(sqrt(x*x+y*y));
output;
end;
end;
run;
proc g3d data=hat;
plot y*x=z;
run;
ods html close;
ods rtf close;
Descargar