Subido por evaalvareztimsit

Lecture 1. Class Notes

Anuncio
Introduction to SAS
MSc Banking and Risk
Lecture 1. Class Notes.
The objective of this lecture is to get familiarized with the SAS environment and learn how to create and
run a first basic program. We will also learn how to import and export files. These notes are
complementary to the slides followed in class.
1. Starting out.
First, set up a directory where you want your work to be. (This is highly
recommended when you work on a public.
a. Create a folder called IntroSASworkspace in your workspace.
b. Open SAS
 All programs > SAS > SAS 9.4
You can see three main windows or areas:
- Explorer
- Log
- Editor
You can arrange these windows as you wish. Other windows may appear in
the future. For example, the results window.
c. Set the folder IntroSASworkspace as the main library Mylib.
 In the window Editor, write or copy this line of code.
LIBNAME user 'M:\IntroSASworkspace';
 Select it.
 Right click on the selected area.
 Click on “Submit Selection”.
You will see the following message in the Log window
Common errors:
- Misspelling the name/path of the folder.
- Forget to create the folder before submitting the code.
From now on, when you create datasets, they will be saved on the folder
IntroSASworkspace.
Introduction to SAS
MSc Banking and Risk
2. First SAS program.
In this section we are going to learn the basic structure of a SAS program and how
to run it.
a. Download the file myfirstSASprogram.sas into the folder
IntroSASworkspace.
b. Open the file in SAS.
- Option 1. Drag the file into the Editor window.
- Option 2. File > Open program > …
You can see a new Editor window with the content of the file.
This program has two parts: the Data Step and the Proc Step. You can also
see some comments.
c. Create a data set.
- Select the command lines corresponding to the Data Step and click on
“Submit Selection”.
Note that if the line “run;” is not among the submitted lines, then the
code will not actually run. The commands submitted will stay in the SAS
memory until “run;” is submitted.
We can check now that the dataset has been created. In the explorer window,
go to Libraries and then User, you will find the dataset Myfirst there. You
also have a copy on your working folder M:/IntroSASworkspace.
For curiosity, you can open the dataset on a new window by double clicking it
on the explorer window. Another window called VIEWTABLE is opened
with the data. I would not recommend this option for large datasets. Instead,
you can think on printing the data set using the Procedure Print.
d. Printing a dataset.
- Go back to the Editor window with the program MyFirstSASprogram.
- Select the command lines corresponding to the PROC PRINT and
click on “Submit Selection”.
A new window called Results Viewer shows up. Moreover, the window
Results will show up on the left part, overlapping the Explorer window (you
can move from one to the other using the tabs at the bottom of the window).
The window Results can be used to navigate through the results when you
have many.
Introduction to SAS
MSc Banking and Risk
EXERCISE:

Modify the program ‘myfirstSASprogram.sas’:

Add an observation using information about yourself

Delete gender variable

Add a variable with the ‘continent’

Print the dataset, but only the variables age and continent

Run the program
3. Importing and exporting.
In practice, creating a SAS dataset from scratch is very uncommon. Usually, data are
downloaded from other sources in a variety of formats: excel, csv, access…You may
also be interested in exporting your SAS dataset to another format, so it can be used by
another program.
There are two ways of importing/exporting SAS datasets: using the Data Step or using
the Import/Export procedures.
For this section, download the following files into the IntroSASworkspace folder:
- auto.sas7bdat
- Home.xlsx
- stress.txt
3.1 Importing data from a text file
We can easily use the Data Step to import data that is in a text or a csv file.
*Importing data from a text file;
data stressImport;
infile 'M:\IntroSASworkspace\stress.txt';
input ID Name $ RestHR MaxHR RecHR TimeMin TimeSec Tolerance $;
run;
There are four statements in this step:
- data stressImport; indicates the name of the new SAS dataset.
- infile 'M:\IntroSASworkspace\stress.txt'; indicates the name of
the new txt dataset to be read. You can add options to this statement. For
instance:
 delimiter=','
indicates that the data is delimited comma (if
delimiter is blank, it is not needed).
 dsd indicates two consecutive delimiters with nothing in between
represent a missing value.
-
input ID Name $ RestHR MaxHR RecHR TimeMin TimeSec Tolerance
$;
indicates the SAS names of the variables. If the variable is non-
numeric, the name is followed by “$”.
- run; indicates that the step is finished: all instructions are given and
SAS should run the program at this stage.
Introduction to SAS
MSc Banking and Risk
EXERCISE:
The file HighSchoolAbsenceRecords contain statistics about the attendance of students in the
different high schools in Edinburgh. It contains the following variables:
School
Name
Attendance
Authorized
Absence
Sickness
Absence
Late
Arrivals
Exceptional
Reasons
Unauthorized
Absence
Truancy
Level
Exclusions
Import the file into SAS using the DATA step. Print the data.
3.2 Exporting data into a text file
The following program creates a txt file from an existing SAS dataset.
data _null_;
set auto;
file 'M:\IntroSASworkspace\autoExport.txt';
put make price length;
run;
There are five statements in this dataset:
- data _null_;
indicates that we are not going to create a new SAS
dataset.
- set auto; indicates the name of the SAS dataset to be read.
- file 'M:\IntroSASworkspace\autoExport.txt'; indicates the
name of the new file to be created.
- put make price length; indicates the SAS name of the variables to
be exported (you can export all variables, or select a few you are
interested in).
- run; indicates that the step is finished.
EXERCISE:
Export to a txt file the dataset created in Section 3.1, keeping only the following information:
School Name
Attendance
Open the file to check everything is ok.
Authorized
Absence
Late
Arrivals
Truancy
Level
Exclusions
Introduction to SAS
MSc Banking and Risk
3.3 Importing data from Excel
In this example we will learn how to import data from excel (Microsoft Excel 2007 or
2010 workbook using file formats – for other versions, the code may vary).
a. Check the type of file.
a.
b.
c.
d.
Go to folder IntroSASworkspace.
Right click on the file Home.
Left click on Properties
Check the version at ‘Type of file’: Microsoft Excel Worksheet (.xlsx)
Note that the extension of this file is xlsx.
Older versions of Excel have different extensions. For example, you may
find that the Type of file is Microsoft Excel 97-2003 Worksheet (.xls).
b. Check the file.
a.
b.
c.
d.
Open the file Home in excel.
Note that the data is in the worksheet labelled “homesheet”.
Note that the names of the variables are in the first row.
Note that the data starts at the first row and first column.
c. IMPORTANT: close the file
d. Run the following code in SAS.
*Importing data from a excel file;
proc import
datafile='M:\IntroSASworkspace\Home'
out=homesas
dbms=xlsx
replace;
sheet='homesheet'; /*optional statement*/
getnames=yes; /* the dataset contains the variable names */
range='homesheet$A1-H118'; /*range where data is hosted. */
run;
e. Check the log. How many variables and observations the dataset have? If there is
an error, check that the file was closed before running the program.
f. Check the SAS dataset. Check the library and find there the new SAS dataset.
g. Open the dataset with the VIEWTABLE. Why are there data that is just a point?
Check it at the excel file.
EXERCISE:
Import the information about the primary schools available in the file “EdinburghSchools.xlsx”.
Introduction to SAS
MSc Banking and Risk
3.4 Exporting data into Excel
The following program creates an excel file from an existing SAS dataset.
*Exporting data to a excel file;
proc export
data=auto
dbms=xlsx
outfile='M:\IntroSASworkspace\autotoxls'
replace;
run;
EXERCISE:
Export to an excel file the dataset created in Section 3.1, keeping only the following information:
School Name
Attendance
Open the file to check everything is ok.
Authorized
Absence
Late
Arrivals
Truancy
Level
Exclusions
Descargar