Subsetting in SAS Part 1 – (If Statement)
Welcome back to Mindful Data Minds! In this session, we’ll learn how to subset datasets in SAS. Subsetting means selecting only the data you need, instead of working with the entire dataset.
Watch the Full Tutorial
What You Will Learn
- The difference between horizontal and vertical subsetting.
- How to use the IF statement to filter rows.
- How to fix issues with spaces using TRIM/STRIP.
- How to handle case sensitivity with UPCASE/LOWCASE.
- How to combine conditions with AND and OR.
Types of Subsetting
In SAS, you can subset data in two ways:
- Horizontal Subsetting → Filtering rows (like choosing only cars from a certain brand).
- Vertical Subsetting → Adding or dropping columns.
In this session, we’ll focus on horizontal subsetting using the IF statement.
Example Dataset
We’ll use the built-in dataset SASHELP.CARS. It contains information about cars, including make, model, type, origin, and more.
data a;
set sashelp.cars;
run;
This creates dataset a with all 428 rows and 15 columns.
Using IF Statement
Filter by Car Make – Suppose we only want cars made by Acura:
data a1;
set sashelp.cars;
if make = "Acura";
run;
Result: Only rows where make = Acura are included.
Filter by Make and Model
Now let’s filter for Acura MDX:
data a2;
set sashelp.cars;
if make = "Acura";
if model = "MDX";
run;
❌ Problem: No records appear! Why?
Because the MODEL column has a length of 40, meaning extra spaces are stored after “MDX.”
Fixing Character Filters
To handle extra spaces, SAS provides two functions:
- TRIM → Removes trailing spaces.
- STRIP → Removes both leading and trailing spaces.
💡Quick Tip – Use STRIP for character variables.
if strip(model) = "MDX";
Handling Case Sensitivity
SAS is case-sensitive. If you type "acura" instead of "Acura", the filter won’t work.
💡Quick Tip – Use functions to standardize text:
- LOWCASE(variable) → Converts to lowercase.
- UPCASE(variable) → Converts to uppercase.
if lowcase(strip(make))='acura';
if lowcase(strip(model))="mdx";
Combining Conditions
You can combine multiple conditions with AND or OR.
Example: Acura cars with MSRP > 30000
data a3;
set sashelp.cars;
if lowcase(strip(make))='acura';
if msrp>30000;
run;
Above code can also be written as:
data a3;
set sashelp.cars;
if lowcase(strip(make))='acura' and msrp>30000;
run;
Result: Only Acura cars priced above 30,000.
Example: Acura OR cars priced above 30000
data a4;
set sashelp.cars;
if lowcase(strip(make))='acura' or if msrp>30000;
run;
Result: All Acura cars plus any car with MSRP greater than 30,000.
Next Step
Continue learning by exploring the next tutorial in this series. Also subscribe to get notified about new lessons.
Have a Question?
Drop your doubts in the comments below or contact us.
