Subsetting in SAS – Part 3 (KEEP Statement)

Welcome back to Mindful Data Minds! In the last two sessions, we learned about horizontal subsetting using IF and WHERE. Now, let’s move to vertical subsetting — which means selecting specific columns instead of all of them.

In SAS, vertical subsetting is done using two statements:

  • KEEP → Keeps only the columns you specify.
  • DROP → Removes the columns you specify.

In this session, we’ll focus on the KEEP statement.

Watch the Full Tutorial

What You Will Learn

  • How to use KEEP to select specific columns.
  • How to keep all numeric or character variables.
  • How to use ranges (–) and sequences (-) for column selection.
  • How to use KEEP as a dataset option for efficiency.
  • How to apply KEEP when creating new columns or multiple datasets.

Starting Point

We’ll use the built-in dataset SASHELP.CARS, which has 428 rows and 15 columns. Some of the columns are: Make, Model, Type, Origin, Drivetrain, MSRP, Invoice, EngineSize, Cylinders, Horsepower, MPG_City, MPG_Highway, Weight, Wheelbase, Length

Example 1: Keep Specific Columns

Suppose we only want Make and Model:

data c1;
   set sashelp.cars;
   keep make model;
run;

Result: Dataset c1 with only 2 columns (Make and Model) and 428 rows.

Example 2: Keep All Numeric Columns

SAS provides shortcuts to select all numeric or character variables.

data c2a;
   set sashelp.cars;
   keep _numeric_;
run;

Result: Dataset with all 10 numeric columns.

Similarly, for character variables:

data c2b;
   set sashelp.cars;
   keep _character_;
run;

Result: Dataset with 5 character columns (Make, Model, Type, Origin, Drivetrain).

Example 3: Keep a Range of Columns

You can use double dash (–) to keep all columns between two variables.

data c3;
   set sashelp.cars;
   keep make--origin;
run;

Result: Dataset with columns Make, Model, Type, and Origin.

Example 4: Keep Sequentially Named Columns

data c4;
   set sashelp.appliance;
   keep units_1-units_13 cycle;
run;

Result: Dataset with Units_1 through Units_13 plus Cycle.

Example 5: Keep Columns by Name Pattern

You can keep all variables starting with a letter. For example, keep all variables starting with M:

data c5;
   set sashelp.cars;
   keep m:;
run;

Result: Columns like Make, Model, MSRP, MPG_City, MPG_Highway are kept.

Example 6: Using KEEP as a Dataset Option

Instead of writing KEEP inside the Data Step, you can use it as a dataset option. This is more efficient because SAS only loads the required columns into memory.

Example:

data c6;
   set sashelp.cars(keep=make model);
run;

Result: Dataset with only Make and Model, using less memory and time.

Example 7: Keep While Creating New Columns

You can also use KEEP when creating new columns:

data c7;
   set sashelp.cars;
   new_col = MSRP * 10;
   keep make new_col;
run;

Result: Dataset with Make and the new calculated column.

Example 8: Different KEEP Options for Different Outputs

If you’re creating multiple datasets in one step, you can apply KEEP differently:

data c8(keep=make new_col) c9(keep=make _numeric_);
   set sashelp.cars;
   new_col = MSRP * 10;
run;

Result:

  • c8 → Make and new_col.
  • c9 → Make and all numeric columns.

Next Step

Continue learning by exploring the next tutorial in this series. Also subscribe to get notified about new lessons.

Have a Question?

Drop your doubts in the comments below or contact us.

Leave a Reply

Your email address will not be published. Required fields are marked *