Analyzing Data with PROC MEANS in SAS

Welcome back to Mindful Data Minds! In this session, we’ll explore one of the most commonly used procedures in SAS — PROC MEANS. It’s a powerful tool for calculating descriptive statistics such as mean, median, count, sum, percentiles, quartiles, standard deviation, variance, and even sample t-tests.

Watch the Full Tutorial

What You Will Learn

  • How to use PROC MEANS for descriptive statistics.
  • How to select specific variables and statistics.
  • The difference between CLASS and BY.
  • How to save results into a dataset with OUTPUT.
  • How to interpret the _TYPE_ variable in grouped outputs.

General Syntax

proc means data=<dataset>;
   by <variable>;
   class <variable>;
   var <numeric variables>;
   output out=<dataset> <statistics>;
run;
  • DATA= → Input dataset
  • BY / CLASS → Grouping variables
  • VAR → Numeric variables to analyze
  • OUTPUT → Save results into a dataset

Default Behavior

By default, PROC MEANS calculates for all numeric variables:

  • Number of observations (N)
  • Mean
  • Standard deviation
  • Minimum
  • Maximum
proc means data=sashelp.cars;
run;

Produces summary statistics for all numeric variables in sashelp.cars

Selecting Specific Variables

You can restrict analysis to certain variables:

proc means data=sashelp.cars;
   var msrp invoice enginesize;
run;

Only MSRP, Invoice, and Engine Size are analyzed.

Selecting Specific Statistics

You can choose which statistics to display:

proc means data=sashelp.cars n mean;
   var msrp invoice enginesize;
run;

Displays only N and Mean.

Grouping Data: CLASS vs. BY

CLASS Statement

proc means data=sashelp.cars;
   class origin;
   var msrp invoice enginesize;
run;

Produces one table with statistics grouped by Origin.

BY Statement

proc sort data=sashelp.cars out=a;
   by origin;
run;

proc means data=a;
   by origin;
   var msrp invoice enginesize;
run;

Produces separate tables for each Origin.

Note: Data must be sorted before using BY.

Difference:

  • CLASS → One combined table.
  • BY → Multiple separate tables.

Saving Results with OUTPUT

You can save results into a dataset:

proc means data=sashelp.cars n mean median min max;
   class origin;
   var msrp invoice enginesize;
   output out=cars_class autoname;
run;

Creates a dataset cars_class with statistics for each variable, automatically named (e.g., msrp_mean, invoice_min).

Understanding TYPE Variable

When using CLASS:

  • _TYPE_=0 → Overall statistics (no grouping).
  • _TYPE_=1 → Grouped statistics (by Origin).

With BY, only grouped statistics are produced.

Next Step

Continue learning by exploring the next tutorial in this series. Also subscribe to get notified about new lessons.

Have a Question?

Drop your doubts in the comments below or contact us.

Leave a Reply

Your email address will not be published. Required fields are marked *