Summarizing Categorical Data with PROC FREQ in SAS

Welcome back to Mindful Data Minds! In this session, we’ll explore PROC FREQ, one of the most frequently used SAS procedures. It’s designed to summarize categorical variables, providing frequency counts, percentages, cumulative values, cross-tabulations, chi-square tests, and plots.

Watch the Full Tutorial

What You Will Learn

  • How to use PROC FREQ for categorical summaries.
  • How to control output with options like NOCUM and NOPERCENT.
  • How to generate cross-tabulations and list outputs.
  • How to save results into datasets.
  • How to perform chi-square tests.
  • How to create bar charts and dot plots.
  • How to handle missing values and sort results.

General Syntax

proc freq data=<dataset>;
   tables <variable(s)> / options;
run;
  • DATA= → Input dataset
  • TABLES → Variables to analyze
  • Options → Control output (e.g., nocum, nopercent, chisq, plots)

Default Output

By default, PROC FREQ produces:

  • Frequency
  • Percentage
  • Cumulative frequency
  • Cumulative percentage
proc freq data=sashelp.cars;
run;

Displays frequency tables for all categorical variables (Make, Model, Type, Origin, etc.).

Selecting Specific Variables

You can restrict analysis to one variable:

proc freq data=sashelp.cars;
   tables type;
run;

Shows frequency distribution for Type only.

Controlling Output

No cumulative values:

tables type / nocum;

No percentages:

tables type / nopercent;

Cross-Tabulation

PROC FREQ can generate contingency tables:

proc freq data=sashelp.cars;
   tables type*origin;
run;

Produces row and column percentages for Type by Origin.

You can also cross-tab multiple pairs:

tables type*origin type*drivetrain;

List Format

To display results in list form:

tables type*origin / list;

Outputs Type, Origin, and Frequency in a compact list.

Distinct Values

To see unique values:

tables type / levels;

Lists distinct categories and their counts.

Saving Results

You can save frequency tables into a dataset:

proc freq data=sashelp.cars;
   tables type*origin / out=cars_cross_freq;
run;

Creates cars_cross_freq with counts and percentages.

Chi-Square Test

PROC FREQ can test independence between categorical variables:

proc freq data=sashelp.cars;
   tables type*origin / chisq;
   output out=cars_chisq chisq;
run;

Produces chi-square test results and saves them into a dataset.

Visualizations

Enable graphics with ODS:

ods graphics on;
proc freq data=sashelp.cars;
   tables type / plots=freq;
run;
ods graphics off;

Generates a bar chart of frequencies.

For dot plots:

tables type / plots=freq(type=dot);

Handling Missing Data

By default, missing values are excluded. To include them:

proc freq data=sashelp.heart;
   tables deathcause / missing;
run;

Displays missing values alongside valid categories.

Sorting Results

To sort by frequency (descending):

proc freq data=sashelp.heart order=freq;
   tables deathcause / missing;
run;

Outputs categories sorted by frequency.

Next Step

Continue learning by exploring the next tutorial in this series. Also subscribe to get notified about new lessons.

Have a Question?

Drop your doubts in the comments below or contact us.

Leave a Reply

Your email address will not be published. Required fields are marked *