Creating Our Own Dataset in SAS Using CARDS or DATALINES

Welcome back to Mindful Data Minds! In this session, we’ll learn how to create a dataset in SAS using text data that we already have — with the help of CARDS or DATALINES statements.

Why Use CARDS or DATALINES?

    Sometimes we don’t want to import data from an external file. Instead, we want to quickly create a dataset by typing values directly into SAS. For this, we use CARDS or DATALINES.

    Watch the Full Tutorial

    What You Will Learn

    • How to create datasets using CARDS/DATALINES.
    • How SAS handles numeric vs. character variables.
    • How to define lengths for character variables.
    • How to use colon (:) and ampersand (&) modifiers.

    Step 1: Define the Data Step

    Start by writing a Data Step:

    data a;
       input EmpID Name $ Department $ City $;
       datalines;
    1 A Sales Delhi
    2 B HR Mumbai
    3 C Analytics Chandigarh
    ;
    run;
    

    Explanation:

    • data a; → Creates a dataset named a.
    • input → Defines the variables (EmpID, Name, Department, City).
    • $ → Marks character variables.
    • datalines; → Indicates that data will follow.
    • The rows below are the actual data records.
    • run; → Executes the step.

    Step 2: Understanding Data Types

    • By default, SAS treats variables as numeric unless you add a $ for character.
    • Numeric missing values are shown as a dot (.).
    • Character missing values are shown as blank.

    Step 3: Handling Lengths

    By default, SAS sets character variable length to 8. Longer values get trimmed. To fix this, define a length:

    length Name $ 15 Department $ 20 City $ 20;
    

    This ensures SAS can store longer text values.

      Step 4: Using Modifiers

      SAS provides modifiers to handle special cases:

      • Colon (:) → Reads up to the defined length but stops at a delimiter (like space).
      • Ampersand (&) → Allows multiple words (e.g., first name + last name) to be read into one variable.

      Example with ampersand:

      input Name & $20 Department $ City $;
      

      This ensures “John Doe” is read as one value for Name.

      Final Dataset

      After running the code, you’ll get a dataset with:

      • Employee IDs (numeric)
      • Names (character, with full names supported)
      • Departments (character)
      • Cities (character)

        Next Step

        Continue learning by exploring the next tutorial in this series. Also subscribe to get notified about new lessons.

        Have a Question?

        Drop your doubts in the comments below or contact us.

        Leave a Reply

        Your email address will not be published. Required fields are marked *