Using the WHERE statement may improve the efficiency of your SAS programs because SAS is not required to read all observations from the input data set.
The WHERE statement cannot be executed conditionally; that is, you cannot use it as part of an IF-THEN statement.
WHERE statements can contain multiple WHERE expressions that are joined by logical operators.
Note: Using indexed SAS data sets can significantly improve performance when you use WHERE expressions to access a subset of the observations in a SAS data set. For more information about indexes, see "Indexes" in the "SAS Data Files" chapter of SAS Language Reference: Concepts for a complete discussion of WHERE-expression processing with indexed data sets and a list of guidelines to consider before you index your SAS data sets.
In DATA Steps
The WHERE statement applies to all data sets in the preceding SET, MERGE, MODIFY, or UPDATE statement, and variables that are used in the WHERE statement must appear in all of those data sets. You cannot use the WHERE statement with programming statements that select observations by observation number, such as the OBS= data set option and the POINT= option in the SET and MODIFY statements. When you use the WHERE statement, the FIRSTOBS= data set option must be 1. You cannot use the WHERE statement to select records from an external file that contains raw data, nor can you use the WHERE statement within the same DATA step in which you read in-stream data with a CARDS or DATALINES statement.
For each iteration of the DATA step, the first operation the SAS System performs in each execution of a SET, MERGE, MODIFY, or UPDATE statement is to determine whether the observation in the input data set meets the condition
of the WHERE statement. The WHERE statement takes effect immediately after the input data set options are applied and before any other statement in the DATA step is executed. If a DATA step combines observations using a WHERE statement with a MERGE, MODIFY, or UPDATE statement, SAS selects observations from each input data set before it combines them.
WHERE and BY in a DATA Step
If a DATA step contains both a WHERE statement and a BY statement, the WHERE statement executes before BY groups are created. Therefore, BY groups reflect groups of observations in the subset of observations that are selected by the WHERE statement, not the actual BY groups of observations in the original input data set.
For a complete discussion of BY-group processing, see "BY-Group Processing" in SAS Language Reference: Concepts.
In PROC Steps
You can use the WHERE statement with any SAS procedure that reads a SAS data set. The WHERE statement is useful for subsetting the original data set for processing by the procedure. The SAS Procedures Guide documents the action of the WHERE statement only in those procedures for which you can specify more than one data set. In all other cases, the WHERE statement performs as documented here.
Use of Indexes
A DATA or PROC step attempts to use an available index to optimize the selection of data when an indexed variable is used in combination with one of the following:
- the BETWEEN-AND operator
SUBSTR requires the following arguments: An index is used in processing when the arguments of the SUBSTR function meet all of the following conditions: