The most general way to subset a matrix by rows and columns is the base R Extract
function, called by d[rows, columms]
, where d is the data frame. Ultimately, to use this function, for the rows parameter, pass the row names of the selected rows, the indices or actual names. For the cols parameter, pass the column indices of the selected columns.
In many uses, perhaps most uses, the indices are not directly provided, but obtained through some logical specification. The annoying feature of the rows parameter is that any reference to the variables in the logical specification must contain the name of the data frame followed by a $. But this name has already been specified in the function call, and now is simply repeated for every variable reference.
The column specification is usually provided as a variable list. Here the annoying aspect is that all variable names must be quoted, and no variable ranges are permitted. Quite clumsy.
To address these deficiencies, lessR provides two functions, ir()
for index rows, and ic()
for index columns. The general form of the subsetting includes these two functions.
d[ir(), ic()]
Both functions, only callable within the Extract
function, use what R refers to as non-standard evaluation. That basically means that the annoying restrictions are removed.
To illustrate, use the Employee data set contained in lessR, here read into the d data frame.
Subset the data frame by only listing observations with a Gender of “M” with scores on Post larger than 90. Only list columns for the variables in the range from Years to Salary, and Post. Referring back to the output of Read()
, the variable range includes Years, Gender, Dept, and Salary.
## Years Gender Dept Salary Post
## Ritchie, Darnell 7 M ADMN 53788.26 92
## Hoang, Binh 15 M SALE 111074.86 97
## Pham, Scott 13 M SALE 81871.05 94
## Correll, Trevon 21 M SALE 134419.23 94
## Langston, Matthew 5 M SALE 49188.96 93
## Anderson, David 9 M ACCT 69547.60 91
Following is the traditional R call for subsetting.
## Years Gender Dept Salary Post
## Ritchie, Darnell 7 M ADMN 53788.26 92
## Hoang, Binh 15 M SALE 111074.86 97
## Pham, Scott 13 M SALE 81871.05 94
## Correll, Trevon 21 M SALE 134419.23 94
## Langston, Matthew 5 M SALE 49188.96 93
## Anderson, David 9 M ACCT 69547.60 91