Wednesday, January 14, 2009

R wish list

Was part of an R discussion - I haven't really used R except for the VARS package and I've been trying to make the switch from SAS but it hasn't worked because of all the pre-processing that I have to do to get a data set for analysis.

My wish list for R is as follows (and they may already be there just not to my mediocre knowledge or quick Google searches of the R discussion list):
1. An input statement for processing text files like SAS - this is key to reading public use files that are usually very large and having to avoid reading the entire file using read.table or the fortran syntax for reading files.
2. Several commenters noted that you can read files without using data frames and I was not able to find a reference to it on the R-discussion list. I'm thinking that this is achieved using vectors or matrices but haven't quite figured it out yet.
3. A first dot and last dot syntax similar to SAS or an egen statement similar to STATA.
4. it would be nice if the R foreign package has a keep or drop statement so that I don't have to read the entire data set into memory. I tried to read the public use version of World Values Survey Data which was in Stata xpt format but the memory limitations on my computer couldn't handle it.

I realize that R is NOT a data processing package and something like Perl could also work BUT it's always nice to have everything integrated instead of having to deal with two languages and porting abck and forth between languages to do what I consider basic data processing tasks. I consider data analysis 90 percent processing and 10 percent analysis and then another 100 percent fooling around with different packages to get the results I want.

No comments: