python - Deleting all but a few Variables from a Dataset in SPSS -



python - Deleting all but a few Variables from a Dataset in SPSS -

i have have info set 1000 variables. want work on dataset little subset of these variables. convenient ways delete non needed variables?

using delete variable command such as

delete variables var1 var13 var 15 var 17 var var35 ...

would quite annoying , error-prone.

sample data data list list (",") / create (a18) cost (f4) mpg (f2) rep78 (f1) hdroom (comma1.1) trunk (f2) weight (f4) length (f3) turn (f2) displ (f3) gratio (comma2.2) foreign (f1) . begin data. amc concord, 4099, , 3, 2.5, 11, 2930, 186, 40, 121, 3.58, 0 amc pacer, 4749, , 3, 3.0, 11, 3350, 173, 40, 258, 2.53, 0 amc spirit, 3799, , 3, .0, 12, 2640, 168, 35, 121, 3.08, 0 audi 5000, 9690, 17, 5, 3.0, 15, 2830, 189, 37, 131, 3.20, 1 audi fox, 6295, 23, 3, 2.5, 11, 2070, 174, 36, 97, 3.70, 1 bmw 320i, 9735, 25, 4, 2.5, 12, 2650, 177, 34, 121, 3.64, 1 buick century, 4816, 20, 3, 4.5, 16, 3250, 196, 40, 196, 2.93, 0 buick electra, 7827, 15, 4, 4.0, 20, 4080, 222, 43, 350, 2.41, 0 buick lesabre, 5788, 18, 3, 4.0, 21, 3670, 218, 43, 231, 2.73, 0 end data. dataset name cars.

lets want extract variables make, trunk , turn.

save file (and file)

if want maintain reduced info set later utilize obvious way utilize /keep alternative in save command.

save outfile='cars.sav' /keep create trunk turn.

on other hand if want open existing info set subset of variables, can utilize get file command /keep option.

match files

this method usefull if want work temporarily reduced dataset without storing hard drive.

match files /file * /keep create trunk turn. python variablelist class

the methods above may take time if have huge info set because read (and write) data. in case might helpfull manipulate info dictionary straight using python. well, that's thought. tried far, more or less succesfull.

begin programme python. keeplist = ["make", "trunk", "turn"] import spss spss.startdatastep() datasetobj = spss.dataset('cars') varlistobj = datasetobj.varlist varlist = [var.name.encode('utf8') var in varlistobj] datasetobj.close() spss.enddatastep() deletelist = [item item in varlist if item not in keeplist] spss.submit("delete variables %s." % " ".join(deletelist)) end programme

note: piece of codes works if list of to-delete-variables isn't much longer 100 (this bug should fixed in spss v23 or higher). otherwise have split list seperate pieces. tried replacing spss.submit-line with:

chunks = 100 in xrange(0, len(deletelist), chunks): spss.submit("delete variables %s." % " ".join(deletelist[i:i+chunks]))

the problem though programme block runs fast on big info sets, turned out after block every execution of commands slowed downwards painfully, maybe caused memory leak.

python spss

Comments

Popular posts from this blog

formatting - SAS SQL Datepart function returning odd values -

c++ - Apple Mach-O Linker Error(Duplicate Symbols For Architecture armv7) -

php - Yii 2: Unable to find a class into the extension 'yii2-admin' -