xml - Congressional Bills into R data table -



xml - Congressional Bills into R data table -

i've read many articles, stackoverflow, examples on how extract info xml files info table in r. have been unsuccessful in attempts, perhaps it's because of xml files? im posting illustration xml file, if give on , point me in direction these files have table, helpful.

' <bill session="113" type="s" number="12" updated="2014-09-20t07:17:52-04:00"> <state datetime="2013-02-26">referred</state> <status> <introduced datetime="2013-02-26"/> </status> <introduced datetime="2013-02-26"/> <titles> <title type="short" as="introduced">naval vessel transfer deed of 2013</title> <title type="official" as="introduced">a bill provide transfer of naval vessels foreign recipients.</title> </titles> <sponsor id="402675"/> <cosponsors> <cosponsor id="412491" joined="2013-11-05"/> </cosponsors> <actions> <action datetime="2013-02-26" state="referred"> <text>read twice , referred commission on foreign relations.</text> </action> </actions> <committees> <committee code="ssfr" name="senate foreign relations" activity="referral, in committee"/> </committees> <relatedbills> <bill relation="unknown" session="113" type="s" number="1683"/> </relatedbills> <subjects> <term name="international affairs"/> <term name="asia"/> <term name="buy american requirements"/> <term name="latin america"/> <term name="marine , inland water transportation"/> <term name="mexico"/> <term name="military assistance, sales, , agreements"/> <term name="military facilities , property"/> <term name="taiwan"/> <term name="thailand"/> </subjects> <amendments/> <summary>2/26/2013--introduced. naval vessels transfer deed of 2013 - authorizes president transfer on grant basis to: (1) mexico, oliver hazard perry class guided missile frigates curts , mcclusky; , (2) thailand, oliver hazard perry class guided missile frigates rentz , vandegrift. authorizes president transfer on sale basis oliver hazard perry class guided missile frigates taylor, gary, carr, , elrod taipei economic , cultural representative office of united states of america (which taiwan instrumentality designated pursuant taiwan relations act). states that: (1) value of such vessels transferred on grant basis shall not counted against aggregate value of excess defense articles transferred countries in financial year under foreign assistance deed of 1961; (2) transfer costs shall charged recipient; , (3) maximum extent practicable, country vessel transferred shall have necessary vessel repair , refurbishment carried out @ u.s. shipyards (including u.s. navy shipyards). terminates transfer authorization 3 years after enactment of act.</summary> </bill> '

you seek splitting xml separate bills (avoiding related bills) , utilize xpath queries select whatever columns need using lapply or loop.

doc <- xmlparse("lotsofbills.xml") nodes <- getnodeset(doc, "//bill[not(ancestor::bill)]") x <- lapply(nodes, function(x){ data.frame( bill_session = xpathsapply(x, ".", xmlgetattr, "session"), short_title = xpathsapply(x, ".//title[@type='short']", xmlvalue), action_datetime = xpathsapply(x, ".//actions/action", xmlgetattr, "datetime"), action = xpathsapply(x, ".//actions/action/text", xmlvalue), subjects = paste( xpathsapply(x, ".//subjects/term", xmlgetattr, "name"), collapse="; ") )}) do.call("rbind", x) bill_session short_title action_datetime action 1 113 naval vessel transfer deed of 2013 2013-02-26 read twice , referred commission on foreign relations. subjects 1 international affairs; asia; purchase american requirements; latin america; marine , inland water transportation; mexico; armed forces assistance, sales, , agreements; armed forces facilities , property; taiwan; thailand

and comparison, here's loop, may easier utilize if unfamiliar xml file

x<-vector("list", length(nodes)) (i in 1:length(nodes)){ subdoc <- xmldoc(nodes[[i]]) bill_session <- xpathsapply(subdoc, "/bill", xmlgetattr, "session") short_title <- xpathsapply(subdoc, "//title[@type='short']", xmlvalue) action_datetime <- xpathsapply(subdoc, "//actions/action", xmlgetattr, "datetime") action <- xpathsapply(subdoc, "//actions/action/text", xmlvalue) subjects <- paste( xpathsapply(subdoc, "//subjects/term", xmlgetattr, "name"), collapse="; ") x[[i]] <- data.frame(bill_session, short_title, action_datetime, action, subjects) free(subdoc) }

xml r

Comments

Popular posts from this blog

formatting - SAS SQL Datepart function returning odd values -

c++ - Apple Mach-O Linker Error(Duplicate Symbols For Architecture armv7) -

php - Yii 2: Unable to find a class into the extension 'yii2-admin' -