java - Extracting some nodes from XML Files -



java - Extracting some nodes from XML Files -

i need extract nodes xml file formatted in way:

<collection sentiment="negativo"> <comment> <sentiment> ...</sentiment> <chars>...</chars> <words>...</words> <text>blabla</text> <lang>english</lang> </comment>

now assume there other <comment> elemente have <lang>spanish</lang> in same xml file. need create 2 separate xml files. first 1 nodes having kid <lang>english</lang> (let's phone call eng.xml) , sec 1 having <lang>spanish</lang> (let's phone call spa.xml)

here java code:

public void getenglishrows() throws ioexception{ outputstreamwriter f = new outputstreamwriter(new fileoutputstream("c:/eclipse/neg_eng.xml")); bufferedwriter buff; nodelist current_row = doc.getelementsbytagname("comment"); //mette in una lista tutti nodi row (che contengono loro volta degli elementi) nodelist tmp; node nodo = null; buff = new bufferedwriter(f); for(int i=0;i< current_row.getlength();i++){ tmp = current_row.item(i).getchildnodes(); for(int k=0;k<tmp.getlength();k++){ nodo = tmp.item(k); if("english".equals(nodo.gettextcontent())) system.out.println("if english"); buff.write(current_row.item(i).getnodevalue()); } } buff.close(); }

i don't know if clear, hope so.

so i've 1 xml files lots of <comment></comment> . i've extract <comment></comment> have <lang>english</lang> , write node (with it's childs) xml file. same behaviour <lang>spanish</lang>.

the output of eng.xml is:

<comment> <sentiment> ...</sentiment> <chars>...</chars> <words>...</words> <text>blabla</text> <lang>english</lang> </comment>

the output of spa.xml is:

<comment> <sentiment> ...</sentiment> <chars>...</chars> <words>...</words> <text>blabla</text> <lang>spanish</lang> </comment>

i hope i'm clear. problem can extract text of nodes, not mantain xml tags!!

please help me!

why not seek delete comments not in english language ? suggestion search tags , observe not-english ones. go parent element contains node (the element) , delete it. preserves original file structure.

try code. worked me :)

public void getenglishrows() throws ioexception, saxexception, parserconfigurationexception, transformerexception{ outputstreamwriter f = new outputstreamwriter(new fileoutputstream("./eng_sent.xml")); documentbuilderfactory dbf = documentbuilderfactory.newinstance(); documentbuilder db = dbf.newdocumentbuilder(); document doc = db.parse(new fileinputstream("c:/eclipse/neg_eng.xml")); nodelist current_row = doc.getelementsbytagname("lang"); // search lang element for(int i=0;i< current_row.getlength();i++){ string lang = current_row.item(i).gettextcontent(); if (!lang.equalsignorecase("english")) { // delete not english language comment element comment = (element) current_row.item(i).getparentnode(); doc.getdocumentelement().removechild(comment); doc.normalize(); } } // write content xml file transformerfactory transformerfactory = transformerfactory.newinstance(); transformer transformer = transformerfactory.newtransformer(); domsource source = new domsource(doc); streamresult result = new streamresult(f); transformer.transform(source, result); }

the file neg_eng appears following:

<collection sentiment="negativo"> <comment> <sentiment> ...</sentiment> <chars>...</chars> <words>...</words> <text>eng3</text> <lang>english</lang> </comment> <comment> <sentiment> ...</sentiment> <chars>...</chars> <words>...</words> <text>eng1</text> <lang>english</lang> </comment> <comment> <sentiment> ...</sentiment> <chars>...</chars> <words>...</words> <text>eng2</text> <lang>english</lang> </comment>

where original xml file was:

<collection sentiment="negativo"> <comment> <sentiment> ...</sentiment> <chars>...</chars> <words>...</words> <text>eng3</text> <lang>english</lang> </comment> <comment> <sentiment> ...</sentiment> <chars>...</chars> <words>...</words> <text>spa2</text> <lang>spanish</lang> </comment> <comment> <sentiment> ...</sentiment> <chars>...</chars> <words>...</words> <text>eng1</text> <lang>english</lang> </comment> <comment> <sentiment> ...</sentiment> <chars>...</chars> <words>...</words> <text>eng2</text> <lang>english</lang> </comment> <comment> <sentiment> ...</sentiment> <chars>...</chars> <words>...</words> <text>spa1</text> <lang>spanish</lang> </comment>

hope help you! happy hacking ;-)

java xml

Comments

Popular posts from this blog

formatting - SAS SQL Datepart function returning odd values -

c++ - Apple Mach-O Linker Error(Duplicate Symbols For Architecture armv7) -

php - Yii 2: Unable to find a class into the extension 'yii2-admin' -