http - C# to get data from a website -
http - C# to get data from a website -
i info this website , set them dictionary.
basically these prices , quantities financial instruments.
i have source code page (here extract of whole text):
<tr> <td class="quotesmaxtime1414148558" id="notation115602071"><span>4,000.00</span></td> <td><span>0</span></td> <td class="icon red"><span id="domhandler:8.consumer:value-2cclass.comp:prev.gt:green.eq:zero.lt:red.resetlt:.resetgt:.reseteq:zero.mdgobj:prices-2fquote-3fversion-3d2-26code_selector_previous_last-3dlatest-26id_type_performance-3d7-26id_type_price-3d1-26id_quality_price-3d5-26id_notation-3d115602071.attr:performance_pct.wtkm:options_options_snapshot_1">-3.87%</span></td> <td><span id="domhandler:9.consumer:value-2cclass.comp:prev.gt:green.eq:zero.lt:red.resetlt:.resetgt:.reseteq:zero.mdgobj:prices-2fquote-3fversion-3d2-26code_selector_previous_last-3dlatest-26id_type_performance-3d7-26id_type_price-3d1-26id_quality_price-3d5-26id_notation-3d115602071.attr:price.wtkm:options_options_snapshot_1">960.40</span></td> </tr>
now extraxt next information:
the value "4000" sec line; the value "-3.87%" 4th line; the value "960.40" 5th line.i have tried utilize next extract first info (the value 4000):
string url = "http://www.eurexchange.com/action/exchange-en/4744-19066/19068/quotessingleviewoption.do?callput=put&maturitydate=201411"; var webget = new htmlweb(); var document = webget.load(url); var firstdata = x in document.documentnode.descendants() x.name == "td" && x.attributes.contains("class") select x.innertext;
but firstdata doesn't contains info want (the value 4000) this:
system.linq.enumerable+whereselectenumerableiterator`2[htmlagilitypack.htmlnode,system.string]
how can these data? need repeat task several times cause in page there more 1 line containing similar information. html agility pack useful in context? thanks.
this may ugly thrown , cleaned greatly, returns of values looking prices/quotes table found on page. hope helps.
var url = "http://www.eurexchange.com/action/exchange-en/4744-19066/19068/quotessingleviewoption.do?callput=put&maturitydate=201411"; var webget = new htmlweb(); var document = webget.load(url); var pricesandquotesdatatable = (from elem in document.documentnode.descendants() .where( d => d.attributes["class"] != null && d.attributes["class"].value == "toggletitle" && d.childnodes.any(h => h.innertext != null && h.innertext == "prices/quotes")) select elem.descendants() .firstordefault( d => d.attributes["class"] != null && d.attributes["class"].value == "datatable")).firstordefault(); if (pricesandquotesdatatable != null) { var datarows = elem in pricesandquotesdatatable.descendants() elem.name == "tr" && elem.parentnode.name == "tbody" select elem; var datapoints = new list<object>(); foreach (var row in datarows) { var datacolumns = (from col in row.childnodes.where(n => n.name == "td") select col).tolist(); datapoints.add( new { strikeprice = datacolumns[0].innertext, differencetopreviousday = datacolumns[9].innertext, lastprice = datacolumns[10].innertext }); } }
c# http web-scraping screen-scraping html-agility-pack
Comments
Post a Comment