ruby - Scrapping a webpage with Mechanize and Nokogiri and storing data in XML doc -
i trying scrap website , store data in xml using mechanize , nokogiri. didn't set rails project , using ruby , irb.
i wrote method:
def mechanize_club agent = mechanize.new agent.get("http://www.rechercheclub.applipub-fft.fr/rechercheclub/") form = agent.page.forms.first form.field_with(:name => 'codeligue').options[0].select form.submit page2 = agent.get('http://www.rechercheclub.applipub-fft.fr/rechercheclub/club.do?codeclub=01670001&millesime=2015') body = page2.body html_body = nokogiri::html(body) codeclub = html_body.search('.form').children("tr:first").children("th:first").to_i @codeclubs << codeclub filepath = '/davidgeismar/documents/codeclubs.xml' builder = nokogiri::xml::builder.new(encoding: 'utf-8') |xml| xml.root { xml.codeclubs { @codeclubss.each |c| xml.codeclub { xml.code_ c.code } end } } end puts builder.to_xml end
my first problem don't know how test code. call ruby webscrapper.rb
in console, file treated think, doesn't create xml file in specified path. then, more quite sure code wrong didn't chance test it.
basically trying submit form several times:
agent = mechanize.new agent.get("http://www.rechercheclub.applipub-fft.fr/rechercheclub/") form = agent.page.forms.first form.field_with(:name => 'codeligue').options[0].select form.submit
i think code ok, dont want select options[0]
, want select option, scrap data need, go page, select options[1]
... until there no more options (an iteration guess).
the file treated think, doesnt create xml file in specified path.
there nothing in code creates file. print output, don't open
or write
file.
perhaps should read io , file documentation , review how using filepath
variable?
the second problem don't call method anywhere. though it's defined , ruby see , parse method, has no idea want unless invoke method:
def mechanize_club ... end mechanize_club()
Comments
Post a Comment