How to scrape data from a website?

How do I scrape data off a page displaying live scores on CricInfo website, using BeautifulSoup,lxml(xpath or cssselect) and Python?

  • I want scrape the teams,score,batsmen,bowlers,fall of wickets etc. This screenshot is an example.

  • Answer:

    This is the approach that I generally use for scraping. 1. Understand source code of the website/webpage in question. 2. Get the tree structure to the desired nodes. Note their ids, classes, tags, attributes etc. 3. In my script, read the whole structure in a variable. 4. Use libraries to parse that structure and get values of interest as per understanding from step 2. Since you want real-time data, you can iterate steps 3 and 4 in a while loop with the time interval that you want (Step 5: Delay of X seconds). I hope this helps.

Akshat Goel at Quora Visit the source

Was this solution helpful to you?

Other answers

Use Mechanize (https://pypi.python.org/pypi/mechanize/) to refresh the HTML source after every 8-10 seconds. In this way you'll get the HTML source with the updated score. Scrape the site using BeautifulSoup (http://www.crummy.com/software/BeautifulSoup/).

Jason Estibeiro

<script src="http://l.yimg.com/rt/pps/listbadge_1.8.js">{"pipe_id":"tMcVGcqn3BGvsT__2R2EvQ","_btype":"list"}</script> use this code then u will get live scores

Chuno Chunno

Related Q & A:

Just Added Q & A:

Find solution

For every problem there is a solution! Proved by Solucija.

  • Got an issue and looking for advice?

  • Ask Solucija to search every corner of the Web for help.

  • Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.