How do you scrape websites that use services like Brassring (e.g., GE Careers)? Also, how do you use the scraping to navigate to separate pages?
-
Brassring uses data taken from a database that seems a bit harder to scrape than by using the parameters of readLines and RCurl packages in R. What methods (preferably in R or Python) would you use to scrape it and could you please share your code?
-
Answer:
I wrote a distributed crawler for just these cases; it is open source https://github.com/CalculatedContent/cloud-crawler it is ruby
Charles H Martin at Quora Visit the source
Other answers
I came to a similar problem when I was creating my own program to search for jobs. One of the sites, Monster, changed their front-end code, and made it much more difficult for me to parse information from data gathered using Requests (I parsed with BeautifulSoup). I ended up having to scrap the project and start from scratch. Here's when I found Selenium( http://selenium-python.readthedocs.org/ )It uses your web browser, Firefox or Chrome to go to a site: from selenium import webdriver from selenium.webdriver.common.keys import Keys driver = webdriver.Firefox() driver.get("http://www.ge.com/") ... or you can run it headless (no browser visuals) with virtualization using xvfb or with phantom.js. driver = webdriver.PhantomJS() You can program Selenium to click on various elements to navigate yourself to where you want to be, and there are functions that let you scrape data. Here's a primer you can use that will let you hit the ground running:https://automatetheboringstuff.com/chapter11/ Likewise, you can also treat it similarly the way you would with a Requests module and just ask for the link (at least with the GE website).For example: driver = webdriver.Firefox() driver.get('http://www.ge.com/careers/opportunities?keyword=&country=United+States&state=California&func=Asset+Management&business=TG_SEARCH_ALL&experience_level=Co-Op%2FIntern') Now, Selenium isn't perfect. It uses a lot of resources on your computer. If you're doing a massive project, this might not be the best module. However, this is something I definitely recommend you check out as a possible alternative solution to your problem.If you're interested to see how I've used Selenium with http://Monster.com, http://SimplyHired.com, and http://Indeed.com, here's my github: https://github.com/michaelverano/AutomatedTools/tree/master/searchJobs2 . As a warning: I don't see myself as a computer programmer (yet), so don't expect the best most efficient code., but perhaps it can give you an idea of how to approach your problem.Good luck!
Michael Verano
Web scraping can retrieve both static and dynamic web pages that can help in the long run. There are numerous scraping tools available online that provide excellent services and among them one of the most popular ones is Easy Data Feed. Powered by ShoppingCartElit, the Easy Data Feed is a data extraction software that is designed to download quickly inventory, pricing and product information into a usable spreadsheet from your drop ship supplierâs online portal without relying on the drop shipper. It has been specially built for online retailers who are dissatisfied with their drop ship supplier's digital data for inventory, pricing and even universal product information. Disclosure: I am a specialist of ecommerce platform for businesses.
Junior Johnson
Related Q & A:
- Why does e.g. Hubble's secondary mirror not block part of the picture?Best solution by Astronomy
- How do I use a custom avatar like a picture of me?Best solution by Yahoo! Answers
- How can I use my signature for my e-mail?Best solution by Yahoo! Answers
- How do I use my own picture instead of an avatar like on the old messenger?Best solution by Yahoo! Answers
- How do I use Outlook with my new e-mail address?Best solution by Yahoo! Answers
Just Added Q & A:
- How many active mobile subscribers are there in China?Best solution by Quora
- How to find the right vacation?Best solution by bookit.com
- How To Make Your Own Primer?Best solution by thekrazycouponlady.com
- How do you get the domain & range?Best solution by ChaCha
- How do you open pop up blockers?Best solution by Yahoo! Answers
For every problem there is a solution! Proved by Solucija.
-
Got an issue and looking for advice?
-
Ask Solucija to search every corner of the Web for help.
-
Get workable solutions and helpful tips in a moment.
Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.