Need to set up an automatic download because humans are forgetful and lazy.
-
I'm looking to do an automated download from a website that requires a login and clicking a javascript button and the usual tricks are failing me. Ideas? We've got to pull a file overnight every weekday from a website where the data is posted. The site itself requires a login/password, and then once you're in, you get a table with the most recent file being the same element on top. This is an unusual one for us as every other place we download data of this nature from provides us an sftp site to pull it from, but these guys don't. They know we are going to automate the process and they're fine with that, but also they're also taking the "it's good enough for everyone else" position and not willing to help automate it. In similar situations I've used wget, but it doesn't seem to be the tool for the job here. The login plus the javascript button are frustrating my attempts to use it. Anyone have any ideas? Something that can be run from a cronjob is ideal.
-
Answer:
I'd recommend http://watir.com/. It's based on ruby, but even if you've never used ruby the installation and examples are straight forward. It interfaces the web through a real browser, so anything you can do through a browser you can do via watir. It is admittedly overkill from a functionality perspective, but it's trivial to deal with cookie sessions, javascript redirects and other things that can throw wrinkles into wget, curl and mechanize solutions.
barc0001 at Ask.Metafilter.Com Visit the source
Other answers
Just because the button uses javascript doesn't mean that deep down it's not just a regular HTTP request at the core, which can be simulated with an appropriate curl/wget command line. Your job is to find the actual HTTP request. There are numerous tools to do this: Firebug, Tamper Data, Wireshark, etc.
Rhomboid
oh, here I was thinking an imacro from firefox. you're all unix-ey though.
TomMelee
What's the javascript button do?
soma lkzx
Perhaps with a spare XP/7 machine (or creating a vm with windows guest) and using http://www.autoitscript.com/? Apparently it works with WINE as well. I see a http://www.autoitscript.com/forum/index.php?showtopic=42691 dealing with Java apps, and http://technologyyogi.blogspot.com/2008/12/generic-web-user-interface-login.html is someone using it to login to websites.
dozo
http://www.youtube.com/watch?v=FxDOlhysFcM&feature=player_embedded let's you program scripts based on mini-screenshots.
Brent Parker
I've had success in the past with http://htmlunit.sourceforge.net/, which is a Java 'browser' which executes JavaScript. If you prefer Ruby to Java, it's available (via JRuby) as http://celerity.rubyforge.org/. As others have said, though, it's generally preferable to avoid this and go HTTP-only if that's in any way simpler.
smcg
I'll give Autoit a go, as it seems like the quickest way to get this up and running and then do a more bulletproof version later if need be. Watir also looks intriguing for not only this but some other things we were thinking of doing as well. I tried digging out the actual request URL with Firebug for a while yesterday but had no luck so far. Might give Wireshark a go in that regard. Thanks for the ideas everyone, this definitely gives me a few other options to try.
barc0001
2nding Rhomboid. wget can handle the POST requests and cookies associated with a login. Here's an example: wget --post-data='hiddenfield=yes&username=cowbellemoo2&password=secret' \ --save-cookies=my-cookies.txt --keep-session-cookies \ http://example.com/login.php wget --load-cookies=my-cookies.txt --save-cookies=my-cookies.txt \ --keep-session-cookies http://example.com/file-needed.doc The hard part is figuring out what to put into POST to login successfully and where your destination file is. Like Rhomboid suggested, firebug and wireshark can help you do that. They'll see through the AJAX or whatever into the network requests and reveal the actual location of the file you want.
cowbellemoo
Are the filenames at all consistent? I've done similar things in the past with shell scripts and http://www.nist.gov/mel/msid/expect.cfm to automate FTP and telnet sessions. You could basically telnet to port 80 of the web server and feed it the appropriate http commands to get the file of the day.
jjb
Related Q & A:
- How can I set up outlook to download mail?Best solution by Yahoo! Answers
- What do I need to set my laptop as a TV?Best solution by Yahoo! Answers
- I need to pull a Automatic Transmission from a 1997 V-8 Mercury Cougar XR7.Best solution by Yahoo! Answers
- How to set up automatic response with e-mail?Best solution by Yahoo! Answers
- I need to set up a new email address.Best solution by Yahoo! Answers
Just Added Q & A:
- How many active mobile subscribers are there in China?Best solution by Quora
- How to find the right vacation?Best solution by bookit.com
- How To Make Your Own Primer?Best solution by thekrazycouponlady.com
- How do you get the domain & range?Best solution by ChaCha
- How do you open pop up blockers?Best solution by Yahoo! Answers
For every problem there is a solution! Proved by Solucija.
-
Got an issue and looking for advice?
-
Ask Solucija to search every corner of the Web for help.
-
Get workable solutions and helpful tips in a moment.
Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.