Kyle R . Burton on Wed, 19 Jun 2002 10:53:49 +0200 |
> I'm not a web programmer - I have just enough knowledge to be dangerous. > Hopefully, somebody here can provide me with the clues to answer to this > question! > > I've got a web service that I pay for. They require me to log in on one web > page and then click a button on the next page to download a file. Since I'd > like to put this download in a script, I want to use curl or some similar > program. Unfortunately, I cannot see the request that the browser sends to the > site to initiate the download. I've examined the source for the web page and > I see following code in the source: > > -----code excerpt------ > > <form action=http://www.site.com/cgi-bin/download.cgi method=post> > <input type=hidden name="action" value="download"> > <input type=hidden name="file" value="20020610"> > <input type=hidden name="area" value="1"> > <input type=hidden name="login" value="userlogin"> > <input type=hidden name="PIN" value="password"> > <tr align=center><td><input type=submit value="20020610.zip"></td> > </form> > > -----end code excerpt------ > > I'm not exactly sure what I'm looking at but I believe that the input(s) > that are hidden are transmitted to the site along with the request. The > input with type=submit actually initiates the request. > > Is there a way for me to capture the request that the browser sends? > I run SuSE 7.3 - KDE 2.2.1. The browser is Knoqueror. > > I believe that, eventually, I'll have to write a perl program to simulate > the browsers actions (if it cannot be done simply using curl). > > Any help greatly appreciated. When doing my won web development, I've found netcat to be indespensible: http://www.bgw.org/tutorials/utilities/nc.php The basic sequence of events I typicly use is to save the form to disk, add a <base href="http://that.site/path/to/page/">, change the form action to point to localhost:8888, run "netcat -l -p 8888 < /dev/null", load the html page, from disk, into the browser and submit it. Netcat should then capture the data that the browser would have sent to the remote system. If the hidden variables change from request to request, you'll probably end up having to dynamicly fetch the page and then parse out the form elements. If you're comfortable with Perl, have a look at LWP::UserAgent. It's basicly a web browser that you can control from your Perl code. There are other HTTP libraries for other langugaes as well (for Java, try HttpClient from the Apache Jakarta Commons project; for C/C++, try libghttp [comes with many linux distributions], httplib.py for Python [http://www.lyra.org/greg/python/httplib.py]). HTH Kyle -- ------------------------------------------------------------------------------ Wisdom and Compassion are inseparable. -- Christmas Humphreys mortis@voicenet.com http://www.voicenet.com/~mortis ------------------------------------------------------------------------------ ______________________________________________________________________ Philadelphia Linux Users Group - http://www.phillylinux.org Announcements-http://lists.phillylinux.org/mail/listinfo/plug-announce General Discussion - http://lists.phillylinux.org/mail/listinfo/plug
|
|