A lockfile is stuck; please send me e-mail to let me know.
A lockfile is stuck; please send me e-mail to let me know.
You are visitor #808968. | ||
Weblog Archive | About Me | About This Site |
BannerFilter | Open Source | Frogbot |
Fish Banner | Cannons and Castles | Tic-Tac-Toe |
Articles | Recipes | Files |
“Comparing a real ISP to AOL is like comparing a genius nymphomaniac
supermodel girlfriend to a syphilis-riddled crackwhore.” - Clark Johnson |
In keeping with the theme of reinventing the wheel and writing everything from scratch on this site, I created my own CAPTCHA solution several years ago to prevent spambots from emailing me via the contact form on the home page. It's fairly simple, but since it's completely unique to my site, I figured there's no way anybody would bother taking the time to write a bot just to solve my little custom CAPTCHA.
It seemed to be working fine. Every once in awhile I'd get some spam, but I suspect the spammers were actually paying some poor schmuck to fill out the CAPTCHA form manually. Not a big deal - a CAPTCHA isn't going to stop a real person.
But this week I started getting a series of test messages, identifying as “XRumerTest”. A quick Google search reveals that XRumer is an advanced spambot capable of solving basic AI-based CAPTCHAs such as mine. These were clearly test messages, and not actually spam attempts, but they were clearly sent by a spambot which had solved my CAPTCHA.
What interesting times we live in. Anyway, it recently came to my attention that Google's re:CAPTCHA has gotten less obnoxious to use, and I've implemented the older version of it before without any trouble, so I've switchd to that.
So it turns out that uploading files over HTTP from a perl script using LWP isn't as obvious and straightforward as you might expect it to be. This stuff should be plainly documented, and it's not. All the examples I could find online for how to do it are about ten years old and each of them is slightly wrong, in various different ways!
So first off, yes, the right way to do this is probably WWW::Mechanize
.
But if you really want to use LWP::UserAgent and do things manually, here are the
undocumented secrets:
# You MUST specify the boundary explicitly here. You can use whatever
# unique string you like here, and it will do the right thing.
my @headers=('Content-type'=>'multipart/form-data; boundary=xYzZY');
# Notice that we're passing an arrayref, not a hashref,
# even though the contents look like a hash.
my $request=new HTTP::Request('POST'=>$url,\@headers);
# If you have other form fields you need to submit in addition
# to the file you're uploading, here's how you attach them. This is
# equivalent to:
# <input name="field1" value="value1">
$request->add_part(new HTTP::Message([
'Content-disposition'=>'form-data; name="field1"'
],'value1'));
# Finally, to upload your file:
$request->add_part(new HTTP::Message([
'Content-disposition'=>'form-data; name="myfile"; filename="upload.csv"',
'Content-type'=>'text/csv',
],$filecontents));
# Finally, submit the query::
my $response=$ua->request($request);
Obviously you should choose the right MIME type for the type of file you're uploading, only use valid characters for the filename, and that sort of thing. I actually haven't tested this with binary data; I assume it works but it's possible there may be some escaping required somewhere that I've skipped here. Please let me know if you notice something I've missed, but this is working for me!
WHATWG says they're dropping the version number from HTML, and calling the specification a “living document” that will continually evolve without the need for specific version numbers (like “HTML 5”). Like everything else I've seen from WHATWG, this new position reflects reality pretty well, at least for now.
Critics are complaining that a standard should be a complete and consistent target that implementers can aim for, and if there are no version numbers, you can never be sure when you've achieved conformance with the spec, and different browsers that implement different revisions of the spec will be incompatible. However, these people have an idealistic view of how things should work - they have no idea how things actually do work in this field. No browser has ever been 100% compliant with one specific version of the HTML specification; browsers implement whichever bits and pieces of the latest spec that they like best (plus other bits and pieces they made up themselves). Thanks largely to WHATWG's work, this situation has rapidly improved over the last several years, but it will be many more years before the Web is stable enough that trying to implement a particular version of HTML actually becomes useful.
Meanwhile, W3C will still put version numbers on everything. They still cling to the fantasy that someday somebody somewhere will try to implement a spec just because the W3C says they should.
You can send me e-mail at
contact@phroggy
or use this form: .com
This site has been rated among the bottom 95% of all Web sites by Pointless Communications ® |