Write an Online Bulk Image Downloader Using BASH
So my girlfriend tells me that I need to download several large image files from a photographer’s website. I manage her online portfolio/website so I’m used to these types of requests… but this time I was going to find a better way to “leech them all.” So anyway, she IM’s me the URL for her most recent shoot…
I was greeted by a nice default Apache index page (and the Photog spelled her name wrong, ugh).
How am i going to grab about 70, five or seven megabyte, image files?
I could click each one and then save it or I could use the Unlinker Firefox Add-On to convert all the links to images. The latter would load all 350MB of photos on the one page. Most certainly my FX-55 single core processor and 1gb of DDR RAM wouldn’t appreciate that very much.
Being the huge open source fan that I am I decided to write a Bash script to accomplish this without hogging up all my computer’s resources. If you manage to know the first image filename and the last image filename in a particular folder, you can download them using seq command with a Bash do loop. Let’s say the first image and the last image’s name is in this format:
http://any_photographer.com/jenn_thomas/full_size/JT_0019.jpg
http://any_photographer.com/jenn_thomas/full_size/JT_1214.jpg
we can assume the images between them should be 0020, 0021, 0022, and so on, until 1214. Therefore a simple Bash script will looks like this:
#!/bin/bash
for i in `seq -f"%04g" 19 1214`
do
wget -c "http://any_photographer.com/jenn_thomas/full_size/JT_$i.jpg"
done
Seq allows you to define printf-like formating by specified with -f”%04g” is actually tells seq I got four digits, fill the blank digits with 0, and the range is from 19 to 1214. After that, use wget to download them. Thats how I got JT_0353.jpg at the top of this post. Pretty simple isn’t it?
You can run Bash scripts under a windows platform too if you have Cygwin installed. But bare in mind, not all images are download-able with this technique. Certain site pad the image’s filename with some random characters, that prevent downloads by this simple script.
UPDATE: A reader suggested using Curl as an alternative:
curl -o JT_01_#1 http://any_photographer.com/jenn_thomas/full_size/JT_[0019-1214].jpg
Filed under: codemonkey, girlfriend, howto, linux
Unlinker Firefox Add-On
BeautyandtheBoost.com


[...] a better way to “leech them all.” So anyway, she IM’s me the URL for her most recent shoot…read more | digg [...]
[...] a better way to “leech them all.” So anyway, she IM’s me the URL for her most recent shoot…read more | digg [...]