Wget

Usage

sh this_script.sh <URL>

Example

sh this_script.sh http://www.affymetrix.com/Auth/support/downloads/library_files/HuEx-1_0-st-v2_libraryfile.zip

Download a file in a website which requires login

wget

#need wget version > 1.10

#1. get cookie

~/bin/bin/wget -v -c --no-check-certificate --post-data="logon=augix.com%40gmail.com&password=PASSWORD"  \
--save-cookies=cookies.txt --keep-session-cookies http://www.affymetrix.com/site/processLogin.jsp

#2. download file $1

~/bin/bin/wget --no-check-certificate  --cookies=on --load-cookies=cookies.txt \
--keep-session-cookies $1

curl

curl -D headers_and_cookies -d "logon=augix.com%40gmail.com&password=PASSWORD&x=0&y=0" http://www.affymetrix.com/site/processLogin.jsp

curl -b headers_and_cookies http://www.affymetrix.com/Auth/support/downloads/demo_data/HuEx-1_0-st-v2.tissue-mixture-data-set.gcos-files.zip

Download all the files form FTP site by Wget

echo "wget -r -l2 --no-parent -A.bz2 ftp://ftp.hgsc.bcm.tmc.edu/pub/data/Rmacaque/fasta/" | at now

You want to download all the GIFs from a directory on an HTTP server. You tried

wget http://www.server.com/dir/*.gif

but that didn't work because HTTP retrieval does not support globbing. In that case, use:

wget -r -l1 --no-parent -A.gif http://www.server.com/dir/

More verbose, but the effect is the same. -r -l1' means to retrieve recursively (see section Recursive Retrieval), with maximum depth of 1. --no-parent' means that references to the parent directory are ignored (see section Directory-Based Limits), and -A.gif' means to download only the GIF files. -A "*.gif"' would have worked too.

wget -r -l1 --no-parent -A.gz ftp://hgdownload.cse.ucsc.edu/goldenPath/hg18/vsPanTro2/axtNet/ -o wget.log

Download the whole Website

wget -r -p -np -k url

Avoid disruption

Reference

Mastering Wget - Lifehacker.com

http://ftp.gnu.org/gnu/Manuals/wget-1.8.1/html_chapter/wget_7.html

Homepage
Comments

Hide Comments