· 7 years ago · Jan 02, 2019, 08:04 PM
1.\" Automatically generated by Pod::Man 2.28 (Pod::Simple 3.29)
2.\"
3.\" Standard preamble:
4.\" ========================================================================
5.de Sp \" Vertical space (when we can't use .PP)
6.if t .sp .5v
7.if n .sp
8..
9.de Vb \" Begin verbatim text
10.ft CW
11.nf
12.ne \\$1
13..
14.de Ve \" End verbatim text
15.ft R
16.fi
17..
18.\" Set up some character translations and predefined strings. \*(-- will
19.\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left
20.\" double quote, and \*(R" will give a right double quote. \*(C+ will
21.\" give a nicer C++. Capital omega is used to do unbreakable dashes and
22.\" therefore won't be available. \*(C` and \*(C' expand to `' in nroff,
23.\" nothing in troff, for use with C<>.
24.tr \(*W-
25.ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p'
26.ie n \{\
27. ds -- \(*W-
28. ds PI pi
29. if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch
30. if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch
31. ds L" ""
32. ds R" ""
33. ds C` ""
34. ds C' ""
35'br\}
36.el\{\
37. ds -- \|\(em\|
38. ds PI \(*p
39. ds L" ``
40. ds R" ''
41. ds C`
42. ds C'
43'br\}
44.\"
45.\" Escape single quotes in literal strings from groff's Unicode transform.
46.ie \n(.g .ds Aq \(aq
47.el .ds Aq '
48.\"
49.\" If the F register is turned on, we'll generate index entries on stderr for
50.\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index
51.\" entries marked with X<> in POD. Of course, you'll have to process the
52.\" output yourself in some meaningful fashion.
53.\"
54.\" Avoid warning from groff about undefined register 'F'.
55.de IX
56..
57.nr rF 0
58.if \n(.g .if rF .nr rF 1
59.if (\n(rF:(\n(.g==0)) \{
60. if \nF \{
61. de IX
62. tm Index:\\$1\t\\n%\t"\\$2"
63..
64. if !\nF==2 \{
65. nr % 0
66. nr F 2
67. \}
68. \}
69.\}
70.rr rF
71.\" ========================================================================
72.\"
73.IX Title "WGET 1"
74.TH WGET 1 "2018-05-08" "GNU Wget 1.17.1" "GNU Wget"
75.\" For nroff, turn off justification. Always turn off hyphenation; it makes
76.\" way too many mistakes in technical documents.
77.if n .ad l
78.nh
79.SH "NAME"
80Wget \- The non\-interactive network downloader.
81.SH "SYNOPSIS"
82.IX Header "SYNOPSIS"
83wget [\fIoption\fR]... [\fI\s-1URL\s0\fR]...
84.SH "DESCRIPTION"
85.IX Header "DESCRIPTION"
86\&\s-1GNU\s0 Wget is a free utility for non-interactive download of files from
87the Web. It supports \s-1HTTP, HTTPS,\s0 and \s-1FTP\s0 protocols, as
88well as retrieval through \s-1HTTP\s0 proxies.
89.PP
90Wget is non-interactive, meaning that it can work in the background,
91while the user is not logged on. This allows you to start a retrieval
92and disconnect from the system, letting Wget finish the work. By
93contrast, most of the Web browsers require constant user's presence,
94which can be a great hindrance when transferring a lot of data.
95.PP
96Wget can follow links in \s-1HTML, XHTML,\s0 and \s-1CSS\s0 pages, to
97create local versions of remote web sites, fully recreating the
98directory structure of the original site. This is sometimes referred to
99as \*(L"recursive downloading.\*(R" While doing that, Wget respects the Robot
100Exclusion Standard (\fI/robots.txt\fR). Wget can be instructed to
101convert the links in downloaded files to point at the local files, for
102offline viewing.
103.PP
104Wget has been designed for robustness over slow or unstable network
105connections; if a download fails due to a network problem, it will
106keep retrying until the whole file has been retrieved. If the server
107supports regetting, it will instruct the server to continue the
108download from where it left off.
109.PP
110Wget does not support Client Revocation Lists (CRLs) so the \s-1HTTPS\s0
111certificate you are connecting to might be revoked by the siteowner.
112.SH "OPTIONS"
113.IX Header "OPTIONS"
114.SS "Option Syntax"
115.IX Subsection "Option Syntax"
116Since Wget uses \s-1GNU\s0 getopt to process command-line arguments, every
117option has a long form along with the short one. Long options are
118more convenient to remember, but take time to type. You may freely
119mix different option styles, or specify options after the command-line
120arguments. Thus you may write:
121.PP
122.Vb 1
123\& wget \-r \-\-tries=10 http://fly.srk.fer.hr/ \-o log
124.Ve
125.PP
126The space between the option accepting an argument and the argument may
127be omitted. Instead of \fB\-o log\fR you can write \fB\-olog\fR.
128.PP
129You may put several options that do not require arguments together,
130like:
131.PP
132.Vb 1
133\& wget \-drc <URL>
134.Ve
135.PP
136This is completely equivalent to:
137.PP
138.Vb 1
139\& wget \-d \-r \-c <URL>
140.Ve
141.PP
142Since the options can be specified after the arguments, you may
143terminate them with \fB\-\-\fR. So the following will try to download
144\&\s-1URL \s0\fB\-x\fR, reporting failure to \fIlog\fR:
145.PP
146.Vb 1
147\& wget \-o log \-\- \-x
148.Ve
149.PP
150The options that accept comma-separated lists all respect the convention
151that specifying an empty list clears its value. This can be useful to
152clear the \fI.wgetrc\fR settings. For instance, if your \fI.wgetrc\fR
153sets \f(CW\*(C`exclude_directories\*(C'\fR to \fI/cgi\-bin\fR, the following
154example will first reset it, and then set it to exclude \fI/~nobody\fR
155and \fI/~somebody\fR. You can also clear the lists in \fI.wgetrc\fR.
156.PP
157.Vb 1
158\& wget \-X " \-X /~nobody,/~somebody
159.Ve
160.PP
161Most options that do not accept arguments are \fIboolean\fR options,
162so named because their state can be captured with a yes-or-no
163(\*(L"boolean\*(R") variable. For example, \fB\-\-follow\-ftp\fR tells Wget
164to follow \s-1FTP\s0 links from \s-1HTML\s0 files and, on the other hand,
165\&\fB\-\-no\-glob\fR tells it not to perform file globbing on \s-1FTP\s0 URLs. A
166boolean option is either \fIaffirmative\fR or \fInegative\fR
167(beginning with \fB\-\-no\fR). All such options share several
168properties.
169.PP
170Unless stated otherwise, it is assumed that the default behavior is
171the opposite of what the option accomplishes. For example, the
172documented existence of \fB\-\-follow\-ftp\fR assumes that the default
173is to \fInot\fR follow \s-1FTP\s0 links from \s-1HTML\s0 pages.
174.PP
175Affirmative options can be negated by prepending the \fB\-\-no\-\fR to
176the option name; negative options can be negated by omitting the
177\&\fB\-\-no\-\fR prefix. This might seem superfluous\-\-\-if the default for
178an affirmative option is to not do something, then why provide a way
179to explicitly turn it off? But the startup file may in fact change
180the default. For instance, using \f(CW\*(C`follow_ftp = on\*(C'\fR in
181\&\fI.wgetrc\fR makes Wget \fIfollow\fR \s-1FTP\s0 links by default, and
182using \fB\-\-no\-follow\-ftp\fR is the only way to restore the factory
183default from the command line.
184.SS "Basic Startup Options"
185.IX Subsection "Basic Startup Options"
186.IP "\fB\-V\fR" 4
187.IX Item "-V"
188.PD 0
189.IP "\fB\-\-version\fR" 4
190.IX Item "--version"
191.PD
192Display the version of Wget.
193.IP "\fB\-h\fR" 4
194.IX Item "-h"
195.PD 0
196.IP "\fB\-\-help\fR" 4
197.IX Item "--help"
198.PD
199Print a help message describing all of Wget's command-line options.
200.IP "\fB\-b\fR" 4
201.IX Item "-b"
202.PD 0
203.IP "\fB\-\-background\fR" 4
204.IX Item "--background"
205.PD
206Go to background immediately after startup. If no output file is
207specified via the \fB\-o\fR, output is redirected to \fIwget-log\fR.
208.IP "\fB\-e\fR \fIcommand\fR" 4
209.IX Item "-e command"
210.PD 0
211.IP "\fB\-\-execute\fR \fIcommand\fR" 4
212.IX Item "--execute command"
213.PD
214Execute \fIcommand\fR as if it were a part of \fI.wgetrc\fR. A command thus invoked will be executed
215\&\fIafter\fR the commands in \fI.wgetrc\fR, thus taking precedence over
216them. If you need to specify more than one wgetrc command, use multiple
217instances of \fB\-e\fR.
218.SS "Logging and Input File Options"
219.IX Subsection "Logging and Input File Options"
220.IP "\fB\-o\fR \fIlogfile\fR" 4
221.IX Item "-o logfile"
222.PD 0
223.IP "\fB\-\-output\-file=\fR\fIlogfile\fR" 4
224.IX Item "--output-file=logfile"
225.PD
226Log all messages to \fIlogfile\fR. The messages are normally reported
227to standard error.
228.IP "\fB\-a\fR \fIlogfile\fR" 4
229.IX Item "-a logfile"
230.PD 0
231.IP "\fB\-\-append\-output=\fR\fIlogfile\fR" 4
232.IX Item "--append-output=logfile"
233.PD
234Append to \fIlogfile\fR. This is the same as \fB\-o\fR, only it appends
235to \fIlogfile\fR instead of overwriting the old log file. If
236\&\fIlogfile\fR does not exist, a new file is created.
237.IP "\fB\-d\fR" 4
238.IX Item "-d"
239.PD 0
240.IP "\fB\-\-debug\fR" 4
241.IX Item "--debug"
242.PD
243Turn on debug output, meaning various information important to the
244developers of Wget if it does not work properly. Your system
245administrator may have chosen to compile Wget without debug support, in
246which case \fB\-d\fR will not work. Please note that compiling with
247debug support is always safe\-\-\-Wget compiled with the debug support will
248\&\fInot\fR print any debug info unless requested with \fB\-d\fR.
249.IP "\fB\-q\fR" 4
250.IX Item "-q"
251.PD 0
252.IP "\fB\-\-quiet\fR" 4
253.IX Item "--quiet"
254.PD
255Turn off Wget's output.
256.IP "\fB\-v\fR" 4
257.IX Item "-v"
258.PD 0
259.IP "\fB\-\-verbose\fR" 4
260.IX Item "--verbose"
261.PD
262Turn on verbose output, with all the available data. The default output
263is verbose.
264.IP "\fB\-nv\fR" 4
265.IX Item "-nv"
266.PD 0
267.IP "\fB\-\-no\-verbose\fR" 4
268.IX Item "--no-verbose"
269.PD
270Turn off verbose without being completely quiet (use \fB\-q\fR for
271that), which means that error messages and basic information still get
272printed.
273.IP "\fB\-\-report\-speed=\fR\fItype\fR" 4
274.IX Item "--report-speed=type"
275Output bandwidth as \fItype\fR. The only accepted value is \fBbits\fR.
276.IP "\fB\-i\fR \fIfile\fR" 4
277.IX Item "-i file"
278.PD 0
279.IP "\fB\-\-input\-file=\fR\fIfile\fR" 4
280.IX Item "--input-file=file"
281.PD
282Read URLs from a local or external \fIfile\fR. If \fB\-\fR is
283specified as \fIfile\fR, URLs are read from the standard input.
284(Use \fB./\-\fR to read from a file literally named \fB\-\fR.)
285.Sp
286If this function is used, no URLs need be present on the command
287line. If there are URLs both on the command line and in an input
288file, those on the command lines will be the first ones to be
289retrieved. If \fB\-\-force\-html\fR is not specified, then \fIfile\fR
290should consist of a series of URLs, one per line.
291.Sp
292However, if you specify \fB\-\-force\-html\fR, the document will be
293regarded as \fBhtml\fR. In that case you may have problems with
294relative links, which you can solve either by adding \f(CW\*(C`<base
295href="\f(CIurl\f(CW">\*(C'\fR to the documents or by specifying
296\&\fB\-\-base=\fR\fIurl\fR on the command line.
297.Sp
298If the \fIfile\fR is an external one, the document will be automatically
299treated as \fBhtml\fR if the Content-Type matches \fBtext/html\fR.
300Furthermore, the \fIfile\fR's location will be implicitly used as base
301href if none was specified.
302.IP "\fB\-\-input\-metalink=\fR\fIfile\fR" 4
303.IX Item "--input-metalink=file"
304Downloads files covered in local Metalink \fIfile\fR. Metalink version 3
305and 4 are supported.
306.IP "\fB\-\-metalink\-over\-http\fR" 4
307.IX Item "--metalink-over-http"
308Issues \s-1HTTP HEAD\s0 request instead of \s-1GET\s0 and extracts Metalink metadata
309from response headers. Then it switches to Metalink download.
310If no valid Metalink metadata is found, it falls back to ordinary \s-1HTTP\s0 download.
311.IP "\fB\-\-preferred\-location\fR" 4
312.IX Item "--preferred-location"
313Set preferred location for Metalink resources. This has effect if multiple
314resources with same priority are available.
315.IP "\fB\-F\fR" 4
316.IX Item "-F"
317.PD 0
318.IP "\fB\-\-force\-html\fR" 4
319.IX Item "--force-html"
320.PD
321When input is read from a file, force it to be treated as an \s-1HTML\s0
322file. This enables you to retrieve relative links from existing
323\&\s-1HTML\s0 files on your local disk, by adding \f(CW\*(C`<base
324href="\f(CIurl\f(CW">\*(C'\fR to \s-1HTML,\s0 or using the \fB\-\-base\fR command-line
325option.
326.IP "\fB\-B\fR \fI\s-1URL\s0\fR" 4
327.IX Item "-B URL"
328.PD 0
329.IP "\fB\-\-base=\fR\fI\s-1URL\s0\fR" 4
330.IX Item "--base=URL"
331.PD
332Resolves relative links using \fI\s-1URL\s0\fR as the point of reference,
333when reading links from an \s-1HTML\s0 file specified via the
334\&\fB\-i\fR/\fB\-\-input\-file\fR option (together with
335\&\fB\-\-force\-html\fR, or when the input file was fetched remotely from
336a server describing it as \s-1HTML\s0). This is equivalent to the
337presence of a \f(CW\*(C`BASE\*(C'\fR tag in the \s-1HTML\s0 input file, with
338\&\fI\s-1URL\s0\fR as the value for the \f(CW\*(C`href\*(C'\fR attribute.
339.Sp
340For instance, if you specify \fBhttp://foo/bar/a.html\fR for
341\&\fI\s-1URL\s0\fR, and Wget reads \fB../baz/b.html\fR from the input file, it
342would be resolved to \fBhttp://foo/baz/b.html\fR.
343.IP "\fB\-\-config=\fR\fI\s-1FILE\s0\fR" 4
344.IX Item "--config=FILE"
345Specify the location of a startup file you wish to use.
346.IP "\fB\-\-rejected\-log=\fR\fIlogfile\fR" 4
347.IX Item "--rejected-log=logfile"
348Logs all \s-1URL\s0 rejections to \fIlogfile\fR as comma separated values. The values
349include the reason of rejection, the \s-1URL\s0 and the parent \s-1URL\s0 it was found in.
350.SS "Download Options"
351.IX Subsection "Download Options"
352.IP "\fB\-\-bind\-address=\fR\fI\s-1ADDRESS\s0\fR" 4
353.IX Item "--bind-address=ADDRESS"
354When making client \s-1TCP/IP\s0 connections, bind to \fI\s-1ADDRESS\s0\fR on
355the local machine. \fI\s-1ADDRESS\s0\fR may be specified as a hostname or \s-1IP\s0
356address. This option can be useful if your machine is bound to multiple
357IPs.
358.IP "\fB\-t\fR \fInumber\fR" 4
359.IX Item "-t number"
360.PD 0
361.IP "\fB\-\-tries=\fR\fInumber\fR" 4
362.IX Item "--tries=number"
363.PD
364Set number of tries to \fInumber\fR. Specify 0 or \fBinf\fR for
365infinite retrying. The default is to retry 20 times, with the exception
366of fatal errors like \*(L"connection refused\*(R" or \*(L"not found\*(R" (404),
367which are not retried.
368.IP "\fB\-O\fR \fIfile\fR" 4
369.IX Item "-O file"
370.PD 0
371.IP "\fB\-\-output\-document=\fR\fIfile\fR" 4
372.IX Item "--output-document=file"
373.PD
374The documents will not be written to the appropriate files, but all
375will be concatenated together and written to \fIfile\fR. If \fB\-\fR
376is used as \fIfile\fR, documents will be printed to standard output,
377disabling link conversion. (Use \fB./\-\fR to print to a file
378literally named \fB\-\fR.)
379.Sp
380Use of \fB\-O\fR is \fInot\fR intended to mean simply "use the name
381\&\fIfile\fR instead of the one in the \s-1URL\s0;" rather, it is
382analogous to shell redirection:
383\&\fBwget \-O file http://foo\fR is intended to work like
384\&\fBwget \-O \- http://foo > file\fR; \fIfile\fR will be truncated
385immediately, and \fIall\fR downloaded content will be written there.
386.Sp
387For this reason, \fB\-N\fR (for timestamp-checking) is not supported
388in combination with \fB\-O\fR: since \fIfile\fR is always newly
389created, it will always have a very new timestamp. A warning will be
390issued if this combination is used.
391.Sp
392Similarly, using \fB\-r\fR or \fB\-p\fR with \fB\-O\fR may not work as
393you expect: Wget won't just download the first file to \fIfile\fR and
394then download the rest to their normal names: \fIall\fR downloaded
395content will be placed in \fIfile\fR. This was disabled in version
3961.11, but has been reinstated (with a warning) in 1.11.2, as there are
397some cases where this behavior can actually have some use.
398.Sp
399A combination with \fB\-nc\fR is only accepted if the given output
400file does not exist.
401.Sp
402Note that a combination with \fB\-k\fR is only permitted when
403downloading a single document, as in that case it will just convert
404all relative URIs to external ones; \fB\-k\fR makes no sense for
405multiple URIs when they're all being downloaded to a single file;
406\&\fB\-k\fR can be used only when the output is a regular file.
407.IP "\fB\-nc\fR" 4
408.IX Item "-nc"
409.PD 0
410.IP "\fB\-\-no\-clobber\fR" 4
411.IX Item "--no-clobber"
412.PD
413If a file is downloaded more than once in the same directory, Wget's
414behavior depends on a few options, including \fB\-nc\fR. In certain
415cases, the local file will be \fIclobbered\fR, or overwritten, upon
416repeated download. In other cases it will be preserved.
417.Sp
418When running Wget without \fB\-N\fR, \fB\-nc\fR, \fB\-r\fR, or
419\&\fB\-p\fR, downloading the same file in the same directory will result
420in the original copy of \fIfile\fR being preserved and the second copy
421being named \fIfile\fR\fB.1\fR. If that file is downloaded yet
422again, the third copy will be named \fIfile\fR\fB.2\fR, and so on.
423(This is also the behavior with \fB\-nd\fR, even if \fB\-r\fR or
424\&\fB\-p\fR are in effect.) When \fB\-nc\fR is specified, this behavior
425is suppressed, and Wget will refuse to download newer copies of
426\&\fIfile\fR. Therefore, "\f(CW\*(C`no\-clobber\*(C'\fR" is actually a
427misnomer in this mode\-\-\-it's not clobbering that's prevented (as the
428numeric suffixes were already preventing clobbering), but rather the
429multiple version saving that's prevented.
430.Sp
431When running Wget with \fB\-r\fR or \fB\-p\fR, but without \fB\-N\fR,
432\&\fB\-nd\fR, or \fB\-nc\fR, re-downloading a file will result in the
433new copy simply overwriting the old. Adding \fB\-nc\fR will prevent
434this behavior, instead causing the original version to be preserved
435and any newer copies on the server to be ignored.
436.Sp
437When running Wget with \fB\-N\fR, with or without \fB\-r\fR or
438\&\fB\-p\fR, the decision as to whether or not to download a newer copy
439of a file depends on the local and remote timestamp and size of the
440file. \fB\-nc\fR may not be specified at the
441same time as \fB\-N\fR.
442.Sp
443A combination with \fB\-O\fR/\fB\-\-output\-document\fR is only accepted
444if the given output file does not exist.
445.Sp
446Note that when \fB\-nc\fR is specified, files with the suffixes
447\&\fB.html\fR or \fB.htm\fR will be loaded from the local disk and
448parsed as if they had been retrieved from the Web.
449.IP "\fB\-\-backups=\fR\fIbackups\fR" 4
450.IX Item "--backups=backups"
451Before (over)writing a file, back up an existing file by adding a
452\&\fB.1\fR suffix (\fB_1\fR on \s-1VMS\s0) to the file name. Such backup
453files are rotated to \fB.2\fR, \fB.3\fR, and so on, up to
454\&\fIbackups\fR (and lost beyond that).
455.IP "\fB\-c\fR" 4
456.IX Item "-c"
457.PD 0
458.IP "\fB\-\-continue\fR" 4
459.IX Item "--continue"
460.PD
461Continue getting a partially-downloaded file. This is useful when you
462want to finish up a download started by a previous instance of Wget, or
463by another program. For instance:
464.Sp
465.Vb 1
466\& wget \-c ftp://sunsite.doc.ic.ac.uk/ls\-lR.Z
467.Ve
468.Sp
469If there is a file named \fIls\-lR.Z\fR in the current directory, Wget
470will assume that it is the first portion of the remote file, and will
471ask the server to continue the retrieval from an offset equal to the
472length of the local file.
473.Sp
474Note that you don't need to specify this option if you just want the
475current invocation of Wget to retry downloading a file should the
476connection be lost midway through. This is the default behavior.
477\&\fB\-c\fR only affects resumption of downloads started \fIprior\fR to
478this invocation of Wget, and whose local files are still sitting around.
479.Sp
480Without \fB\-c\fR, the previous example would just download the remote
481file to \fIls\-lR.Z.1\fR, leaving the truncated \fIls\-lR.Z\fR file
482alone.
483.Sp
484Beginning with Wget 1.7, if you use \fB\-c\fR on a non-empty file, and
485it turns out that the server does not support continued downloading,
486Wget will refuse to start the download from scratch, which would
487effectively ruin existing contents. If you really want the download to
488start from scratch, remove the file.
489.Sp
490Also beginning with Wget 1.7, if you use \fB\-c\fR on a file which is of
491equal size as the one on the server, Wget will refuse to download the
492file and print an explanatory message. The same happens when the file
493is smaller on the server than locally (presumably because it was changed
494on the server since your last download attempt)\-\-\-because \*(L"continuing\*(R"
495is not meaningful, no download occurs.
496.Sp
497On the other side of the coin, while using \fB\-c\fR, any file that's
498bigger on the server than locally will be considered an incomplete
499download and only \f(CW\*(C`(length(remote) \- length(local))\*(C'\fR bytes will be
500downloaded and tacked onto the end of the local file. This behavior can
501be desirable in certain cases\-\-\-for instance, you can use \fBwget \-c\fR
502to download just the new portion that's been appended to a data
503collection or log file.
504.Sp
505However, if the file is bigger on the server because it's been
506\&\fIchanged\fR, as opposed to just \fIappended\fR to, you'll end up
507with a garbled file. Wget has no way of verifying that the local file
508is really a valid prefix of the remote file. You need to be especially
509careful of this when using \fB\-c\fR in conjunction with \fB\-r\fR,
510since every file will be considered as an \*(L"incomplete download\*(R" candidate.
511.Sp
512Another instance where you'll get a garbled file if you try to use
513\&\fB\-c\fR is if you have a lame \s-1HTTP\s0 proxy that inserts a
514\&\*(L"transfer interrupted\*(R" string into the local file. In the future a
515\&\*(L"rollback\*(R" option may be added to deal with this case.
516.Sp
517Note that \fB\-c\fR only works with \s-1FTP\s0 servers and with \s-1HTTP\s0
518servers that support the \f(CW\*(C`Range\*(C'\fR header.
519.IP "\fB\-\-start\-pos=\fR\fI\s-1OFFSET\s0\fR" 4
520.IX Item "--start-pos=OFFSET"
521Start downloading at zero-based position \fI\s-1OFFSET\s0\fR. Offset may be expressed
522in bytes, kilobytes with the `k' suffix, or megabytes with the `m' suffix, etc.
523.Sp
524\&\fB\-\-start\-pos\fR has higher precedence over \fB\-\-continue\fR. When
525\&\fB\-\-start\-pos\fR and \fB\-\-continue\fR are both specified, wget will emit a
526warning then proceed as if \fB\-\-continue\fR was absent.
527.Sp
528Server support for continued download is required, otherwise \fB\-\-start\-pos\fR
529cannot help. See \fB\-c\fR for details.
530.IP "\fB\-\-progress=\fR\fItype\fR" 4
531.IX Item "--progress=type"
532Select the type of the progress indicator you wish to use. Legal
533indicators are \*(L"dot\*(R" and \*(L"bar\*(R".
534.Sp
535The \*(L"bar\*(R" indicator is used by default. It draws an \s-1ASCII\s0 progress
536bar graphics (a.k.a \*(L"thermometer\*(R" display) indicating the status of
537retrieval. If the output is not a \s-1TTY,\s0 the \*(L"dot\*(R" bar will be used by
538default.
539.Sp
540Use \fB\-\-progress=dot\fR to switch to the \*(L"dot\*(R" display. It traces
541the retrieval by printing dots on the screen, each dot representing a
542fixed amount of downloaded data.
543.Sp
544The progress \fItype\fR can also take one or more parameters. The parameters
545vary based on the \fItype\fR selected. Parameters to \fItype\fR are passed by
546appending them to the type sperated by a colon (:) like this:
547\&\fB\-\-progress=\fR\fItype\fR\fB:\fR\fIparameter1\fR\fB:\fR\fIparameter2\fR.
548.Sp
549When using the dotted retrieval, you may set the \fIstyle\fR by
550specifying the type as \fBdot:\fR\fIstyle\fR. Different styles assign
551different meaning to one dot. With the \f(CW\*(C`default\*(C'\fR style each dot
552represents 1K, there are ten dots in a cluster and 50 dots in a line.
553The \f(CW\*(C`binary\*(C'\fR style has a more \*(L"computer\*(R"\-like orientation\-\-\-8K
554dots, 16\-dots clusters and 48 dots per line (which makes for 384K
555lines). The \f(CW\*(C`mega\*(C'\fR style is suitable for downloading large
556files\-\-\-each dot represents 64K retrieved, there are eight dots in a
557cluster, and 48 dots on each line (so each line contains 3M).
558If \f(CW\*(C`mega\*(C'\fR is not enough then you can use the \f(CW\*(C`giga\*(C'\fR
559style\-\-\-each dot represents 1M retrieved, there are eight dots in a
560cluster, and 32 dots on each line (so each line contains 32M).
561.Sp
562With \fB\-\-progress=bar\fR, there are currently two possible parameters,
563\&\fIforce\fR and \fInoscroll\fR.
564.Sp
565When the output is not a \s-1TTY,\s0 the progress bar always falls back to \*(L"dot\*(R",
566even if \fB\-\-progress=bar\fR was passed to Wget during invokation. This
567behaviour can be overridden and the \*(L"bar\*(R" output forced by using the \*(L"force\*(R"
568parameter as \fB\-\-progress=bar:force\fR.
569.Sp
570By default, the \fBbar\fR style progress bar scroll the name of the file from
571left to right for the file being downloaded if the filename exceeds the maximum
572length allotted for its display. In certain cases, such as with
573\&\fB\-\-progress=bar:force\fR, one may not want the scrolling filename in the
574progress bar. By passing the \*(L"noscroll\*(R" parameter, Wget can be forced to
575display as much of the filename as possible without scrolling through it.
576.Sp
577Note that you can set the default style using the \f(CW\*(C`progress\*(C'\fR
578command in \fI.wgetrc\fR. That setting may be overridden from the
579command line. For example, to force the bar output without scrolling,
580use \fB\-\-progress=bar:force:noscroll\fR.
581.IP "\fB\-\-show\-progress\fR" 4
582.IX Item "--show-progress"
583Force wget to display the progress bar in any verbosity.
584.Sp
585By default, wget only displays the progress bar in verbose mode. One may
586however, want wget to display the progress bar on screen in conjunction with
587any other verbosity modes like \fB\-\-no\-verbose\fR or \fB\-\-quiet\fR. This
588is often a desired a property when invoking wget to download several small/large
589files. In such a case, wget could simply be invoked with this parameter to get
590a much cleaner output on the screen.
591.Sp
592This option will also force the progress bar to be printed to \fIstderr\fR when
593used alongside the \fB\-\-logfile\fR option.
594.IP "\fB\-N\fR" 4
595.IX Item "-N"
596.PD 0
597.IP "\fB\-\-timestamping\fR" 4
598.IX Item "--timestamping"
599.PD
600Turn on time-stamping.
601.IP "\fB\-\-no\-if\-modified\-since\fR" 4
602.IX Item "--no-if-modified-since"
603Do not send If-Modified-Since header in \fB\-N\fR mode. Send preliminary \s-1HEAD\s0
604request instead. This has only effect in \fB\-N\fR mode.
605.IP "\fB\-\-no\-use\-server\-timestamps\fR" 4
606.IX Item "--no-use-server-timestamps"
607Don't set the local file's timestamp by the one on the server.
608.Sp
609By default, when a file is downloaded, its timestamps are set to
610match those from the remote file. This allows the use of
611\&\fB\-\-timestamping\fR on subsequent invocations of wget. However, it
612is sometimes useful to base the local file's timestamp on when it was
613actually downloaded; for that purpose, the
614\&\fB\-\-no\-use\-server\-timestamps\fR option has been provided.
615.IP "\fB\-S\fR" 4
616.IX Item "-S"
617.PD 0
618.IP "\fB\-\-server\-response\fR" 4
619.IX Item "--server-response"
620.PD
621Print the headers sent by \s-1HTTP\s0 servers and responses sent by
622\&\s-1FTP\s0 servers.
623.IP "\fB\-\-spider\fR" 4
624.IX Item "--spider"
625When invoked with this option, Wget will behave as a Web \fIspider\fR,
626which means that it will not download the pages, just check that they
627are there. For example, you can use Wget to check your bookmarks:
628.Sp
629.Vb 1
630\& wget \-\-spider \-\-force\-html \-i bookmarks.html
631.Ve
632.Sp
633This feature needs much more work for Wget to get close to the
634functionality of real web spiders.
635.IP "\fB\-T seconds\fR" 4
636.IX Item "-T seconds"
637.PD 0
638.IP "\fB\-\-timeout=\fR\fIseconds\fR" 4
639.IX Item "--timeout=seconds"
640.PD
641Set the network timeout to \fIseconds\fR seconds. This is equivalent
642to specifying \fB\-\-dns\-timeout\fR, \fB\-\-connect\-timeout\fR, and
643\&\fB\-\-read\-timeout\fR, all at the same time.
644.Sp
645When interacting with the network, Wget can check for timeout and
646abort the operation if it takes too long. This prevents anomalies
647like hanging reads and infinite connects. The only timeout enabled by
648default is a 900\-second read timeout. Setting a timeout to 0 disables
649it altogether. Unless you know what you are doing, it is best not to
650change the default timeout settings.
651.Sp
652All timeout-related options accept decimal values, as well as
653subsecond values. For example, \fB0.1\fR seconds is a legal (though
654unwise) choice of timeout. Subsecond timeouts are useful for checking
655server response times or for testing network latency.
656.IP "\fB\-\-dns\-timeout=\fR\fIseconds\fR" 4
657.IX Item "--dns-timeout=seconds"
658Set the \s-1DNS\s0 lookup timeout to \fIseconds\fR seconds. \s-1DNS\s0 lookups that
659don't complete within the specified time will fail. By default, there
660is no timeout on \s-1DNS\s0 lookups, other than that implemented by system
661libraries.
662.IP "\fB\-\-connect\-timeout=\fR\fIseconds\fR" 4
663.IX Item "--connect-timeout=seconds"
664Set the connect timeout to \fIseconds\fR seconds. \s-1TCP\s0 connections that
665take longer to establish will be aborted. By default, there is no
666connect timeout, other than that implemented by system libraries.
667.IP "\fB\-\-read\-timeout=\fR\fIseconds\fR" 4
668.IX Item "--read-timeout=seconds"
669Set the read (and write) timeout to \fIseconds\fR seconds. The
670\&\*(L"time\*(R" of this timeout refers to \fIidle time\fR: if, at any point in
671the download, no data is received for more than the specified number
672of seconds, reading fails and the download is restarted. This option
673does not directly affect the duration of the entire download.
674.Sp
675Of course, the remote server may choose to terminate the connection
676sooner than this option requires. The default read timeout is 900
677seconds.
678.IP "\fB\-\-limit\-rate=\fR\fIamount\fR" 4
679.IX Item "--limit-rate=amount"
680Limit the download speed to \fIamount\fR bytes per second. Amount may
681be expressed in bytes, kilobytes with the \fBk\fR suffix, or megabytes
682with the \fBm\fR suffix. For example, \fB\-\-limit\-rate=20k\fR will
683limit the retrieval rate to 20KB/s. This is useful when, for whatever
684reason, you don't want Wget to consume the entire available bandwidth.
685.Sp
686This option allows the use of decimal numbers, usually in conjunction
687with power suffixes; for example, \fB\-\-limit\-rate=2.5k\fR is a legal
688value.
689.Sp
690Note that Wget implements the limiting by sleeping the appropriate
691amount of time after a network read that took less time than specified
692by the rate. Eventually this strategy causes the \s-1TCP\s0 transfer to slow
693down to approximately the specified rate. However, it may take some
694time for this balance to be achieved, so don't be surprised if limiting
695the rate doesn't work well with very small files.
696.IP "\fB\-w\fR \fIseconds\fR" 4
697.IX Item "-w seconds"
698.PD 0
699.IP "\fB\-\-wait=\fR\fIseconds\fR" 4
700.IX Item "--wait=seconds"
701.PD
702Wait the specified number of seconds between the retrievals. Use of
703this option is recommended, as it lightens the server load by making the
704requests less frequent. Instead of in seconds, the time can be
705specified in minutes using the \f(CW\*(C`m\*(C'\fR suffix, in hours using \f(CW\*(C`h\*(C'\fR
706suffix, or in days using \f(CW\*(C`d\*(C'\fR suffix.
707.Sp
708Specifying a large value for this option is useful if the network or the
709destination host is down, so that Wget can wait long enough to
710reasonably expect the network error to be fixed before the retry. The
711waiting interval specified by this function is influenced by
712\&\f(CW\*(C`\-\-random\-wait\*(C'\fR, which see.
713.IP "\fB\-\-waitretry=\fR\fIseconds\fR" 4
714.IX Item "--waitretry=seconds"
715If you don't want Wget to wait between \fIevery\fR retrieval, but only
716between retries of failed downloads, you can use this option. Wget will
717use \fIlinear backoff\fR, waiting 1 second after the first failure on a
718given file, then waiting 2 seconds after the second failure on that
719file, up to the maximum number of \fIseconds\fR you specify.
720.Sp
721By default, Wget will assume a value of 10 seconds.
722.IP "\fB\-\-random\-wait\fR" 4
723.IX Item "--random-wait"
724Some web sites may perform log analysis to identify retrieval programs
725such as Wget by looking for statistically significant similarities in
726the time between requests. This option causes the time between requests
727to vary between 0.5 and 1.5 * \fIwait\fR seconds, where \fIwait\fR was
728specified using the \fB\-\-wait\fR option, in order to mask Wget's
729presence from such analysis.
730.Sp
731A 2001 article in a publication devoted to development on a popular
732consumer platform provided code to perform this analysis on the fly.
733Its author suggested blocking at the class C address level to ensure
734automated retrieval programs were blocked despite changing DHCP-supplied
735addresses.
736.Sp
737The \fB\-\-random\-wait\fR option was inspired by this ill-advised
738recommendation to block many unrelated users from a web site due to the
739actions of one.
740.IP "\fB\-\-no\-proxy\fR" 4
741.IX Item "--no-proxy"
742Don't use proxies, even if the appropriate \f(CW*_proxy\fR environment
743variable is defined.
744.IP "\fB\-Q\fR \fIquota\fR" 4
745.IX Item "-Q quota"
746.PD 0
747.IP "\fB\-\-quota=\fR\fIquota\fR" 4
748.IX Item "--quota=quota"
749.PD
750Specify download quota for automatic retrievals. The value can be
751specified in bytes (default), kilobytes (with \fBk\fR suffix), or
752megabytes (with \fBm\fR suffix).
753.Sp
754Note that quota will never affect downloading a single file. So if you
755specify \fBwget \-Q10k ftp://wuarchive.wustl.edu/ls\-lR.gz\fR, all of the
756\&\fIls\-lR.gz\fR will be downloaded. The same goes even when several
757URLs are specified on the command-line. However, quota is
758respected when retrieving either recursively, or from an input file.
759Thus you may safely type \fBwget \-Q2m \-i sites\fR\-\-\-download will be
760aborted when the quota is exceeded.
761.Sp
762Setting quota to 0 or to \fBinf\fR unlimits the download quota.
763.IP "\fB\-\-no\-dns\-cache\fR" 4
764.IX Item "--no-dns-cache"
765Turn off caching of \s-1DNS\s0 lookups. Normally, Wget remembers the \s-1IP\s0
766addresses it looked up from \s-1DNS\s0 so it doesn't have to repeatedly
767contact the \s-1DNS\s0 server for the same (typically small) set of hosts it
768retrieves from. This cache exists in memory only; a new Wget run will
769contact \s-1DNS\s0 again.
770.Sp
771However, it has been reported that in some situations it is not
772desirable to cache host names, even for the duration of a
773short-running application like Wget. With this option Wget issues a
774new \s-1DNS\s0 lookup (more precisely, a new call to \f(CW\*(C`gethostbyname\*(C'\fR or
775\&\f(CW\*(C`getaddrinfo\*(C'\fR) each time it makes a new connection. Please note
776that this option will \fInot\fR affect caching that might be
777performed by the resolving library or by an external caching layer,
778such as \s-1NSCD.\s0
779.Sp
780If you don't understand exactly what this option does, you probably
781won't need it.
782.IP "\fB\-\-restrict\-file\-names=\fR\fImodes\fR" 4
783.IX Item "--restrict-file-names=modes"
784Change which characters found in remote URLs must be escaped during
785generation of local filenames. Characters that are \fIrestricted\fR
786by this option are escaped, i.e. replaced with \fB\f(CB%HH\fB\fR, where
787\&\fB\s-1HH\s0\fR is the hexadecimal number that corresponds to the restricted
788character. This option may also be used to force all alphabetical
789cases to be either lower\- or uppercase.
790.Sp
791By default, Wget escapes the characters that are not valid or safe as
792part of file names on your operating system, as well as control
793characters that are typically unprintable. This option is useful for
794changing these defaults, perhaps because you are downloading to a
795non-native partition, or because you want to disable escaping of the
796control characters, or you want to further restrict characters to only
797those in the \s-1ASCII\s0 range of values.
798.Sp
799The \fImodes\fR are a comma-separated set of text values. The
800acceptable values are \fBunix\fR, \fBwindows\fR, \fBnocontrol\fR,
801\&\fBascii\fR, \fBlowercase\fR, and \fBuppercase\fR. The values
802\&\fBunix\fR and \fBwindows\fR are mutually exclusive (one will
803override the other), as are \fBlowercase\fR and
804\&\fBuppercase\fR. Those last are special cases, as they do not change
805the set of characters that would be escaped, but rather force local
806file paths to be converted either to lower\- or uppercase.
807.Sp
808When \*(L"unix\*(R" is specified, Wget escapes the character \fB/\fR and
809the control characters in the ranges 0\-\-31 and 128\-\-159. This is the
810default on Unix-like operating systems.
811.Sp
812When \*(L"windows\*(R" is given, Wget escapes the characters \fB\e\fR,
813\&\fB|\fR, \fB/\fR, \fB:\fR, \fB?\fR, \fB"\fR, \fB*\fR, \fB<\fR,
814\&\fB>\fR, and the control characters in the ranges 0\-\-31 and 128\-\-159.
815In addition to this, Wget in Windows mode uses \fB+\fR instead of
816\&\fB:\fR to separate host and port in local file names, and uses
817\&\fB@\fR instead of \fB?\fR to separate the query portion of the file
818name from the rest. Therefore, a \s-1URL\s0 that would be saved as
819\&\fBwww.xemacs.org:4300/search.pl?input=blah\fR in Unix mode would be
820saved as \fBwww.xemacs.org+4300/search.pl@input=blah\fR in Windows
821mode. This mode is the default on Windows.
822.Sp
823If you specify \fBnocontrol\fR, then the escaping of the control
824characters is also switched off. This option may make sense
825when you are downloading URLs whose names contain \s-1UTF\-8\s0 characters, on
826a system which can save and display filenames in \s-1UTF\-8 \s0(some possible
827byte values used in \s-1UTF\-8\s0 byte sequences fall in the range of values
828designated by Wget as \*(L"controls\*(R").
829.Sp
830The \fBascii\fR mode is used to specify that any bytes whose values
831are outside the range of \s-1ASCII\s0 characters (that is, greater than
832127) shall be escaped. This can be useful when saving filenames
833whose encoding does not match the one used locally.
834.IP "\fB\-4\fR" 4
835.IX Item "-4"
836.PD 0
837.IP "\fB\-\-inet4\-only\fR" 4
838.IX Item "--inet4-only"
839.IP "\fB\-6\fR" 4
840.IX Item "-6"
841.IP "\fB\-\-inet6\-only\fR" 4
842.IX Item "--inet6-only"
843.PD
844Force connecting to IPv4 or IPv6 addresses. With \fB\-\-inet4\-only\fR
845or \fB\-4\fR, Wget will only connect to IPv4 hosts, ignoring \s-1AAAA\s0
846records in \s-1DNS,\s0 and refusing to connect to IPv6 addresses specified in
847URLs. Conversely, with \fB\-\-inet6\-only\fR or \fB\-6\fR, Wget will
848only connect to IPv6 hosts and ignore A records and IPv4 addresses.
849.Sp
850Neither options should be needed normally. By default, an IPv6\-aware
851Wget will use the address family specified by the host's \s-1DNS\s0 record.
852If the \s-1DNS\s0 responds with both IPv4 and IPv6 addresses, Wget will try
853them in sequence until it finds one it can connect to. (Also see
854\&\f(CW\*(C`\-\-prefer\-family\*(C'\fR option described below.)
855.Sp
856These options can be used to deliberately force the use of IPv4 or
857IPv6 address families on dual family systems, usually to aid debugging
858or to deal with broken network configuration. Only one of
859\&\fB\-\-inet6\-only\fR and \fB\-\-inet4\-only\fR may be specified at the
860same time. Neither option is available in Wget compiled without IPv6
861support.
862.IP "\fB\-\-prefer\-family=none/IPv4/IPv6\fR" 4
863.IX Item "--prefer-family=none/IPv4/IPv6"
864When given a choice of several addresses, connect to the addresses
865with specified address family first. The address order returned by
866\&\s-1DNS\s0 is used without change by default.
867.Sp
868This avoids spurious errors and connect attempts when accessing hosts
869that resolve to both IPv6 and IPv4 addresses from IPv4 networks. For
870example, \fBwww.kame.net\fR resolves to
871\&\fB2001:200:0:8002:203:47ff:fea5:3085\fR and to
872\&\fB203.178.141.194\fR. When the preferred family is \f(CW\*(C`IPv4\*(C'\fR, the
873IPv4 address is used first; when the preferred family is \f(CW\*(C`IPv6\*(C'\fR,
874the IPv6 address is used first; if the specified value is \f(CW\*(C`none\*(C'\fR,
875the address order returned by \s-1DNS\s0 is used without change.
876.Sp
877Unlike \fB\-4\fR and \fB\-6\fR, this option doesn't inhibit access to
878any address family, it only changes the \fIorder\fR in which the
879addresses are accessed. Also note that the reordering performed by
880this option is \fIstable\fR\-\-\-it doesn't affect order of addresses of
881the same family. That is, the relative order of all IPv4 addresses
882and of all IPv6 addresses remains intact in all cases.
883.IP "\fB\-\-retry\-connrefused\fR" 4
884.IX Item "--retry-connrefused"
885Consider \*(L"connection refused\*(R" a transient error and try again.
886Normally Wget gives up on a \s-1URL\s0 when it is unable to connect to the
887site because failure to connect is taken as a sign that the server is
888not running at all and that retries would not help. This option is
889for mirroring unreliable sites whose servers tend to disappear for
890short periods of time.
891.IP "\fB\-\-user=\fR\fIuser\fR" 4
892.IX Item "--user=user"
893.PD 0
894.IP "\fB\-\-password=\fR\fIpassword\fR" 4
895.IX Item "--password=password"
896.PD
897Specify the username \fIuser\fR and password \fIpassword\fR for both
898\&\s-1FTP\s0 and \s-1HTTP\s0 file retrieval. These parameters can be overridden
899using the \fB\-\-ftp\-user\fR and \fB\-\-ftp\-password\fR options for
900\&\s-1FTP\s0 connections and the \fB\-\-http\-user\fR and \fB\-\-http\-password\fR
901options for \s-1HTTP\s0 connections.
902.IP "\fB\-\-ask\-password\fR" 4
903.IX Item "--ask-password"
904Prompt for a password for each connection established. Cannot be specified
905when \fB\-\-password\fR is being used, because they are mutually exclusive.
906.IP "\fB\-\-no\-iri\fR" 4
907.IX Item "--no-iri"
908Turn off internationalized \s-1URI \s0(\s-1IRI\s0) support. Use \fB\-\-iri\fR to
909turn it on. \s-1IRI\s0 support is activated by default.
910.Sp
911You can set the default state of \s-1IRI\s0 support using the \f(CW\*(C`iri\*(C'\fR
912command in \fI.wgetrc\fR. That setting may be overridden from the
913command line.
914.IP "\fB\-\-local\-encoding=\fR\fIencoding\fR" 4
915.IX Item "--local-encoding=encoding"
916Force Wget to use \fIencoding\fR as the default system encoding. That affects
917how Wget converts URLs specified as arguments from locale to \s-1UTF\-8\s0 for
918\&\s-1IRI\s0 support.
919.Sp
920Wget use the function \f(CW\*(C`nl_langinfo()\*(C'\fR and then the \f(CW\*(C`CHARSET\*(C'\fR
921environment variable to get the locale. If it fails, \s-1ASCII\s0 is used.
922.Sp
923You can set the default local encoding using the \f(CW\*(C`local_encoding\*(C'\fR
924command in \fI.wgetrc\fR. That setting may be overridden from the
925command line.
926.IP "\fB\-\-remote\-encoding=\fR\fIencoding\fR" 4
927.IX Item "--remote-encoding=encoding"
928Force Wget to use \fIencoding\fR as the default remote server encoding.
929That affects how Wget converts URIs found in files from remote encoding
930to \s-1UTF\-8\s0 during a recursive fetch. This options is only useful for
931\&\s-1IRI\s0 support, for the interpretation of non-ASCII characters.
932.Sp
933For \s-1HTTP,\s0 remote encoding can be found in \s-1HTTP \s0\f(CW\*(C`Content\-Type\*(C'\fR
934header and in \s-1HTML \s0\f(CW\*(C`Content\-Type http\-equiv\*(C'\fR meta tag.
935.Sp
936You can set the default encoding using the \f(CW\*(C`remoteencoding\*(C'\fR
937command in \fI.wgetrc\fR. That setting may be overridden from the
938command line.
939.IP "\fB\-\-unlink\fR" 4
940.IX Item "--unlink"
941Force Wget to unlink file instead of clobbering existing file. This
942option is useful for downloading to the directory with hardlinks.
943.SS "Directory Options"
944.IX Subsection "Directory Options"
945.IP "\fB\-nd\fR" 4
946.IX Item "-nd"
947.PD 0
948.IP "\fB\-\-no\-directories\fR" 4
949.IX Item "--no-directories"
950.PD
951Do not create a hierarchy of directories when retrieving recursively.
952With this option turned on, all files will get saved to the current
953directory, without clobbering (if a name shows up more than once, the
954filenames will get extensions \fB.n\fR).
955.IP "\fB\-x\fR" 4
956.IX Item "-x"
957.PD 0
958.IP "\fB\-\-force\-directories\fR" 4
959.IX Item "--force-directories"
960.PD
961The opposite of \fB\-nd\fR\-\-\-create a hierarchy of directories, even if
962one would not have been created otherwise. E.g. \fBwget \-x
963http://fly.srk.fer.hr/robots.txt\fR will save the downloaded file to
964\&\fIfly.srk.fer.hr/robots.txt\fR.
965.IP "\fB\-nH\fR" 4
966.IX Item "-nH"
967.PD 0
968.IP "\fB\-\-no\-host\-directories\fR" 4
969.IX Item "--no-host-directories"
970.PD
971Disable generation of host-prefixed directories. By default, invoking
972Wget with \fB\-r http://fly.srk.fer.hr/\fR will create a structure of
973directories beginning with \fIfly.srk.fer.hr/\fR. This option disables
974such behavior.
975.IP "\fB\-\-protocol\-directories\fR" 4
976.IX Item "--protocol-directories"
977Use the protocol name as a directory component of local file names. For
978example, with this option, \fBwget \-r http://\fR\fIhost\fR will save to
979\&\fBhttp/\fR\fIhost\fR\fB/...\fR rather than just to \fIhost\fR\fB/...\fR.
980.IP "\fB\-\-cut\-dirs=\fR\fInumber\fR" 4
981.IX Item "--cut-dirs=number"
982Ignore \fInumber\fR directory components. This is useful for getting a
983fine-grained control over the directory where recursive retrieval will
984be saved.
985.Sp
986Take, for example, the directory at
987\&\fBftp://ftp.xemacs.org/pub/xemacs/\fR. If you retrieve it with
988\&\fB\-r\fR, it will be saved locally under
989\&\fIftp.xemacs.org/pub/xemacs/\fR. While the \fB\-nH\fR option can
990remove the \fIftp.xemacs.org/\fR part, you are still stuck with
991\&\fIpub/xemacs\fR. This is where \fB\-\-cut\-dirs\fR comes in handy; it
992makes Wget not \*(L"see\*(R" \fInumber\fR remote directory components. Here
993are several examples of how \fB\-\-cut\-dirs\fR option works.
994.Sp
995.Vb 4
996\& No options \-> ftp.xemacs.org/pub/xemacs/
997\& \-nH \-> pub/xemacs/
998\& \-nH \-\-cut\-dirs=1 \-> xemacs/
999\& \-nH \-\-cut\-dirs=2 \-> .
1000\&
1001\& \-\-cut\-dirs=1 \-> ftp.xemacs.org/xemacs/
1002\& ...
1003.Ve
1004.Sp
1005If you just want to get rid of the directory structure, this option is
1006similar to a combination of \fB\-nd\fR and \fB\-P\fR. However, unlike
1007\&\fB\-nd\fR, \fB\-\-cut\-dirs\fR does not lose with subdirectories\-\-\-for
1008instance, with \fB\-nH \-\-cut\-dirs=1\fR, a \fIbeta/\fR subdirectory will
1009be placed to \fIxemacs/beta\fR, as one would expect.
1010.IP "\fB\-P\fR \fIprefix\fR" 4
1011.IX Item "-P prefix"
1012.PD 0
1013.IP "\fB\-\-directory\-prefix=\fR\fIprefix\fR" 4
1014.IX Item "--directory-prefix=prefix"
1015.PD
1016Set directory prefix to \fIprefix\fR. The \fIdirectory prefix\fR is the
1017directory where all other files and subdirectories will be saved to,
1018i.e. the top of the retrieval tree. The default is \fB.\fR (the
1019current directory).
1020.SS "\s-1HTTP\s0 Options"
1021.IX Subsection "HTTP Options"
1022.IP "\fB\-\-default\-page=\fR\fIname\fR" 4
1023.IX Item "--default-page=name"
1024Use \fIname\fR as the default file name when it isn't known (i.e., for
1025URLs that end in a slash), instead of \fIindex.html\fR.
1026.IP "\fB\-E\fR" 4
1027.IX Item "-E"
1028.PD 0
1029.IP "\fB\-\-adjust\-extension\fR" 4
1030.IX Item "--adjust-extension"
1031.PD
1032If a file of type \fBapplication/xhtml+xml\fR or \fBtext/html\fR is
1033downloaded and the \s-1URL\s0 does not end with the regexp
1034\&\fB\e.[Hh][Tt][Mm][Ll]?\fR, this option will cause the suffix \fB.html\fR
1035to be appended to the local filename. This is useful, for instance, when
1036you're mirroring a remote site that uses \fB.asp\fR pages, but you want
1037the mirrored pages to be viewable on your stock Apache server. Another
1038good use for this is when you're downloading CGI-generated materials. A \s-1URL \s0
1039like \fBhttp://site.com/article.cgi?25\fR will be saved as
1040\&\fIarticle.cgi?25.html\fR.
1041.Sp
1042Note that filenames changed in this way will be re-downloaded every time
1043you re-mirror a site, because Wget can't tell that the local
1044\&\fI\fIX\fI.html\fR file corresponds to remote \s-1URL \s0\fIX\fR (since
1045it doesn't yet know that the \s-1URL\s0 produces output of type
1046\&\fBtext/html\fR or \fBapplication/xhtml+xml\fR.
1047.Sp
1048As of version 1.12, Wget will also ensure that any downloaded files of
1049type \fBtext/css\fR end in the suffix \fB.css\fR, and the option was
1050renamed from \fB\-\-html\-extension\fR, to better reflect its new
1051behavior. The old option name is still acceptable, but should now be
1052considered deprecated.
1053.Sp
1054At some point in the future, this option may well be expanded to
1055include suffixes for other types of content, including content types
1056that are not parsed by Wget.
1057.IP "\fB\-\-http\-user=\fR\fIuser\fR" 4
1058.IX Item "--http-user=user"
1059.PD 0
1060.IP "\fB\-\-http\-password=\fR\fIpassword\fR" 4
1061.IX Item "--http-password=password"
1062.PD
1063Specify the username \fIuser\fR and password \fIpassword\fR on an
1064\&\s-1HTTP\s0 server. According to the type of the challenge, Wget will
1065encode them using either the \f(CW\*(C`basic\*(C'\fR (insecure),
1066the \f(CW\*(C`digest\*(C'\fR, or the Windows \f(CW\*(C`NTLM\*(C'\fR authentication scheme.
1067.Sp
1068Another way to specify username and password is in the \s-1URL\s0 itself. Either method reveals your password to anyone who
1069bothers to run \f(CW\*(C`ps\*(C'\fR. To prevent the passwords from being seen,
1070store them in \fI.wgetrc\fR or \fI.netrc\fR, and make sure to protect
1071those files from other users with \f(CW\*(C`chmod\*(C'\fR. If the passwords are
1072really important, do not leave them lying in those files either\-\-\-edit
1073the files and delete them after Wget has started the download.
1074.IP "\fB\-\-no\-http\-keep\-alive\fR" 4
1075.IX Item "--no-http-keep-alive"
1076Turn off the \*(L"keep-alive\*(R" feature for \s-1HTTP\s0 downloads. Normally, Wget
1077asks the server to keep the connection open so that, when you download
1078more than one document from the same server, they get transferred over
1079the same \s-1TCP\s0 connection. This saves time and at the same time reduces
1080the load on the server.
1081.Sp
1082This option is useful when, for some reason, persistent (keep-alive)
1083connections don't work for you, for example due to a server bug or due
1084to the inability of server-side scripts to cope with the connections.
1085.IP "\fB\-\-no\-cache\fR" 4
1086.IX Item "--no-cache"
1087Disable server-side cache. In this case, Wget will send the remote
1088server an appropriate directive (\fBPragma: no-cache\fR) to get the
1089file from the remote service, rather than returning the cached version.
1090This is especially useful for retrieving and flushing out-of-date
1091documents on proxy servers.
1092.Sp
1093Caching is allowed by default.
1094.IP "\fB\-\-no\-cookies\fR" 4
1095.IX Item "--no-cookies"
1096Disable the use of cookies. Cookies are a mechanism for maintaining
1097server-side state. The server sends the client a cookie using the
1098\&\f(CW\*(C`Set\-Cookie\*(C'\fR header, and the client responds with the same cookie
1099upon further requests. Since cookies allow the server owners to keep
1100track of visitors and for sites to exchange this information, some
1101consider them a breach of privacy. The default is to use cookies;
1102however, \fIstoring\fR cookies is not on by default.
1103.IP "\fB\-\-load\-cookies\fR \fIfile\fR" 4
1104.IX Item "--load-cookies file"
1105Load cookies from \fIfile\fR before the first \s-1HTTP\s0 retrieval.
1106\&\fIfile\fR is a textual file in the format originally used by Netscape's
1107\&\fIcookies.txt\fR file.
1108.Sp
1109You will typically use this option when mirroring sites that require
1110that you be logged in to access some or all of their content. The login
1111process typically works by the web server issuing an \s-1HTTP\s0 cookie
1112upon receiving and verifying your credentials. The cookie is then
1113resent by the browser when accessing that part of the site, and so
1114proves your identity.
1115.Sp
1116Mirroring such a site requires Wget to send the same cookies your
1117browser sends when communicating with the site. This is achieved by
1118\&\fB\-\-load\-cookies\fR\-\-\-simply point Wget to the location of the
1119\&\fIcookies.txt\fR file, and it will send the same cookies your browser
1120would send in the same situation. Different browsers keep textual
1121cookie files in different locations:
1122.RS 4
1123.ie n .IP """Netscape 4.x.""" 4
1124.el .IP "\f(CWNetscape 4.x.\fR" 4
1125.IX Item "Netscape 4.x."
1126The cookies are in \fI~/.netscape/cookies.txt\fR.
1127.ie n .IP """Mozilla and Netscape 6.x.""" 4
1128.el .IP "\f(CWMozilla and Netscape 6.x.\fR" 4
1129.IX Item "Mozilla and Netscape 6.x."
1130Mozilla's cookie file is also named \fIcookies.txt\fR, located
1131somewhere under \fI~/.mozilla\fR, in the directory of your profile.
1132The full path usually ends up looking somewhat like
1133\&\fI~/.mozilla/default/\fIsome-weird-string\fI/cookies.txt\fR.
1134.ie n .IP """Internet Explorer.""" 4
1135.el .IP "\f(CWInternet Explorer.\fR" 4
1136.IX Item "Internet Explorer."
1137You can produce a cookie file Wget can use by using the File menu,
1138Import and Export, Export Cookies. This has been tested with Internet
1139Explorer 5; it is not guaranteed to work with earlier versions.
1140.ie n .IP """Other browsers.""" 4
1141.el .IP "\f(CWOther browsers.\fR" 4
1142.IX Item "Other browsers."
1143If you are using a different browser to create your cookies,
1144\&\fB\-\-load\-cookies\fR will only work if you can locate or produce a
1145cookie file in the Netscape format that Wget expects.
1146.RE
1147.RS 4
1148.Sp
1149If you cannot use \fB\-\-load\-cookies\fR, there might still be an
1150alternative. If your browser supports a \*(L"cookie manager\*(R", you can use
1151it to view the cookies used when accessing the site you're mirroring.
1152Write down the name and value of the cookie, and manually instruct Wget
1153to send those cookies, bypassing the \*(L"official\*(R" cookie support:
1154.Sp
1155.Vb 1
1156\& wget \-\-no\-cookies \-\-header "Cookie: <name>=<value>"
1157.Ve
1158.RE
1159.IP "\fB\-\-save\-cookies\fR \fIfile\fR" 4
1160.IX Item "--save-cookies file"
1161Save cookies to \fIfile\fR before exiting. This will not save cookies
1162that have expired or that have no expiry time (so-called \*(L"session
1163cookies\*(R"), but also see \fB\-\-keep\-session\-cookies\fR.
1164.IP "\fB\-\-keep\-session\-cookies\fR" 4
1165.IX Item "--keep-session-cookies"
1166When specified, causes \fB\-\-save\-cookies\fR to also save session
1167cookies. Session cookies are normally not saved because they are
1168meant to be kept in memory and forgotten when you exit the browser.
1169Saving them is useful on sites that require you to log in or to visit
1170the home page before you can access some pages. With this option,
1171multiple Wget runs are considered a single browser session as far as
1172the site is concerned.
1173.Sp
1174Since the cookie file format does not normally carry session cookies,
1175Wget marks them with an expiry timestamp of 0. Wget's
1176\&\fB\-\-load\-cookies\fR recognizes those as session cookies, but it might
1177confuse other browsers. Also note that cookies so loaded will be
1178treated as other session cookies, which means that if you want
1179\&\fB\-\-save\-cookies\fR to preserve them again, you must use
1180\&\fB\-\-keep\-session\-cookies\fR again.
1181.IP "\fB\-\-ignore\-length\fR" 4
1182.IX Item "--ignore-length"
1183Unfortunately, some \s-1HTTP\s0 servers (\s-1CGI\s0 programs, to be more
1184precise) send out bogus \f(CW\*(C`Content\-Length\*(C'\fR headers, which makes Wget
1185go wild, as it thinks not all the document was retrieved. You can spot
1186this syndrome if Wget retries getting the same document again and again,
1187each time claiming that the (otherwise normal) connection has closed on
1188the very same byte.
1189.Sp
1190With this option, Wget will ignore the \f(CW\*(C`Content\-Length\*(C'\fR header\-\-\-as
1191if it never existed.
1192.IP "\fB\-\-header=\fR\fIheader-line\fR" 4
1193.IX Item "--header=header-line"
1194Send \fIheader-line\fR along with the rest of the headers in each
1195\&\s-1HTTP\s0 request. The supplied header is sent as-is, which means it
1196must contain name and value separated by colon, and must not contain
1197newlines.
1198.Sp
1199You may define more than one additional header by specifying
1200\&\fB\-\-header\fR more than once.
1201.Sp
1202.Vb 3
1203\& wget \-\-header=\*(AqAccept\-Charset: iso\-8859\-2\*(Aq \e
1204\& \-\-header=\*(AqAccept\-Language: hr\*(Aq \e
1205\& http://fly.srk.fer.hr/
1206.Ve
1207.Sp
1208Specification of an empty string as the header value will clear all
1209previous user-defined headers.
1210.Sp
1211As of Wget 1.10, this option can be used to override headers otherwise
1212generated automatically. This example instructs Wget to connect to
1213localhost, but to specify \fBfoo.bar\fR in the \f(CW\*(C`Host\*(C'\fR header:
1214.Sp
1215.Vb 1
1216\& wget \-\-header="Host: foo.bar" http://localhost/
1217.Ve
1218.Sp
1219In versions of Wget prior to 1.10 such use of \fB\-\-header\fR caused
1220sending of duplicate headers.
1221.IP "\fB\-\-max\-redirect=\fR\fInumber\fR" 4
1222.IX Item "--max-redirect=number"
1223Specifies the maximum number of redirections to follow for a resource.
1224The default is 20, which is usually far more than necessary. However, on
1225those occasions where you want to allow more (or fewer), this is the
1226option to use.
1227.IP "\fB\-\-proxy\-user=\fR\fIuser\fR" 4
1228.IX Item "--proxy-user=user"
1229.PD 0
1230.IP "\fB\-\-proxy\-password=\fR\fIpassword\fR" 4
1231.IX Item "--proxy-password=password"
1232.PD
1233Specify the username \fIuser\fR and password \fIpassword\fR for
1234authentication on a proxy server. Wget will encode them using the
1235\&\f(CW\*(C`basic\*(C'\fR authentication scheme.
1236.Sp
1237Security considerations similar to those with \fB\-\-http\-password\fR
1238pertain here as well.
1239.IP "\fB\-\-referer=\fR\fIurl\fR" 4
1240.IX Item "--referer=url"
1241Include `Referer: \fIurl\fR' header in \s-1HTTP\s0 request. Useful for
1242retrieving documents with server-side processing that assume they are
1243always being retrieved by interactive web browsers and only come out
1244properly when Referer is set to one of the pages that point to them.
1245.IP "\fB\-\-save\-headers\fR" 4
1246.IX Item "--save-headers"
1247Save the headers sent by the \s-1HTTP\s0 server to the file, preceding the
1248actual contents, with an empty line as the separator.
1249.IP "\fB\-U\fR \fIagent-string\fR" 4
1250.IX Item "-U agent-string"
1251.PD 0
1252.IP "\fB\-\-user\-agent=\fR\fIagent-string\fR" 4
1253.IX Item "--user-agent=agent-string"
1254.PD
1255Identify as \fIagent-string\fR to the \s-1HTTP\s0 server.
1256.Sp
1257The \s-1HTTP\s0 protocol allows the clients to identify themselves using a
1258\&\f(CW\*(C`User\-Agent\*(C'\fR header field. This enables distinguishing the
1259\&\s-1WWW\s0 software, usually for statistical purposes or for tracing of
1260protocol violations. Wget normally identifies as
1261\&\fBWget/\fR\fIversion\fR, \fIversion\fR being the current version
1262number of Wget.
1263.Sp
1264However, some sites have been known to impose the policy of tailoring
1265the output according to the \f(CW\*(C`User\-Agent\*(C'\fR\-supplied information.
1266While this is not such a bad idea in theory, it has been abused by
1267servers denying information to clients other than (historically)
1268Netscape or, more frequently, Microsoft Internet Explorer. This
1269option allows you to change the \f(CW\*(C`User\-Agent\*(C'\fR line issued by Wget.
1270Use of this option is discouraged, unless you really know what you are
1271doing.
1272.Sp
1273Specifying empty user agent with \fB\-\-user\-agent=""\fR instructs Wget
1274not to send the \f(CW\*(C`User\-Agent\*(C'\fR header in \s-1HTTP\s0 requests.
1275.IP "\fB\-\-post\-data=\fR\fIstring\fR" 4
1276.IX Item "--post-data=string"
1277.PD 0
1278.IP "\fB\-\-post\-file=\fR\fIfile\fR" 4
1279.IX Item "--post-file=file"
1280.PD
1281Use \s-1POST\s0 as the method for all \s-1HTTP\s0 requests and send the specified
1282data in the request body. \fB\-\-post\-data\fR sends \fIstring\fR as
1283data, whereas \fB\-\-post\-file\fR sends the contents of \fIfile\fR.
1284Other than that, they work in exactly the same way. In particular,
1285they \fIboth\fR expect content of the form \f(CW\*(C`key1=value1&key2=value2\*(C'\fR,
1286with percent-encoding for special characters; the only difference is
1287that one expects its content as a command-line parameter and the other
1288accepts its content from a file. In particular, \fB\-\-post\-file\fR is
1289\&\fInot\fR for transmitting files as form attachments: those must
1290appear as \f(CW\*(C`key=value\*(C'\fR data (with appropriate percent-coding) just
1291like everything else. Wget does not currently support
1292\&\f(CW\*(C`multipart/form\-data\*(C'\fR for transmitting \s-1POST\s0 data; only
1293\&\f(CW\*(C`application/x\-www\-form\-urlencoded\*(C'\fR. Only one of
1294\&\fB\-\-post\-data\fR and \fB\-\-post\-file\fR should be specified.
1295.Sp
1296Please note that wget does not require the content to be of the form
1297\&\f(CW\*(C`key1=value1&key2=value2\*(C'\fR, and neither does it test for it. Wget will
1298simply transmit whatever data is provided to it. Most servers however expect
1299the \s-1POST\s0 data to be in the above format when processing \s-1HTML\s0 Forms.
1300.Sp
1301When sending a \s-1POST\s0 request using the \fB\-\-post\-file\fR option, Wget treats
1302the file as a binary file and will send every character in the \s-1POST\s0 request
1303without stripping trailing newline or formfeed characters. Any other control
1304characters in the text will also be sent as-is in the \s-1POST\s0 request.
1305.Sp
1306Please be aware that Wget needs to know the size of the \s-1POST\s0 data in
1307advance. Therefore the argument to \f(CW\*(C`\-\-post\-file\*(C'\fR must be a regular
1308file; specifying a \s-1FIFO\s0 or something like \fI/dev/stdin\fR won't work.
1309It's not quite clear how to work around this limitation inherent in
1310\&\s-1HTTP/1.0. \s0 Although \s-1HTTP/1.1\s0 introduces \fIchunked\fR transfer that
1311doesn't require knowing the request length in advance, a client can't
1312use chunked unless it knows it's talking to an \s-1HTTP/1.1\s0 server. And it
1313can't know that until it receives a response, which in turn requires the
1314request to have been completed \*(-- a chicken-and-egg problem.
1315.Sp
1316Note: As of version 1.15 if Wget is redirected after the \s-1POST\s0 request is
1317completed, its behaviour will depend on the response code returned by the
1318server. In case of a 301 Moved Permanently, 302 Moved Temporarily or
1319307 Temporary Redirect, Wget will, in accordance with \s-1RFC2616,\s0 continue
1320to send a \s-1POST\s0 request.
1321In case a server wants the client to change the Request method upon
1322redirection, it should send a 303 See Other response code.
1323.Sp
1324This example shows how to log in to a server using \s-1POST\s0 and then proceed to
1325download the desired pages, presumably only accessible to authorized
1326users:
1327.Sp
1328.Vb 4
1329\& # Log in to the server. This can be done only once.
1330\& wget \-\-save\-cookies cookies.txt \e
1331\& \-\-post\-data \*(Aquser=foo&password=bar\*(Aq \e
1332\& http://server.com/auth.php
1333\&
1334\& # Now grab the page or pages we care about.
1335\& wget \-\-load\-cookies cookies.txt \e
1336\& \-p http://server.com/interesting/article.php
1337.Ve
1338.Sp
1339If the server is using session cookies to track user authentication,
1340the above will not work because \fB\-\-save\-cookies\fR will not save
1341them (and neither will browsers) and the \fIcookies.txt\fR file will
1342be empty. In that case use \fB\-\-keep\-session\-cookies\fR along with
1343\&\fB\-\-save\-cookies\fR to force saving of session cookies.
1344.IP "\fB\-\-method=\fR\fIHTTP-Method\fR" 4
1345.IX Item "--method=HTTP-Method"
1346For the purpose of RESTful scripting, Wget allows sending of other \s-1HTTP\s0 Methods
1347without the need to explicitly set them using \fB\-\-header=Header\-Line\fR.
1348Wget will use whatever string is passed to it after \fB\-\-method\fR as the \s-1HTTP\s0
1349Method to the server.
1350.IP "\fB\-\-body\-data=\fR\fIData-String\fR" 4
1351.IX Item "--body-data=Data-String"
1352.PD 0
1353.IP "\fB\-\-body\-file=\fR\fIData-File\fR" 4
1354.IX Item "--body-file=Data-File"
1355.PD
1356Must be set when additional data needs to be sent to the server along with the
1357Method specified using \fB\-\-method\fR. \fB\-\-body\-data\fR sends \fIstring\fR as
1358data, whereas \fB\-\-body\-file\fR sends the contents of \fIfile\fR. Other than that,
1359they work in exactly the same way.
1360.Sp
1361Currently, \fB\-\-body\-file\fR is \fInot\fR for transmitting files as a whole.
1362Wget does not currently support \f(CW\*(C`multipart/form\-data\*(C'\fR for transmitting data;
1363only \f(CW\*(C`application/x\-www\-form\-urlencoded\*(C'\fR. In the future, this may be changed
1364so that wget sends the \fB\-\-body\-file\fR as a complete file instead of sending its
1365contents to the server. Please be aware that Wget needs to know the contents of
1366\&\s-1BODY\s0 Data in advance, and hence the argument to \fB\-\-body\-file\fR should be a
1367regular file. See \fB\-\-post\-file\fR for a more detailed explanation.
1368Only one of \fB\-\-body\-data\fR and \fB\-\-body\-file\fR should be specified.
1369.Sp
1370If Wget is redirected after the request is completed, Wget will
1371suspend the current method and send a \s-1GET\s0 request till the redirection
1372is completed. This is true for all redirection response codes except
1373307 Temporary Redirect which is used to explicitly specify that the
1374request method should \fInot\fR change. Another exception is when
1375the method is set to \f(CW\*(C`POST\*(C'\fR, in which case the redirection rules
1376specified under \fB\-\-post\-data\fR are followed.
1377.IP "\fB\-\-content\-disposition\fR" 4
1378.IX Item "--content-disposition"
1379If this is set to on, experimental (not fully-functional) support for
1380\&\f(CW\*(C`Content\-Disposition\*(C'\fR headers is enabled. This can currently result in
1381extra round-trips to the server for a \f(CW\*(C`HEAD\*(C'\fR request, and is known
1382to suffer from a few bugs, which is why it is not currently enabled by default.
1383.Sp
1384This option is useful for some file-downloading \s-1CGI\s0 programs that use
1385\&\f(CW\*(C`Content\-Disposition\*(C'\fR headers to describe what the name of a
1386downloaded file should be.
1387.IP "\fB\-\-content\-on\-error\fR" 4
1388.IX Item "--content-on-error"
1389If this is set to on, wget will not skip the content when the server responds
1390with a http status code that indicates error.
1391.IP "\fB\-\-trust\-server\-names\fR" 4
1392.IX Item "--trust-server-names"
1393If this is set to on, on a redirect the last component of the
1394redirection \s-1URL\s0 will be used as the local file name. By default it is
1395used the last component in the original \s-1URL.\s0
1396.IP "\fB\-\-auth\-no\-challenge\fR" 4
1397.IX Item "--auth-no-challenge"
1398If this option is given, Wget will send Basic \s-1HTTP\s0 authentication
1399information (plaintext username and password) for all requests, just
1400like Wget 1.10.2 and prior did by default.
1401.Sp
1402Use of this option is not recommended, and is intended only to support
1403some few obscure servers, which never send \s-1HTTP\s0 authentication
1404challenges, but accept unsolicited auth info, say, in addition to
1405form-based authentication.
1406.SS "\s-1HTTPS \s0(\s-1SSL/TLS\s0) Options"
1407.IX Subsection "HTTPS (SSL/TLS) Options"
1408To support encrypted \s-1HTTP \s0(\s-1HTTPS\s0) downloads, Wget must be compiled
1409with an external \s-1SSL\s0 library. The current default is GnuTLS.
1410In addition, Wget also supports \s-1HSTS \s0(\s-1HTTP\s0 Strict Transport Security).
1411If Wget is compiled without \s-1SSL\s0 support, none of these options are available.
1412.IP "\fB\-\-secure\-protocol=\fR\fIprotocol\fR" 4
1413.IX Item "--secure-protocol=protocol"
1414Choose the secure protocol to be used. Legal values are \fBauto\fR,
1415\&\fBSSLv2\fR, \fBSSLv3\fR, \fBTLSv1\fR, \fBTLSv1_1\fR, \fBTLSv1_2\fR
1416and \fB\s-1PFS\s0\fR. If \fBauto\fR is used, the \s-1SSL\s0 library is given the
1417liberty of choosing the appropriate protocol automatically, which is
1418achieved by sending a TLSv1 greeting. This is the default.
1419.Sp
1420Specifying \fBSSLv2\fR, \fBSSLv3\fR, \fBTLSv1\fR, \fBTLSv1_1\fR or
1421\&\fBTLSv1_2\fR forces the use of the corresponding protocol. This is
1422useful when talking to old and buggy \s-1SSL\s0 server implementations that
1423make it hard for the underlying \s-1SSL\s0 library to choose the correct
1424protocol version. Fortunately, such servers are quite rare.
1425.Sp
1426Specifying \fB\s-1PFS\s0\fR enforces the use of the so-called Perfect Forward
1427Security cipher suites. In short, \s-1PFS\s0 adds security by creating a one-time
1428key for each \s-1SSL\s0 connection. It has a bit more \s-1CPU\s0 impact on client and server.
1429We use known to be secure ciphers (e.g. no \s-1MD4\s0) and the \s-1TLS\s0 protocol.
1430.IP "\fB\-\-https\-only\fR" 4
1431.IX Item "--https-only"
1432When in recursive mode, only \s-1HTTPS\s0 links are followed.
1433.IP "\fB\-\-no\-check\-certificate\fR" 4
1434.IX Item "--no-check-certificate"
1435Don't check the server certificate against the available certificate
1436authorities. Also don't require the \s-1URL\s0 host name to match the common
1437name presented by the certificate.
1438.Sp
1439As of Wget 1.10, the default is to verify the server's certificate
1440against the recognized certificate authorities, breaking the \s-1SSL\s0
1441handshake and aborting the download if the verification fails.
1442Although this provides more secure downloads, it does break
1443interoperability with some sites that worked with previous Wget
1444versions, particularly those using self-signed, expired, or otherwise
1445invalid certificates. This option forces an \*(L"insecure\*(R" mode of
1446operation that turns the certificate verification errors into warnings
1447and allows you to proceed.
1448.Sp
1449If you encounter \*(L"certificate verification\*(R" errors or ones saying
1450that \*(L"common name doesn't match requested host name\*(R", you can use
1451this option to bypass the verification and proceed with the download.
1452\&\fIOnly use this option if you are otherwise convinced of the
1453site's authenticity, or if you really don't care about the validity of
1454its certificate.\fR It is almost always a bad idea not to check the
1455certificates when transmitting confidential or important data.
1456For self\-signed/internal certificates, you should download the certificate
1457and verify against that instead of forcing this insecure mode.
1458If you are really sure of not desiring any certificate verification, you
1459can specify \-\-check\-certificate=quiet to tell wget to not print any
1460warning about invalid certificates, albeit in most cases this is the
1461wrong thing to do.
1462.IP "\fB\-\-certificate=\fR\fIfile\fR" 4
1463.IX Item "--certificate=file"
1464Use the client certificate stored in \fIfile\fR. This is needed for
1465servers that are configured to require certificates from the clients
1466that connect to them. Normally a certificate is not required and this
1467switch is optional.
1468.IP "\fB\-\-certificate\-type=\fR\fItype\fR" 4
1469.IX Item "--certificate-type=type"
1470Specify the type of the client certificate. Legal values are
1471\&\fB\s-1PEM\s0\fR (assumed by default) and \fB\s-1DER\s0\fR, also known as
1472\&\fB\s-1ASN1\s0\fR.
1473.IP "\fB\-\-private\-key=\fR\fIfile\fR" 4
1474.IX Item "--private-key=file"
1475Read the private key from \fIfile\fR. This allows you to provide the
1476private key in a file separate from the certificate.
1477.IP "\fB\-\-private\-key\-type=\fR\fItype\fR" 4
1478.IX Item "--private-key-type=type"
1479Specify the type of the private key. Accepted values are \fB\s-1PEM\s0\fR
1480(the default) and \fB\s-1DER\s0\fR.
1481.IP "\fB\-\-ca\-certificate=\fR\fIfile\fR" 4
1482.IX Item "--ca-certificate=file"
1483Use \fIfile\fR as the file with the bundle of certificate authorities
1484(\*(L"\s-1CA\*(R"\s0) to verify the peers. The certificates must be in \s-1PEM\s0 format.
1485.Sp
1486Without this option Wget looks for \s-1CA\s0 certificates at the
1487system-specified locations, chosen at OpenSSL installation time.
1488.IP "\fB\-\-ca\-directory=\fR\fIdirectory\fR" 4
1489.IX Item "--ca-directory=directory"
1490Specifies directory containing \s-1CA\s0 certificates in \s-1PEM\s0 format. Each
1491file contains one \s-1CA\s0 certificate, and the file name is based on a hash
1492value derived from the certificate. This is achieved by processing a
1493certificate directory with the \f(CW\*(C`c_rehash\*(C'\fR utility supplied with
1494OpenSSL. Using \fB\-\-ca\-directory\fR is more efficient than
1495\&\fB\-\-ca\-certificate\fR when many certificates are installed because
1496it allows Wget to fetch certificates on demand.
1497.Sp
1498Without this option Wget looks for \s-1CA\s0 certificates at the
1499system-specified locations, chosen at OpenSSL installation time.
1500.IP "\fB\-\-crl\-file=\fR\fIfile\fR" 4
1501.IX Item "--crl-file=file"
1502Specifies a \s-1CRL\s0 file in \fIfile\fR. This is needed for certificates
1503that have been revocated by the CAs.
1504.IP "\fB\-\-random\-file=\fR\fIfile\fR" 4
1505.IX Item "--random-file=file"
1506[OpenSSL and LibreSSL only]
1507Use \fIfile\fR as the source of random data for seeding the
1508pseudo-random number generator on systems without \fI/dev/urandom\fR.
1509.Sp
1510On such systems the \s-1SSL\s0 library needs an external source of randomness
1511to initialize. Randomness may be provided by \s-1EGD \s0(see
1512\&\fB\-\-egd\-file\fR below) or read from an external source specified by
1513the user. If this option is not specified, Wget looks for random data
1514in \f(CW$RANDFILE\fR or, if that is unset, in \fI\f(CI$HOME\fI/.rnd\fR.
1515.Sp
1516If you're getting the \*(L"Could not seed OpenSSL \s-1PRNG\s0; disabling \s-1SSL.\*(R" \s0
1517error, you should provide random data using some of the methods
1518described above.
1519.IP "\fB\-\-egd\-file=\fR\fIfile\fR" 4
1520.IX Item "--egd-file=file"
1521[OpenSSL only]
1522Use \fIfile\fR as the \s-1EGD\s0 socket. \s-1EGD\s0 stands for \fIEntropy
1523Gathering Daemon\fR, a user-space program that collects data from
1524various unpredictable system sources and makes it available to other
1525programs that might need it. Encryption software, such as the \s-1SSL\s0
1526library, needs sources of non-repeating randomness to seed the random
1527number generator used to produce cryptographically strong keys.
1528.Sp
1529OpenSSL allows the user to specify his own source of entropy using the
1530\&\f(CW\*(C`RAND_FILE\*(C'\fR environment variable. If this variable is unset, or
1531if the specified file does not produce enough randomness, OpenSSL will
1532read random data from \s-1EGD\s0 socket specified using this option.
1533.Sp
1534If this option is not specified (and the equivalent startup command is
1535not used), \s-1EGD\s0 is never contacted. \s-1EGD\s0 is not needed on modern Unix
1536systems that support \fI/dev/urandom\fR.
1537.IP "\fB\-\-no\-hsts\fR" 4
1538.IX Item "--no-hsts"
1539Wget supports \s-1HSTS \s0(\s-1HTTP\s0 Strict Transport Security, \s-1RFC 6797\s0) by default.
1540Use \fB\-\-no\-hsts\fR to make Wget act as a non-HSTS-compliant \s-1UA.\s0 As a
1541consequence, Wget would ignore all the \f(CW\*(C`Strict\-Transport\-Security\*(C'\fR
1542headers, and would not enforce any existing \s-1HSTS\s0 policy.
1543.IP "\fB\-\-hsts\-file=\fR\fIfile\fR" 4
1544.IX Item "--hsts-file=file"
1545By default, Wget stores its \s-1HSTS\s0 database in \fI~/.wget\-hsts\fR.
1546You can use \fB\-\-hsts\-file\fR to override this. Wget will use
1547the supplied file as the \s-1HSTS\s0 database. Such file must conform to the
1548correct \s-1HSTS\s0 database format used by Wget. If Wget cannot parse the provided
1549file, the behaviour is unspecified.
1550.Sp
1551The Wget's \s-1HSTS\s0 database is a plain text file. Each line contains an \s-1HSTS\s0 entry
1552(ie. a site that has issued a \f(CW\*(C`Strict\-Transport\-Security\*(C'\fR header and that
1553therefore has specified a concrete \s-1HSTS\s0 policy to be applied). Lines starting with
1554a dash (\f(CW\*(C`#\*(C'\fR) are ignored by Wget. Please note that in spite of this convenient
1555human-readability hand-hacking the \s-1HSTS\s0 database is generally not a good idea.
1556.Sp
1557An \s-1HSTS\s0 entry line consists of several fields separated by one or more whitespace:
1558.Sp
1559\&\f(CW\*(C`<hostname> SP [<port>] SP <include subdomains> SP <created> SP <max\-age>\*(C'\fR
1560.Sp
1561The \fIhostname\fR and \fIport\fR fields indicate the hostname and port to which
1562the given \s-1HSTS\s0 policy applies. The \fIport\fR field may be zero, and it will, in
1563most of the cases. That means that the port number will not be taken into account
1564when deciding whether such \s-1HSTS\s0 policy should be applied on a given request (only
1565the hostname will be evaluated). When \fIport\fR is different to zero, both the
1566target hostname and the port will be evaluated and the \s-1HSTS\s0 policy will only be applied
1567if both of them match. This feature has been included for testing/development purposes only.
1568The Wget testsuite (in \fItestenv/\fR) creates \s-1HSTS\s0 databases with explicit ports
1569with the purpose of ensuring Wget's correct behaviour. Applying \s-1HSTS\s0 policies to ports
1570other than the default ones is discouraged by \s-1RFC 6797 \s0(see Appendix B \*(L"Differences
1571between \s-1HSTS\s0 Policy and Same-Origin Policy\*(R"). Thus, this functionality should not be used
1572in production environments and \fIport\fR will typically be zero. The last three fields
1573do what they are expected to. The field \fIinclude_subdomains\fR can either be \f(CW1\fR
1574or \f(CW0\fR and it signals whether the subdomains of the target domain should be
1575part of the given \s-1HSTS\s0 policy as well. The \fIcreated\fR and \fImax-age\fR fields
1576hold the timestamp values of when such entry was created (first seen by Wget) and the
1577HSTS-defined value 'max\-age', which states how long should that \s-1HSTS\s0 policy remain active,
1578measured in seconds elapsed since the timestamp stored in \fIcreated\fR. Once that time
1579has passed, that \s-1HSTS\s0 policy will no longer be valid and will eventually be removed
1580from the database.
1581.Sp
1582If you supply your own \s-1HSTS\s0 database via \fB\-\-hsts\-file\fR, be aware that Wget
1583may modify the provided file if any change occurs between the \s-1HSTS\s0 policies
1584requested by the remote servers and those in the file. When Wget exists,
1585it effectively updates the \s-1HSTS\s0 database by rewriting the database file with the new entries.
1586.Sp
1587If the supplied file does not exist, Wget will create one. This file will contain the new \s-1HSTS\s0
1588entries. If no \s-1HSTS\s0 entries were generated (no \f(CW\*(C`Strict\-Transport\-Security\*(C'\fR headers
1589were sent by any of the servers) then no file will be created, not even an empty one. This
1590behaviour applies to the default database file (\fI~/.wget\-hsts\fR) as well: it will not be
1591created until some server enforces an \s-1HSTS\s0 policy.
1592.Sp
1593Care is taken not to override possible changes made by other Wget processes at
1594the same time over the \s-1HSTS\s0 database. Before dumping the updated \s-1HSTS\s0 entries
1595on the file, Wget will re-read it and merge the changes.
1596.Sp
1597Using a custom \s-1HSTS\s0 database and/or modifying an existing one is discouraged.
1598For more information about the potential security threats arised from such practice,
1599see section 14 \*(L"Security Considerations\*(R" of \s-1RFC 6797,\s0 specially section 14.9
1600\&\*(L"Creative Manipulation of \s-1HSTS\s0 Policy Store\*(R".
1601.IP "\fB\-\-warc\-file=\fR\fIfile\fR" 4
1602.IX Item "--warc-file=file"
1603Use \fIfile\fR as the destination \s-1WARC\s0 file.
1604.IP "\fB\-\-warc\-header=\fR\fIstring\fR" 4
1605.IX Item "--warc-header=string"
1606Use \fIstring\fR into as the warcinfo record.
1607.IP "\fB\-\-warc\-max\-size=\fR\fIsize\fR" 4
1608.IX Item "--warc-max-size=size"
1609Set the maximum size of the \s-1WARC\s0 files to \fIsize\fR.
1610.IP "\fB\-\-warc\-cdx\fR" 4
1611.IX Item "--warc-cdx"
1612Write \s-1CDX\s0 index files.
1613.IP "\fB\-\-warc\-dedup=\fR\fIfile\fR" 4
1614.IX Item "--warc-dedup=file"
1615Do not store records listed in this \s-1CDX\s0 file.
1616.IP "\fB\-\-no\-warc\-compression\fR" 4
1617.IX Item "--no-warc-compression"
1618Do not compress \s-1WARC\s0 files with \s-1GZIP.\s0
1619.IP "\fB\-\-no\-warc\-digests\fR" 4
1620.IX Item "--no-warc-digests"
1621Do not calculate \s-1SHA1\s0 digests.
1622.IP "\fB\-\-no\-warc\-keep\-log\fR" 4
1623.IX Item "--no-warc-keep-log"
1624Do not store the log file in a \s-1WARC\s0 record.
1625.IP "\fB\-\-warc\-tempdir=\fR\fIdir\fR" 4
1626.IX Item "--warc-tempdir=dir"
1627Specify the location for temporary files created by the \s-1WARC\s0 writer.
1628.SS "\s-1FTP\s0 Options"
1629.IX Subsection "FTP Options"
1630.IP "\fB\-\-ftp\-user=\fR\fIuser\fR" 4
1631.IX Item "--ftp-user=user"
1632.PD 0
1633.IP "\fB\-\-ftp\-password=\fR\fIpassword\fR" 4
1634.IX Item "--ftp-password=password"
1635.PD
1636Specify the username \fIuser\fR and password \fIpassword\fR on an
1637\&\s-1FTP\s0 server. Without this, or the corresponding startup option,
1638the password defaults to \fB\-wget@\fR, normally used for anonymous
1639\&\s-1FTP.\s0
1640.Sp
1641Another way to specify username and password is in the \s-1URL\s0 itself. Either method reveals your password to anyone who
1642bothers to run \f(CW\*(C`ps\*(C'\fR. To prevent the passwords from being seen,
1643store them in \fI.wgetrc\fR or \fI.netrc\fR, and make sure to protect
1644those files from other users with \f(CW\*(C`chmod\*(C'\fR. If the passwords are
1645really important, do not leave them lying in those files either\-\-\-edit
1646the files and delete them after Wget has started the download.
1647.IP "\fB\-\-no\-remove\-listing\fR" 4
1648.IX Item "--no-remove-listing"
1649Don't remove the temporary \fI.listing\fR files generated by \s-1FTP\s0
1650retrievals. Normally, these files contain the raw directory listings
1651received from \s-1FTP\s0 servers. Not removing them can be useful for
1652debugging purposes, or when you want to be able to easily check on the
1653contents of remote server directories (e.g. to verify that a mirror
1654you're running is complete).
1655.Sp
1656Note that even though Wget writes to a known filename for this file,
1657this is not a security hole in the scenario of a user making
1658\&\fI.listing\fR a symbolic link to \fI/etc/passwd\fR or something and
1659asking \f(CW\*(C`root\*(C'\fR to run Wget in his or her directory. Depending on
1660the options used, either Wget will refuse to write to \fI.listing\fR,
1661making the globbing/recursion/time\-stamping operation fail, or the
1662symbolic link will be deleted and replaced with the actual
1663\&\fI.listing\fR file, or the listing will be written to a
1664\&\fI.listing.\fInumber\fI\fR file.
1665.Sp
1666Even though this situation isn't a problem, though, \f(CW\*(C`root\*(C'\fR should
1667never run Wget in a non-trusted user's directory. A user could do
1668something as simple as linking \fIindex.html\fR to \fI/etc/passwd\fR
1669and asking \f(CW\*(C`root\*(C'\fR to run Wget with \fB\-N\fR or \fB\-r\fR so the file
1670will be overwritten.
1671.IP "\fB\-\-no\-glob\fR" 4
1672.IX Item "--no-glob"
1673Turn off \s-1FTP\s0 globbing. Globbing refers to the use of shell-like
1674special characters (\fIwildcards\fR), like \fB*\fR, \fB?\fR, \fB[\fR
1675and \fB]\fR to retrieve more than one file from the same directory at
1676once, like:
1677.Sp
1678.Vb 1
1679\& wget ftp://gnjilux.srk.fer.hr/*.msg
1680.Ve
1681.Sp
1682By default, globbing will be turned on if the \s-1URL\s0 contains a
1683globbing character. This option may be used to turn globbing on or off
1684permanently.
1685.Sp
1686You may have to quote the \s-1URL\s0 to protect it from being expanded by
1687your shell. Globbing makes Wget look for a directory listing, which is
1688system-specific. This is why it currently works only with Unix \s-1FTP\s0
1689servers (and the ones emulating Unix \f(CW\*(C`ls\*(C'\fR output).
1690.IP "\fB\-\-no\-passive\-ftp\fR" 4
1691.IX Item "--no-passive-ftp"
1692Disable the use of the \fIpassive\fR \s-1FTP\s0 transfer mode. Passive \s-1FTP\s0
1693mandates that the client connect to the server to establish the data
1694connection rather than the other way around.
1695.Sp
1696If the machine is connected to the Internet directly, both passive and
1697active \s-1FTP\s0 should work equally well. Behind most firewall and \s-1NAT\s0
1698configurations passive \s-1FTP\s0 has a better chance of working. However,
1699in some rare firewall configurations, active \s-1FTP\s0 actually works when
1700passive \s-1FTP\s0 doesn't. If you suspect this to be the case, use this
1701option, or set \f(CW\*(C`passive_ftp=off\*(C'\fR in your init file.
1702.IP "\fB\-\-preserve\-permissions\fR" 4
1703.IX Item "--preserve-permissions"
1704Preserve remote file permissions instead of permissions set by umask.
1705.IP "\fB\-\-retr\-symlinks\fR" 4
1706.IX Item "--retr-symlinks"
1707By default, when retrieving \s-1FTP\s0 directories recursively and a symbolic link
1708is encountered, the symbolic link is traversed and the pointed-to files are
1709retrieved. Currently, Wget does not traverse symbolic links to directories to
1710download them recursively, though this feature may be added in the future.
1711.Sp
1712When \fB\-\-retr\-symlinks=no\fR is specified, the linked-to file is not
1713downloaded. Instead, a matching symbolic link is created on the local
1714filesystem. The pointed-to file will not be retrieved unless this recursive
1715retrieval would have encountered it separately and downloaded it anyway. This
1716option poses a security risk where a malicious \s-1FTP\s0 Server may cause Wget to
1717write to files outside of the intended directories through a specially crafted
1718\&.LISTING file.
1719.Sp
1720Note that when retrieving a file (not a directory) because it was
1721specified on the command-line, rather than because it was recursed to,
1722this option has no effect. Symbolic links are always traversed in this
1723case.
1724.SS "\s-1FTPS\s0 Options"
1725.IX Subsection "FTPS Options"
1726.IP "\fB\-\-ftps\-implicit\fR" 4
1727.IX Item "--ftps-implicit"
1728This option tells Wget to use \s-1FTPS\s0 implicitly. Implicit \s-1FTPS\s0 consists of initializing
1729\&\s-1SSL/TLS\s0 from the very beginning of the control connection. This option does not send
1730an \f(CW\*(C`AUTH TLS\*(C'\fR command: it assumes the server speaks \s-1FTPS\s0 and directly starts an
1731\&\s-1SSL/TLS\s0 connection. If the attempt is successful, the session continues just like
1732regular \s-1FTPS \s0(\f(CW\*(C`PBSZ\*(C'\fR and \f(CW\*(C`PROT\*(C'\fR are sent, etc.).
1733Implicit \s-1FTPS\s0 is no longer a requirement for \s-1FTPS\s0 implementations, and thus
1734many servers may not support it. If \fB\-\-ftps\-implicit\fR is passed and no explicit
1735port number specified, the default port for implicit \s-1FTPS, 990,\s0 will be used, instead
1736of the default port for the \*(L"normal\*(R" (explicit) \s-1FTPS\s0 which is the same as that of \s-1FTP,
173721.\s0
1738.IP "\fB\-\-no\-ftps\-resume\-ssl\fR" 4
1739.IX Item "--no-ftps-resume-ssl"
1740Do not resume the \s-1SSL/TLS\s0 session in the data channel. When starting a data connection,
1741Wget tries to resume the \s-1SSL/TLS\s0 session previously started in the control connection.
1742\&\s-1SSL/TLS\s0 session resumption avoids performing an entirely new handshake by reusing
1743the \s-1SSL/TLS\s0 parameters of a previous session. Typically, the \s-1FTPS\s0 servers want it that way,
1744so Wget does this by default. Under rare circumstances however, one might want to
1745start an entirely new \s-1SSL/TLS\s0 session in every data connection.
1746This is what \fB\-\-no\-ftps\-resume\-ssl\fR is for.
1747.IP "\fB\-\-ftps\-clear\-data\-connection\fR" 4
1748.IX Item "--ftps-clear-data-connection"
1749All the data connections will be in plain text. Only the control connection will be
1750under \s-1SSL/TLS.\s0 Wget will send a \f(CW\*(C`PROT C\*(C'\fR command to achieve this, which must be
1751approved by the server.
1752.IP "\fB\-\-ftps\-fallback\-to\-ftp\fR" 4
1753.IX Item "--ftps-fallback-to-ftp"
1754Fall back to \s-1FTP\s0 if \s-1FTPS\s0 is not supported by the target server. For security reasons,
1755this option is not asserted by default. The default behaviour is to exit with an error.
1756If a server does not successfully reply to the initial \f(CW\*(C`AUTH TLS\*(C'\fR command, or in the
1757case of implicit \s-1FTPS,\s0 if the initial \s-1SSL/TLS\s0 connection attempt is rejected, it is
1758considered that such server does not support \s-1FTPS.\s0
1759.SS "Recursive Retrieval Options"
1760.IX Subsection "Recursive Retrieval Options"
1761.IP "\fB\-r\fR" 4
1762.IX Item "-r"
1763.PD 0
1764.IP "\fB\-\-recursive\fR" 4
1765.IX Item "--recursive"
1766.PD
1767Turn on recursive retrieving. The default maximum depth is 5.
1768.IP "\fB\-l\fR \fIdepth\fR" 4
1769.IX Item "-l depth"
1770.PD 0
1771.IP "\fB\-\-level=\fR\fIdepth\fR" 4
1772.IX Item "--level=depth"
1773.PD
1774Specify recursion maximum depth level \fIdepth\fR.
1775.IP "\fB\-\-delete\-after\fR" 4
1776.IX Item "--delete-after"
1777This option tells Wget to delete every single file it downloads,
1778\&\fIafter\fR having done so. It is useful for pre-fetching popular
1779pages through a proxy, e.g.:
1780.Sp
1781.Vb 1
1782\& wget \-r \-nd \-\-delete\-after http://whatever.com/~popular/page/
1783.Ve
1784.Sp
1785The \fB\-r\fR option is to retrieve recursively, and \fB\-nd\fR to not
1786create directories.
1787.Sp
1788Note that \fB\-\-delete\-after\fR deletes files on the local machine. It
1789does not issue the \fB\s-1DELE\s0\fR command to remote \s-1FTP\s0 sites, for
1790instance. Also note that when \fB\-\-delete\-after\fR is specified,
1791\&\fB\-\-convert\-links\fR is ignored, so \fB.orig\fR files are simply not
1792created in the first place.
1793.IP "\fB\-k\fR" 4
1794.IX Item "-k"
1795.PD 0
1796.IP "\fB\-\-convert\-links\fR" 4
1797.IX Item "--convert-links"
1798.PD
1799After the download is complete, convert the links in the document to
1800make them suitable for local viewing. This affects not only the visible
1801hyperlinks, but any part of the document that links to external content,
1802such as embedded images, links to style sheets, hyperlinks to non-HTML
1803content, etc.
1804.Sp
1805Each link will be changed in one of the two ways:
1806.RS 4
1807.IP "\(bu" 4
1808The links to files that have been downloaded by Wget will be changed to
1809refer to the file they point to as a relative link.
1810.Sp
1811Example: if the downloaded file \fI/foo/doc.html\fR links to
1812\&\fI/bar/img.gif\fR, also downloaded, then the link in \fIdoc.html\fR
1813will be modified to point to \fB../bar/img.gif\fR. This kind of
1814transformation works reliably for arbitrary combinations of directories.
1815.IP "\(bu" 4
1816The links to files that have not been downloaded by Wget will be changed
1817to include host name and absolute path of the location they point to.
1818.Sp
1819Example: if the downloaded file \fI/foo/doc.html\fR links to
1820\&\fI/bar/img.gif\fR (or to \fI../bar/img.gif\fR), then the link in
1821\&\fIdoc.html\fR will be modified to point to
1822\&\fIhttp://\fIhostname\fI/bar/img.gif\fR.
1823.RE
1824.RS 4
1825.Sp
1826Because of this, local browsing works reliably: if a linked file was
1827downloaded, the link will refer to its local name; if it was not
1828downloaded, the link will refer to its full Internet address rather than
1829presenting a broken link. The fact that the former links are converted
1830to relative links ensures that you can move the downloaded hierarchy to
1831another directory.
1832.Sp
1833Note that only at the end of the download can Wget know which links have
1834been downloaded. Because of that, the work done by \fB\-k\fR will be
1835performed at the end of all the downloads.
1836.RE
1837.IP "\fB\-\-convert\-file\-only\fR" 4
1838.IX Item "--convert-file-only"
1839This option converts only the filename part of the URLs, leaving the rest
1840of the URLs untouched. This filename part is sometimes referred to as the
1841\&\*(L"basename\*(R", although we avoid that term here in order not to cause confusion.
1842.Sp
1843It works particularly well in conjunction with \fB\-\-adjust\-extension\fR, although
1844this coupling is not enforced. It proves useful to populate Internet caches
1845with files downloaded from different hosts.
1846.Sp
1847Example: if some link points to \fI//foo.com/bar.cgi?xyz\fR with
1848\&\fB\-\-adjust\-extension\fR asserted and its local destination is intended to be
1849\&\fI./foo.com/bar.cgi?xyz.css\fR, then the link would be converted to
1850\&\fI//foo.com/bar.cgi?xyz.css\fR. Note that only the filename part has been
1851modified. The rest of the \s-1URL\s0 has been left untouched, including the net path
1852(\f(CW\*(C`//\*(C'\fR) which would otherwise be processed by Wget and converted to the
1853effective scheme (ie. \f(CW\*(C`http://\*(C'\fR).
1854.IP "\fB\-K\fR" 4
1855.IX Item "-K"
1856.PD 0
1857.IP "\fB\-\-backup\-converted\fR" 4
1858.IX Item "--backup-converted"
1859.PD
1860When converting a file, back up the original version with a \fB.orig\fR
1861suffix. Affects the behavior of \fB\-N\fR.
1862.IP "\fB\-m\fR" 4
1863.IX Item "-m"
1864.PD 0
1865.IP "\fB\-\-mirror\fR" 4
1866.IX Item "--mirror"
1867.PD
1868Turn on options suitable for mirroring. This option turns on recursion
1869and time-stamping, sets infinite recursion depth and keeps \s-1FTP\s0
1870directory listings. It is currently equivalent to
1871\&\fB\-r \-N \-l inf \-\-no\-remove\-listing\fR.
1872.IP "\fB\-p\fR" 4
1873.IX Item "-p"
1874.PD 0
1875.IP "\fB\-\-page\-requisites\fR" 4
1876.IX Item "--page-requisites"
1877.PD
1878This option causes Wget to download all the files that are necessary to
1879properly display a given \s-1HTML\s0 page. This includes such things as
1880inlined images, sounds, and referenced stylesheets.
1881.Sp
1882Ordinarily, when downloading a single \s-1HTML\s0 page, any requisite documents
1883that may be needed to display it properly are not downloaded. Using
1884\&\fB\-r\fR together with \fB\-l\fR can help, but since Wget does not
1885ordinarily distinguish between external and inlined documents, one is
1886generally left with \*(L"leaf documents\*(R" that are missing their
1887requisites.
1888.Sp
1889For instance, say document \fI1.html\fR contains an \f(CW\*(C`<IMG>\*(C'\fR tag
1890referencing \fI1.gif\fR and an \f(CW\*(C`<A>\*(C'\fR tag pointing to external
1891document \fI2.html\fR. Say that \fI2.html\fR is similar but that its
1892image is \fI2.gif\fR and it links to \fI3.html\fR. Say this
1893continues up to some arbitrarily high number.
1894.Sp
1895If one executes the command:
1896.Sp
1897.Vb 1
1898\& wget \-r \-l 2 http://<site>/1.html
1899.Ve
1900.Sp
1901then \fI1.html\fR, \fI1.gif\fR, \fI2.html\fR, \fI2.gif\fR, and
1902\&\fI3.html\fR will be downloaded. As you can see, \fI3.html\fR is
1903without its requisite \fI3.gif\fR because Wget is simply counting the
1904number of hops (up to 2) away from \fI1.html\fR in order to determine
1905where to stop the recursion. However, with this command:
1906.Sp
1907.Vb 1
1908\& wget \-r \-l 2 \-p http://<site>/1.html
1909.Ve
1910.Sp
1911all the above files \fIand\fR \fI3.html\fR's requisite \fI3.gif\fR
1912will be downloaded. Similarly,
1913.Sp
1914.Vb 1
1915\& wget \-r \-l 1 \-p http://<site>/1.html
1916.Ve
1917.Sp
1918will cause \fI1.html\fR, \fI1.gif\fR, \fI2.html\fR, and \fI2.gif\fR
1919to be downloaded. One might think that:
1920.Sp
1921.Vb 1
1922\& wget \-r \-l 0 \-p http://<site>/1.html
1923.Ve
1924.Sp
1925would download just \fI1.html\fR and \fI1.gif\fR, but unfortunately
1926this is not the case, because \fB\-l 0\fR is equivalent to
1927\&\fB\-l inf\fR\-\-\-that is, infinite recursion. To download a single \s-1HTML\s0
1928page (or a handful of them, all specified on the command-line or in a
1929\&\fB\-i\fR \s-1URL\s0 input file) and its (or their) requisites, simply leave off
1930\&\fB\-r\fR and \fB\-l\fR:
1931.Sp
1932.Vb 1
1933\& wget \-p http://<site>/1.html
1934.Ve
1935.Sp
1936Note that Wget will behave as if \fB\-r\fR had been specified, but only
1937that single page and its requisites will be downloaded. Links from that
1938page to external documents will not be followed. Actually, to download
1939a single page and all its requisites (even if they exist on separate
1940websites), and make sure the lot displays properly locally, this author
1941likes to use a few options in addition to \fB\-p\fR:
1942.Sp
1943.Vb 1
1944\& wget \-E \-H \-k \-K \-p http://<site>/<document>
1945.Ve
1946.Sp
1947To finish off this topic, it's worth knowing that Wget's idea of an
1948external document link is any \s-1URL\s0 specified in an \f(CW\*(C`<A>\*(C'\fR tag, an
1949\&\f(CW\*(C`<AREA>\*(C'\fR tag, or a \f(CW\*(C`<LINK>\*(C'\fR tag other than \f(CW\*(C`<LINK
1950REL="stylesheet">\*(C'\fR.
1951.IP "\fB\-\-strict\-comments\fR" 4
1952.IX Item "--strict-comments"
1953Turn on strict parsing of \s-1HTML\s0 comments. The default is to terminate
1954comments at the first occurrence of \fB\-\->\fR.
1955.Sp
1956According to specifications, \s-1HTML\s0 comments are expressed as \s-1SGML
1957\&\s0\fIdeclarations\fR. Declaration is special markup that begins with
1958\&\fB<!\fR and ends with \fB>\fR, such as \fB<!DOCTYPE ...>\fR, that
1959may contain comments between a pair of \fB\-\-\fR delimiters. \s-1HTML\s0
1960comments are \*(L"empty declarations\*(R", \s-1SGML\s0 declarations without any
1961non-comment text. Therefore, \fB<!\-\-foo\-\->\fR is a valid comment, and
1962so is \fB<!\-\-one\*(-- \-\-two\-\->\fR, but \fB<!\-\-1\-\-2\-\->\fR is not.
1963.Sp
1964On the other hand, most \s-1HTML\s0 writers don't perceive comments as anything
1965other than text delimited with \fB<!\-\-\fR and \fB\-\->\fR, which is not
1966quite the same. For example, something like \fB<!\-\-\-\-\-\-\-\-\-\-\-\->\fR
1967works as a valid comment as long as the number of dashes is a multiple
1968of four (!). If not, the comment technically lasts until the next
1969\&\fB\-\-\fR, which may be at the other end of the document. Because of
1970this, many popular browsers completely ignore the specification and
1971implement what users have come to expect: comments delimited with
1972\&\fB<!\-\-\fR and \fB\-\->\fR.
1973.Sp
1974Until version 1.9, Wget interpreted comments strictly, which resulted in
1975missing links in many web pages that displayed fine in browsers, but had
1976the misfortune of containing non-compliant comments. Beginning with
1977version 1.9, Wget has joined the ranks of clients that implements
1978\&\*(L"naive\*(R" comments, terminating each comment at the first occurrence of
1979\&\fB\-\->\fR.
1980.Sp
1981If, for whatever reason, you want strict comment parsing, use this
1982option to turn it on.
1983.SS "Recursive Accept/Reject Options"
1984.IX Subsection "Recursive Accept/Reject Options"
1985.IP "\fB\-A\fR \fIacclist\fR \fB\-\-accept\fR \fIacclist\fR" 4
1986.IX Item "-A acclist --accept acclist"
1987.PD 0
1988.IP "\fB\-R\fR \fIrejlist\fR \fB\-\-reject\fR \fIrejlist\fR" 4
1989.IX Item "-R rejlist --reject rejlist"
1990.PD
1991Specify comma-separated lists of file name suffixes or patterns to
1992accept or reject. Note that if
1993any of the wildcard characters, \fB*\fR, \fB?\fR, \fB[\fR or
1994\&\fB]\fR, appear in an element of \fIacclist\fR or \fIrejlist\fR,
1995it will be treated as a pattern, rather than a suffix.
1996In this case, you have to enclose the pattern into quotes to prevent
1997your shell from expanding it, like in \fB\-A \*(L"*.mp3\*(R"\fR or \fB\-A '*.mp3'\fR.
1998.IP "\fB\-\-accept\-regex\fR \fIurlregex\fR" 4
1999.IX Item "--accept-regex urlregex"
2000.PD 0
2001.IP "\fB\-\-reject\-regex\fR \fIurlregex\fR" 4
2002.IX Item "--reject-regex urlregex"
2003.PD
2004Specify a regular expression to accept or reject the complete \s-1URL.\s0
2005.IP "\fB\-\-regex\-type\fR \fIregextype\fR" 4
2006.IX Item "--regex-type regextype"
2007Specify the regular expression type. Possible types are \fBposix\fR or
2008\&\fBpcre\fR. Note that to be able to use \fBpcre\fR type, wget has to be
2009compiled with libpcre support.
2010.IP "\fB\-D\fR \fIdomain-list\fR" 4
2011.IX Item "-D domain-list"
2012.PD 0
2013.IP "\fB\-\-domains=\fR\fIdomain-list\fR" 4
2014.IX Item "--domains=domain-list"
2015.PD
2016Set domains to be followed. \fIdomain-list\fR is a comma-separated list
2017of domains. Note that it does \fInot\fR turn on \fB\-H\fR.
2018.IP "\fB\-\-exclude\-domains\fR \fIdomain-list\fR" 4
2019.IX Item "--exclude-domains domain-list"
2020Specify the domains that are \fInot\fR to be followed.
2021.IP "\fB\-\-follow\-ftp\fR" 4
2022.IX Item "--follow-ftp"
2023Follow \s-1FTP\s0 links from \s-1HTML\s0 documents. Without this option,
2024Wget will ignore all the \s-1FTP\s0 links.
2025.IP "\fB\-\-follow\-tags=\fR\fIlist\fR" 4
2026.IX Item "--follow-tags=list"
2027Wget has an internal table of \s-1HTML\s0 tag / attribute pairs that it
2028considers when looking for linked documents during a recursive
2029retrieval. If a user wants only a subset of those tags to be
2030considered, however, he or she should be specify such tags in a
2031comma-separated \fIlist\fR with this option.
2032.IP "\fB\-\-ignore\-tags=\fR\fIlist\fR" 4
2033.IX Item "--ignore-tags=list"
2034This is the opposite of the \fB\-\-follow\-tags\fR option. To skip
2035certain \s-1HTML\s0 tags when recursively looking for documents to download,
2036specify them in a comma-separated \fIlist\fR.
2037.Sp
2038In the past, this option was the best bet for downloading a single page
2039and its requisites, using a command-line like:
2040.Sp
2041.Vb 1
2042\& wget \-\-ignore\-tags=a,area \-H \-k \-K \-r http://<site>/<document>
2043.Ve
2044.Sp
2045However, the author of this option came across a page with tags like
2046\&\f(CW\*(C`<LINK REL="home" HREF="/">\*(C'\fR and came to the realization that
2047specifying tags to ignore was not enough. One can't just tell Wget to
2048ignore \f(CW\*(C`<LINK>\*(C'\fR, because then stylesheets will not be downloaded.
2049Now the best bet for downloading a single page and its requisites is the
2050dedicated \fB\-\-page\-requisites\fR option.
2051.IP "\fB\-\-ignore\-case\fR" 4
2052.IX Item "--ignore-case"
2053Ignore case when matching files and directories. This influences the
2054behavior of \-R, \-A, \-I, and \-X options, as well as globbing
2055implemented when downloading from \s-1FTP\s0 sites. For example, with this
2056option, \fB\-A \*(L"*.txt\*(R"\fR will match \fBfile1.txt\fR, but also
2057\&\fBfile2.TXT\fR, \fBfile3.TxT\fR, and so on.
2058The quotes in the example are to prevent the shell from expanding the
2059pattern.
2060.IP "\fB\-H\fR" 4
2061.IX Item "-H"
2062.PD 0
2063.IP "\fB\-\-span\-hosts\fR" 4
2064.IX Item "--span-hosts"
2065.PD
2066Enable spanning across hosts when doing recursive retrieving.
2067.IP "\fB\-L\fR" 4
2068.IX Item "-L"
2069.PD 0
2070.IP "\fB\-\-relative\fR" 4
2071.IX Item "--relative"
2072.PD
2073Follow relative links only. Useful for retrieving a specific home page
2074without any distractions, not even those from the same hosts.
2075.IP "\fB\-I\fR \fIlist\fR" 4
2076.IX Item "-I list"
2077.PD 0
2078.IP "\fB\-\-include\-directories=\fR\fIlist\fR" 4
2079.IX Item "--include-directories=list"
2080.PD
2081Specify a comma-separated list of directories you wish to follow when
2082downloading. Elements
2083of \fIlist\fR may contain wildcards.
2084.IP "\fB\-X\fR \fIlist\fR" 4
2085.IX Item "-X list"
2086.PD 0
2087.IP "\fB\-\-exclude\-directories=\fR\fIlist\fR" 4
2088.IX Item "--exclude-directories=list"
2089.PD
2090Specify a comma-separated list of directories you wish to exclude from
2091download. Elements of
2092\&\fIlist\fR may contain wildcards.
2093.IP "\fB\-np\fR" 4
2094.IX Item "-np"
2095.PD 0
2096.IP "\fB\-\-no\-parent\fR" 4
2097.IX Item "--no-parent"
2098.PD
2099Do not ever ascend to the parent directory when retrieving recursively.
2100This is a useful option, since it guarantees that only the files
2101\&\fIbelow\fR a certain hierarchy will be downloaded.
2102.SH "ENVIRONMENT"
2103.IX Header "ENVIRONMENT"
2104Wget supports proxies for both \s-1HTTP\s0 and \s-1FTP\s0 retrievals. The
2105standard way to specify proxy location, which Wget recognizes, is using
2106the following environment variables:
2107.IP "\fBhttp_proxy\fR" 4
2108.IX Item "http_proxy"
2109.PD 0
2110.IP "\fBhttps_proxy\fR" 4
2111.IX Item "https_proxy"
2112.PD
2113If set, the \fBhttp_proxy\fR and \fBhttps_proxy\fR variables should
2114contain the URLs of the proxies for \s-1HTTP\s0 and \s-1HTTPS\s0
2115connections respectively.
2116.IP "\fBftp_proxy\fR" 4
2117.IX Item "ftp_proxy"
2118This variable should contain the \s-1URL\s0 of the proxy for \s-1FTP\s0
2119connections. It is quite common that \fBhttp_proxy\fR and
2120\&\fBftp_proxy\fR are set to the same \s-1URL.\s0
2121.IP "\fBno_proxy\fR" 4
2122.IX Item "no_proxy"
2123This variable should contain a comma-separated list of domain extensions
2124proxy should \fInot\fR be used for. For instance, if the value of
2125\&\fBno_proxy\fR is \fB.mit.edu\fR, proxy will not be used to retrieve
2126documents from \s-1MIT.\s0
2127.SH "EXIT STATUS"
2128.IX Header "EXIT STATUS"
2129Wget may return one of several error codes if it encounters problems.
2130.ie n .IP "0" 4
2131.el .IP "\f(CW0\fR" 4
2132.IX Item "0"
2133No problems occurred.
2134.ie n .IP "1" 4
2135.el .IP "\f(CW1\fR" 4
2136.IX Item "1"
2137Generic error code.
2138.ie n .IP "2" 4
2139.el .IP "\f(CW2\fR" 4
2140.IX Item "2"
2141Parse error\-\-\-for instance, when parsing command-line options, the
2142\&\fB.wgetrc\fR or \fB.netrc\fR...
2143.ie n .IP "3" 4
2144.el .IP "\f(CW3\fR" 4
2145.IX Item "3"
2146File I/O error.
2147.ie n .IP "4" 4
2148.el .IP "\f(CW4\fR" 4
2149.IX Item "4"
2150Network failure.
2151.ie n .IP "5" 4
2152.el .IP "\f(CW5\fR" 4
2153.IX Item "5"
2154\&\s-1SSL\s0 verification failure.
2155.ie n .IP "6" 4
2156.el .IP "\f(CW6\fR" 4
2157.IX Item "6"
2158Username/password authentication failure.
2159.ie n .IP "7" 4
2160.el .IP "\f(CW7\fR" 4
2161.IX Item "7"
2162Protocol errors.
2163.ie n .IP "8" 4
2164.el .IP "\f(CW8\fR" 4
2165.IX Item "8"
2166Server issued an error response.
2167.PP
2168With the exceptions of 0 and 1, the lower-numbered exit codes take
2169precedence over higher-numbered ones, when multiple types of errors
2170are encountered.
2171.PP
2172In versions of Wget prior to 1.12, Wget's exit status tended to be
2173unhelpful and inconsistent. Recursive downloads would virtually always
2174return 0 (success), regardless of any issues encountered, and
2175non-recursive fetches only returned the status corresponding to the
2176most recently-attempted download.
2177.SH "FILES"
2178.IX Header "FILES"
2179.IP "\fB/etc/wgetrc\fR" 4
2180.IX Item "/etc/wgetrc"
2181Default location of the \fIglobal\fR startup file.
2182.IP "\fB.wgetrc\fR" 4
2183.IX Item ".wgetrc"
2184User startup file.
2185.SH "BUGS"
2186.IX Header "BUGS"
2187You are welcome to submit bug reports via the \s-1GNU\s0 Wget bug tracker (see
2188<\fBhttp://wget.addictivecode.org/BugTracker\fR>).
2189.PP
2190Before actually submitting a bug report, please try to follow a few
2191simple guidelines.
2192.IP "1." 4
2193Please try to ascertain that the behavior you see really is a bug. If
2194Wget crashes, it's a bug. If Wget does not behave as documented,
2195it's a bug. If things work strange, but you are not sure about the way
2196they are supposed to work, it might well be a bug, but you might want to
2197double-check the documentation and the mailing lists.
2198.IP "2." 4
2199Try to repeat the bug in as simple circumstances as possible. E.g. if
2200Wget crashes while downloading \fBwget \-rl0 \-kKE \-t5 \-\-no\-proxy
2201http://yoyodyne.com \-o /tmp/log\fR, you should try to see if the crash is
2202repeatable, and if will occur with a simpler set of options. You might
2203even try to start the download at the page where the crash occurred to
2204see if that page somehow triggered the crash.
2205.Sp
2206Also, while I will probably be interested to know the contents of your
2207\&\fI.wgetrc\fR file, just dumping it into the debug message is probably
2208a bad idea. Instead, you should first try to see if the bug repeats
2209with \fI.wgetrc\fR moved out of the way. Only if it turns out that
2210\&\fI.wgetrc\fR settings affect the bug, mail me the relevant parts of
2211the file.
2212.IP "3." 4
2213Please start Wget with \fB\-d\fR option and send us the resulting
2214output (or relevant parts thereof). If Wget was compiled without
2215debug support, recompile it\-\-\-it is \fImuch\fR easier to trace bugs
2216with debug support on.
2217.Sp
2218Note: please make sure to remove any potentially sensitive information
2219from the debug log before sending it to the bug address. The
2220\&\f(CW\*(C`\-d\*(C'\fR won't go out of its way to collect sensitive information,
2221but the log \fIwill\fR contain a fairly complete transcript of Wget's
2222communication with the server, which may include passwords and pieces
2223of downloaded data. Since the bug address is publically archived, you
2224may assume that all bug reports are visible to the public.
2225.IP "4." 4
2226If Wget has crashed, try to run it in a debugger, e.g. \f(CW\*(C`gdb \`which
2227wget\` core\*(C'\fR and type \f(CW\*(C`where\*(C'\fR to get the backtrace. This may not
2228work if the system administrator has disabled core files, but it is
2229safe to try.
2230.SH "SEE ALSO"
2231.IX Header "SEE ALSO"
2232This is \fBnot\fR the complete manual for \s-1GNU\s0 Wget.
2233For more complete information, including more detailed explanations of
2234some of the options, and a number of commands available
2235for use with \fI.wgetrc\fR files and the \fB\-e\fR option, see the \s-1GNU\s0
2236Info entry for \fIwget\fR.
2237.SH "AUTHOR"
2238.IX Header "AUTHOR"
2239Originally written by Hrvoje Nikšić <hniksic@xemacs.org>.
2240.SH "COPYRIGHT"
2241.IX Header "COPYRIGHT"
2242Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003,
22432004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2015 Free Software
2244Foundation, Inc.
2245.PP
2246Permission is granted to copy, distribute and/or modify this document
2247under the terms of the \s-1GNU\s0 Free Documentation License, Version 1.3 or
2248any later version published by the Free Software Foundation; with no
2249Invariant Sections, with no Front-Cover Texts, and with no Back-Cover
2250Texts. A copy of the license is included in the section entitled
2251\&\*(L"\s-1GNU\s0 Free Documentation License\*(R".