Downloading all files from a website (excluding HTML files) using wget can be achieved with specific parameter settings. I will now detail a common method and its steps.
First, wget is a powerful command-line tool that supports HTTP, HTTPS, and FTP protocols for downloading files. To download all non-HTML files, we can utilize wget's exclusion feature.
The specific command is as follows:
bashwget -r -l inf -A pdf,jpg,png,mp3 -nd -np -R html,htm http://example.com
Here are the parameters used:
-r: Enables recursive downloading, meaningwgetstarts from the specified URL and recursively fetches all resources.-l inf: Sets the recursion depth to infinite.-A: Specifies the accept list; here,pdf,jpg,png,mp3indicates only these file types will be downloaded.-nd: Prevents directory creation; all downloaded files are stored directly in the current directory.-np: Disables following parent directory links on web pages.-R: Defines the exclusion list; here,html,htmensures no HTML files are downloaded.http://example.com: The target website URL.
With this configuration, wget will recursively download all specified file types from the target website without downloading any HTML files.
For example, if you want to download all lecture materials and audio files from a music school's website—primarily in PDF and MP3 formats—you can use a similar command by adjusting the website URL and potentially modifying the file type list to ensure only required formats are downloaded. This approach is highly effective and straightforward to implement.