wget website recusively from localhost without using bandwith

logax asked:

i’m looking to download recusively my wordpress website into static using wget, the problem is that whenever i do that, it’s using too much bandwith (3.5gb) even though i end up downloading 20mb which weird, so i’m looking to download using localhost, but when i use wget with localhost i only get the index page, now, we all know that wordpress saves the website url into database, so how am i supposed to download using localhost, i already set it up in apache configuration, i just want to download without using so many bandwith.

Tried using -N option to reduce bandwith but i keep getting error that files don’t have last-modified header, so it is not helping..

This is the command i’m using :

wget -N     --recursive      --no-clobber      --page-requisites      --html-extension      --convert-links      --restrict-file-names=windows      --domains website website -P /opt/

Thank you,

Used /etc/hosts and linked the website to the localhost ip, but still it redirects back to the original ip and even then only downloads the index.page.

Is there a way to tell server to force add last-modified header to all wordpress files ?

My answer:

You should not use the --page-requisites option.

This option causes wget to download all images, CSS, scripts, etc., for each page, including ones from external sites. This isn’t necessary for your case because all of the internal ones would already be downloaded.

You also should consider some more reasonable way to export your site. If you mean for it to remain static forever, and you will never again change it (which is very unlikely!) then this approach is fine. But consider instead exporting it in a format suitable for importing to a static site generator such as hugo or jekyll. By this method you can have static web pages but still be able to maintain them when it becomes necessary later.

View the full question and any other answers on Server Fault.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.