How are web pages scraped and how to protect againist someone doing it?
Im not talking about extracting a text, or downloading a web page. but I
see people downloading whole web sites, for example, there is a directory
called "example" and it isnt even linked in web site, how do I know its
there? how do I download "ALL" pages of a website? and how do I protect
against?
this question is not language-specific, I would be happy with just a link
that explains techniques that does this, or a detailed answer.
No comments:
Post a Comment