.net - Is there a way to get files from a webserver when directory listing is deactivated? -


i try build "crawler" or "atuomatic downloader" each file based on webserver / webpage.

so in oppinion there 2 ways:

1) directory listing enabled. easy, read out data in listing , download every file see.

2) directory listing disabled. then? idea have brute force filenames , see reaction of server (e.g.: 404 no file, 403 found directory, , data correct found data).

is idea right? there better way?

you can parse html , , follow ('crawl') links get. way crawlers implemented.

check these libraries out it:

  1. .net: html agility pack

  2. python: beautiful soup

  3. php: htmlsimpledom

always robots.txt in site's root , make sure respect site's rules on pages allowed be crawled.


Comments

Popular posts from this blog

c++ - Is it possible to compile a VST on linux? -

java - Output of Eclipse is rubbish -

jquery - Confused with JSON data and normal data in Django ajax request -