monetizer expertmobi
Dismiss Notice
Welcome to Our Community
Wanting to join the rest of our members? Feel free to sign up today.

Spiders

Discussion in 'General Internet Marketing' started by mastermind97, Apr 26, 2013.

  1. mastermind97

    mastermind97 Affiliate affiliate

    206
    7
    18
    Hello,
    I have created a site a week back and now I find that the site and according to statpress plugin it has lots of visits by spiders in 2-3 days, these visits are like 400-500 per day in last 2-3 days. Is it alarming and what does it mean. Is it good or bad and why? Please let me know .
     
  2. monetizer
  3. Marc

    Marc 武士- Spamurai moderator affiliate

    2,835
    1,118
    113
    Webcrawlers (or also called spiders or bots) are an usual part of the Internet, and they form about 40% of the Internet traffic. So nothing special about that.

    If you want to exclude crawlers from parts of your website, you have to insert a "robots.txt"-file in the database of your site on your server. The content of this "robots.txt"-file (a simple Editor file) is:

    User-agent: *
    Disallow: /yoursubpage​

    But instead of "yoursubpage" you will insert of course the URL of the subpage (or subpages), which you want to be excluded from a crawler access.

    A simple method to exclude unwanted crawlers from certain parts of a website :)
     
  4. oldman

    oldman Affiliate affiliate

    5
    0
    0
    I have also had problems with bot traffic on my wordpress sites. Here are a copy of the robot.txt file I used to solve my problems.
    Just copy this into your robot.txt file:
    ==============================

    User-agent: Alexibot
    Disallow: /

    User-agent: Aqua_Products
    Disallow: /

    User-agent: asterias
    Disallow: /

    User-agent: b2w/0.1
    Disallow: /

    User-agent: BackDoorBot/1.0
    Disallow: /

    User-agent: BlowFish/1.0
    Disallow: /

    User-agent: Bookmark search tool
    Disallow: /

    User-agent: BotALot
    Disallow: /

    User-agent: BotRightHere
    Disallow: /

    User-agent: BuiltBotTough
    Disallow: /

    User-agent: Bullseye/1.0
    Disallow: /

    User-agent: BunnySlippers
    Disallow: /

    User-agent: CheeseBot
    Disallow: /

    User-agent: CherryPicker
    Disallow: /

    User-agent: CherryPickerElite/1.0
    Disallow: /

    User-agent: CherryPickerSE/1.0
    Disallow: /

    User-agent: Copernic
    Disallow: /

    User-agent: CopyRightCheck
    Disallow: /

    User-agent: cosmos
    Disallow: /

    User-agent: Crescent Internet ToolPak HTTP OLE Control v.1.0
    Disallow: /

    User-agent: Crescent
    Disallow: /

    User-agent: DittoSpyder
    Disallow: /

    User-agent: EmailCollector
    Disallow: /

    User-agent: EmailSiphon
    Disallow: /

    User-agent: EmailWolf
    Disallow: /

    User-agent: EroCrawler
    Disallow: /

    User-agent: ExtractorPro
    Disallow: /

    User-agent: FairAd Client
    Disallow: /

    User-agent: Flaming AttackBot
    Disallow: /

    User-agent: Foobot
    Disallow: /

    User-agent: Gaisbot
    Disallow: /

    User-agent: GetRight/4.2
    Disallow: /

    User-agent: Harvest/1.5
    Disallow: /

    User-agent: hloader
    Disallow: /

    User-agent: httplib
    Disallow: /

    User-agent: HTTrack 3.0
    Disallow: /

    User-agent: humanlinks
    Disallow: /

    User-agent: InfoNaviRobot
    Disallow: /

    User-agent: Iron33/1.0.2
    Disallow: /

    User-agent: JennyBot
    Disallow: /

    User-agent: Kenjin Spider
    Disallow: /

    User-agent: Keyword Density/0.9
    Disallow: /

    User-agent: larbin
    Disallow: /

    User-agent: LexiBot
    Disallow: /

    User-agent: libWeb/clsHTTP
    Disallow: /

    User-agent: LinkextractorPro
    Disallow: /

    User-agent: LinkScan/8.1a Unix
    Disallow: /

    User-agent: LinkWalker
    Disallow: /

    User-agent: LNSpiderguy
    Disallow: /

    User-agent: lwp-trivial/1.34
    Disallow: /

    User-agent: lwp-trivial
    Disallow: /

    User-agent: Mata Hari
    Disallow: /

    User-agent: Microsoft URL Control - 5.01.4511
    Disallow: /

    User-agent: Microsoft URL Control - 6.00.8169
    Disallow: /

    User-agent: Microsoft URL Control
    Disallow: /

    User-agent: MIIxpc/4.2
    Disallow: /

    User-agent: MIIxpc
    Disallow: /

    User-agent: Mister PiX
    Disallow: /

    User-agent: moget/2.1
    Disallow: /

    User-agent: moget
    Disallow: /

    User-agent: Mozilla/4.0 (compatible; BullsEye; Windows 95)
    Disallow: /

    User-agent: MSIECrawler
    Disallow: /

    User-agent: NetAnts
    Disallow: /

    User-agent: NICErsPRO
    Disallow: /

    User-agent: Offline Explorer
    Disallow: /

    User-agent: Openbot
    Disallow: /

    User-agent: Openfind data gatherer
    Disallow: /

    User-agent: Openfind
    Disallow: /

    User-agent: Oracle Ultra Search
    Disallow: /

    User-agent: PerMan
    Disallow: /

    User-agent: ProPowerBot/2.14
    Disallow: /

    User-agent: ProWebWalker
    Disallow: /

    User-agent: psbot
    Disallow: /

    User-agent: Python-urllib
    Disallow: /

    User-agent: QueryN Metasearch
    Disallow: /

    User-agent: Radiation Retriever 1.1
    Disallow: /

    User-agent: RepoMonkey Bait & Tackle/v1.01
    Disallow: /

    User-agent: RepoMonkey
    Disallow: /

    User-agent: RMA
    Disallow: /

    User-agent: searchpreview
    Disallow: /

    User-agent: SiteSnagger
    Disallow: /

    User-agent: SpankBot
    Disallow: /

    User-agent: spanner
    Disallow: /

    User-agent: suzuran
    Disallow: /

    User-agent: Szukacz/1.4
    Disallow: /

    User-agent: Teleport
    Disallow: /

    User-agent: TeleportPro
    Disallow: /

    User-agent: Telesoft
    Disallow: /

    User-agent: The Intraformant
    Disallow: /

    User-agent: TheNomad
    Disallow: /

    User-agent: TightTwatBot
    Disallow: /

    User-agent: toCrawl/UrlDispatcher
    Disallow: /

    User-agent: True_Robot/1.0
    Disallow: /

    User-agent: True_Robot
    Disallow: /

    User-agent: turingos
    Disallow: /

    User-agent: TurnitinBot/1.5
    Disallow: /

    User-agent: TurnitinBot
    Disallow: /

    User-agent: URL Control
    Disallow: /

    User-agent: URL_Spider_Pro
    Disallow: /

    User-agent: URLy Warning
    Disallow: /

    User-agent: VCI WebViewer VCI WebViewer Win32
    Disallow: /

    User-agent: VCI
    Disallow: /

    User-agent: Web Image Collector
    Disallow: /

    User-agent: WebAuto
    Disallow: /

    User-agent: WebBandit/3.50
    Disallow: /

    User-agent: WebBandit
    Disallow: /

    User-agent: WebCapture 2.0
    Disallow: /

    User-agent: WebCopier v.2.2
    Disallow: /

    User-agent: WebCopier v3.2a
    Disallow: /

    User-agent: WebCopier
    Disallow: /

    User-agent: WebEnhancer
    Disallow: /

    User-agent: WebSauger
    Disallow: /

    User-agent: Website Quester
    Disallow: /

    User-agent: Webster Pro
    Disallow: /

    User-agent: WebStripper
    Disallow: /

    User-agent: WebZip/4.0
    Disallow: /

    User-agent: WebZIP/4.21
    Disallow: /

    User-agent: WebZIP/5.0
    Disallow: /

    User-agent: WebZip
    Disallow: /

    User-agent: Wget/1.5.3
    Disallow: /

    User-agent: Wget/1.6
    Disallow: /

    User-agent: Wget
    Disallow: /

    User-agent: wget
    Disallow: /

    User-agent: WWW-Collector-E
    Disallow: /

    User-agent: Xenu's Link Sleuth 1.1c
    Disallow: /

    User-agent: Xenu's
    Disallow: /

    User-agent: Zeus 32297 Webster Pro V2.9 Win32
    Disallow: /

    User-agent: Zeus Link Scout
    Disallow: /

    User-agent: Zeus
    Disallow: /

    User-agent: Adsbot-Google
    Disallow:

    User-agent: Googlebot
    Disallow:

    User-agent: Mediapartners-Google
    Disallow:

    User-agent: *
    Disallow: /cgi-bin/
    Disallow: /wp-admin/
    Disallow: /wp-includes/
    Disallow: /wp-content/plugins/
    Disallow: /wp-content/cache/
    Disallow: /wp-content/themes/
    Disallow: /wp-login.php
    Disallow: /wp-register.php
     

Featured Resources (View All)

MI