Month: March 2018

  • What are the best SEO Crawler for Huge Website ?

    You want to crawl millions of URL and make a super SEO Audit ?

    Here are the best crawler for SEO on a large to very large website:

    • Desktop SEO Crawler (starts at more or less 200$ / year / unlimited crawl)
      • Screaming Frog (Linux / Windows / Mac OS) -> The big plus: native function to cross data from G.A., Search Console, MajesticSEO, Ahrefs… + You can cross with your log files ! +/- 60 GB of disk space for 1 million URL crawled. Read this post to setup Screaming Frog on Remote Desktop Ubuntu Cloud instance.
      • Sitebulb (Windows / Mac OS) -> pretty rich ! Interesting visualization of the internal links structure.
      • Hextrakt (Windows) -> URL Segmentation is a real + when it comes to analyze Big Websites. Hextrakt does the job !
      • Xenu (Windows) -> only for very basic checkup, like 404.
    • Open Source SEO Crawler (Python / Java etc. )
      • Scrapy
      • Crowl (An Open Source crawler based on Scrapy)
      • Nutch
      • => Those solutions aren’t profitable in most cases, since it requires a lot of development and maintenance compared to a SaaS solution for instance.
      • => Nevertheless, if you want to discover how a search engine works, you will learn a lot ! 🙂