Sitemap Creator 0.2 beta

Sitemap Creator crawls/spiders your website creating XML sitemaps compatible with the standard sitemaps.org protocol supported by Google, Yahoo!, MSN and MoreOver. The script pings Google, Yahoo!, MSN and MoreOver bots to download the sitemap file, then tracks the bot and sends you an email on every scan to your Sitemap and gives you a full report of the Search Engine respond.
Sitemaps are created from a CSV file which could easily be edited using any text editor before creating the sitemap. Sitemap Creator has three built in ranking mechanism which decide priorities of your pages depending on the number and the placement of link backs, crawled first links or URL structure. You can also limit the crawler by memory, run time or number of URLs.

    beta info

  • cURL is not needed anymore, all requests are processed through fsockopen.
  • Big fix to the link back ranking functions.
  • Limit crawler to a number of URLs
  • Disable crawling specific directories or links. Regular expressions are supported
  • limit number of links to show on start page

The script was tested on PHP5, let me know how it worked for PHP4 .
Online demo might be available on the next release.
Download (build 20080514) :
sitemap_creator.tar.gz - sitemap_creator.zip


Tags :

This entry was posted on Thursday, May 15th, 2008 at 4:24 pm and is filed under News, Programs. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

15 Responses to “Sitemap Creator 0.2 beta”

  1. Sitemap Creator 0.2 beta Released « jared.brodsky Says:

    [...] You need a sitemap!  Lucky for you GadElKareem created a little script written in PHP called Sitemap Creator.  So what does it do?  Upon logging into the sitemap admin section and clicking “Crawl [...]

  2. gtnman Says:

    What is the URL you are supposed to submit to google for the sitemap?

  3. wkarim Says:

    @gtnman
    Please use “Add reference to robots.txt" to show you the default sitemap URL, remember to chmod 666 robots.txt .
    it should look something like
    http://www.greatertalent.com/sitemap.php?do=showsitemap&sm=sitemap.xml.gz

  4. gtnman Says:

    Interesting note. I have tested this script on two VPS servers both running the latest version of Plesk on two different web hosts. Seems that the unix command utime (which touch()) uses is not available in Plesk. Is touch vital to the script or will is_writable do the same thing? (I made this change, and the robots.txt generated perfectly.)
    Does the cronjob re-generate robots.txt or is this something that the user must do.

  5. gtnman Says:

    Another note, I tested this on my ubuntu box and it worked fine which utime did exist on.

  6. wkarim Says:

    @gtnman
    - touch() is not related to Plesk, you need to have permissions to modify files in that directory, I can not find any relation between touch() function and utime. make sure you 'chmod 777′ data directory.
    - robots.txt are not generated every time the sitemap is created, you only need to modify it once.

  7. abyzn Says:

    hi
    help me
    i see this error:

    No Pages were crawled, Please make sure you have set your site domain correctly and you have valid connection to host

  8. abyzn Says:

    where is the right setting for site

  9. wkarim Says:

    @abyzn
    Can you enable the debug mode from the configuration file and give me the results?

  10. Ferran Says:

    I use your Sitemap Creator for a site and all going OK, show a table with the URL’s and no error messages. But when I click “Create Sitemaps" the XML is mal formed and incorrect. Any suggestion?
    The Site is in ISO-8859-1 and Sitemap in UTF-8…

  11. wkarim Says:

    @Ferran
    can you give me a like to your sitemap xml file?

  12. Ferran Says:

    http://cordigual.com/__sitemap/sitemap.php?do=showsitemap&sm=sitemap.xml.gz

  13. links for 2008-06-18 « Free Open Source Directory Says:

    [...] Sitemap Creator 0.2 beta :: GadElKareem (tags: Sitemap Creator 0.2 beta :: GadElKareem) [...]

  14. wkarim Says:

    @Ferran
    can you change constant SMC_GSS to false to disable GSS and try again?

  15. Ferran Says:

    @wkarim
    I set to false SMC_GSS and the result is the same…
    When I look at CSV or the table of sitemap.php show the links… It may be the codification of the files? I try in my sites in UTF and works perfectly, but when i try in this site that’s in ISO, the xml is corrupt… I don't understand it, the functions are correct and config too…

 

Leave a Reply


 Top