Sitemap Creator 0.2a: Create sitemaps 09 valid for Google, Yahoo, MSN, Ask.com and moreover sitemaps

New Sitemap Creator Beta available
Sitemap Creator 0.2a is different from version 0.1, The script now is able to crawl/spider your website, create your sitemaps, ping Google, Yahoo, MSN, Ask.com, moreover.com with the location of your sitemaps and send you alerts by email when sitemaps are created or crawled by the search bot. The crawler saves sitemaps data into an easy to edit CSV file.

Download (build 20070109) :
sitemap_creator.tar.gz - sitemap_creator.zip


Tags :

This entry was posted on Monday, December 10th, 2007 at 5:42 am and is filed under Blog, News, Programs, Solutions. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

93 Responses to “Sitemap Creator 0.2a: Create sitemaps 09 valid for Google, Yahoo, MSN, Ask.com and moreover sitemaps”

  1. Sitemap Creator 0.1 : Create Sitemaps 0.9 valid for Google, Yahoo! and MSN Sitemaps :: GadElKareem Says:

    [...] New Sitemap Creator 0.2a available [...]

  2. Jerry Says:

    Has anyone got this to work? Have spent several hours on it. No luck, can not get it to redirect, I think.

  3. wkarim Says:

    @Jerry : Can you describe the error?

  4. Mike Says:

    I've also tried with no success (although I'm not proficient with PHP). Are there more detailed instructions available?

  5. Flash Buddy Says:

    I get:
    Parse error: syntax error, unexpected '=', expecting ')' in /home/hqcodec/public_html/.function.inc.php on line 99

  6. wkarim Says:

    @Flash Buddy:
    what PHP version are you using?
    a fast answer would be to remove the '&' before the '$val' on that line.
    Let me know if it worked

  7. Chris Says:

    Had the same error as Jerry, removed the “&" in various lines with errors.
    Now the program starts but if I click on “Crawl" I get the following error: “Call to undefined function: curl_setopt_array() in /www/htdocs/v031207/.function.inc.php on line 218″
    My php-version: 4.4.8

  8. wkarim Says:

    @Chris
    you either should recompile PHP with curl library support, or change 'SMC_USE_CURL' on the configuration file to false.

  9. wkarim Says:

    The script is tested against PHP v5.2.5
    However, I tried to make it as compatible as possible with previous versions of PHP
    the error
    'Parse error: syntax error, unexpected ‘=’, expecting ‘)’ in /.function.inc.php on line 99′ is due to using a default value on a variable passed by reference which is not supported on PHP 4.2.2

    Solution :
    download Sitemap Creator 0.2 for 4.2.2
    fixed on the new build

  10. nit Says:

    after i insall and edit the config file and go to web and enter the password i put in the config file I get this;

    Warning: Cannot modify header information - headers already sent by (output started at /home/osgames/public_html/sitemap/sitemap/sitemap.php:54) in /home/osgames/public_html/sitemap/sitemap/.function.inc.php on line 703

    when i click on the crawl site link it puts me back to the password screen and it just goes on and on..

    php -v
    PHP 5.2.1 (cli) (built: Feb 23 2007 08:00:24)
    Copyright (c) 1997-2007 The PHP Group
    Zend Engine v2.2.0, Copyright (c) 1998-2007 Zend Technologies
    with Zend Extension Manager v1.2.0, Copyright (c) 2003-2006, by Zend Technologies
    with Zend Optimizer v3.2.2, Copyright (c) 1998-2006, by Zend Technologies

    mysql is 4.1.22

    thanks..

  11. nit Says:

    forgot to add;

    the redirect doesnt work too. it loads for a moment then its a blank page..

    every link puts me back to the login screen

  12. wkarim Says:

    @nit
    Thank you for reporting this error, I'm not sure how I didn't get that on my server!
    a new build is available with all reported errors fixed

  13. nit Says:

    np thx for this nice script..

  14. Paddy Says:

    i don't know what i have done but i get lots of errors:

    Warning: Division by zero in C:\xampp\htdocs\New Folder\sitemap\.config.inc.php on line 52

    Warning: Cannot modify header information - headers already sent by (output started at C:\xampp\htdocs\New Folder\sitemap\.config.inc.php:52) in C:\xampp\htdocs\New Folder\sitemap\sitemap.php on line 49

    Warning: file(C:\xampp\htdocs\New Folder\sitemap/data/sites/) [function.file]: failed to open stream: No such file or directory in C:\xampp\htdocs\New Folder\sitemap\.function.inc.php on line 570

    Warning: Invalid argument supplied for foreach() in C:\xampp\htdocs\New Folder\sitemap\.function.inc.php on line 572

    i dont know whats happining can anyone help me

  15. wkarim Says:

    Please re download and extract the files and run without any editing, it appears that you've added unneeded characters in the config file

  16. Reiner Says:

    Hi,
    How do I exclude the directroies I don't wan't to be listed, crawled BEFORE running the script and where is the blacklist file or directory?

  17. wkarim Says:

    @Reiner
    Current version of Sitemap Creator 0.2a does not have exclude directory feature, you might edit the created CSV file with any text editor to do that.
    blacklist directory path depends on your configuration, default is sitemap/data/errors/ or you can change it through config file by changing SMC_DATA_ERRORS

  18. shaun Says:

    I click crawl site. and it says crawling please wait and after 4 seconds I get this error.

    Fatal error: Call to undefined function: stripos() in /home/shaun/public_html/site/sitemap/.function.inc.php on line 233

  19. shaun Says:

    ahh I'm running php 4.4.8. stripos is for php 5. I changed stripos to strpos and now it is running how ever it isn't going to be case insensitive now. Not sure what the result will be but it is running.

  20. wkarim Says:

    @shaun
    yeah sorry for that one…you can use this instead
    strpos(strtolower($header['content_type']), 'text')
    I guess the script might need PHP5 to work, anyway let me know if it worked with this fix

  21. Mafiozy Says:

    Does this thing works???

  22. wkarim Says:

    @Mafiozy
    yes it’s working

  23. Pfff Says:

    Hello,

    i need some help to install this script.
    http://www.webynux.net/sitemap.php is a blank page …

    i think i've problem to configure the .config.inc.php.

    Could you please help me ??

  24. wkarim Says:

    you need to access the sitemap through
    /sitemap/sitemap.php
    I think you have disabled error reporting from your php.ini file, you need to enable that to check what error you have.
    also, please let me know your PHP version

  25. Pfff Says:

    Warning: tempnam() [function.tempnam]: open_basedir restriction in effect. File(/tmp) is not within the allowed path(s): (/home/www/tmp/:/usr/local/lib/php:/usr/local/bin:/home/www/pfff) in /home/www/pfff/www/sitemap/.function.inc.php on line 87

    Warning: curl_setopt_array() [function.curl-setopt-array]: CURLOPT_FOLLOWLOCATION cannot be activated when in safe_mode or an open_basedir is set in /home/www/pfff/www/sitemap/.function.inc.php on line 218
    ERROR6: Couldn't resolve host 'wwwwebynuxnet' for URL http://wwwwebynuxnet/

    No Pages were crawled, Please make sure you have set your site domain correctly and you have valid connection to host

    why the “." are note between www, webynux and net ?

  26. Pfff Says:

    Hello,

    i just solve the matter by uploading the config file without any change BUT now, my website can't be crawled …

    here’s the error message:

    No Pages were crawled, Please make sure you have set your site domain correctly and you have valid connection to host

    but idon't know how to set the domain …

    Please help ;-)

  27. wkarim Says:

    @pfff
    I think you do not have enough privileges to run curl on your server, please disable CURL :
    define('SMC_USE_CURL', false);
    Then clean your cache and blacklist folders as the index page might has been blacklisted

  28. Ruud Says:

    Hi,
    Get the following message:

    No Pages were crawled, Please make sure you have set your site domain correctly and you have valid connection to host

    I've index.php in the root folder which redirects to http://www.bestplaces2be.com/md/index.php

    index.php

    any idee what goes wrong here?

  29. Ruud Says:

    some additional info:
    CURL :
    define(’SMC_USE_CURL’, false);
    Then clean your cache and blacklist folders as the index page might has been blacklisted

    I did this also.

  30. wkarim Says:

    @Ruud
    you have your domain set with www at the beginning while no link on the pages begins with 'www' The script is able to crawl up-level sub-domains but not the reverse.
    Set your domain without 'www'

  31. webmaster Says:

    Warning: Cannot modify header information - headers already sent by (output started at /var/www/virtual/luvshades.com/htdocs/sitemap.php:54) in /var/www/virtual/luvshades.com/htdocs/.function.inc.php on line 703

    When I try any function, it goes back to sign-in page.

  32. wkarim Says:

    @webmaster
    please download the new build, it should fix this problem

  33. Ruud Says:

    okay removed the www now still get this:
    NOTICE: Document type is text/html for URL http://bestplaces2be.com/md/index.php
    NOTICE: Document type is text/html for URL http://bestplaces2be.com/
    No Pages were crawled, Please make sure you have set your site domain correctly and you have valid connection to host

  34. wkarim Says:

    @Ruud
    please replace the files with the original ones and empty cache and error folders.
    That’s not a possible error!

  35. webmaster Says:

    Hi,

    I have downloaded the most recent script and still get the same error.

  36. wkarim Says:

    @webmaster
    can you copy and paste line 54 on sitemap.php

  37. webmaster Says:

    script language="javascript" type="text/javascript

    I had to remove the brackets to post.

  38. wkarim Says:

    @wbmaster
    that’s not the same line on the last build, please re download.

  39. Ruud Says:

    please replace the files with the original ones
    What do you mean with this?
    I changed everywhere the Http://www.bestplaces2be.com into http://bestpalces2be.com

    cache and errors empty, result the same

  40. webmaster Says:

    Hi,

    Now I get:

    Warning: tempnam() [function.tempnam]: open_basedir restriction in effect. File(/tmp) is not within the allowed path(s): (/var/www/virtual/anysite.com/:/usr/share/php/:/tmp/) in /var/www/virtual/anysite.com/htdocs/sitemap/.function.inc.php on line 87

    Warning: file_exists() [function.file-exists]: open_basedir restriction in effect. File(/smc_cookies) is not within the allowed path(s): (/var/www/virtual/anysite.com/:/usr/share/php/:/tmp/) in /var/www/virtual/anysite.com/htdocs/sitemap/.function.inc.php on line 143

    Also, I have two questions:

    1. I see a folder with sitemaps. Is there supposed to be a seperate sitemaps.php for the root and then a seperate folder called sitemap.

    2. The script keeps stating that my “data" folder may not exist or might not be writable. However, it does exist and is writable.

  41. wkarim Says:

    @Ruud
    Remove all edited php files and replace them with the ones from the recent build.
    Do not add 'http://' at the beginning of your domain

  42. wkarim Says:

    @webmaster
    For the warnings you weather need to fix permissions or disable the use for curl from the config file.

    1. Yes, there are two sitemap.php files, and you are supposed to use the one inside the sitemap folder
    to extract directly on your server do
    tar xzf sitemap_creator.tar.gz

    2. same as previous

  43. Ruud Says:

    I use your last build
    Sitemap Creator 0.2 alpha build 20080109

    and use this in config.inc.php
    define('SMC_SITE', $_SERVER['HTTP_HOST']);

    if ( isset($header['content_type']) && strpos(strtolower($header[’content_type’]), 'text') === false ){
    _error("Document type is {$header['content_type']} for URL {$url}“);

    after this it stops I think

  44. wkarim Says:

    @Ruud
    if you're using $_SERVER[’HTTP_HOST’] then you need to access the script from http://example.com not http://www.example.com

  45. Ruud Says:

    this way I access the script:

    http://bestplaces2be.com/sitemap/sitemap.php

  46. wkarim Says:

    @Ruud
    yes, That should work

  47. neeraj Says:

    hello
    i have install it at mmmec.com but i have a forum under mmmec.com/forum ,so how can i make it to map my forum also ,where should i put the link depth for it ??

  48. wkarim Says:

    @neeraj
    as far as there’s a link to the forums it should crawl it with no problem

  49. Dhyar IRdiansyah Says:

    I get error “No Pages were crawled, Please make sure you have set your site domain correctly and you have valid connection to host". what must i do?

  50. wkarim Says:

    @Dhyar IRdiansyah
    make sure the links on the page have the same domain as your localhost or make changes on the config file

  51. Andrew Says:

    Guys and Girls..I downloaded the most recent version on hotscripts.com and edited config.php.inc. All I changed was the prefix for my email and the password (did not touch the HTTP_HOST). I uploaded the files to my site, chmod 777 the data folder, and accessed the site via this link: http://domain.com/sitemap/sitemap.php (NO WWW)

    It went through and indexed everything fine. Works like a charm, thanks webmaster!

  52. wkarim Says:

    @Andrew
    Thank you for your comment.

  53. Darren Says:

    Is there a way to limit the depth in a single directory. I have a calendar and once it goes into that area it fills it up fast.

  54. wkarim Says:

    @Darren
    I am afraid this is not available, I will try to include that as an option on the config file on next releases.
    A workaround is to edit the csv file and delete links you do not want to include on the sitemap. Or in case you do not want to crawl that directory you may blacklist it.

  55. Tammy Says:

    I keep getting this error when crawling. I've tried raising the timeout to no avail

    WARNING: Connection failed (0)
    WARNING: Connection failed (0)
    NOTICE: Document type is application/xml for URL http://mysite.com/clients/announcements.xml
    WARNING: Connection failed (0)
    NOTICE: Document type is application/zip for URL http://mysite.com/themes/dreamland.zip
    NOTICE: Document type is application/zip for URL http://mysite.com/themes/Green-Glow_000.zip
    NOTICE: Document type is application/zip for URL http://mysite.com/themes/mountain-dawn.zip
    NOTICE: Document type is application/zip for URL http://mysite.com/themes/yalla.zip

  56. wkarim Says:

    @Tammy
    This might be related to concurrent connection or bandwidth limit per session on your server.
    To solve this problem try commenting line :
    $get .= “Connection: close\r\n\r\n";
    on function.inc.php file

  57. Tammy Says:

    thank you wkarim, I do that but now when it crawls it goes through the whole 350 seconds and then tells me:
    No Pages were crawled, Please make sure you have set your site domain correctly and you have valid connection to host

  58. wkarim Says:

    @Tammy
    please change SMC_USE_WWW to true and add www to your domain as all links displayed there include www

  59. Sitemap Creator - Create sitemaps for Google, Yahoo, MSN and Ask.comphp and javascript Says:

    [...] URL: http://gadelkareem.com/2007/12/10/sitemap-creator-02a-create-sitemaps-09-valid-for-google-yahoo-and-... [...]

  60. shawn Says:

    this is just what i was looking for for a week and 50 scripts later yep works like a charm my site is huge so i uped the mem to 600 and time out to 1800 and it rolls like a swiss watch

    I CAN NOT SAY THAK YOU ENOUGH!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

  61. wkarim Says:

    @shawn
    glad it helped, thanks for nice words

  62. shawn Says:

    wkarim,

    can you tell me if the
    /*internal settings*/
    define('SMC_VERSION', '0.2a');
    define('SMC_DATA_CACHE', SMC_DATA.'cache/');
    define('SMC_DATA_SITES', SMC_DATA.’sites/');
    define('SMC_DATA_ERRORS', SMC_DATA.'errors/');
    define('SMC_DATA_SITEMAPS', SMC_DATA_SITES.SMC_SITE.'_sitemaps/');
    $pings = array (
    'Google' => 'http://www.google.com/webmasters/sitemaps/ping?sitemap=',
    'Yahoo' => 'http://search.yahooapis.com/SiteExplorerService/V1/updateNotification?appid=SitemapWriter&url=',
    'Live Search' => 'http://webmaster.live.com/ping.aspx?siteMap=',
    'Ask.com' => 'http://submissions.ask.com/ping?sitemap=',
    'MoreOver' => 'http://api.moreover.com/ping?u=',
    );
    does any of this have to be edited and will this script automatacly ping the search engines
    because google askes for sitemap.xml in your root directory and the script does not save one there that i can see please explain how this works!! the scrips is awsome also in the admin section it would be nice if the admin could edit or add to the robots.txt file i have set up a cron task to run every hour it seems to work but does it save a new site map in the cron task or just update the existing one i have an auction site so it will be changing all the time.

  63. shawn Says:

    Sitemap : http://floafieds.com/sitemap.php?do=showsitemap&sm=sitemap.xml.gz
    when google runs this they get a server internal error is there any fix?

  64. wkarim Says:

    @shawn
    - you do not need to edit internal settings.
    - the script provides another page ’sitemap.php' which should be placed in the root directory where it redirects the bot to the sitemap location.
    - There is an option on the admin area to add the sitemap URL to robots.txt
    - The script creates a new sitemap if the time of crawling is one day different from the last created sitemap, you can create a cron job to delete old sitemaps
    - The sitemap on the posted URL does not exist, please regenerate it.

  65. shawn Says:

    so you do need to add any thing to these
    Google' => 'http://www.google.com/webmasters/sitemaps/ping?sitemap=',
    'Yahoo' => 'http://search.yahooapis.com/SiteExplorerService/V1/updateNotification?appid=SitemapWriter&url=',
    'Live Search' => 'http://webmaster.live.com/ping.aspx?siteMap=',
    'Ask.com' => 'http://submissions.ask.com/ping?sitemap=',
    'MoreOver' => 'http://api.moreover.com/ping?u=',?

  66. shawn Says:

    sorry i do have this sitemap.php file and it is in the root directory.and there is also referance in the robots.txt to it!

    i have 3 domains and want them all crawled seperate and they seem to be working but what do i need to tell google my site map is? sitemap.php?

    if we get this working properly i will help you promote it my placing a link on my sites!

  67. wkarim Says:

    @shawn
    - No, you mostly do not need to edit this.
    - For crawling more than one domain you need to keep SMC_SITE as $_SERVER['HTTP_HOST'] and run the script from these different domains, every domain will create its own sitemap and will product a different sitemap which you can submit to Google webmaster’s CP.
    - You are welcome to show a link back to Sitemap Creator.

  68. sl Says:

    stripos() and curl_setopt_array() is not php 4.x compilant. so i fixed following lines in file .function.inc.php.

    line 217:
    $ch = curl_init($url);

    changed to:
    foreach ($options as $myoption => $myvalue) {
    curl_setopt ($ch, $myoption, $myvalue);
    }

    line 233:
    if ( isset($header['content_type']) && stripos($header['content_type'], 'text') === false ){

    changed to:
    if ( isset($header['content_type']) && strpos(strtolower($header['content_type']), 'text') === false ){

  69. Gsm files center Says:

    Thanks for the sitemap script, it’s working great
    nice work ..

  70. cyberswept Says:

    sl I changed the .function.inc.php as you described and it now crawls my site, but when I go to create the sitemaps it says, please crawl domain.com first. Any suggestions?

  71. gtnman Says:

    I assume that this script does not work w/ PHP safemode?

  72. gtnman Says:

    I turned off safe mode, however when I crawl the site it crawls perfect then I click create sitemap, and it tell me I must crawl the site first.

    I have curl disabled and www set to true.

  73. wkarim Says:

    @bispak
    Make sure the script finishes crawling before you create the sitemap

  74. bispak Says:

    i'm sure crawling is finish with the result:
    Finished crawling http://www.tukarinfobispak.com, Crawled 72 links
    Took 26.69 Seconds, using 2MB of memory

    and then at the end mention about the url to add to crontab or schedule tasks.

    Still, when i click “Create Sitemaps" i got answer: “Please crawl tukarinfobispak.com first".

    I use PHP Version 5.2.5

  75. bispak Says:

    When i clink on “Display CSV File" the result is same with when i click on “Create Sitemaps".
    “Please crawl tukarinfobispak.com first"

  76. bispak Says:

    same result happened with or without “www".
    success in crawl but csv file and sitemap not created.
    what’s wrong with me? am i not lucky?
    :(

  77. wkarim Says:

    @bispak
    please use the new version sitemap creator 0.2b

  78. bispak Says:

    thanks, i'll download and try sitemap creator 0.2b

  79. bispak Says:

    when i download, install and run sitemap creator 0.2b
    Voila.. it’s works well.. yippeeee….

    sitemap20080711.xml.gz Created successfully

    Where i can see sitemap20080711.xml.gz ?

    when i'll submit sitemap to google as per SMC advised i dont know what i can submit since i cant find xml file under root or sitemap folder.

    Please more advise and sorry to make you busy with my questions. Hope you dont mind.

  80. Frank Says:

    Hi,
    first, thank you for this great script. It works perfect with a browser. But it doesn't work from console. The only output is the HTML Markup from sitemap.php. But nothing is crawled and no sitemap is created. Do you have a workaround how to crawl from console/cron?
    Thank You

  81. wkarim Says:

    @Frank
    Good point! It has not been tested from console, however I will consider adding console interface on the next versions

  82. evgenij2007 Says:

    Sitemap Creator the best

  83. Prakash Subba Says:

    It works well on my site but how to added sitemap at google webmaster tool? which file should be added?

  84. wkarim Says:

    @prakash
    it should be at http://example.com/sitemap.php?do=showsitemap&sm=sitemap.xml.gz
    you might use rewrite rules to create more simple URL
    make sure you use the new version at http://gadelkareem.com/2008/05/15/sitemap-creator-02-beta/

  85. DoubleClic Says:

    I have an error with url with quote.
    Example :
    http://www.site.com/she’s.htm

    The script return only the first part :
    http://www.site.com/she

    Thank’s for your script.

  86. wkarim Says:

    @DoubleClic
    you might need to urlencode() your URL
    http://www.example.com/she’s.htm should be http://www.example.com/she%27s.htm

  87. Stupid Says:

    Great script.It works perfect

  88. Sitemap Creator- create sitemap for google,yahoo,msn and ask.com | SAYFARZ Says:

    [...] Homepage Download Share [...]

  89. Shopping cart Says:

    Creat i needed this for my site

  90. Antony Says:

    Thank you for the script. It worked ok, I think. But now I have two doubts:
    1- The program automaticaly pings to google and yahoo and the others?
    2- Where is the xml file created so we can put it in google webmaster tools?

    Thank you

  91. Michael Says:

    The site map looks good on the screen using url below but Google WebMaster tools shows errors and says wrong file type.
    ttp://www.reddotdeals.com/sitemap.php?do=showsitemap&sm=20090328.xml.gz

    I actually copied the code this generates and saved it as sitemap.xml and this works perfectly.

    I would suggest the script should create ’sitemap.xml' either on demand or using a chron

    respect for the work on this script

    Michael

  92. Depoastur Says:

    Thanks for it.
    I have download it

  93. coolboy Says:

    Can you tell where to download a newest version..?

 

Leave a Reply


 Top