Sitemap Creator 0.2 beta
Sitemap Creator crawls/spiders your website creating XML sitemaps compatible with the standard sitemaps.org protocol supported by Google, Yahoo!, MSN and MoreOver. The script pings Google, Yahoo!, MSN and MoreOver bots to download the sitemap file, then tracks the bot and sends you an email on every scan to your Sitemap and gives you a full report of the Search Engine respond.
Sitemaps are created from a CSV file which could easily be edited using any text editor before creating the sitemap. Sitemap Creator has three built in ranking mechanism which decide priorities of your pages depending on the number and the placement of link backs, crawled first links or URL structure. You can also limit the crawler by memory, run time or number of URLs.
- beta info
- cURL is not needed anymore, all requests are processed through fsockopen.
- Big fix to the link back ranking functions.
- Limit crawler to a number of URLs
- Disable crawling specific directories or links. Regular expressions are supported
- limit number of links to show on start page
The script was tested on PHP5, let me know how it worked for PHP4 .
Online demo might be available on the next release.
Download (build 20080514) :
sitemap_creator.tar.gz - sitemap_creator.zip


May 16th, 2008 at 6:11 pm
[...] You need a sitemap! Lucky for you GadElKareem created a little script written in PHP called Sitemap Creator. So what does it do? Upon logging into the sitemap admin section and clicking “Crawl [...]
May 17th, 2008 at 12:57 am
What is the URL you are supposed to submit to google for the sitemap?
May 17th, 2008 at 1:09 am
@gtnman
Please use “Add reference to robots.txt" to show you the default sitemap URL, remember to chmod 666 robots.txt .
it should look something like
http://www.greatertalent.com/sitemap.php?do=showsitemap&sm=sitemap.xml.gz
May 19th, 2008 at 6:09 pm
Interesting note. I have tested this script on two VPS servers both running the latest version of Plesk on two different web hosts. Seems that the unix command utime (which touch()) uses is not available in Plesk. Is touch vital to the script or will is_writable do the same thing? (I made this change, and the robots.txt generated perfectly.)
Does the cronjob re-generate robots.txt or is this something that the user must do.
May 19th, 2008 at 6:13 pm
Another note, I tested this on my ubuntu box and it worked fine which utime did exist on.
May 21st, 2008 at 1:44 am
@gtnman
- touch() is not related to Plesk, you need to have permissions to modify files in that directory, I can not find any relation between touch() function and utime. make sure you 'chmod 777′ data directory.
- robots.txt are not generated every time the sitemap is created, you only need to modify it once.
May 27th, 2008 at 4:21 pm
hi
help me
i see this error:
No Pages were crawled, Please make sure you have set your site domain correctly and you have valid connection to host
May 27th, 2008 at 4:22 pm
where is the right setting for site
May 28th, 2008 at 11:14 pm
@abyzn
Can you enable the debug mode from the configuration file and give me the results?
June 13th, 2008 at 2:54 pm
I use your Sitemap Creator for a site and all going OK, show a table with the URL’s and no error messages. But when I click “Create Sitemaps" the XML is mal formed and incorrect. Any suggestion?
The Site is in ISO-8859-1 and Sitemap in UTF-8…
June 14th, 2008 at 5:13 am
@Ferran
can you give me a like to your sitemap xml file?
June 17th, 2008 at 9:43 am
http://cordigual.com/__sitemap/sitemap.php?do=showsitemap&sm=sitemap.xml.gz
June 18th, 2008 at 2:35 am
[...] Sitemap Creator 0.2 beta :: GadElKareem (tags: Sitemap Creator 0.2 beta :: GadElKareem) [...]
June 18th, 2008 at 9:19 am
@Ferran
can you change constant SMC_GSS to false to disable GSS and try again?
June 18th, 2008 at 11:06 am
@wkarim
I set to false SMC_GSS and the result is the same…
When I look at CSV or the table of sitemap.php show the links… It may be the codification of the files? I try in my sites in UTF and works perfectly, but when i try in this site that’s in ISO, the xml is corrupt… I don't understand it, the functions are correct and config too…