Class SitemapCreator

Description

Sitemap Creator creates XML sitemaps files compatible with the standard sitemaps.org protocol and supported by Google and Bing.

Located in /SitemapCreator.class.php (line 19)


	
			
Class Constant Summary
Variable Summary
Method Summary
static array openURL (string $url, [int $max_redirects = 5], [int $timeout = 15])
static bool validSitemapName ( $filename)
SitemapCreator __construct ([string $site = ''])
void addEngine (string $url)
void addEntry (array $entry)
string addSitemapEXT ( $filename)
string addToRobots ([string $robots_file = ''])
void addXMLURLSet (array $entry)
void calcFrequency (int $key,  &$entry)
void calcPriority (int $key,  &$entry,  $total_entries)
void Crawl ()
void CreateSitemaps ([array $entries = array()])
string csvFile ()
string getDataDir ()
array getEntries ()
bool|string getSitemapDirName ()
bool|string getSitemapPath (string $filename)
string getSitemapsDir ()
bool|string getSitemapURL (string $filename)
bool isDataDirWritable ( $dir)
array ping ([string|int $filename = 'index'])
void putContent (string $file, $XML $XML)
bool readFromCSV ([string $file = ''])
void readSitemap (string|int $filename)
bool removeSitemaps ()
void setDataDir (string $dir)
void setEntries (array $entries)
bool setEntriesPerSitemap (int $number)
bool setFrequency (int $mode)
bool setMinFrequency (string $mode)
bool setMinPriority (float $mode)
bool setPriority (int $mode)
string|bool setSite (string $site)
void setSitemapURL (string $url)
void useGzip (bool $mode)
void writeIndex ()
void writeSitemap ()
bool writeToCSV ([string $file = ''])
Variables
static mixed $useragent = "Sitemaps Creator 1.0 (compatible; sitemapcreatorbot/1.0; +http://sitemapcreator.org/)" (line 22)
mixed $classpath (line 214)
  • access: protected
mixed $class_version = "1.0" (line 21)
  • access: public
mixed $Crawler (line 208)
  • access: public
mixed $crawler_reports (line 221)
  • access: public
string $data_dir = '' (line 161)

Data directory Path.

Valid system path for the directory where sitemaps directories will be created.The directory should be writable.

array $engines = array
(
'Google' => 'http://www.google.com/webmasters/sitemaps/ping?sitemap=',
'Live Search' => 'http://www.bing.com/webmaster/ping.aspx?siteMap='
)
(line 198)

Ping URLs of the search engines sitemaps API

array $entries = array() (line 123)

Array contianing the entries of the sitemap.

  • access: protected
int $entries_per_sitemap = 50000 (line 131)

Maximum number of entries per sitemap file

int $frequency_mode = 1 (line 79)

Frequency mode

array $frequency_types = array
(
'always' => 3600, //1 hour
'hourly' => 86400, //1 day
'daily' => 604800, //1 week
'weekly' => 2678400, //1 month
'monthly' => 31536000, //1 year
'yearly' => 63072000, //2 years
'never' => 94608000 //3 years
)
(line 91)

Array contains Frequency types as keys and max time in seconds as values.

  • access: protected
string $min_frequency = 'never' (line 85)

Minimum Priority

int $min_priority = 0 (line 53)

Minimum Priority

int $now (line 106)

Current time().

  • access: protected
int $priority_mode = 1 (line 47)

Priority mode

string $site (line 116)

The URL of the website.It should be full qualified and normalized.

Set on class creation __construct() or setSite()

Default: 'http://' . $_SERVER['HTTP_HOST'] . '/'

  • access: protected
int $sitemaps_count = 0 (line 138)

Number of sitemap files created

  • access: protected
string $sitemaps_dir (line 169)

Sitemaps directory path auto created in prepareSitemapsDir()

string $sitemaps_url (line 182)

Sitemap URL where the sitemap file name will be appended to the end of the URL.

If not set then the link will be generated automatically.

bool $use_gzip = false (line 190)

choose to save sitemaps in gzip format

mixed $xml_foot (line 147)
  • access: protected
mixed $xml_head (line 146)
  • access: protected
string $xml_url_set = '' (line 145)

XML string containing sitemap <urlset></urlset> elements

  • access: protected
Methods
static method openURL (line 940)

Open URL and get respond body or error

  • return: Array contains errors as $results['error'], or respond body as $request['body']
  • see: Ping()
  • section: 4 Sitemap
static array openURL (string $url, [int $max_redirects = 5], [int $timeout = 15])
  • string $url: URL to retrieve
  • int $max_redirects: max allowed redirects (optional)
  • int $timeout: process timeout (optional )
static method validSitemapName (line 570)

Validates sitemap filename

static bool validSitemapName ( $filename)
  • $filename
Constructor __construct (line 230)

Initiates a new Sitemap.

  • section: 1 Settings
  • access: public
SitemapCreator __construct ([string $site = ''])
  • string $site: The url may contain the protocol (http://www.foo.com or https://www.foo.com), the port (http://www.foo.com:4500/index.php) and/or basic-authentication-data (http://loginname:[email protected])
addEngine (line 469)

add a ping URL to the array $engines

  • section: 1 Settings
  • access: public
void addEngine (string $url)
  • string $url: ping URL of the search engine
addEntry (line 456)

add URL entry manually to $entries

* this method throws exception if $entry is not an array or does not have 'URL' key

  • section: 1 Settings
  • access: public
  • example: example not found
void addEntry (array $entry)
  • array $entry: URL set to be added to sitemap
addSitemapEXT (line 710)

adds sitemap gzip extension to sitemap filename if $use_gzip enabled

  • return: filename
  • see: useGzip();
  • section: 4 Sitemap
  • access: protected
string addSitemapEXT ( $filename)
  • $filename
addToRobots (line 1037)

Add index sitemap URL to robots.txt file

* this method throws exception if robots file doesn't exist and is writable

  • return: robots.txt text
  • section: 4 Sitemap
  • access: public
string addToRobots ([string $robots_file = ''])
  • string $robots_file: Robots.txt file path
addXMLURLSet (line 867)

Add single XML code to $xml_url_set

  • section: 4 Sitemap
  • access: protected
void addXMLURLSet (array $entry)
  • array $entry: URL set entry
calcFrequency (line 812)

calculates Frequency for each entry $priority_mode

void calcFrequency (int $key,  &$entry)
  • int $key: array offset $entry reference to array offset in $entries
  • &$entry
calcPriority (line 770)

calculates Priority for each entry $priority_mode

  • section: 4 Sitemap
  • access: protected
void calcPriority (int $key,  &$entry,  $total_entries)
  • int $key: array offset $entry reference to array offset in $entries $total_entries $entries array count
  • &$entry
  • $total_entries
Crawl (line 620)

Start the crawl process

More related options could be found on http://phpcrawl.cuab.de/classreferences/index.html

void Crawl ()
CreateSitemaps (line 728)

Create sitemaps files and index

  • section: 4 Sitemap
  • access: public
  • example: example not found
void CreateSitemaps ([array $entries = array()])
  • array $entries: array of entries to be added to sitemap
csvFile (line 1138)

get CSV file path

  • return: CSV file path
  • section: 5 CSV
  • access: public
string csvFile ()
getDataDir (line 558)

Get data directory path $data_dir

string getDataDir ()
getEntries (line 547)

Get URLs sets array $entries

  • return: $entries
  • section: 2 Info
  • access: public
array getEntries ()
getSitemapDirName (line 518)

Get sitemap directory name

Get this site's directory name where sitemaps are strored. The directory is created inside the $data_dir directory.

  • return: false if $site has not been set, sitemap dir path on success
  • section: 2 Info
  • access: public
  • example: example not found
bool|string getSitemapDirName ()
getSitemapPath (line 482)

Get sitemap file path

bool|string getSitemapPath (string $filename)
  • string $filename: 'index' string or number
getSitemapsDir (line 535)

Get sitemap directory path $sitemaps_dir

Get this site's directory path where sitemaps are strored. The directory is created inside the $data_dir directory.

  • return: SitemapCreator::sitemaps_dir sitemap dir path
  • section: 2 Info
  • access: public
  • example: example not found
string getSitemapsDir ()
getSitemapURL (line 499)

Get sitemap file URL

bool|string getSitemapURL (string $filename)
  • string $filename: 'index' string or number
initCrawler (line 591)

Initiate the crawler $Crawler

visit http://phpcrawl.cuab.de/classreferences/index.html for full cralwer options which can be accessed through $Crawler object.

SMCCrawler initCrawler ()
isDataDirWritable (line 697)

Check if data directory is writable $data_dir

bool isDataDirWritable ( $dir)
  • $dir
ping (line 919)

Ping search engines

  • return: Array contains errors as $results['google']['error'], or respond body as $request['google']['body']
  • see: SitemapCreator::addEngine()
  • section: 4 Sitemap
  • access: public
array ping ([string|int $filename = 'index'])
  • string|int $filename: 'index' string or number
prepareSitemapsDir (line 675)

Creates sitemaps directory $sitemaps_dir

* this method throws exception if data is not writable

  • see: setDataDir();
  • section: 4 Sitemap
  • access: protected
void prepareSitemapsDir ()
putContent (line 1000)

Write XML to disk

  • section: 4 Sitemap
  • access: public
void putContent (string $file, $XML $XML)
  • string $file: file path
  • $XML $XML: XML code
readFromCSV (line 1110)

Read from CSV file and add to $entries

  • return: true on success, false otherwise
  • section: 5 CSV
  • access: public
bool readFromCSV ([string $file = ''])
  • string $file: file path
readSitemap (line 1017)

Read sitemap file from disk

*this method throws exception if sitemap file doesn't exsit

  • section: 4 Sitemap
  • access: public
void readSitemap (string|int $filename)
  • string|int $filename: 'index' string or number
removeSitemaps (line 1062)

Delete all sitemap dir and files

  • return: true on success, false otherwise
  • section: 4 Sitemap
  • access: public
bool removeSitemaps ()
setCrawlerDefaults (line 647)

Load default crawler settings for $Crawler

Internally load default options, no external calls allowed More related options could be found on http://phpcrawl.cuab.de/classreferences/index.html

void setCrawlerDefaults ()
setDataDir (line 290)

Sets the data directory path of the sitemaps $data_dir

The directory is used to store sitemaps and csv files. If not set then sys_get_temp_dir() will be used.

* This method throws an exception if the path is not valid or the directory is not writable.

void setDataDir (string $dir)
setEntries (line 358)

Set the URLs sets manually $entries

* this method throws exception if $entires is not an array or the first entry does not have 'URL' key

  • section: 1 Settings
  • access: public
  • example: example not found
void setEntries (array $entries)
  • array $entries: array of entries to be added to sitemap
setEntriesPerSitemap (line 325)

Sets number of URLs set for each sitemap file $entries_per_sitemap

Each sitemap file should have a maximum of 50,000 URL. Use this function to change the number of URLs set per sitemap.

  • return: true on success, false otherwise
  • section: 1 Settings
  • access: public
  • example: example not found
bool setEntriesPerSitemap (int $number)
  • int $number: number greater than 0
setFrequency (line 416)

Set Frequency mode $frequency_mode

Choose how the Frequency of every URL should be calculated

bool setFrequency (int $mode)
  • int $mode: number between 0 and 2 or use the predefined constants SitemapCreator::FREQUENCY_Disable Disables frequency calculations SitemapCreator::FREQUENCY_LAST_MODIFIED Latest modified pages get higher frequency SitemapCreator::FREQUENCY_PRIORITY Higher priority pages get higher frequency
setMinFrequency (line 432)

Set minimum Frequency value for all URLs $min_priority

bool setMinFrequency (string $mode)
setMinPriority (line 395)

Set minimum Priority value for all URLs $min_priority

bool setMinPriority (float $mode)
  • float $mode: number between 0 and 1.0
setPriority (line 379)

Set Priority mode $priority_mode

Choose how the Priority of every URL should be calculated

bool setPriority (int $mode)
  • int $mode: number between 0 and 2 or use the predefined constants SitemapCreator::PRIORITY_Disable Disables priority calculations SitemapCreator::PRIORITY_CRAWLED_FIRST Crawled first pages get higher priority SitemapCreator::PRIORITY_URL_STRUCTURE Deeper pathes get lower priority
setSite (line 259)

Sets the URL of the website $site

Normalizes the given URL and returns a full qualified and normalized URL. The method also generates $sitemaps_url if not set.

* This method throws an exception if the URL is invalid

  • return: SitemapCreator::$site|false Returns the valid normalized URL on success or false on failure
  • section: 1 Settings
  • access: public
string|bool setSite (string $site)
  • string $site: The url may contain the protocol (http://www.foo.com or https://www.foo.com), the port (http://www.foo.com:4500/index.php) and/or basic-authentication-data (http://loginname:[email protected])
setSitemapURL (line 309)

Sets the URL of the sitemap files $sitemaps_url

Sitemap URL where the sitemap file name will be appended to the end of the URL.

void setSitemapURL (string $url)
  • string $url: The sitemap URL
useGzip (line 338)

Use gzip compressed sitemaps files $use_gzip

  • section: 1 Settings
  • access: public
void useGzip (bool $mode)
  • bool $mode: true to enable gzip, false otherwise
writeIndex (line 897)

Write sitemap index file

  • section: 4 Sitemap
  • access: protected
void writeIndex ()
writeSitemap (line 887)

Write sitemap file

  • section: 4 Sitemap
  • access: protected
void writeSitemap ()
writeToCSV (line 1083)

Write to CSV file

  • return: true on success, false otherwise
  • section: 5 CSV
  • access: public
bool writeToCSV ([string $file = ''])
  • string $file: file path
Class Constants
FREQUENCY_Disable = 0 (line 61)

Disables frequency calculations

FREQUENCY_LAST_MODIFIED = 1 (line 67)

Latest modified pages get higher frequency

Default

FREQUENCY_PRIORITY = 2 (line 72)

Higher priority pages get higher frequency

PRIORITY_CRAWLED_FIRST = 1 (line 35)

Crawled first pages get higher priority

Default

PRIORITY_Disable = 0 (line 29)

Disables priority calculations

PRIORITY_URL_STRUCTURE = 2 (line 40)

Deeper pathes get lower priority

Documentation generated on Sun, 20 Jan 2013 21:18:50 +0200 by phpDocumentor 1.4.4