December 28, 2014 9:29 am

Using PHP and SimpleXML to parse WordPress feeds

SimpleXML is probably the easiest way to parse a XML document. We have created a class that only has to be instantiated to get the feed’s data and added a utility function to display it in a tabular format (HTML table). We have also added some simple CSS styling to the table to make it look better than the browser’s default rendering. It has to be noted that this is just for demonstration and teaching’s sake and it could be better to use XMLReader instead of SimpleXML for lengthy XML files because SimpleXML processes the inputs at once by writing them to the memory which is not a good idea if you have a big XML file.

Using PHP and SimpleXML to parse WordPress feeds

[wpdm_file id=120]DEMO

This is a practical tutorial of using SimpleXML to read XML files.

We start with the standard html skeleton, include our WordPressFeed class and start feeding it with feeds.

<!DOCTYPE html>
<!-- To get the XML feed of a WordPress page you enter: http://www.page.com/feed or http://www.page.com/comments/feed or the relevant blog URL of the feed -->
<html lang="en">
	<head>
		<style>

	table.feed_table  {

		font-family:'Arial', sans-serif;
		width:75%;
		margin:25px auto;
	}
	.feed_tabletr:nth-of-type(even) {
		background-color:#eee;
	}

	.feed_tableth {
		font-size:1.8em;
		padding:10px;
		color:#fff;
		background-color:#333;
	}

	.feed_table td {
		border:1px solid #aaa;
		text-align: center;
	}



		</style>


		<meta charset="UTF-8">
		<title>WordPress Feed Fetcher</title>
	</head>
	<body>
	<?phprequire_once("WordPressFeed.php");

	// Those two lines can be used to display a WP feed in an HTML table
	$phpgang=newWordPressFeed("https://www.phpgang.com/feed/");

	echo$phpgang->showTabular();

	$infosec=newWordPressFeed("http://resources.infosecinstitute.com/feed/");

	echo$infosec->showTabular();

	$phpgang_posts=$phpgang->rawResults;



?>

	</body>
</html>

You can see that we have included the styles for the table in the head but you can include them in an external stylesheet and change them easily.
Then we define our class (comments are directly inserted in the code for the class):

Feed’s data can be retrieved the following way:

$phpgang=newWordPressFeed("https://www.phpgang.com/feed/");

Then the raw results (an array of all posts/articles and their data) can be accessed with the public property:

$phpgang->rawResults;

The utility function to display the data in a table can be used the following way:\

echo $phpgang->showTabular();
<?php
	// 30 seconds may not be enough 
	ini_set('max_execution_time', 90);

/*

*You can add methods to show in different ways, save to database. Anything you wish.
*You can also modify the class so it checks whether the elements exist or make it search for attributes or missing elements or whatever you want.
*/
classWordPressFeed {
	//$site holds the name of the site ("phpgang", "lada", etc.)
	private$site;
	// $feedURL is the argument that you pass to the constructor
	private$feedURL;
	// $rawResults is an array with all articles/posts.
	public$rawResults;
	//creates $site
	privatefunctiongetSite() {
		//get only the site's name and assign it to $site
		$site_arr=explode(".", $this->feedURL);
		array_pop($site_arr);
		$this->site=array_pop($site_arr);

	}
	//gets the feed's data upon instantiation
	publicfunction__construct($feedURL) {
		$this->feedURL=$feedURL;
		
		$this->rawResults=$this->getFeed();

		$this->getSite();

	}


	publicfunctionshowTabular() {
		// shows the feed as a table

	
		if (count($this->rawResults) >0) {
		
		?>
		<h1 style="text-align: center;">Feed for Website: <?phpecho$this->site?></h1><hr />
		<table class="feed_table">
			<tr>
				<th>Title</th>
				<th>URL</th>
				<th>Author</th>
				<th>Categories</th>
				<th>Description</th>
			</tr>
			<?php
			foreach ($this->rawResultsas$articleElement) {
				echo"<tr>";
				echo"<td>{$articleElement['title']}</td>";
				echo"<td>{$articleElement['link']}</td>";
				echo"<td>{$articleElement['author']}</td>";
				//create a string from all array indices
				$categories=implode(", ", $articleElement['categories']);
				//remove html tags such as images and links from the description
				$description=strip_tags($articleElement['description']);
				echo"<td>$categories</td>";
				echo"<td>$description</td>";
				echo"</tr>";

			}
			echo"</table>";

			
		}


	}

	privatefunctiongetFeed() {
	//get feed's data

// if it is an .xml file we need to use simplexml_load_file()

	$feed=file_get_contents($this->feedURL);


	$allArticles=array();
	$xml=simplexml_load_string($feed);
	//loop through each post
	foreach ($xml->channel->itemas$item) {
	$category=array();
	//add the data about that post to a variable with categories being an array of all the categories for the post
	$article['title'] =htmlspecialchars((string)$item->title);
	$article['link'] =htmlspecialchars((string)$item->link);
	// get the creator element that is in the dc namespace
	$namespaces=$item->getNameSpaces(true);
	$dc_namespace=$item->children($namespaces['dc']); 
	
	$article['author'] =htmlspecialchars((string)$dc_namespace->creator);
	foreach ($item->categoryas$single_category) {
		$category[] =htmlspecialchars((string)$single_category);

	}
	$article['categories'] =$category;
	$article['description'] = (string) $item->description;
	//add the article to the array of articles
	$allArticles[] =$article;

	}

	return$allArticles;
    }

}

It is important to cast the XML elements as string otherwise they would be SimpleXML objects. Also, in XML elements entitled something like dc:creator mean that the creator element is in the dc namespace and that namespace has to be accessed and the creator element retrieved.

You cannot simply access it with something like:

$item->{"dc:creator"};

When we give it a WordPress feed and call the showTabular() method on the class we get a page that resembles something like the graphic below:

Feed for Website phpgang

Author Ivan Dimov

Ivan is a student of IT, a freelance web designer/developer and a tech writer. He deals with both front-end and back-end stuff. Whenever he is not in front of an Internet-enabled device he is probably reading a book or traveling. You can find more about him at: http://www.dimoff.biz. facebook, twitter


Tutorial Categories:

3 responses to “Using PHP and SimpleXML to parse WordPress feeds”

  1. Dinesh Kumar says:

    Good Article, Thanks

  2. ProgrammingNewbie . says:

    Nice to see how you can use simple array functions to build upon data!

Leave a Reply

Your email address will not be published. Required fields are marked *