<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Alan's blog &#187; shortcodes</title>
	<atom:link href="http://www.alandix.com/blog/tag/shortcodes/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.alandix.com/blog</link>
	<description>just starting ...</description>
	<lastBuildDate>Wed, 08 Feb 2012 18:53:58 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.1</generator>
		<item>
		<title>fix for WordPress shortcode bug</title>
		<link>http://www.alandix.com/blog/2009/07/26/fix-for-wordpress-shortcode-bug/</link>
		<comments>http://www.alandix.com/blog/2009/07/26/fix-for-wordpress-shortcode-bug/#comments</comments>
		<pubDate>Sun, 26 Jul 2009 15:56:04 +0000</pubDate>
		<dc:creator>alan</dc:creator>
				<category><![CDATA[academic]]></category>
		<category><![CDATA[web development]]></category>
		<category><![CDATA[bugs]]></category>
		<category><![CDATA[debugging]]></category>
		<category><![CDATA[php]]></category>
		<category><![CDATA[shortcodes]]></category>
		<category><![CDATA[wordpress]]></category>

		<guid isPermaLink="false">http://www.alandix.com/blog/?p=196</guid>
		<description><![CDATA[I&#8217;m starting to use shortcodes heavily in WordPress1 as we are using it internally on the DEPtH project to coordinate our new TouchIT book.  There was minor bug which meant that HTML tags came out unbalanced (e.g. &#8220;&#60;p&#62;&#60;/div&#62;&#60;/p&#8221;). I&#8217;ve just been fixing it and posting a patch2, interestingly the bug was partly due to the [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m starting to use <a href="http://codex.wordpress.org/Shortcode_API" target="_blank">shortcodes</a> heavily in <a href="http://wordpress.org/" target="_blank">WordPress</a><sup><a href="#footnote-1-196" id="footnote-link-1-196" title="See the footnote.">1</a></sup> as we are using it internally on the <a href="http://www.physicality.org/DEPtH/" target="_blank">DEPtH project</a> to coordinate our new <a href="http://www.physicality.org/TouchIT/" target="_blank">TouchIT book</a>.  There was minor bug which meant that HTML tags came out unbalanced (e.g. &#8220;&lt;p&gt;&lt;/div&gt;&lt;/p&#8221;).</p>
<p>I&#8217;ve just been fixing it and posting a patch<sup><a href="#footnote-2-196" id="footnote-link-2-196" title="See the footnote.">2</a></sup>, interestingly the bug was partly due to the fact that <a href="http://uk3.php.net/manual/en/regexp.reference.back-references.php" target="_blank">back-references</a> in regular expressions count from the beginning of the regular expression, making it impossible to use them if the expression may be &#8216;glued&#8217; into a larger one &#8230; lack of referential transparency!</p>
<p>For anyone having similar problems, full details and patch below (all WP and PHP techie stuff).</p>
<p><span id="more-196"></span></p>
<p>The regex returned by get_shortcode_regex() in <a href="http://svn.automattic.com/wordpress/tags/2.8.1/wp-includes/shortcodes.php" target="_blank">shortcodes.php</a> had a back-reference &#8216;\[\/\2\]&#8216; to match balanced tag-end-tag pairs such as &#8220;[x]text[/x]&#8220;.  The regex assumes it will be used &#8216;bare&#8217; as a regular expression.  However, wpautop in <a href="http://svn.automattic.com/wordpress/tags/2.8.1/wp-includes/formatting.php" target="_blank">formatting.php</a> (which turns newlines in posts into &lt;p&gt;s or &lt;br&gt;s) needs to remove &lt;p&gt;s form standalone shortcodes. To do this it adds extra enclosing brackets to the shortcode regex to be able to refer to the entire matched tags and content.  This means that the back-reference instead matches the (optional) preceding letter which is usually blank and thus the closing tag does not match properly.</p>
<p>In the case of single tag-end-tag pair, this means that</p>
<pre style="padding-left: 30px;"> [divtag]</pre>
<pre style="padding-left: 30px;"> abc</pre>
<pre style="padding-left: 30px;"> [/divtag]
</pre>
<p>ends up as</p>
<pre style="padding-left: 30px;"> &lt;div&gt;</pre>
<pre style="padding-left: 30px;"> &lt;p&gt;abc&lt;/p&gt;</pre>
<pre style="padding-left: 30px;"> &lt;p&gt;&lt;/div&gt;&lt;/p&gt;
</pre>
<p>This gets more complicated when there are internal tags such as:</p>
<p style="padding-left: 30px;">
<pre style="padding-left: 30px;"> [mytag] some [specialsymbol] text [/mytag]</pre>
<p>As in some cases the existing .*? for attribute matching chomps everything while it tries to find a &#8216;/&#8217;.</p>
<p>A simple fix would be to add an optional parameter to get_shortcode_regex($nested=0) and use this to modify the regular expression.  This would give the intended match for the shortcode regex, but in fact makes things worse. Looking at the same source:</p>
<pre style="padding-left: 30px;"> [divtag]</pre>
<pre style="padding-left: 30px;"> abc</pre>
<pre style="padding-left: 30px;"> [/divtag]</pre>
<p>it would generate (assuming [divtag] generates a div):</p>
<pre style="padding-left: 30px;"> &lt;div&gt;&lt;/p&gt;</pre>
<pre style="padding-left: 30px;"> &lt;p&gt;abc&lt;/p&gt;</pre>
<pre style="padding-left: 30px;"> &lt;p&gt;&lt;/div&gt;</pre>
<p>The attached patch addresses this by adding two new functions to shortcodes.php returning regular expressions for matching begin and end tags separately.  wpautop then has two separate preg_replace lines doing the &lt;p&gt; fixes.  I&#8217;ve tried to change as little as possible, although sometime might return to do things like extend it to allow nested copies of the same tag.</p>
<p>At the same time as doing the abve bug fix,  I swopped the .*? for attribute matching to [^\]]*?<br />
I guess slightly slower, but doesn&#8217;t risk chomping the entire post as a shortcode attribute :-/</p>
<p><strong>Download: </strong> <a href="http://www.alandix.com/blog/wp-content/uploads/2009/07/shortcodes_wpautop_patch_2009_07_26.diff">shortcodes patch</a><br />
Instructions for installing patches at <a href="http://markjaquith.com/" target="_blank">Mark Jacquith</a>&#8216;s entry on <a href="http://markjaquith.wordpress.com/2005/11/02/my-wordpress-toolbox/" target="_blank">My WordPress Toolbox</a>.</p>
<p>The patch in full:</p>
<pre class="brush: php; title: ;">

Index: wp-includes/shortcodes.php
===================================================================
--- wp-includes/shortcodes.php    (revision 11744)
+++ wp-includes/shortcodes.php    (working copy)
@@ -175,10 +175,40 @@
 $tagnames = array_keys($shortcode_tags);
 $tagregexp = join( '|', array_map('preg_quote', $tagnames) );

-    return '(.?)\[('.$tagregexp.')\b(.*?)(?:(\/))?\](?:(.+?)\[\/\2\])?(.?)';
+    return '(.?)\[('.$tagregexp.')\b([^\]]*?)(?:(\/))?\](?:(.+?)\[\/\2\])?(.?)';
 }

 /**
+ * Retrieve the shortcode regular expression for searching for start tag only.
+ *
+ * @uses $shortcode_tags
+ *
+ * @return string The shortcode start tag search regular expression
+ */
+function get_shortcode_start_regex() {
+    global $shortcode_tags;
+    $tagnames = array_keys($shortcode_tags);
+    $tagregexp = join( '|', array_map('preg_quote', $tagnames) );
+
+    return '\[('.$tagregexp.')\b([^\]]*?)(?:(\/))?\]';
+}
+
+/**
+ * Retrieve the shortcode regular expression for searching for end tag only.
+ *
+ * @uses $shortcode_tags
+ *
+ * @return string The shortcode end tag search regular expression
+ */
+function get_shortcode_end_regex() {
+    global $shortcode_tags;
+    $tagnames = array_keys($shortcode_tags);
+    $tagregexp = join( '|', array_map('preg_quote', $tagnames) );
+
+    return '\[\/('.$tagregexp.')\]';
+}
+
+/**
 * Regular Expression callable for do_shortcode() for calling shortcode hook.
 * @see get_shortcode_regex for details of the match array contents.
 *
Index: wp-includes/formatting.php
===================================================================
--- wp-includes/formatting.php    (revision 11744)
+++ wp-includes/formatting.php    (working copy)
@@ -170,7 +170,8 @@
 if (strpos($pee, '&lt;pre') !== false)
 $pee = preg_replace_callback('!(&lt;pre[^&gt;]*&gt;)(.*?)&lt;/pre&gt;!is', 'clean_pre', $pee );
 $pee = preg_replace( &quot;|\n&lt;/p&gt;$|&quot;, '&lt;/p&gt;', $pee );
-    $pee = preg_replace('/&lt;p&gt;\s*?(' . get_shortcode_regex() . ')\s*&lt;\/p&gt;/s', '$1', $pee); // don't auto-p wrap shortcodes that stand alone
+    $pee = preg_replace('/&lt;p&gt;\s*?(' . get_shortcode_start_regex() . ')\s*&lt;\/p&gt;/s', '$1', $pee); // don't auto-p wrap shortcodes that stand alone
+    $pee = preg_replace('/&lt;p&gt;\s*?(' . get_shortcode_end_regex() . ')\s*&lt;\/p&gt;/s', '$1', $pee);   // check both start and end tags

 return $pee;
 }
</pre>
<br /><ol class="footnotes"><li id="footnote-1-196">see section &#8220;using dynamic binding&#8221; in <a href="http://www.alandix.com/blog/2009/07/20/whats-wrong-with-dynamic-binding/" target="_blank">What’s wrong with dynamic binding?</a>  [<a href="#footnote-link-1-196">back</a>]</li><li id="footnote-2-196"><a href="http://core.trac.wordpress.org/ticket/10490" target="_blank">TRAC ticket #10490</a>  [<a href="#footnote-link-2-196">back</a>]</li></ol>]]></content:encoded>
			<wfw:commentRss>http://www.alandix.com/blog/2009/07/26/fix-for-wordpress-shortcode-bug/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>

