<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Alan's blog &#187; bugs</title>
	<atom:link href="http://www.alandix.com/blog/tag/bugs/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.alandix.com/blog</link>
	<description>just starting ...</description>
	<lastBuildDate>Wed, 08 Feb 2012 18:53:58 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.1</generator>
		<item>
		<title>tread lightly &#8212; controlling user experience pollution</title>
		<link>http://www.alandix.com/blog/2012/01/14/tread-lightly-controlling-user-experience-pollution/</link>
		<comments>http://www.alandix.com/blog/2012/01/14/tread-lightly-controlling-user-experience-pollution/#comments</comments>
		<pubDate>Sat, 14 Jan 2012 08:40:15 +0000</pubDate>
		<dc:creator>alan</dc:creator>
				<category><![CDATA[academic]]></category>
		<category><![CDATA[HCI and usability]]></category>
		<category><![CDATA[web development]]></category>
		<category><![CDATA[bugs]]></category>
		<category><![CDATA[firefox]]></category>
		<category><![CDATA[HCI]]></category>
		<category><![CDATA[human computer interaction]]></category>
		<category><![CDATA[networks]]></category>
		<category><![CDATA[time]]></category>
		<category><![CDATA[usability]]></category>

		<guid isPermaLink="false">http://www.alandix.com/blog/?p=802</guid>
		<description><![CDATA[When thinking about usability or user experience, it is easy to focus on the application in front of us, but the way it impacts its environment may sometimes be far more critical. However, designing applications that are friendly to their environment (digital and physical) may require deep changes to the low-level operating systems. I&#8217;m writing [...]]]></description>
			<content:encoded><![CDATA[<p>When thinking about usability or user experience, it is easy to focus on the application in front of us, but the way it impacts its environment may sometimes be far more critical.  However, designing applications that are friendly to their environment (digital and physical) may require deep changes to the low-level operating systems.</p>
<p>I&#8217;m writing this post effectively &#8216;offline&#8217; into a word processor for later upload. I sometimes do this as I find it easier to write without the distractions of editing within a web browser, or because I am physically disconnected from the Internet.  However, now I am connected, and indeed I can see I am connected as a FTP file upload is progressing, it is just that anything else network-related is stalled.</p>
<p>The reason that the FTP upload is &#8216;hogging&#8217; the network is, I believe, due to a quirk in the UNIX scheduling system, which was, paradoxically, originally intended to improve interactivity.</p>
<p>UNIX, which sits underneath Mac OS, is a multiprocessing operating system running many programs at once.  Each process has a priority, called its &#8216;<a href="http://docstore.mik.ua/orelly/unix3/upt/ch26_07.htm" target="_blank">niceness</a>&#8216;, which can be set explicitly, but is also tweaked from moment to moment by the operating system.  One of the rules for &#8216;tweaking&#8217; it is that if a process is IO-bound, that is if it is constantly waiting for input or output, then its niceness is decreased, meaning that it is given higher priority.</p>
<p>The reason for this rule is partly to enhance interactive performance in the old days of command line interfaces; an interactive program would spend lots of time waiting for the user to enter something, and so its priority would increase meaning it would respond quickly as soon as the user entered anything. The other reason is that CPU time was seen as the scarce resource, so that processes that were IO bound were effectively being &#8216;nicer&#8217; to other processes as they let them get a share of the precious CPU.</p>
<p>The FTP program is simply sitting there shunting out data to the network, so is almost permanently blocked waiting for the network as it can read from the disk faster than the network can transmit data.  This means UNIX regards it as &#8216;nice&#8217; and ups its priority.  As soon as the network clears sufficiently, the FTP program is rescheduled and it puts more into the network queue, reads the next chunk from disk until the network is again full to capacity.  Nothing else gets a chance, no web, no email, not even a network trace utility.</p>
<p>I&#8217;ve seen the same before with a database server on one of Fiona&#8217;s machines &#8212; all my fault.  In the MySQL manual it suggested that you disable indices before large bulk updates (e.g. ingesting a file of data) and then re-enable them once the update is finished as indexing is more efficient on lots of data than one at a time.  I duly did this and forgot about it until Fiona noticed something was wrong on the server and web traffic had ground to a near halt.  When she opened a console on the server, she found that it seemed quiet, very little CPU load at all, and was puzzled until I realised it was my indexing.  Indexing requires a lot of reading and writing data to and from disk, so MySQL became IO-bound, was given higher priority, as soon as the disk was free it was rescheduled, hit the disk once more &#8230; just as FTP is now hogging the network, MySQL hogged the disk and nothing else could read or write.  Of course MySQL&#8217;s own performance was fine as it internally interleaved queries with indexing, it is just everything else on the system that failed.</p>
<p>These are hard scenarios to design for.  I have written before (&#8220;<a href="http://www.alandix.com/blog/2008/06/21/why-software-need-never-hang/" target="_blank">why software need never hang</a>&#8220;) about the way application designers do not think sufficiently about potential delays due to slow networks, or broken connections.  However, that was about the applications that are suffering.  Here the issue is not that the FTP program is badly designed for its delays, it is still responding very happily, just that it has had a knock on effect on the rest of the system. It is like cleaning your sink with industrial bleach &#8212; you have a clean house within, but pollute the watercourse without.</p>
<p>These kind of issues are not related solely to network and disk, any kind of resource is limited and profligacy causes damage in the digital world as much as in the physical environment.</p>
<p>Some years ago I had a Symbian smartphone, but it proved unusable as its battery life rarely exceeded 40 minutes from full charge.  I thought I had a duff battery, but later realised it was because I was leaving applications on the phone &#8216;open&#8217;.  For me I went to the address book, looked up a number, and that was that, I then maybe turned the phone off or switched  to something else without &#8216;exiting&#8217; the address book.  I was treating the phone like every previous phone I had used, but this one was different, it had a &#8216;real&#8217; operating system, opening the address book launched the address book application, which then kept on running &#8212; and using power &#8212; until it was explicitly closed, a model that is maybe fine for permanently plugged in computers, but disastrous for a moble phone.</p>
<p>When early iPhones came out iOS was criticised for being single threaded, that is not having lots of things running in the &#8216;background&#8217;.  However, this undoubtedly helped its battery life.  Now, with newer versions of iOS, it has changed and there are lots of apps running at once, and I have noticed the battery life reducing, is that simply the battery wearing out with age or the effect of all those apps running?</p>
<p>Power is of course not just a problem for smartphones, but for any laptop.  I try to closedown applications on my Mac when I am working without power as I know some programs just eat CPU when they are apparently idle (yes, Firefox, it&#8217;s you I&#8217;m talking about).  And from an environmental point of view, lower power consumption when connected would also be good.   My hope was that Apple would take the lessons learnt in the early iOS to change the nature of their mainstream OS, but sadly they succumbed to the pressure to make iOS a &#8216;proper&#8217; OS!</p>
<p>Of course the FTP program could try to be friendly, perhaps when it is not the selected window deliberately throttle its network activity.  But then the 4 hour upload would take 8 hours, instead of 20 minutes left at this point, I&#8217;d be looking forward to another 4 hours and 20 minutes, and I&#8217;d be complaining about that.</p>
<p>The trouble is that there needs to be better communication, more knowledge shared, between application and operating system.  I would like FTP to use all the network capacity that it can, <em>except</em> when I am interacting with some other program.  Either FTP needs to say to the OS &#8220;hey here&#8217;s a packet, send it when there&#8217;s a gap&#8221;<sup><a href="#footnote-1-802" id="footnote-link-1-802" title="See the footnote.">1</a></sup>, or the OS needs some way for applications to determine current network state and make decisions based on that.  Sometimes this sort of information is easily available, more often it is either very hard to get at or not available at all.</p>
<p>I recall years ago when internet was still mainly through pay-per-minute dial-up connections.  You could set your PC to automatically dial when the internet was needed.  However, some programs, such as chat, would periodically check with a central server to see if there was activity, this would cause the PC to dial-up the ISP.  If you were lucky the PC also had an auto-disconnect after a period of inactivity, if you were not lucky the PC would connect at 2am and by the morning you&#8217;d find yourself with a phone bill more than your weeks&#8217; wages.</p>
<p>When we were designing onCue at <a href="http://www.aqtive.net/" target="_blank">aQtive</a>, we wanted to be able to connect to the Internet when it was available, but avoid bankrupting our users.  Clearly somewhere in the TCP/IP stack, the layers of code over the network, at some level deep down it knew whether we were connected.  I recall we found a very helpful function in the Windows API called something like &#8220;isConnected&#8221;<sup><a href="#footnote-2-802" id="footnote-link-2-802" title="See the footnote.">2</a></sup>.  Unfortunately, it worked by attempting to send a network packet and returning true if it succeeded and false if it failed.  Of course sending the test packet caused the PC to auto-dial &#8230;</p>
<p>And now there is just 1 minute and 53 seconds left on the upload, so time to finish this post before I get on to garbage collection.</p>
<br /><ol class="footnotes"><li id="footnote-1-802">This form of &#8220;send when you can&#8221; would also be useful in cellular networks, for example when syncing photos.  [<a href="#footnote-link-1-802">back</a>]</li><li id="footnote-2-802">I had a quick peek, and fund that Windows CE has a function called <a href="http://msdn.microsoft.com/en-us/library/ms918360.aspx" target="_blank">InternetGetConnectedState</a>.  I don&#8217;t know if this works better now.  [<a href="#footnote-link-2-802">back</a>]</li></ol>]]></content:encoded>
			<wfw:commentRss>http://www.alandix.com/blog/2012/01/14/tread-lightly-controlling-user-experience-pollution/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Time Machine &#8211; when it goes wrong and how to fix it</title>
		<link>http://www.alandix.com/blog/2010/07/09/time-machine-when-it-goes-wrong-and-how-to-fix-it/</link>
		<comments>http://www.alandix.com/blog/2010/07/09/time-machine-when-it-goes-wrong-and-how-to-fix-it/#comments</comments>
		<pubDate>Fri, 09 Jul 2010 09:05:55 +0000</pubDate>
		<dc:creator>alan</dc:creator>
				<category><![CDATA[academic]]></category>
		<category><![CDATA[HCI and usability]]></category>
		<category><![CDATA[bugs]]></category>
		<category><![CDATA[debugging]]></category>
		<category><![CDATA[fail-fast programming]]></category>
		<category><![CDATA[MacOSX]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[software development]]></category>
		<category><![CDATA[software engineering]]></category>

		<guid isPermaLink="false">http://www.alandix.com/blog/?p=262</guid>
		<description><![CDATA[Unfortunately only fixing Mac OS X backup, not the Tardis &#8230; but, nonetheless, critical. What bit of software do you really need to be reliable?  If anything else goes really wrong you have the backup &#8212; but if the backup fails you really are lost. And Mac OS X Time Machine, while it does have [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.apple.com/macosx/what-is-macosx/time-machine.html" target="_blank"><img class="alignright" title="MacOSX Time Machine" src="http://www.alandix.com/images/time-machine-logo-only.gif" alt="" width="100" height="100" /></a>Unfortunately only fixing Mac OS X backup, not the Tardis <img src='http://www.alandix.com/blog/wp-includes/images/smilies/icon_sad.gif' alt=':-(' class='wp-smiley' />  &#8230; but, nonetheless, critical.</p>
<p>What bit of software do you really need to be reliable?  If anything else goes really wrong you have the backup &#8212; but if the backup fails you really are lost.</p>
<p>And Mac OS X <a href="http://www.apple.com/macosx/what-is-macosx/time-machine.html" target="_blank">Time Machine</a>, while it does have a very pretty  interface, is inclined to get stuck sometimes.</p>
<p>This is my own story of how it goes wrong &#8230; and how to put it right.</p>
<p>&#8230; and throughout I&#8217;ve dropped in a few lessons for anyone implementing critical system software &#8212; maybe the odd Apple engineer is reading</p>
<h3>how to tell when things are wrong</h3>
<p>Occasionally Time Machine seems to be stuck, but isn&#8217;t really.  When you first do a backup, or when you haven&#8217;t backed up to a particular disk for ages (perhaps if you have been away on a trip), it can spend several hours &#8216;preparing&#8217;.  You can tell it is &#8216;preparing&#8217; because when you open the Time Machine preferences there is the little barbers pole saying &#8216;preparing&#8217; <img src='http://www.alandix.com/blog/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> </p>
<p style="padding-left: 30px;"><img class="alignnone" title="Time Machine preparing ..." src="http://www.alandix.com/images/time-machine-preparing.gif" alt="" width="253" height="45" /></p>
<p>This is when it is running over the disk working out what it needs to backup, and always seems to be the lengthiest operation, actually backing up the disk is often quite fast, and yet, for some reason there is no indication of how far through the &#8216;preparing&#8217; process it has got.</p>
<p style="padding-left: 30px;"><span style="color: #000080;"><strong>Lesson 1: </strong></span><span style="color: #993366;">make sure you include progress indicators for anything that can take a while, not just the obvious &#8216;slow&#8217; things.</span></p>
<p>So, when you see &#8216;preparing&#8217;, just be patient!</p>
<p>However, at least half-a-dozen times over the last year, my Time Machine has got completely stuck.  I have seen this happen in three ways:</p>
<p>(i)  it is still saying &#8216;preparing&#8217; after leaving it overnight!</p>
<p>(ii)  it starts to transfer to disk, but then gets stuck part way:</p>
<p style="padding-left: 30px;"><img class="alignnone" title="Tiem Machine transfering to disk" src="http://www.alandix.com/images/time-machine-transfering.png" alt="" width="252" height="44" /></p>
<p>(iii)  if you look in the Time Machine preferences it says the backup has failed</p>
<p>This last time in fact the first sign was (iii), but it doesn&#8217;t actually tell you (if you don&#8217;t look) until it has failed for ten days, by which time I was travelling.  In the days before Time Machine I always did a manual backup before travelling as I knew that was when things were most likely to go wrong, but now-a-days I have got used to relying on it and forget to check it is working OK &#8230; so if you are paranoid about your data, do peek occasionally at Time Machine to check it is still working!</p>
<p>When I got home and told Time Machine to backup to the Time Capsule here rather than my office disk (why can&#8217;t it remember that I have two backup disks??).  Then (after being very very patient while to was &#8216;preparing&#8217; for four hours), I saw it got stuck in step (ii) at 1.4 GB or 4.2 GB.  Of course progress indicators are never very good for very slow operations, when transferring several GB of data there may be several minutes before the bar even moves a pixel &#8230; but I was very very patient and it definitely did not move!</p>
<p style="padding-left: 30px;"><span style="color: #000080;"><strong>Lesson 2: </strong></span><span style="color: #993366;">for very long processes supplement the progress indicator with some other indicator to show things are still working, in this case perhaps amount transferred in last minute<br />
</span></p>
<p>At this point I did the normal things, turn Time Machine On/Off,  restart machine a couple of times, etc., but when it persists then you know something is deeply wrong.</p>
<h3>so why does it go wrong?</h3>
<p>In fact Fiona@<a href="http://www.lovefibre.com/" target="_blank">lovefibre</a> has found Time Machine flawless for her desktop machine backing up to exactly the same <a href="http://www.apple.com/timecapsule/" target="_blank">Time Capsule</a>.  I am guessing the problem I have is because I use a laptop so possible reasons:</p>
<ul>
<li>it may go to sleep occasionally, breaking connection to the Time Capsule</li>
<li>maybe the WiFi aerial on a laptop is not as good as the desktop</li>
</ul>
<p>However, if every laptop failed as often surely Apple would have fixed it by now.  So guessing there is an additional factor:</p>
<ul>
<li>my disk has 196 Gb of data, much of it in smaller document files (word docs, code files, etc.), not just a few giant movies.</li>
</ul>
<p>The software will be designed to withstand a certain amount of external failure, especially when connecting to disks over WiFi as the Time Capsule is designed to do.  However, I imagine that there are places in the code where there are race conditions, or critical portions where external failure really makes a difference.  If the external connections are reliable and the backup is quite fast the likelihood of hitting one of the nasty spots in the code is low.  However, if you have a lot of data to check and then transfer and the external failures more frequent, then the likelihood of hitting one increases and things start to go wrong.</p>
<p>I see similar problems with other software, Dreamweaver in particular, which has got better, but still can crash if the Internet connection is poor (see also &#8220;<a href="http://www.alandix.com/blog/2008/06/21/why-software-need-never-hang/" target="_blank">Why software need never hang</a>&#8220;).  What happens is that during testing, the <span style="color: #000000;">test machines  often have minimal data, little software (maybe just the operating system and what is being tested), and  operate in perfect situations.  In such circumstances these hidden flaws never become apparent.<br />
</span></p>
<p style="padding-left: 30px;"><span style="color: #000080;"><strong>Lesson 3: </strong></span><span style="color: #993366;">make sure your test machine is fully loaded with data and applications, and operates in an unreliable environment, so that testing is realistic<br />
</span></p>
<p>However, this is not like Word crashing and losing your most recent edits to one document.  When Time Machine fails it seems to occasionally leave something corrupt in the backup disk so that subsequent attempts to backup also fail.  There is no excuse for this, the techniques for dealing with potential disk-writing failures are well established in both databases and low-level disk management.  For example, one can save a timestamp file at the end of successful operations so that, when  returning to the data, if the timestamp file is not there the software knows something went wrong last time.</p>
<p>Maybe Time Machine is trying to be too clever, picking up where it left off when, for example, connection to the disk is broken.  If so it clearly needs some additional mechanism to notice &#8220;I&#8217;ve tried this several times and it keeps going wrong, maybe I need to back off to the last successful state&#8221;.  Perhaps not something to worry about in less critical software, but not difficult to get right when it is really needed &#8230; as in backups!</p>
<p style="padding-left: 30px;"><span style="color: #000080;"><strong>Lesson 4: </strong></span><span style="color: #993366;">build critical software defensively in layers so that errors in one part do not affect the whole; and if saving to disk ensure there is some sort of atomic transaction<br />
</span></p>
<p>The aim during testing should be what I call &#8220;fail-fast programming&#8221; trying to make sure that failures happen during testing not real use!</p>
<p>One thing I found particularly disturbing about my most recent Time Machine hang is that when I looked at the system console it had regular spats of &#8220;unknown SIGSEGV&#8221; several times a minute &#8230; in the kernel!  If you don&#8217;t know UNIX internals the &#8216;kernel&#8217; is the heart of the operating system of the Mac, where all the lowest level work is done and where if something goes wrong <em>everything</em> fails.  SIGSEGV means that some bit of software is trying to access a memory location that doesn&#8217;t exist.  In fact while this is caught it is not so bad, the greater worry is that if it is trying to access non-existent memory, then it may corrupt other memory &#8230; and the kernel has access to everything &#8211; not good.</p>
<p style="padding-left: 30px;"><a href="http://www.alandix.com/images/time-machine-backup-console.png" target="_blank"><img class="alignnone" style="border: 1px solid black;" title="console with kernel SIGSEGV - segment violation!" src="http://www.alandix.com/images/time-machine-backup-console.png" alt="" width="342" height="139" /></a></p>
<p>Please, <em>please</em> Apple if you cannot get Time Machine to work properly, do not let it affect the kernel!</p>
<h3>how to put it right</h3>
<p>One might hope that even if Time Machine cannot notice itself there is something wrong at least there would be an option to say &#8220;restart yourself&#8221;.  One might hope, but there is not.  However, you can do it yourself by digging a little into the backup disk itself.</p>
<p>First problem is to stop the Time Machine backup if it has hung.</p>
<p>In the Time Machine control panel, you can simply slide the OFF-ON button to OFF.  The status <em>should</em> change to &#8216;stopping&#8217; and after a while stop.  Then you can restart the machine and try to fix things.</p>
<p style="padding-left: 30px;"><img class="alignnone" title="Tiem Machine on-off button" src="http://www.alandix.com/images/time-machine-on-off.png" alt="" width="139" height="49" /></p>
<p>This is the ideal thing to do, but I find that when Time Machine is really hung this rarely works.  I do turn it to OFF, but either it never changes to &#8216;stopping&#8217; and stays &#8216;preparing&#8217;, or it changes to &#8216;stopping&#8217;, but never does.  If this happens the system restart typically doesn&#8217;t restart the system as Time Machine won&#8217;t stop running.  Then, always with much trepidation, I reach for the on/off button on the Mac itself :-/</p>
<p style="padding-left: 30px;"><img class="alignnone" title="MacBook Pro power button" src="http://www.alandix.com/images/macbook-power-button.jpg" alt="" width="119" height="97" /></p>
<p>After doing a hard on/off like this, I usually do anther restart from the Apple menu &#8230; not sure if this is necessary, but just to be on the safe side!</p>
<p>Occasionally I skip to the next step before the hard restart.</p>
<p>Then you can start to fix the problem properly.</p>
<p>Find the backup disk.  If it is not obvious in the Finder use the &#8216;Go&#8217; menu and select &#8220;Computer&#8221;; it shows all the locally connected disks (or it may simply appear in the left hand favourites pane in each Finder window).</p>
<p>If you skipped the restart stage (or of you just peek now to see what it is like when it hasn&#8217;t gone wrong), you will see something like &#8220;Backup of Alan Dix’s MacBook Pro&#8221; (obviously for you it will not be &#8220;Alan Dix&#8217;s MacBook Pro&#8221;!).  This is the Time Machine backup.  However, if you have restarted the machine with Time Machine off you will have to find the actual disk that you chose as your backup disk and on it look for a file called something like &#8220;Alan Dix’s MacBook  Pro_0039fc56f8a2.sparsebundle&#8221;.  This is some form of compressed disk image.  In the older versions of Time Machine there was simply a folder with all the backups in it &#8212; I felt much more secure.  Now this is a single opaque file and I worry that if one day it gets corrupted :-/</p>
<p>Having found the &#8216;sparsebundle&#8217; double click it and it will display a little pop-up window that says &#8216;checking volumes&#8217;.  I keep meaning to see if this ever stops, but I am not patient enough and press the button that says to skip this state and then (after a while) it mounts the disk image and the disk &#8220;Backup of Alan Dix’s MacBook Pro&#8221; appears.</p>
<p>Double click &#8220;Backup of Alan Dix’s MacBook Pro&#8221; and look inside and then inside the folder &#8220;Backups_backupd&#8221; and you find loads of dated folders, which are the actual backups of your system that you can browse if you prefer instead of using the Time Machine interface.  In addition there may be one file ending &#8220;.inProgress&#8221;, which is some sort of internal file created while it is in the middle of doing the backup.</p>
<p style="padding-left: 30px;"><a href="http://www.alandix.com/images/time-machine-volume.png" target="_blank"><img class="alignnone" title="Time Machine backup volume" src="http://www.alandix.com/images/time-machine-volume.png" alt="" width="577" height="130" /></a></p>
<p>Delete the &#8220;.inProgress&#8221; file.</p>
<p>In addition, I usually delete the last of the dated folders (sort by &#8220;Date Modified&#8221; to get the last one).  However, if you don&#8217;t want to lose the last backup you can try just deleting the &#8220;inProgress&#8221; file and only delete the last dated backup if Time Machine still gets stuck.</p>
<p style="padding-left: 30px;"><span style="color: #000080;"><strong>Important: </strong></span><span style="color: #993366;">only delete the latest of the dated backup folders (e.g. &#8220;2010-06-09-225547&#8243; in the screen shot above), NOT the entire &#8220;Alan Dix&#8217;s Macbook Pro&#8221; folder.  If you do that you lose all your backups!<br />
</span></p>
<p>I recall doing this all with extreme trepidation the first time, but had got to the point when I couldn&#8217;t do backups or access them anyway so had nothing to lose.  Actually it seems pretty OK getting in here and doing this sort of thing, the nice thing about Time Machine is that it uses ordinary folder structures that you can peek around in and see are there all secure.  I am much happier with this than the kind of backup where you only know if it is working the day you try to restore something!  At least half the times I have used such backups over the years I&#8217;ve found the backup is in some way corrupt or incomplete. So actually one up for Time Machine <img src='http://www.alandix.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>Now reboot again (for luck).  Turn Time Machine back on in the control panel and wait &#8230; a long time &#8230; it will start &#8216;preparing&#8217; as if for the first backup &#8230; and several hours later hopefully all will be well.</p>
<p>But do remember to set the power save options not to go to sleep in the middle!</p>
<p>In fact the above has always worked for me except for this last time when, for some reason (maybe I missed something on the way?), it hung again and I had to go through the whole process again.  This time I waited until yesterday evening before turning Time Machine back on so that I could leave it to do the long 4 hour &#8216;preparing&#8217; stage without me doing anything else.</p>
<p>And then:</p>
<p style="padding-left: 30px;"><a href="http://www.alandix.com/images/time-machine-control-panel-success.png" target="_blank"><img class="alignnone" title="Tiem Machine success!" src="http://www.alandix.com/images/time-machine-control-panel-success.png" alt="" width="294" height="195" /></a></p>
<p>Joy!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.alandix.com/blog/2010/07/09/time-machine-when-it-goes-wrong-and-how-to-fix-it/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>microsoft makes things easy</title>
		<link>http://www.alandix.com/blog/2010/04/15/microsoft-makes-things-easy/</link>
		<comments>http://www.alandix.com/blog/2010/04/15/microsoft-makes-things-easy/#comments</comments>
		<pubDate>Thu, 15 Apr 2010 16:40:33 +0000</pubDate>
		<dc:creator>alan</dc:creator>
				<category><![CDATA[academic]]></category>
		<category><![CDATA[HCI and usability]]></category>
		<category><![CDATA[web development]]></category>
		<category><![CDATA[bugs]]></category>
		<category><![CDATA[microsoft]]></category>

		<guid isPermaLink="false">http://www.alandix.com/blog/?p=243</guid>
		<description><![CDATA[I&#8217;m so glad that Microsoft&#8217;s conference management service allows you to prepare reviews offline, in order to make reviewing  easy &#8230;]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m so glad that Microsoft&#8217;s conference management service allows you to prepare reviews offline, in order to make reviewing  easy &#8230;</p>
<p><img class="alignnone" style="border: 1px solid black;" title="Uploaded file was not correctly formatted: An error occurred while parsing EntityName. Line 36, position 48." src="http://www.alandix.com/images/microsoft-cmt-error.gif" alt="" width="768" height="92" /></p>
]]></content:encoded>
			<wfw:commentRss>http://www.alandix.com/blog/2010/04/15/microsoft-makes-things-easy/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>the plague of bugs</title>
		<link>http://www.alandix.com/blog/2010/01/01/the-plague-of-bugs/</link>
		<comments>http://www.alandix.com/blog/2010/01/01/the-plague-of-bugs/#comments</comments>
		<pubDate>Fri, 01 Jan 2010 13:31:36 +0000</pubDate>
		<dc:creator>alan</dc:creator>
				<category><![CDATA[academic]]></category>
		<category><![CDATA[HCI and usability]]></category>
		<category><![CDATA[backward compatibility]]></category>
		<category><![CDATA[bugs]]></category>
		<category><![CDATA[coding]]></category>
		<category><![CDATA[google]]></category>
		<category><![CDATA[microsoft]]></category>
		<category><![CDATA[software development]]></category>
		<category><![CDATA[Word]]></category>

		<guid isPermaLink="false">http://www.alandix.com/blog/?p=221</guid>
		<description><![CDATA[Like some Biblical locust swarm, every attempt to do anything is thwarted by the dead weight of innumerable bugs! This time I was trying &#8230; and failing &#8230; to upload a Word file into Google docs. I uploaded the docx file and it said the file was unreadable, tried saving it as .doc, and when [...]]]></description>
			<content:encoded><![CDATA[<p>Like some Biblical locust swarm, every attempt to do anything is thwarted by the dead weight of innumerable bugs!  This time I was trying &#8230; and failing &#8230; to upload a Word file into Google docs.  I uploaded the docx file and it said the file was unreadable, tried saving it as .doc, and when that failed created an rtf file.  Amazingly from a 1 Meg word file the rtf was 66 Meg, but very very slowly Google docs did upload the file and when it was eventually all uploaded &#8230;</p>
<p><img class="aligncenter" title="Google Docs upload failed!" src="http://www.alandix.com/images/google-docs-upload.png" alt="" width="565" height="72" /></p>
<p>To be fair the same document imports pretty badly into Pages (all the headings disappear).  I think this is because it is originally a 2003 Word file and gets corrupted when the new Word reads it.</p>
<p>Now I have griped before about <a href="http://www.alandix.com/blog/2008/06/16/pain-tears-and-office-2008/" target="_blank">backward compatibility issues for Word</a>, and in general about lack of robustness in many leading products, and to add to my woes, for the last month or so (I guess after a software update) Word has decided not to show its formatting menus on an opened document unless I first hide them, then show them, and then maximise the window. Mostly these things are annoying, sometimes really block work, and always waste time and destroy the flow of work.</p>
<p>However, rather than grousing once again (well I already have a bit), I am trying to make sense of this.  For some time it has become apparent that software is fundamentally breaking down, in that with every new version there is minimal new useful functionality, but more bugs.  This may be simply issues of scale, of the training of programmers, or of the nature of development processes.  Indeed in the talk I gave a bit over a  year ago to PPIG, &#8220;<a href="http://www.hcibook.com/alan/papers/PPIG2008-as-we-may-code/" target="_blank">as we may code</a>&#8220;, I noted that coding in th 21st Century seems to be radically different, more about finding tricks and community know-how and less about problem solving.</p>
<p>Whatever the reason, I don&#8217;t think the Biblical plague of bugs is simply due to laziness or indifference on the part of large vendors such as  Microsoft and Adobe, but is symptomatic of a deeper crisis in software development, certainly where there is a significant user interface.</p>
<p>Maybe this is simply an inevitable consequence of scale, but more optimistically I wonder if there are new ways of coding, new paradigms or new architectural models.  Can 2010 be the decade when software is reborn?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.alandix.com/blog/2010/01/01/the-plague-of-bugs/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>fix for WordPress shortcode bug</title>
		<link>http://www.alandix.com/blog/2009/07/26/fix-for-wordpress-shortcode-bug/</link>
		<comments>http://www.alandix.com/blog/2009/07/26/fix-for-wordpress-shortcode-bug/#comments</comments>
		<pubDate>Sun, 26 Jul 2009 15:56:04 +0000</pubDate>
		<dc:creator>alan</dc:creator>
				<category><![CDATA[academic]]></category>
		<category><![CDATA[web development]]></category>
		<category><![CDATA[bugs]]></category>
		<category><![CDATA[debugging]]></category>
		<category><![CDATA[php]]></category>
		<category><![CDATA[shortcodes]]></category>
		<category><![CDATA[wordpress]]></category>

		<guid isPermaLink="false">http://www.alandix.com/blog/?p=196</guid>
		<description><![CDATA[I&#8217;m starting to use shortcodes heavily in WordPress1 as we are using it internally on the DEPtH project to coordinate our new TouchIT book.  There was minor bug which meant that HTML tags came out unbalanced (e.g. &#8220;&#60;p&#62;&#60;/div&#62;&#60;/p&#8221;). I&#8217;ve just been fixing it and posting a patch2, interestingly the bug was partly due to the [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m starting to use <a href="http://codex.wordpress.org/Shortcode_API" target="_blank">shortcodes</a> heavily in <a href="http://wordpress.org/" target="_blank">WordPress</a><sup><a href="#footnote-1-196" id="footnote-link-1-196" title="See the footnote.">1</a></sup> as we are using it internally on the <a href="http://www.physicality.org/DEPtH/" target="_blank">DEPtH project</a> to coordinate our new <a href="http://www.physicality.org/TouchIT/" target="_blank">TouchIT book</a>.  There was minor bug which meant that HTML tags came out unbalanced (e.g. &#8220;&lt;p&gt;&lt;/div&gt;&lt;/p&#8221;).</p>
<p>I&#8217;ve just been fixing it and posting a patch<sup><a href="#footnote-2-196" id="footnote-link-2-196" title="See the footnote.">2</a></sup>, interestingly the bug was partly due to the fact that <a href="http://uk3.php.net/manual/en/regexp.reference.back-references.php" target="_blank">back-references</a> in regular expressions count from the beginning of the regular expression, making it impossible to use them if the expression may be &#8216;glued&#8217; into a larger one &#8230; lack of referential transparency!</p>
<p>For anyone having similar problems, full details and patch below (all WP and PHP techie stuff).</p>
<p><span id="more-196"></span></p>
<p>The regex returned by get_shortcode_regex() in <a href="http://svn.automattic.com/wordpress/tags/2.8.1/wp-includes/shortcodes.php" target="_blank">shortcodes.php</a> had a back-reference &#8216;\[\/\2\]&#8216; to match balanced tag-end-tag pairs such as &#8220;[x]text[/x]&#8220;.  The regex assumes it will be used &#8216;bare&#8217; as a regular expression.  However, wpautop in <a href="http://svn.automattic.com/wordpress/tags/2.8.1/wp-includes/formatting.php" target="_blank">formatting.php</a> (which turns newlines in posts into &lt;p&gt;s or &lt;br&gt;s) needs to remove &lt;p&gt;s form standalone shortcodes. To do this it adds extra enclosing brackets to the shortcode regex to be able to refer to the entire matched tags and content.  This means that the back-reference instead matches the (optional) preceding letter which is usually blank and thus the closing tag does not match properly.</p>
<p>In the case of single tag-end-tag pair, this means that</p>
<pre style="padding-left: 30px;"> [divtag]</pre>
<pre style="padding-left: 30px;"> abc</pre>
<pre style="padding-left: 30px;"> [/divtag]
</pre>
<p>ends up as</p>
<pre style="padding-left: 30px;"> &lt;div&gt;</pre>
<pre style="padding-left: 30px;"> &lt;p&gt;abc&lt;/p&gt;</pre>
<pre style="padding-left: 30px;"> &lt;p&gt;&lt;/div&gt;&lt;/p&gt;
</pre>
<p>This gets more complicated when there are internal tags such as:</p>
<p style="padding-left: 30px;">
<pre style="padding-left: 30px;"> [mytag] some [specialsymbol] text [/mytag]</pre>
<p>As in some cases the existing .*? for attribute matching chomps everything while it tries to find a &#8216;/&#8217;.</p>
<p>A simple fix would be to add an optional parameter to get_shortcode_regex($nested=0) and use this to modify the regular expression.  This would give the intended match for the shortcode regex, but in fact makes things worse. Looking at the same source:</p>
<pre style="padding-left: 30px;"> [divtag]</pre>
<pre style="padding-left: 30px;"> abc</pre>
<pre style="padding-left: 30px;"> [/divtag]</pre>
<p>it would generate (assuming [divtag] generates a div):</p>
<pre style="padding-left: 30px;"> &lt;div&gt;&lt;/p&gt;</pre>
<pre style="padding-left: 30px;"> &lt;p&gt;abc&lt;/p&gt;</pre>
<pre style="padding-left: 30px;"> &lt;p&gt;&lt;/div&gt;</pre>
<p>The attached patch addresses this by adding two new functions to shortcodes.php returning regular expressions for matching begin and end tags separately.  wpautop then has two separate preg_replace lines doing the &lt;p&gt; fixes.  I&#8217;ve tried to change as little as possible, although sometime might return to do things like extend it to allow nested copies of the same tag.</p>
<p>At the same time as doing the abve bug fix,  I swopped the .*? for attribute matching to [^\]]*?<br />
I guess slightly slower, but doesn&#8217;t risk chomping the entire post as a shortcode attribute :-/</p>
<p><strong>Download: </strong> <a href="http://www.alandix.com/blog/wp-content/uploads/2009/07/shortcodes_wpautop_patch_2009_07_26.diff">shortcodes patch</a><br />
Instructions for installing patches at <a href="http://markjaquith.com/" target="_blank">Mark Jacquith</a>&#8216;s entry on <a href="http://markjaquith.wordpress.com/2005/11/02/my-wordpress-toolbox/" target="_blank">My WordPress Toolbox</a>.</p>
<p>The patch in full:</p>
<pre class="brush: php; title: ;">

Index: wp-includes/shortcodes.php
===================================================================
--- wp-includes/shortcodes.php    (revision 11744)
+++ wp-includes/shortcodes.php    (working copy)
@@ -175,10 +175,40 @@
 $tagnames = array_keys($shortcode_tags);
 $tagregexp = join( '|', array_map('preg_quote', $tagnames) );

-    return '(.?)\[('.$tagregexp.')\b(.*?)(?:(\/))?\](?:(.+?)\[\/\2\])?(.?)';
+    return '(.?)\[('.$tagregexp.')\b([^\]]*?)(?:(\/))?\](?:(.+?)\[\/\2\])?(.?)';
 }

 /**
+ * Retrieve the shortcode regular expression for searching for start tag only.
+ *
+ * @uses $shortcode_tags
+ *
+ * @return string The shortcode start tag search regular expression
+ */
+function get_shortcode_start_regex() {
+    global $shortcode_tags;
+    $tagnames = array_keys($shortcode_tags);
+    $tagregexp = join( '|', array_map('preg_quote', $tagnames) );
+
+    return '\[('.$tagregexp.')\b([^\]]*?)(?:(\/))?\]';
+}
+
+/**
+ * Retrieve the shortcode regular expression for searching for end tag only.
+ *
+ * @uses $shortcode_tags
+ *
+ * @return string The shortcode end tag search regular expression
+ */
+function get_shortcode_end_regex() {
+    global $shortcode_tags;
+    $tagnames = array_keys($shortcode_tags);
+    $tagregexp = join( '|', array_map('preg_quote', $tagnames) );
+
+    return '\[\/('.$tagregexp.')\]';
+}
+
+/**
 * Regular Expression callable for do_shortcode() for calling shortcode hook.
 * @see get_shortcode_regex for details of the match array contents.
 *
Index: wp-includes/formatting.php
===================================================================
--- wp-includes/formatting.php    (revision 11744)
+++ wp-includes/formatting.php    (working copy)
@@ -170,7 +170,8 @@
 if (strpos($pee, '&lt;pre') !== false)
 $pee = preg_replace_callback('!(&lt;pre[^&gt;]*&gt;)(.*?)&lt;/pre&gt;!is', 'clean_pre', $pee );
 $pee = preg_replace( &quot;|\n&lt;/p&gt;$|&quot;, '&lt;/p&gt;', $pee );
-    $pee = preg_replace('/&lt;p&gt;\s*?(' . get_shortcode_regex() . ')\s*&lt;\/p&gt;/s', '$1', $pee); // don't auto-p wrap shortcodes that stand alone
+    $pee = preg_replace('/&lt;p&gt;\s*?(' . get_shortcode_start_regex() . ')\s*&lt;\/p&gt;/s', '$1', $pee); // don't auto-p wrap shortcodes that stand alone
+    $pee = preg_replace('/&lt;p&gt;\s*?(' . get_shortcode_end_regex() . ')\s*&lt;\/p&gt;/s', '$1', $pee);   // check both start and end tags

 return $pee;
 }
</pre>
<br /><ol class="footnotes"><li id="footnote-1-196">see section &#8220;using dynamic binding&#8221; in <a href="http://www.alandix.com/blog/2009/07/20/whats-wrong-with-dynamic-binding/" target="_blank">What’s wrong with dynamic binding?</a>  [<a href="#footnote-link-1-196">back</a>]</li><li id="footnote-2-196"><a href="http://core.trac.wordpress.org/ticket/10490" target="_blank">TRAC ticket #10490</a>  [<a href="#footnote-link-2-196">back</a>]</li></ol>]]></content:encoded>
			<wfw:commentRss>http://www.alandix.com/blog/2009/07/26/fix-for-wordpress-shortcode-bug/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>PPIG2008 and the twenty first century coder</title>
		<link>http://www.alandix.com/blog/2008/09/16/ppig2008-and-the-twenty-first-century-coder/</link>
		<comments>http://www.alandix.com/blog/2008/09/16/ppig2008-and-the-twenty-first-century-coder/#comments</comments>
		<pubDate>Tue, 16 Sep 2008 17:36:22 +0000</pubDate>
		<dc:creator>alan</dc:creator>
				<category><![CDATA[academic]]></category>
		<category><![CDATA[HCI and usability]]></category>
		<category><![CDATA[web development]]></category>
		<category><![CDATA[bugs]]></category>
		<category><![CDATA[debugging]]></category>
		<category><![CDATA[HCI]]></category>
		<category><![CDATA[human computer interaction]]></category>
		<category><![CDATA[mathematics]]></category>
		<category><![CDATA[maths]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[software engineering]]></category>
		<category><![CDATA[tagging]]></category>
		<category><![CDATA[techie]]></category>
		<category><![CDATA[web]]></category>
		<category><![CDATA[web2.0]]></category>

		<guid isPermaLink="false">http://www.alandix.com/blog/?p=95</guid>
		<description><![CDATA[Last week I was giving a keynote at the annual workshop PPIG2008 of the Psychology of Programming Interest Group.   Before I went I was politely pronouncing this pee-pee-eye-gee … however, when I got there I found the accepted pronunciation was pee-pig … hence the logo! My own keynote at PPIG2008 was &#8220;as we may code: [...]]]></description>
			<content:encoded><![CDATA[<p>Last week I was giving a keynote at the annual workshop <a href="http://www.cs.st-andrews.ac.uk/~jr/ppig08/" target="_blank">PPIG2008</a> of the <a href="http://www.ppig.org/" target="_blank">Psychology of Programming Interest Group</a>.   Before I went I was politely pronouncing this pee-pee-eye-gee … however, when I got there I found the accepted pronunciation was pee-pig … hence the logo!</p>
<p><a href="http://www.ppig.org/" target="_blank"><img class="alignright" title="two pigs" src="http://www.cs.st-andrews.ac.uk/~jr/ppig08/img/pair-100.JPG" alt="" width="100" height="125" align="right" /></a></p>
<p>My own keynote at PPIG2008 was &#8220;<a title="as we may code - web pages" href="http://www.hcibook.com/alan/papers/PPIG2008-as-we-may-code/" target="_blank">as we may code: the art (and craft) of computer programming in the 21st century</a>&#8221; and was an exploration of the changes in coding from 1968 when <a title="Wikipedia.org: Donald Knuth" href="http://en.wikipedia.org/wiki/Donald_Knuth" target="_blank">Knuth</a> published the first of his books on &#8220;<a title="Wikipedia: the art of computer programmimg" href="http://en.wikipedia.org/wiki/The_Art_of_Computer_Programming" target="_blank">the art of computer programming</a>&#8220;.  On the <a title="as we may code - web pages" href="http://www.hcibook.com/alan/papers/PPIG2008-as-we-may-code/" target="_blank">web site for the talk</a> I&#8217;ve made a relatively unstructured list of some of the distinctions I&#8217;ve noticed between 20th and 21st Century coding (C20 vs. C21); and in <a href="http://www.comp.lancs.ac.uk/~dixa/papers/PPIG2008-as-we-may-code/as-we-may-code-v2.pdf" target="_blank">my slides</a> I have started to add some more structure.  In general we have a move from more mathematical, analytic, problem solving approach, to something more akin to a search task, finding the right bits to fit together with a greater need for information management and social skills. Both this characterisation and the list are, of course, a gross simplification, but seem to capture some of the change of spirit.  These changes suggest different cognitive issues to be explored and maybe different personality types involved &#8211; as one of the attendees, <a href="http://www.cs.ncl.ac.uk/people/david.greathead" target="_blank">David Greathead</a>, pointed out, rather like the judging vs. perceiving personality distinction in Myers-Briggs<sup><a href="#footnote-1-95" id="footnote-link-1-95" title="See the footnote.">1</a></sup>.</p>
<p>One interesting comment on this was from <a href="mcs.open.ac.uk/mp8/ " target="_blank">Marian Petre</a>, who has studied many professional programmers.  Her impression, and echoed by others, was that the heavy-hitters were the more experienced programmers who had adapted to newer styles of programming, whereas  the younger programmers found it harder to adapt the other way when they hit difficult problems.  Another attendee suggested that perhaps I was focused more on application coding and that system coding and system programmers were still operating in the C20 mode.</p>
<p>The social nature of modern coding came out in several papers about agile methods and pair programming.  As well as being an important phenomena in its own right, pair programming gives a level of think-aloud  &#8216;for free&#8217;, so maybe this will also cast light on individual coding.</p>
<p><a href="http://webhome.cs.uvic.ca/%7Emstorey/" target="_blank">Margaret-Anne Storey</a> gave a fascinating keynote about the use of comments and annotations in code and again this picks up the social nature of code as she was studying open-source coding where comments are often for other people in the community, maybe explaining actions, or suggesting improvements.  She reviewed a lot of material in the area and I was especially interested in one result that showed that <em>novice</em> programmers with <em>small</em> pieces of code found method comments more useful than class comments.  Given my own frequent complaint that code is inadequately <em>documented</em> at the class or higher level, this appeared to disagree with my own impressions.  However, in discussion it seemed that this was probably accounted for by differences in context: novice vs. expert programmers, small vs large code, internal comments vs. external documentation.  One of the big problems I find is that the way different classes work together to produce effects is particularly poorly documented.  Margaret-Anne described one system her group had worked on<sup><a href="#footnote-2-95" id="footnote-link-2-95" title="See the footnote.">2</a></sup> that allowed you to write a tour of your code opening windows, highlighting sections, etc.</p>
<p>I sadly missed some of the presentations as I had to go to other meetings (the danger of a conference at your home site!), but I did get to some and  was particularly fascinated by the more theoretical/philosophical session including one paper addressing the psychological origins of the notions of objects and another focused on (the dangers of) abstraction.</p>
<p>The latter, presented by <a href="http://www.lukechurch.net/" target="_blank">Luke Church</a>, critiqued  <a href="http://www.cs.cmu.edu/~wing/" target="_blank">Jeanette Wing</a>&#8216;s 2006 CACM paper on <a href="http://portal.acm.org/citation.cfm?id=1227504.1227378" target="_blank">Computational Thinking</a>.  This is evidently a &#8216;big thing&#8217; with loads of funding and hype &#8230; but one that I had entirely missed :-/ Basically the idea is to translate the ways that one thinks about computation to problems other than computers &#8211; nerds rule OK. The tenet&#8217;s of computational thinking seem to overlap a lot with management thinking and also reminded me of the way my own HCI community and also parts of the Design (with capital D) community in different ways are trying to say they we/they are the universal discipline  &#8230; well if we don&#8217;t say it about our own discipline who will &#8230;the physicists have been getting away with it for years <img src='http://www.alandix.com/blog/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> </p>
<p>Luke (and his co-authors) argument is that abstraction can be dangerous (although of course it is also powerful).  It would be interesting perhaps rather than Wing&#8217;s paper to look at this argument alongside  Jeff Kramer&#8217;s 2007 CACM article &#8220;<a href="http://dx.doi.org/10.1145/1232743.1232745" target="_blank">Is abstraction the key to computing?</a>&#8220;, which I recall liking because it says computer scientists ought to know more mathematics <img src='http://www.alandix.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' />  <img src='http://www.alandix.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>I also sadly missed some of <a href="http://www.lancs.ac.uk/staff/mackenza/" target="_blank">Adrian Mackenzie</a>&#8216;s closing keynote &#8230; although this time not due to competing meetings but because I had been up since 4:30am reading a PhD thesis and after lunch on a Friday had begin to flag!  However, this was no reflection an Adrian&#8217;s talk and the bits I heard were fascinating looking at the way bio-tech is using the language of software engineering.  This sparked a debate relating back to the overuse of abstraction, especially in the case of the genome where interactions between parts are strong and so the software component analogy weak.  It also reminded me of yet another relatively recent paper<sup><a href="#footnote-3-95" id="footnote-link-3-95" title="See the footnote.">3</a></sup> on the way computation can be seen in many phenomena and should not be construed solely as a science of computers.</p>
<p>As well as the academic content it was great to be with the PPIG crowd they are a small but very welcoming and accepting community &#8211; I don&#8217;t recall anything but constructive and friendly debate &#8230; and next year they have PPIG09 in Limerick &#8211; PPIG and Guiness what could be better!</p>
<br /><ol class="footnotes"><li id="footnote-1-95">David has done some really interesting work on the relationship between personality types and different kinds of programming tasks.  I&#8217;ve seen him present before about debugging and unfortunately had to miss his talk at PPIG on comprehension.  Given his work has has shown clearly that there are strong correlations between certain personality attributes and coding, it would be good to see more qualitative work investigating the nature of the differences.   I&#8217;d like to know whether strategies change between personality types: for example, between systematic debugging and more insight-based scan and see it bug finding.   [<a href="#footnote-link-1-95">back</a>]</li><li id="footnote-2-95">but I can&#8217;t find on their website <img src='http://www.alandix.com/blog/wp-includes/images/smilies/icon_sad.gif' alt=':-(' class='wp-smiley' />   [<a href="#footnote-link-2-95">back</a>]</li><li id="footnote-3-95">Perhaps 2006/2007 in either CACM or Computer Journal, if anyone knows the one I mean please remind me!  [<a href="#footnote-link-3-95">back</a>]</li></ol>]]></content:encoded>
			<wfw:commentRss>http://www.alandix.com/blog/2008/09/16/ppig2008-and-the-twenty-first-century-coder/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>why software need never hang</title>
		<link>http://www.alandix.com/blog/2008/06/21/why-software-need-never-hang/</link>
		<comments>http://www.alandix.com/blog/2008/06/21/why-software-need-never-hang/#comments</comments>
		<pubDate>Sat, 21 Jun 2008 13:54:19 +0000</pubDate>
		<dc:creator>alan</dc:creator>
				<category><![CDATA[academic]]></category>
		<category><![CDATA[HCI and usability]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[bugs]]></category>
		<category><![CDATA[debugging]]></category>
		<category><![CDATA[HCI]]></category>
		<category><![CDATA[human computer interaction]]></category>
		<category><![CDATA[microsoft]]></category>
		<category><![CDATA[software engineering]]></category>
		<category><![CDATA[time]]></category>
		<category><![CDATA[Word]]></category>

		<guid isPermaLink="false">http://www.alandix.com/blog/?p=81</guid>
		<description><![CDATA[Over 20 years ago I wrote &#8220;The Myth of the Infinitely Fast Machine&#8220;, about the way software developers effectively assume that everything on the machine side of human interaction happens instantly. Often interaction is programmed in a turn-taking style: wait for user action process the event display changes back to step 1 This assumption of [...]]]></description>
			<content:encoded><![CDATA[<p>Over 20 years ago I wrote &#8220;<a href="http://www.comp.lancs.ac.uk/~dixa/papers/hci87/" target="_blank">The Myth of the Infinitely Fast Machine</a>&#8220;, about the way software developers effectively assume that everything on the machine side of human interaction happens instantly.  Often interaction is  programmed in a turn-taking style:</p>
<ol>
<li>wait for user action</li>
<li>process the event</li>
<li>display changes</li>
<li>back to step 1</li>
</ol>
<p>This assumption of instant (or at least infinitely fast) response at step 2 often ignores network delays, disk IO or heavy computation. This tends to work fine on a high-spec development or test machine, with a fast network and clean install of all system software &#8230; but when the software hits a real machine, a few years old, untidy system, slow network &#8230; things fall to pieces.</p>
<p>So 20 years later (as I described in my <a title="pain, tears and office 2008" href="http://www.alandix.com/blog/2008/06/16/pain-tears-and-office-2008/" target="_self">post last week</a>) I am sitting watching the spinning rainbow ball as Word struggles to save a document (over an hour now, I think I will need to kill it).  To be fair I think the root &#8217;cause&#8217; of the problem &#8230; or at least one problem &#8230; may be the printer as the Cannon printer driver has never worked properly on an Intel Mac (maybe new driver when I upgrade to Leopard?) and perhaps some change in the rest of the system (maybe the Office install) has tipped it over into not working at all.</p>
<p>As far as I can tell Word then decides to ask the printer things in order to set the margins properly when saving the document, and then gets stuck.  I found a post on a Microsoft forum about a different print related problem and the &#8216;helpful&#8217; tech support from MS simply said &#8220;not our fault, re-install everything&#8221;.</p>
<p>So to recap:</p>
<ul>
<li>user asks Word to save &#8211; probably the most critical operation in the system, or the system auto-saves, again to ensure safety against crashes, so really critical</li>
<li>Word decides it needs information from the printer (although it has been displaying the page to the users using some existing information on page properties).</li>
<li>Word asks for info from the printer driver of the currently selected printer</li>
<li>if the printer doesn&#8217;t respond Word hangs and blocks all user interaction</li>
</ul>
<p>However, the printer driver may be third party, may be connecting to a shared printer hanging off a different network, or in the case of a laptop on a network currently disconnected from the computer &#8230; and any resulting delay is not the fault of the developers of Word??!</p>
<p>The annoying thing is that such &#8216;hanging&#8217; delays need never happen.</p>
<p>Basically there are four main causes for delays:</p>
<ol>
<li>ordinary computation takes a long time due to it being too complex for the available hardware</li>
<li>unbounded internal computation -for example iterative algorithms</li>
<li>waiting for external resources (disk, network, etc.)</li>
<li>bugs that lead to the system going crazy (effectively case 2 by accident!)</li>
</ol>
<p>Type 1 will surface during testing and may require re-design of the interaction, but is simply &#8216;slow&#8217; rather than &#8216;hanging&#8217;.  Typically it leads to things gradually getting slower as the document or data gets larger or more complicated.  This requires standard profiling and optimisation.</p>
<p>Type 4 is hard to deal with &#8211; bugs do happen.  However, the majority of the problems I&#8217;m experiencing in Word at the moment are not a failure of this kind as Word does, most of the time, eventually complete without crashing.</p>
<p>Types 2 and 3, especially the latter, should be detected and then dealt with in the design of the user interface.</p>
<p>Some real-time programming languages have ways of automatically working out how long code will take to run in order to be able to assert &#8220;this will respond within a 10 ms interrupt cycle&#8221;.  However, this is hard, even for relatively simple embedded systems; so not practical for complex operating systems or user interfaces.</p>
<p>However, a simpler version of the above is possible.  Certain system functions invoke external resources such as the disk, or the network.  If any function or method in your own application invokes one of these system functions, then it could potentially hang &#8211; and should be documented to say so or return some sort of &#8216;promise&#8217;: &#8220;I&#8217;ve started to do X, please check back later to see if it is ready&#8221;. Of course the methods that call these themselves need to be documented as potentially hanging &#8230; and so forth.</p>
<p>If the response to any form of user interaction ends up calling a potentially hanging function, then it is in danger of having a delay of type 3 above.  However, so long as this is known, it can be dealt with at the user interface level by spawning a thread to do the work so that some form of progress indicator or at least &#8220;Cancel&#8221; button can be active &#8211; it should <strong>never</strong> &#8216;hang&#8217;.</p>
<p>This marking of functions as potentially &#8216;hanging&#8217; could be done by programmers themselves, but equally can be automated as a form of static analysis, simply starting with a known set of hanging system functions and recursively  &#8216;colouring&#8217; functions that call them.  This kind of automated checking should  be standard practice in any large software project.</p>
<p>The type 2 hanging is a little more complicated.  The ADA programming language has a &#8216;safe&#8217; subset that only allows loops where  the bounds are fixed at compile time.  This is probably too restrictive for complex software, but certainly any loop with unknown  limits could be flagged.  If as part of a code walk through or similar practice it is decided that the loop is &#8216;safe&#8217; it can be annotated as such, otherwise, just like the case of system calls, the system can propagate the fact that certain functions may have unbounded computation and then the UI adjusted accordingly.</p>
<p>For small bespoke software development I can be forgiving, but for large vendors like Microsoft, Apple or Adobe, there is no excuse for this form of culpable failure.</p>
<p>&#8230; but I have a bad feeling that in 20 years time I may be writing again &#8230;</p>
<p>[[ News flash - 1.5 hours later Word has finished saving the document! ... 14 pages obviously  hard work. ... but then it has hung again <img src='http://www.alandix.com/blog/wp-includes/images/smilies/icon_sad.gif' alt=':-(' class='wp-smiley' />  ]]</p>
]]></content:encoded>
			<wfw:commentRss>http://www.alandix.com/blog/2008/06/21/why-software-need-never-hang/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>I just wanted to print a file</title>
		<link>http://www.alandix.com/blog/2007/12/27/i-just-wanted-to-print-a-file/</link>
		<comments>http://www.alandix.com/blog/2007/12/27/i-just-wanted-to-print-a-file/#comments</comments>
		<pubDate>Thu, 27 Dec 2007 10:12:57 +0000</pubDate>
		<dc:creator>alan</dc:creator>
				<category><![CDATA[academic]]></category>
		<category><![CDATA[HCI and usability]]></category>
		<category><![CDATA[personal]]></category>
		<category><![CDATA[bugs]]></category>
		<category><![CDATA[human computer interaction]]></category>
		<category><![CDATA[snip!t]]></category>
		<category><![CDATA[software engineering]]></category>
		<category><![CDATA[usability]]></category>

		<guid isPermaLink="false">http://www.alandix.com/blog/2007/12/27/i-just-wanted-to-print-a-file/</guid>
		<description><![CDATA[I just wanted to print a file, now it is an hour and a half later and I still have nothing &#8230; this is where all our time goes gently coaxing our computers in the hope they may do what we ask. Such a simple thing to want to do &#8230; and so much pain [...]]]></description>
			<content:encoded><![CDATA[<p> I just wanted to print a file, now it is an hour and a half later and I still have nothing &#8230;  this is where all our time goes gently coaxing our computers in the hope they may do what we ask.</p>
<p>Such a simple thing to want to do &#8230; and so much pain on the process, and so many simple things that application designers could do it make it better.</p>
<p><span id="more-53"></span><br />
I have a PhD thesis to read, it arrived last night and palexa, whose thesis it is, is coming this afternoon to talk about it.  But I got an error that communication to the printer had failed &#8211; &#8220;please check it is turned it on&#8221;! Of course it is on already, but I checked anyway.</p>
<p>I recall at the weekend Fiona had a similar problem; she rebooted her machine and it all worked again.</p>
<blockquote><p><em>WHY should a simple thing like connecting to a printer require periodic reboots</em>?</p></blockquote>
<p>&#8230; who knows, but we try &#8230;</p>
<p>To reboot my computer involves closing it down &#8230; and some applications notably Microsoft Word, Photoshop, Dreamweaver never close without more coaxing &#8230; shutting windows one by one, until there is nothing left and then &#8230;. &#8216;application not responding&#8217; and I have to Force Quit them anyway.</p>
<p>Of course some of the problem and the reason why the machine crawls to a halt, with lots of spinning rainbows and sluggish mouse, is that I have so many open applications and they all grow over time â€“ but</p>
<blockquote><p><em>why can&#8217;t an application quit without dragging a gigabyte into RAM?</em></p></blockquote>
<p>&#8230; and of course because I know it will take me so much pain to close down my computer I don&#8217;t do it often &#8230; hmm what was that about positive feedback?<br />
As I close down Firefox windows I find several that I need to do things with as they were sitting there as reminders &#8211; so twenty minutes updating web pages, saving bookmarks or grabbing bits in Snip!t.  In fact if Firefox crashes it does recall its last state &#8230; but not when it is closed down &#8216;normally&#8217;.  I sometimes &#8216;Force Quit&#8217; it in order to make it save state &#8211; that is like pulling the plug out form your computer to stop it erasing your disk when you shutdown.</p>
<blockquote><p><em>In an age of laptops why don&#8217;t all applications save their current state when they close down?</em></p></blockquote>
<p>And of course Firefox crashed anyway when I eventually hit Quit.</p>
<p>And then as the windows clear I find a little Software Update window asking me if I want to install urgent security fixes &#8230; between 80 and 50 meg each.  What I can never recall is whether this means &#8220;I have downloaded do you want me to do the actual install&#8221; or &#8220;if you say YES install it will take a LONG time as it downloads files&#8221;</p>
<blockquote><p><em>Why can&#8217;t  applications give you ate least order of magnitude estimates of the time cost before you confirm actions?</em></p></blockquote>
<p>Of course, if I had recalled the download was still needed I could have put off the update &#8230; but of course the dialogue didn&#8217;t have any options to say things like &#8216;do this slowly in the background&#8217; and if it did would I risk rebooting my computer through it.</p>
<blockquote><p><em>Why can&#8217;t applications give users control over when things happen instead of assuming it happens now or never?</em></p></blockquote>
<p>Software Update begins the download &#8230; it estimates 5 minutes, so  go off to get some cornflakes.  I come back and it is almost finished &#8230; I wait &#8230; the little progress bar gets to the end &#8230; and then starts again with another 5 minutes. My head in my hands I recall &#8220;Of course the 5 minutes was for the first of the sevral tings it needs to downlaod&#8221;.</p>
<blockquote><p><em>Why when an application does something in several stages are the progress bars on an action-by-action basis?  The user has to wait for them all to complete.  By all means show individual progress for each stag, but also some form of overall estimate &#8230; please.</em></p></blockquote>
<p>Some time later  &#8230; an information box appears: &#8220;Some of the checked updates couldn&#8217;t be installed &#8211; a network error has occurred&#8221;</p>
<p>I despair.</p>
<p style="text-align: center"><img src="http://www.alandix.com/images/software-update-failed.png" title="Software Update Failed" alt="Software Update Failed" border="0" height="190" width="437" /></p>
]]></content:encoded>
			<wfw:commentRss>http://www.alandix.com/blog/2007/12/27/i-just-wanted-to-print-a-file/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>surprised by Microsoft</title>
		<link>http://www.alandix.com/blog/2007/05/20/surprised-by-microsoft/</link>
		<comments>http://www.alandix.com/blog/2007/05/20/surprised-by-microsoft/#comments</comments>
		<pubDate>Sun, 20 May 2007 09:58:13 +0000</pubDate>
		<dc:creator>alan</dc:creator>
				<category><![CDATA[academic]]></category>
		<category><![CDATA[bugs]]></category>
		<category><![CDATA[confidentiality]]></category>
		<category><![CDATA[microsoft]]></category>
		<category><![CDATA[security]]></category>
		<category><![CDATA[techie]]></category>
		<category><![CDATA[Word]]></category>

		<guid isPermaLink="false">http://www.alandix.com/blog/2007/05/20/surprised-by-microsoft/</guid>
		<description><![CDATA[I had thought I had got used to the ineptitude of Microsoft programmers, that nothing would surprise me any more. However, yet again their inventiveness in producing obscure yet damaging effects has once again amazed me. Some colleagues and I had a paper reviewed for a conference and when we received back the reviewers&#8217; comments [...]]]></description>
			<content:encoded><![CDATA[<p>I had thought I had got used to the ineptitude of Microsoft programmers, that nothing would surprise me any more.  However, yet again their inventiveness in producing obscure yet damaging effects has once again amazed me.</p>
<p><span id="more-24"></span></p>
<p>Some colleagues and I had a paper reviewed for a conference and when we received back the reviewers&#8217; comments  one of them mentioned that the document had track changes on. <sup><a href="#footnote-1-24" id="footnote-link-1-24" title="See the footnote.">1</a></sup><br />
As my colleague had been away, I had been responsible for the final upload so thought that it was me who had messed up.  I looked at the final version I had produced and then the anonymised version I had uploaded.  They both looked free of change marks on screen and sure enough when I checked the highlight changes dialogue box, sure enough nothing was selected and &#8216;Track changes&#8217; was off.</p>
<div style="text-align: center"><img border="1" alt="track changes dialogue" title="track changes dialogue" src="http://www.alandix.com/images/track-changes.gif" /></div>
<p>I sent this copy to my colleagues (on Windows machines, I was on Mac) and they could see the changes.  Obviously the dialogue means that no *new* changes will be collected, but that older changes are still secretly hidden somewhere and furthermore even the setting to say they should not be visible, seems to be platform dependent!</p>
<p>I am used to floating figure problems between platforms and problems moving between machines attached to printers with different characteristics. I was alsoÂ  aware that comments sometimes fail to show up when you switch platforms.Â  However, this trac changes bugÂ  was new to me and reminded me of the &#8216;bad old days&#8217; when Word&#8217;s incremental save meant if you opened a .doc file in a text editor you could see past copies of confidential letters or whaever the docuement had been edited from.</p>
<p>For us the damage was not to great, anonymity partially compormised through names on the changes and the occasional remark in a comment like &#8220;are we really making sense here&#8221; which I guess do not look that good to a reviewer!</p>
<p>However, if you are thinking of mailing anything confidential either make it a PDF or just send text &#8230; and I may take another look at Open Office!</p>
<p>Just imagine that last letter to a difficult client and all the words you delted before sending it <img src='http://www.alandix.com/blog/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> </p>
<br /><ol class="footnotes"><li id="footnote-1-24"> The submissions had been in Word &#8230; the lesson &#8230; always always always get submissions in PDF!   [<a href="#footnote-link-1-24">back</a>]</li></ol>]]></content:encoded>
			<wfw:commentRss>http://www.alandix.com/blog/2007/05/20/surprised-by-microsoft/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

