{"id":108,"date":"2005-10-20T22:39:12","date_gmt":"2005-10-20T12:39:12","guid":{"rendered":"http:\/\/www.somethinkodd.com\/oddthinking\/?p=108"},"modified":"2005-12-04T16:23:21","modified_gmt":"2005-12-04T05:23:21","slug":"version-number-challenge","status":"publish","type":"post","link":"https:\/\/www.somethinkodd.com\/oddthinking\/2005\/10\/20\/version-number-challenge\/","title":{"rendered":"Version Number Challenge"},"content":{"rendered":"<p><!-- UnMarkedDown_2_01132526441--><\/p>\n<p>Here&#8217;s a little challenge for the software developers. It arose as I considered the <a href=\"http:\/\/www.somethinkodd.com\/oddthinking\/2005\/10\/03\/raid-is-of-the-lost-art\/\">meta-data in a RAID configuration<\/a>, but it has wider applicability, so I will generalise a little. It is probably a solved problem in Computer Science, but I&#8217;ve not heard of it before, so I will happily reinvent this wheel. I do not have a good solution yet, but it seemed an interesting puzzle.<\/p>\n<h3>Simple Starter Problem.<\/h3>\n<p>Suppose you are developing some software. The requirement is that, when the software starts up, it is given two different versions of an file, and it must use the latest version.<\/p>\n<h3>Simple Starter Solution<\/h3>\n<p>That&#8217;s easy! You simply store a generation number with the object and increase it every time the object is modified. A relatively small number amount of bytes can handle a large number of changes.<\/p>\n<p>According to my calculations, a 64-bit generation number easily cover 10 version changes every second for the entire history of the Universe.<\/p>\n<div class=\"aside\">\n<p>Oohh! I&#8217;ve just had a product idea!  A font and template for Excel just for these sorts of calculations. It makes the spreadsheet look like the cells have actually been written on the back of an envelope!<\/p>\n<\/div>\n<h3>Slightly Harder Version of the Problem<\/h3>\n<p>However, what if the files <em>might<\/em> have been branched in the meantime? So a copy of the file <em>might<\/em> have been edited while the main file was also been edited. <\/p>\n<p>There are three possible scenarios:<\/p>\n<ul>\n<li>Neither file was touched, in which case they are both identical.<\/li>\n<li>One file is derived directly from the other file, in which case the later file should be used.<\/li>\n<li>Both files were edited, in which case a more sophisticated merge needs to take place.<\/li>\n<\/ul>\n<p>For example: When a PDA is reconnected to a PC, and the files are resynchronised, the system needs to work out which files should be kept, which should be overwritten and which should be flagged as being in conflict.<\/p>\n<h3>Overly-Simplistic Solution<\/h3>\n<p>If storage space was no object, this would be easy to solve. You could store a complete version history of the object in the version number &#8211; perhaps using a code for each device.<\/p>\n<p>A version number might look like this <code>PC.PDA.PDA.PDA<\/code>, which means the file was created on the PC, and edited three times on the PDA. When the file is compared against <code>PC.PDA.PDA.PDA.PC.PC<\/code>, the system can see that the latter is more up to date. Meanwhile, <code>PC.PDA.PDA.PC.PC.PC<\/code> would be highlighted as a conflicting version.<\/p>\n<p>You could make this system even more robust with timestamps and other distinguishing marks.<\/p>\n<h3>Hard Problem<\/h3>\n<p>When the system&#8217;s architect sees your plans above, a new design constraint is quickly added.  The architect limits you to a bounded storage budget. <\/p>\n<p>On the other hand, the architect will accept a probabilitic solution &#8211; one that will <em>probably<\/em> detect which one is the latest. If it helps, it may be less accurate (i.e. lower probability of a correct answer) after a massive number of off-line changes. In comparison, when there has only been one or two changes off-line, when it should be highly accurate.<\/p>\n<p>There&#8217;s room to negotiate how much space you need in return for how accurate you can get, but let&#8217;s say, in the first draft, 32 bytes of storage have been put aside for this.<\/p>\n<h3>A Poor Solution<\/h3>\n<p>Here&#8217;s one solution:<\/p>\n<p>You treat the 32 bytes as a pool of 256 flags, all initialised to 0 as the file is created.<\/p>\n<p>Each time the file is modified, a random flag is set to 1.<\/p>\n<p>When comparing two files, if one file contains all the set flags of the other file, and more, then it is later.<\/p>\n<p>This solution is very limited &#8211; it only handles 256 changes. The desired behaviour starts to collapse completely as the number of changes approaches 256. <\/p>\n<h3>Can you do better?<\/h3>\n<p>So my question is, what would be a good algorithm to use here, to handle more than 256 changes, while still retaining the right stochastic profile?<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Here&#8217;s a little challenge for the software developers.<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_s2mail":"","footnotes":""},"categories":[33,34],"tags":[],"class_list":["post-108","post","type-post","status-publish","format-standard","hentry","category-puzzle-solving","category-software-development"],"_links":{"self":[{"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/posts\/108","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/comments?post=108"}],"version-history":[{"count":0,"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/posts\/108\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/media?parent=108"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/categories?post=108"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/tags?post=108"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}