{"id":1076,"date":"2009-11-16T03:20:20","date_gmt":"2009-11-15T17:20:20","guid":{"rendered":"http:\/\/www.somethinkodd.com\/oddthinking\/?p=1076"},"modified":"2009-11-16T15:30:22","modified_gmt":"2009-11-16T05:30:22","slug":"collection-correction","status":"publish","type":"post","link":"https:\/\/www.somethinkodd.com\/oddthinking\/2009\/11\/16\/collection-correction\/","title":{"rendered":"Collection Correction"},"content":{"rendered":"<p>Is this just me?<\/p>\n<hr \/>\n<p>Okay, Julian, what&#8217;s the next programming task. Ah, yes, deleting the old objects. Should be easy:<\/p>\n<pre><code>def delete(oldobjects):\r\n    for oldobject in oldobjects:\r\n        oldobject.delete()<\/code><\/pre>\n<p>Hmm&#8230; You know the object deletion is going to be slow &#8211; it may need to talk to the remote servers holding the objects, and it will get blocked on I\/O. Maybe it would have better performance if I created a pool of threads, to delete multiple objects concurrently? Oh well, make a note to look at that if there is a performance issue. If it is fast enough, no need to optimise. YAGNI.<\/p>\n<p>Okay, Julian, what&#8217;s the next programming task. Ah, yes, collecting the old objects. Should be easy; I just need to find the right point to insert a line of code:<\/p>\n<pre><code>if object.is_old():\r\n    oldobjects.add???(object)<\/code><\/pre>\n<p>Hmm&#8230;. what is the correct verb here? Do I add? insert? append? To answer that, I guess I have to decide what the data structure of oldobjects is. It is a collection of some sort, that supports iteration, but I really don&#8217;t care what sort; it isn&#8217;t that important.<\/p>\n<p>I guess I will make it a list then. But, wait &#8211; the list data structure implies an ordering. My collection is unordered. I would rather not even suggest there is an  ordering, because that will make it seem trickier to add concurrency later.<\/p>\n<p>I guess I will make it a set then. But, wait &#8211; the set data structure implies no duplication. My collection is guaranteed to be free of duplications (and even if it wasn&#8217;t, object.delete() is idempotent, making it harmless). I don&#8217;t want the computer wasting its time enforcing a constraint that I don&#8217;t even have. <\/p>\n<p>A more minor point is that set implementations generally involve using hash functions &#8211; I see no reason to enforce the constraint that my objects should support hash functions; they probably do, but if I shouldn&#8217;t need to think about it. I also shouldn&#8217;t need to  thinking about whether hash function is efficient.<\/p>\n<p>If I want efficient, I guess I could go with an array, but there is the implied ordering again, plus the concept of an index, which is another piece of semantics I don&#8217;t care about.<\/p>\n<p>Bags, a.k.a. multi-sets? I wonder what the constant time of the insert operation is. I wonder what the insert operation is called. I wonder if there even is a built-in library for this, or do I have to go and search the net for an implementation? <\/p>\n<p>Hmmm&#8230; If I was a multi-set implementation in the standard library, what would I be called?<\/p>\n<p><em>WHOA! Stop! You started by saying the collection type was unimportant, and now you are searching for new libraries, and their performance characteristics? What is wrong with you? Make it a damned list and move on!<\/em><\/p>\n<hr \/>\n<p>So, is it just me, or do other developers go through this internal debate just about every time they use a collection?<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In which Julian tries to choose a data-structure with a low constant time, in a low constant time.<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_s2mail":"","footnotes":""},"categories":[21,34],"tags":[69],"class_list":["post-1076","post","type-post","status-publish","format-standard","hentry","category-observation","category-software-development","tag-software"],"_links":{"self":[{"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/posts\/1076","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/comments?post=1076"}],"version-history":[{"count":6,"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/posts\/1076\/revisions"}],"predecessor-version":[{"id":1082,"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/posts\/1076\/revisions\/1082"}],"wp:attachment":[{"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/media?parent=1076"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/categories?post=1076"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/tags?post=1076"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}