{"id":508,"date":"2008-01-17T22:13:50","date_gmt":"2008-01-17T12:13:50","guid":{"rendered":"http:\/\/www.somethinkodd.com\/oddthinking\/2008\/01\/17\/common-phrase-experiment\/"},"modified":"2008-01-17T22:14:43","modified_gmt":"2008-01-17T12:14:43","slug":"common-phrase-experiment","status":"publish","type":"post","link":"https:\/\/www.somethinkodd.com\/oddthinking\/2008\/01\/17\/common-phrase-experiment\/","title":{"rendered":"Common Phrase Experiment"},"content":{"rendered":"<p>Did you ever notice that some people (all people?) have certain phrases that they use a disproportionately high percentage of the time?<\/p>\n<p>You may find in infuriating or endearing &#8211; probably depending on what you think of the person anyway.<\/p>\n<p>I was wondering what my commonly-repeated phrases might be.<\/p>\n<p>Then I realised I had a large corpus of my writing on this blog, and I may be able to do an experiment to find out.<\/p>\n<div class=\"aside\">To any linguists reading: I know, I know. Writing &ne; conversation. I need a control to compare. Phrase &ne; series of words in a row. I need a bigger sample size. Go with me here; I am just playing.<\/div>\n<p>Here&#8217;s the result:<\/p>\n<p>This blog contains (as of earlier today): 197,514 words in my posts.<\/p>\n<p>Of that, I use 13,109 unique words &#8211; arguably that&#8217;s related to my vocabulary size. I am sure the linguists have some better definitions of vocabulary size.<\/p>\n<p>The most popular words are: the (6% of all words), to, a, I, of, and, it, that, is, in.<\/p>\n<p>That tells us more about English than me. Let&#8217;s try a longer phrase.<\/p>\n<p>My most popular four word phrases:<\/p>\n<ol>\n<li>I am going to<\/li>\n<li>I don&#8217;t want to<\/li>\n<li>the rest of the<\/li>\n<li>the size of the<\/li>\n<li>I am not sure<\/li>\n<\/ol>\n<p>It seems I promise a lot (&#8220;I am going to&#8221;), I whinge a lot (&#8220;I don&#8217;t want to&#8221;) and I don&#8217;t know much (&#8220;I am not sure&#8221;).  Sounds like a good characterisation to me!<\/p>\n<p>This experiment looks like it is working! Let&#8217;s go to phrases of five words.<\/p>\n<p>My most popular five word phrases:<\/p>\n<ol>\n<li>request timed out request timed<\/li>\n<li>timed out request timed out<\/li>\n<li>out request timed out request<\/li>\n<li>round trip time to ms<\/li>\n<li>time to ms round trip<\/li>\n<li>trip time to ms round<\/li>\n<li>ms round trip time to<\/li>\n<li>to ms round trip time<\/li>\n<li>series of nostalgic reminiscences about<\/li>\n<li>of nostalgic reminiscences about the<\/li>\n<\/ol>\n<p>Unfortunately, after four words, the experiment breaks down.<\/p>\n<p>The first eight lines above represent the <a href=\"http:\/\/www.somethinkodd.com\/oddthinking\/2006\/08\/15\/bloglines-networking-issue\/\">output of the ping application<\/a>, which has lots of repeated phrases. The last two are a repeated refrain I used in the introduction to a <a href=\"http:\/\/www.somethinkodd.com\/oddthinking\/category\/geek\/software-development\/rat1000\/\">series of articles about the Rational 1000<\/a>, in order to link them together.<\/p>\n<p>The ping output and Rational 1000 stock phrase dominate up to 10-word phrases, which is where I stopped the experiment.<\/p>\n<p>So, I guess my New Year&#8217;s Resolution is to try to cut down on my repetition of the words &#8220;Request timed out! Request timed out!&#8221;  in order to make my conversation more interesting.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Did you ever notice that some people (all people?) have certain phrases that they use a disproportionately high percentage of the time?<\/p>\n<p>You may find in infuriating or endearing &#8211; probably depending on what you think of the person anyway.<\/p>\n<p>I was wondering what my commonly-repeated phrases might be.<\/p>\n<p>Then I realised I had a large corpus of my writing on this blog, and I may be able to do an experiment to find out.<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_s2mail":"","footnotes":""},"categories":[32,31,21,27],"tags":[230,105,370,84],"class_list":["post-508","post","type-post","status-publish","format-standard","hentry","category-about-oddthinking","category-geek","category-observation","category-thoughts-from-the-shower","tag-experiment","tag-linguistics","tag-observation","tag-oddthinking"],"_links":{"self":[{"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/posts\/508","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/comments?post=508"}],"version-history":[{"count":0,"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/posts\/508\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/media?parent=508"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/categories?post=508"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/tags?post=508"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}