{"id":111,"date":"2005-10-25T18:46:31","date_gmt":"2005-10-25T08:46:31","guid":{"rendered":"http:\/\/www.somethinkodd.com\/oddthinking\/?p=111"},"modified":"2007-12-29T12:48:48","modified_gmt":"2007-12-29T02:48:48","slug":"the-world-of-upper-lower-case-mappings","status":"publish","type":"post","link":"https:\/\/www.somethinkodd.com\/oddthinking\/2005\/10\/25\/the-world-of-upper-lower-case-mappings\/","title":{"rendered":"The World of Upper- &#038; Lower-Case Mappings"},"content":{"rendered":"<p><!-- UnMarkedDown_2_01132526441--><\/p>\n<p>I&#8217;m no linguist &#8211; <a href=\"http:\/\/www.somethinkodd.com\/oddthinking\/2005\/09\/08\/employment-agency\/\">nor do I play one on TV<\/a>, but it can be fun to have a dig around in their world. I&#8217;ve been doing a bit of research.<\/p>\n<p><a href=\"http:\/\/www.ethnologue.com\/\">Enthnologue<\/a> is an interesting database of human languages and worth a wander through.<\/p>\n<p>However, it doesn&#8217;t contain what I was looking for &#8211; a list of human languages (well, more strictly, the scripts) that have both <a href=\"http:\/\/en.wikipedia.org\/wiki\/Majuscule\" title=\"Wikipedia definition of Majuscule\" class=\"wikipedia\">majuscules<\/a> and <a href=\"http:\/\/en.wikipedia.org\/wiki\/Minuscule\" title=\"Wikipedia definition of Minuscule\" class=\"wikipedia\">minuscules<\/a> &#8211; better known as &#8220;upper-case&#8221; and &#8220;lower-case&#8221; to the likes of me. Certainly the <a href=\"http:\/\/www.ethnologue.com\/show_family.asp?subid=90067\">Germanic<\/a>  scripts do.<\/p>\n<p>What I learnt from the <a href=\"http:\/\/www.unicode.org\/faq\/casemap_charprop.html\">Unicode FAQ<\/a> was that:<\/p>\n<ul>\n<li>\n<p>Most scripts don&#8217;t have cases.<\/p>\n<\/li>\n<li>\n<p>The mappings from upper-to-lower-case are not one-to-one.<\/p>\n<ul>\n<li>\n<p>&#8220;For example, both a sigma and a final sigma upper-case to a capital sigma.&#8221; &#8211; <a href=\"http:\/\/www.unicode.org\/faq\/casemap_charprop.html\">Unicode FAQ<\/a><\/p>\n<\/li>\n<li>\n<p>Some mappings are locale-dependent. <code>UPPER(\"i\") != \"I\"<\/code> in Turkish!<\/p>\n<\/li>\n<li>\n<p>Some lower-case Unicode glyphs can&#8217;t map to a single upper-case glyph. For example, the  <em>&#64258;<\/em> ligature (<code>U+FB02<\/code>), when converted to upper-case, should end up taking <em>two<\/em> characters.<\/p>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p>There are a couple of conclusions:<\/p>\n<ul>\n<li>When dealing with non-English scripts, avoid assumptions like <code>UPPER(x) == UPPER(LOWER(x))<\/code>.<\/li>\n<li>Relying on a computer to change the case of a character is going to be a non-trivial operation.<\/li>\n<\/ul>\n<p><em>(Why am I going on about typography and linguistics? Bear with me. I&#8217;m building up a framework for an argument. I&#8217;ll come back to this later on.)<\/em><\/p>\n","protected":false},"excerpt":{"rendered":"<p>A quick look at the world of upper- and lower-case<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_s2mail":"","footnotes":""},"categories":[31,34],"tags":[203,205,105,204],"class_list":["post-111","post","type-post","status-publish","format-standard","hentry","category-geek","category-software-development","tag-case","tag-internationalisation","tag-linguistics","tag-typography"],"_links":{"self":[{"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/posts\/111","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/comments?post=111"}],"version-history":[{"count":0,"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/posts\/111\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/media?parent=111"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/categories?post=111"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/tags?post=111"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}