{"id":172,"date":"2007-01-22T06:40:30","date_gmt":"2007-01-21T21:40:30","guid":{"rendered":"https:\/\/fugutabetai.com\/blog\/2007\/01\/22\/chasen-on-osx-10-4\/"},"modified":"2007-01-22T06:40:30","modified_gmt":"2007-01-21T21:40:30","slug":"chasen-on-osx-10-4","status":"publish","type":"post","link":"https:\/\/fugutabetai.com\/blog\/2007\/01\/22\/chasen-on-osx-10-4\/","title":{"rendered":"Chasen on OSX 10.4"},"content":{"rendered":"<p>I found myself needing to do some Japanese morphological analysis today, which usually means either Chasen or Kabocha.  Kabocha is supposed to be the new hottness, running fast, but a quick search didn&#8217;t turn up any precompiled packages for it on OSX.  ChaSen, on the other hand, is available in <a href=\"http:\/\/chasen.darwinports.com\/\">DarwinPorts<\/a>, but since I went with fink, and just want to get something running, not enter into some sort of strange package-management land-war, I skipped that.  It also turns out that <a href=\"http:\/\/www.apple.com\/jp\/downloads\/macosx\/utilities\/chasen.html\">apple is hosting an package for chasen<\/a>.  It install with a nice installer into <code>\/usr\/local\/bin\/chasen<\/code>.<\/p>\n<p><P\/><\/p>\n<p>It seems to run fine, includes the necessary dictionaries, etc., but I had a strange problem.  When I tried to process a file in shift-jis encoding using the <code>-i s<\/code> flag, I would get this strange error: <code>chasen: \/usr\/local\/lib\/chasen\/dic\/ipadic\/cforms.cha:9-21: no basic form<\/code><\/p>\n<p><P\/><\/p>\n<p>That wasn&#8217;t really what I wanted: I wanted parsed output.  Well, since things seem to work just fine in EUC-JP encoding, you can always use iconv to convert from shift-jis to EUC-JP and pipe the resulting output to chasen:<br \/>\n<code>iconv -f SHIFT-JIS -t EUC-JP file.txt | chasen<\/code><\/p>\n<p><P\/><\/p>\n<p>That works nicely.  <\/p>\n","protected":false},"excerpt":{"rendered":"<p>I found myself needing to do some Japanese morphological analysis today, which usually means either Chasen or Kabocha. Kabocha is supposed to be the new hottness, running fast, but a quick search didn&#8217;t turn up any precompiled packages for it on OSX. ChaSen, on the other hand, is available in DarwinPorts, but since I went [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[4,10,5],"tags":[],"_links":{"self":[{"href":"https:\/\/fugutabetai.com\/blog\/wp-json\/wp\/v2\/posts\/172"}],"collection":[{"href":"https:\/\/fugutabetai.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/fugutabetai.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/fugutabetai.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/fugutabetai.com\/blog\/wp-json\/wp\/v2\/comments?post=172"}],"version-history":[{"count":0,"href":"https:\/\/fugutabetai.com\/blog\/wp-json\/wp\/v2\/posts\/172\/revisions"}],"wp:attachment":[{"href":"https:\/\/fugutabetai.com\/blog\/wp-json\/wp\/v2\/media?parent=172"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/fugutabetai.com\/blog\/wp-json\/wp\/v2\/categories?post=172"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/fugutabetai.com\/blog\/wp-json\/wp\/v2\/tags?post=172"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}