{"id":189,"date":"2007-03-20T05:14:00","date_gmt":"2007-03-19T20:14:00","guid":{"rendered":"https:\/\/fugutabetai.com\/blog\/2007\/03\/20\/notes-from-tuesday-2007-03-20-natural-language-processing-meeting-in-japan\/"},"modified":"2007-03-20T05:14:00","modified_gmt":"2007-03-19T20:14:00","slug":"notes-from-tuesday-2007-03-20-natural-language-processing-meeting-in-japan","status":"publish","type":"post","link":"https:\/\/fugutabetai.com\/blog\/2007\/03\/20\/notes-from-tuesday-2007-03-20-natural-language-processing-meeting-in-japan\/","title":{"rendered":"Notes from Tuesday 2007-03-20 Natural Language Processing Meeting in Japan"},"content":{"rendered":"<p>I attended the <a href=\"http:\/\/nlp2007.itc.nagoya-u.ac.jp\/program.html\">(Japanese) Natural Language Processing meeting in Ryukoku University<\/a> from the 20th until the 23rd.  I&#8217;ve taken some notes on the sessions that I attended.  <\/p>\n<h2>Session B1: Meaning Analysis<\/h2>\n<p>\u610f\u5473\u5206\u6790<br \/>\nSession chair is Utusrou Takehito \u5b87\u6d25\u5442\u6b66\u4ec1 from Tkuba Daigaku.<\/p>\n<ul>\n<li>B1-1  \t\u69cb\u6587\u89e3\u6790\u3092\u88dc\u52a9\u7684\u306b\u7528\u3044\u308b\u610f\u5473\u89e3\u6790<br \/>\n\t\u25cb\u8239\u8d8a\u5b5d\u592a\u90ce, \u4e2d\u91ce\u5e79\u751f, \u9577\u8c37\u5ddd\u96c4\u4e8c, \u8fbb\u91ce\u5e83\u53f8 (HRI-JP)<\/li>\n<li>B1-2 \t\u7d50\u5408\u4fa1\u30d1\u30bf\u30fc\u30f3\u8f9e\u66f8\u304b\u3089\u306e\u60c5\u7dd2\u3092\u660e\u793a\u3059\u308b\u7528\u8a00\u306e\u77e5\u8b58\u30d9\u30fc\u30b9\u5316<br \/>\n\t\u25cb\u9ed2\u4f4f\u4e9c\u7d00\u5b50, \u5fb3\u4e45\u96c5\u4eba, \u6751\u4e0a\u4ec1\u4e00, \u6c60\u539f\u609f (\u9ce5\u53d6\u5927)<\/li>\n<li>B1-3 \tSYNGRAPH\u30c7\u30fc\u30bf\u69cb\u9020\u306b\u304a\u3051\u308b\u8ff0\u8a9e\u9805\u69cb\u9020\u306e\u67d4\u8edf\u30de\u30c3\u30c1\u30f3\u30b0<br \/>\n\t\u25cb\u5c0f\u8c37\u901a\u9686 (\u4eac\u5927), \u4e2d\u6fa4\u654f\u660e, \u67f4\u7530\u77e5\u79c0 (\u6771\u5927), \u9ed2\u6a4b\u798e\u592b (\u4eac\u5927)<\/li>\n<li>C1-5  \t\u79d1\u5b66\u6280\u8853\u6587\u732e\u3092\u5bfe\u8c61\u3068\u3059\u308b\u65e5\u4e2d\u6a5f\u68b0\u7ffb\u8a33\u30b7\u30b9\u30c6\u30e0\u958b\u767a\u30d7\u30ed\u30b8\u30a7\u30af\u30c8<br \/>\n\t\u25cb\u4e95\u4f50\u539f\u5747 (NICT), \u9ed2\u6a4b\u798e\u592b (\u4eac\u5927), \u8fbb\u4e95\u6f64\u4e00 (\u6771\u5927), \u5185\u5143\u6e05\u8cb4 (NICT), \u4e2d\u5ddd\u88d5\u5fd7 (\u6771\u5927), \u68b6\u535a\u884c (\u9759\u5ca1\u5927), \u4e2d\u6751\u5fb9 (JST)<\/li>\n<li>C1-6  \t\u30cf\u30a4\u30d6\u30ea\u30c3\u30c9\u7ffb\u8a33\u306e\u305f\u3081\u306e\u30d5\u30ec\u30fc\u30ba\u30a2\u30e9\u30a4\u30f3\u30e1\u30f3\u30c8<br \/>\n\t\u25cb\u6f6e\u7530\u660e (\u5bcc\u58eb\u901a\u7814)<\/li>\n<li>C1-7 \t\u90e8\u5206\u76ee\u6a19\u306e\u9054\u6210\u5ea6\u306b\u57fa\u3065\u304f\u6a5f\u68b0\u7ffb\u8a33\u81ea\u52d5\u8a55\u4fa1 &#8211; \u90e8\u5206\u76ee\u6a19\u306e\u81ea\u52d5\u751f\u6210 &#8211;<br \/>\n\t\u25cb\u5185\u5143\u6e05\u8cb4, \u5c0f\u8c37\u514b\u5247, \u5f35\u7389\u6f54, \u4e95\u4f50\u539f\u5747 (NICT)<\/li>\n<li>C1-8 \tTranslation quality prediction using multiple automatic evaluation metrics<br \/>\n\t\u25cbPaul Michael, Andrew Finch, \u9685\u7530\u82f1\u4e00\u90ce (NICT\/ATR)<\/li>\n<\/ul>\n<p><!-- readmore --><\/p>\n<h3>B-1: \u69cb\u6587\u89e3\u6790\u3092\u88dc\u52a9\u7684\u306b\u7528\u3044\u308b\u610f\u5473\u89e3\u6790<\/h3>\n<p>\u25cb\u8239\u8d8a\u5b5d\u592a\u90ce, \u4e2d\u91ce\u5e79\u751f, \u9577\u8c37\u5ddd\u96c4\u4e8c, \u8fbb\u91ce\u5e83\u53f8 (HRI-JP)<\/p>\n<p>It looks like the problem they want to tackle is translating natural language into a kind of frame structure that can then be modified into a predicate structure for understanding.  They want to avoid manually created patterns to pick up meaning, and also bottom-up based parsing.  They have a semantic ontology that has a lexicon-meaning mapping, and a set of meaning frames that contain slots that they need to fill.  They do some sort of bottom-up parsing with their lexicon, and do matching to paths over that parse to see if they can fill slots in their meaning frames.  <\/p>\n<h3>B-2: \u7d50\u5408\u4fa1\u30d1\u30bf\u30fc\u30f3\u8f9e\u66f8\u304b\u3089\u306e\u60c5\u7dd2\u3092\u660e\u793a\u3059\u308b\u7528\u8a00\u306e\u77e5\u8b58\u30d9\u30fc\u30b9\u5316<\/h3>\n<p>\u25cb\u9ed2\u4f4f\u4e9c\u7d00\u5b50, \u5fb3\u4e45\u96c5\u4eba, \u6751\u4e0a\u4ec1\u4e00, \u6c60\u539f\u609f (\u9ce5\u53d6\u5927)<br \/>\nThey have 14,000 patterns for Japanese combination patterns.  There are about 1000 that they created that attach some emotive element to the sentence.  Based on random sampling over 80% of their patterns are correctly interpreted (with respect to?)  They performed an experiment to evaluate the performance of their dictionary.  They have a corpus with 1642 (sentences?) that are tagged with information.  They ran their program to determine which things like other things (?) and then compared to the annotations.  Looks like they are looking at things like &#8220;happy&#8221;, &#8220;like&#8221;, &#8220;angry&#8221;, &#8220;dislike&#8221;, &#8220;unhappy&#8221;, &#8220;fear&#8221; and a few others.  They had 7 people tag each sentence (?) and for things where more than 4 people agreed (?) they took that as a correct tag.  <\/p>\n<h3>B-3: SYNGRAPH\u30c7\u30fc\u30bf\u69cb\u9020\u306b\u304a\u3051\u308b\u8ff0\u8a9e\u9805\u69cb\u9020\u306e\u67d4\u8edf\u30de\u30c3\u30c1\u30f3\u30b0<\/h3>\n<p>&#8220;Flexible Predicate Argument Structure Matching using the SYNGRAPH data&#8221;<br \/>\n\u25cb\u5c0f\u8c37\u901a\u9686 (\u4eac\u5927), \u4e2d\u6fa4\u654f\u660e, \u67f4\u7530\u77e5\u79c0 (\u6771\u5927), \u9ed2\u6a4b\u798e\u592b (\u4eac\u5927)<\/p>\n<p>Gave an example where meaning analysis explodes combinatorily based on the different meanings of the words in the sentence.  Syngraph looks like some kind of database that contains similar concepts.  By collapsing many possibilities, the search space is reduced.  Analysis with JUMAN and KNP.  They did some extraction of patterns from a dictionary using what looks like a pattern base (e.g., Sasha&#8217;s definition pattern work.)  They do some matching with a similarity metric between terms (1.0 for the same term, the other scores are based on their ontology?)  <\/p>\n<h3>C1-5  \t\u79d1\u5b66\u6280\u8853\u6587\u732e\u3092\u5bfe\u8c61\u3068\u3059\u308b\u65e5\u4e2d\u6a5f\u68b0\u7ffb\u8a33\u30b7\u30b9\u30c6\u30e0\u958b\u767a\u30d7\u30ed\u30b8\u30a7\u30af\u30c8<\/h3>\n<p>A development project targeting a Chinese-Japanese Machine Translation System for Technical Literature.<br \/>\n\u25cb\u4e95\u4f50\u539f\u5747 (NICT), \u9ed2\u6a4b\u798e\u592b (\u4eac\u5927), \u8fbb\u4e95\u6f64\u4e00 (\u6771\u5927), \u5185\u5143\u6e05\u8cb4 (NICT), \u4e2d\u5ddd\u88d5\u5fd7 (\u6771\u5927), \u68b6\u535a\u884c (\u9759\u5ca1\u5927), \u4e2d\u6751\u5fb9 (JST)<\/p>\n<p>A five year project starting from 2006, with three university affiliates and NICT, JST.  They would like to be able to share research work in Asia, and particularly would like to make information about cutting-edge research available to many countries.  They present a break-down of who is doing what work and research, and a schedule for it.  <\/p>\n<h3>C1-6  \t\u30cf\u30a4\u30d6\u30ea\u30c3\u30c9\u7ffb\u8a33\u306e\u305f\u3081\u306e\u30d5\u30ec\u30fc\u30ba\u30a2\u30e9\u30a4\u30f3\u30e1\u30f3\u30c8<\/h3>\n<p>&#8220;Phrase Alignment for Hybrid Machine Translation&#8221;, \u25cb\u6f6e\u7530\u660e (\u5bcc\u58eb\u901a\u7814)<\/p>\n<p>There are a few different types of hybrid systems, like voting systems, syntax-guided SMT, EBMT using parsed examples, and also fusion type systems.  Phrase based SMT &#8220;phrases&#8221; are not necessarily linguistic phrases.  Parses in syntax-based SMT aren&#8217;t as good as traditional parses, errors propogate and can&#8217;t be fixed, there isn&#8217;t feedback between parsers and the bilingual training data.  They explain their phrase alignment technique which loses me for a while.  They did some evaluation with hand-entered heuristics using the NTCIR3 Patent task.  From the examples shown, it looks like it creates some very good phrases and translations.  <\/p>\n<h3>C1-7 \t\u90e8\u5206\u76ee\u6a19\u306e\u9054\u6210\u5ea6\u306b\u57fa\u3065\u304f\u6a5f\u68b0\u7ffb\u8a33\u81ea\u52d5\u8a55\u4fa1 &#8211; \u90e8\u5206\u76ee\u6a19\u306e\u81ea\u52d5\u751f\u6210 &#8211;<\/h3>\n<p>&#8220;Building an automatic evaluation of parts based on automatic part translation&#8221;, \u25cb\u5185\u5143\u6e05\u8cb4, \u5c0f\u8c37\u514b\u5247, \u5f35\u7389\u6f54, \u4e95\u4f50\u539f\u5747 (NICT)<\/p>\n<p>They would like to be able to rank many different systems in terms of which is better than another using automatic evaluation methods.  Looking at scores like BLEU, NIST and fluency and adequacy they look at multiple system translations and which areas overlap.  So there is some concept of global (BLUE, NIST) evaluation and local evaluation (over just smaller parts.)  They evaluated these local things with Yes\/No questions about specific types of translation rules (maybe using human evaluation?)  It looks like they built simple patterns to answer these questions based on previous work with humans.  Maybe.  They have an equation for mixing in many global evaluation methods with their local evaluation method.  They use some sort of skip trigram method for doing matching.  Tested over the JEDIA (769 sentences) English-Japanese set.  They compared their automatic yes\/no question matching to human question matching.  <\/p>\n<h3>C1-8 \tTranslation quality prediction using multiple automatic evaluation metrics<\/h3>\n<p>\u25cbPaul Michael, Andrew Finch, \u9685\u7530\u82f1\u4e00\u90ce (NICT\/ATR)<\/p>\n<p>I thought I would be getting a talk in English, but it is in Japanese.  The slides are English though.  They are trying to predict MT system translation quality based on automatic evaluation metrics.  They use a travel domain corpus doing both binary and multi-class learing using decision trees with eight features (BLEU, NIST, WER, PER, etc.)  They take classes based on the human-graded fluency and accuracy as their target.  Their approach outperforms the majority baseline in all cases.  <\/p>\n<h3>Poster Session<\/h3>\n<p>I didn&#8217;t walk around with my laptop open, so no notes for this.<\/p>\n<h3>B2:\u8a9e\u5f59\u30fb\u8f9e\u66f8 <\/h3>\n<p>\u5ea7\u9577:\u98af\u3005\u91ce\u5b66 (\u30e4\u30d5\u30fc)<\/p>\n<ul>\n<li>B2-1  \t\u6f22\u8f14\uff1a\u5916\u56fd\u4eba\u306e\u305f\u3081\u306e\u6f22\u5b57\u691c\u7d22\u30b7\u30b9\u30c6\u30e0<br \/>\n\t\u25cb\u7530\u4e2d\u4e45\u7f8e\u5b50, Julian Godon (\u6771\u5927)<\/li>\n<li>B2-2 \t\u81ea\u52d5\u672a\u77e5\u8a9e\u7372\u5f97\u306b\u3088\u4eee\u540d\u6f22\u5b57\u5909\u63db\u30b7\u30b9\u30c6\u30e0\u306e\u7cbe\u5ea6\u5411\u4e0a<br \/>\n\t\u25cb\u68ee\u4fe1\u4ecb (\u65e5\u672cIBM), \u5c0f\u7530\u88d5\u6a39<\/li>\n<li>B2-3 \t\u6700\u5c0f\u8a18\u8ff0\u9577\u539f\u7406\u306b\u57fa\u3065\u3044\u305f\u65e5\u672c\u8a9e\u8a71\u3057\u8a00\u8449\u306e\u5358\u8a9e\u5206\u5272<br \/>\n\t\u25cb\u677e\u539f\u52c7\u4ecb (\u6771\u5927), \u79cb\u8449\u53cb\u826f (\u8c4a\u6a4b\u6280\u79d1\u5927), \u8fbb\u4e95\u6f64\u4e00 (\u6771\u5927\/Univ. of Manchester\/NaCTeM)<\/li>\n<li>B2-4 \t\u8f9e\u66f8\u898b\u51fa\u3057\u8a9e\u306e5\u6587\u5b57\u6f22\u5b57\u719f\u8a9e\u3092\u5bfe\u8c61\u3068\u3057\u305f\u8a9e\u57fa\u69cb\u6210\u306e\u89e3\u6790<br \/>\n\t\u25cb\u90ed\u6069\u6771, \u68ee\u672c\u8cb4\u4e4b, \u5f8c\u85e4\u667a\u7bc4 (\u795e\u5948\u5ddd\u5927)<\/li>\n<\/ul>\n<h3>B2-1  \t\u6f22\u8f14\uff1a\u5916\u56fd\u4eba\u306e\u305f\u3081\u306e\u6f22\u5b57\u691c\u7d22\u30b7\u30b9\u30c6\u30e0<\/h3>\n<p>&#8220;Kansuke: A Kanji Search System for Foreigners&#8221;, \u25cb\u7530\u4e2d\u4e45\u7f8e\u5b50, Julian Godon (\u6771\u5927)<\/p>\n<p>For Chinese, particularly beginners, how should they look up complicated Kanji?  Lookup by reading hard, radical takes time and you have to know the strokes, stroke takes time.  They did a study on how foreigners draw their characters (stroke number and order wrong!) They count up how many horizontal strokes, vertical strokes, and other strokes a character has, then search based on that.  That works for simple things, but for complicated things (\u9b31) it is harder.  So, look up little radicals for each character as before, click on the part, then refine as before.  They use EDICT and a Chinese dictionary for lookup on those characters.  Comparing to lookup by stoke number or SKIP code, they have many fewer candidates on average with lower variance.<br \/>\n<a href=\"http:\/\/cantor.ish.ci.i.u-tokyo.ac.jp\/kansuke\/kansuke_j\/index.html\">Kansuke Kanji Lookup Web Interface<\/a>.<\/p>\n<h3>B2-2 \t\u81ea\u52d5\u672a\u77e5\u8a9e\u7372\u5f97\u306b\u3088\u4eee\u540d\u6f22\u5b57\u5909\u63db\u30b7\u30b9\u30c6\u30e0\u306e\u7cbe\u5ea6\u5411\u4e0a<\/h3>\n<p>&#8220;Atomatic acquisition of unknown terms from kana-kanji henkan input&#8221;, \u25cb\u68ee\u4fe1\u4ecb (\u65e5\u672cIBM), \u5c0f\u7530\u88d5\u6a39<\/p>\n<p>Talking about how to choose the proper kanji for kana input I believe.  They extend this model with some context information.  <\/p>\n<h3>B2-3 \t\u6700\u5c0f\u8a18\u8ff0\u9577\u539f\u7406\u306b\u57fa\u3065\u3044\u305f\u65e5\u672c\u8a9e\u8a71\u3057\u8a00\u8449\u306e\u5358\u8a9e\u5206\u5272<\/h3>\n<p>\u25cb\u677e\u539f\u52c7\u4ecb (\u6771\u5927), \u79cb\u8449\u53cb\u826f (\u8c4a\u6a4b\u6280\u79d1\u5927), \u8fbb\u4e95\u6f64\u4e00 (\u6771\u5927\/Univ. of Manchester\/NaCTeM)<\/p>\n<h3>B2-4 \t\u8f9e\u66f8\u898b\u51fa\u3057\u8a9e\u306e5\u6587\u5b57\u6f22\u5b57\u719f\u8a9e\u3092\u5bfe\u8c61\u3068\u3057\u305f\u8a9e\u57fa\u69cb\u6210\u306e\u89e3\u6790<\/h3>\n<p>\u25cb\u90ed\u6069\u6771, \u68ee\u672c\u8cb4\u4e4b, \u5f8c\u85e4\u667a\u7bc4 (\u795e\u5948\u5ddd\u5927)<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I attended the (Japanese) Natural Language Processing meeting in Ryukoku University from the 20th until the 23rd. I&#8217;ve taken some notes on the sessions that I attended. Session B1: Meaning Analysis \u610f\u5473\u5206\u6790 Session chair is Utusrou Takehito \u5b87\u6d25\u5442\u6b66\u4ec1 from Tkuba Daigaku. B1-1 \u69cb\u6587\u89e3\u6790\u3092\u88dc\u52a9\u7684\u306b\u7528\u3044\u308b\u610f\u5473\u89e3\u6790 \u25cb\u8239\u8d8a\u5b5d\u592a\u90ce, \u4e2d\u91ce\u5e79\u751f, \u9577\u8c37\u5ddd\u96c4\u4e8c, \u8fbb\u91ce\u5e83\u53f8 (HRI-JP) B1-2 \u7d50\u5408\u4fa1\u30d1\u30bf\u30fc\u30f3\u8f9e\u66f8\u304b\u3089\u306e\u60c5\u7dd2\u3092\u660e\u793a\u3059\u308b\u7528\u8a00\u306e\u77e5\u8b58\u30d9\u30fc\u30b9\u5316 \u25cb\u9ed2\u4f4f\u4e9c\u7d00\u5b50, \u5fb3\u4e45\u96c5\u4eba, \u6751\u4e0a\u4ec1\u4e00, \u6c60\u539f\u609f (\u9ce5\u53d6\u5927) [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[10],"tags":[],"_links":{"self":[{"href":"https:\/\/fugutabetai.com\/blog\/wp-json\/wp\/v2\/posts\/189"}],"collection":[{"href":"https:\/\/fugutabetai.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/fugutabetai.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/fugutabetai.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/fugutabetai.com\/blog\/wp-json\/wp\/v2\/comments?post=189"}],"version-history":[{"count":0,"href":"https:\/\/fugutabetai.com\/blog\/wp-json\/wp\/v2\/posts\/189\/revisions"}],"wp:attachment":[{"href":"https:\/\/fugutabetai.com\/blog\/wp-json\/wp\/v2\/media?parent=189"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/fugutabetai.com\/blog\/wp-json\/wp\/v2\/categories?post=189"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/fugutabetai.com\/blog\/wp-json\/wp\/v2\/tags?post=189"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}