{"id":123,"date":"2006-07-19T03:58:00","date_gmt":"2006-07-18T18:58:00","guid":{"rendered":"https:\/\/fugutabetai.com\/blog\/2006\/07\/19\/notes-from-tuesday-2007-07-18-coling-acl-2006-session\/"},"modified":"2006-07-19T03:58:00","modified_gmt":"2006-07-18T18:58:00","slug":"notes-from-tuesday-2007-07-18-coling-acl-2006-session","status":"publish","type":"post","link":"https:\/\/fugutabetai.com\/blog\/2006\/07\/19\/notes-from-tuesday-2007-07-18-coling-acl-2006-session\/","title":{"rendered":"Notes from Tuesday 2007-07-18 COLING\/ACL 2006 session"},"content":{"rendered":"<p>2006-07-18 Invited Keynote Tuesday morning<br \/>\nArgmax Search in Natural Language Processing<br \/>\nDaniel Marcu<\/p>\n<p><!-- readmore --><\/p>\n<p>The goal is to convince us that search is hard, and by not worrying<br \/>\nabout it (just throwing in an argmax without thinking about it much)<br \/>\nwe have a problem.  <\/p>\n<p>For example with ISI&#8217;s syntax based MT system, the big problem they<br \/>\nhave is search errors because the search space is very complex.  When<br \/>\nthey moved to training on long sentences, they started to have many<br \/>\nsearch errors.  A very interesting high-level talk about the problems<br \/>\nstill in search.  Many good examples, and encouraging us as a field to<br \/>\nlook at search more compared to models where we do spend a lot of time<br \/>\nalready.  <\/p>\n<p>&#8212;<\/p>\n<p>Summarization I session<\/p>\n<p>Bayesian Query-Focused Summarization<br \/>\nHal Daume III, Daniel Marcu (ISI, University of Southern California)<\/p>\n<p>Extractive summarization in a mostly-unsupervised environment.<\/p>\n<p>Assume a large document collection, a query, and relevance judgments<br \/>\n(a few representative examples) to the query over the documents.<br \/>\nVery interesting talk, compared to a variety of extraction methods.<br \/>\nBasically it works like a relevance feedback method for IR over<br \/>\nsentences.  <\/p>\n<p>&#8212;<\/p>\n<p>Extractive Summarization using Inter- and Intra- Event Relevance<br \/>\nWenjie Li, Mingli Wu, Qin Lu, Wei Xu, Chunfa Yuan<\/p>\n<p>Their events are verbs and action nouns appearing between two named<br \/>\nentities.  They also look at associated named entities (person,<br \/>\norganization, location, date) as well.  They build a graph of named<br \/>\nentities and event terms.  They use PageRank to run over the graph and<br \/>\ncreate summaries.  <\/p>\n<p>Intra-event relevance is defined as relevance that is direct between<br \/>\ntwo nodes on named entities.  Inter-event relevance is links on the<br \/>\ngraph that are not direct?  I&#8217;m not sure, this isn&#8217;t very clear in the<br \/>\npresentation.  <\/p>\n<p>They use WordNet for some similarity for &#8220;semantic relevance&#8221; and also<br \/>\na topic-specific relevance from the document set.  Basically it looks<br \/>\nlike just a count of how many times the events occur together (the<br \/>\nnumber of named entities that they share.)  They also cluster named<br \/>\nentities based on descriptive text to see how they are related.  <\/p>\n<p>Basically I don&#8217;t think they are doing anything new here; I&#8217;ve seen<br \/>\nall the things they are doing before.  They put a lot of different<br \/>\napproaches together perhaps, but I don&#8217;t see any new big ideas here.  <\/p>\n<p>They use some DUC data for evaluation with ROUGE scores of different<br \/>\nfeatures of their system, but I don&#8217;t think they differences they<br \/>\npresent are statistically significant from what I remember of DUC<br \/>\nscores.  <\/p>\n<p>There was a long pause before any questions, but then one was asked<br \/>\nabout how PageRank was used.  I don&#8217;t think that is an interesting<br \/>\nquestion, because Drago (at least, there are others) have been using<br \/>\nPageRank for sentence ranking for a few years.  Of course, some sort<br \/>\ndetails can be given on their usage and adoption of it.  <\/p>\n<p>&#8212;<\/p>\n<p>Models for Sentence Compression: A Comparison across Domains, Training<br \/>\nRequirements and Evaluation Measures<br \/>\nJames Clarke, Mirella Lapata<\/p>\n<p>(I met James Clarke last night and had a few drinks with him.  A<br \/>\nreally nice guy!)  <\/p>\n<p>This paper concentrates on word deletion.  They had 3 annotators<br \/>\nremove tokens from a broadcast news transcript.  Analysis of their<br \/>\ncorpus and an existing one (Ziff-Davis by Knight and Marcu I think,<br \/>\ncreated from looking at abstracts and looking for full sentences in<br \/>\nthe news article.)  <\/p>\n<p>They train a decision tree model to learn operations over the parse<br \/>\ntree.  Compared to a word-based model.  <\/p>\n<p>60 unpaid annotators rated compressed sentences on a single 1-5<br \/>\nscale.  <\/p>\n<p>They looked at two automatic evaluation metrics to see if they are<br \/>\ncorrelated with the human scores.  The more complicated metric (they<br \/>\nsay F-measure, but probably should specify as F-measure over what)<br \/>\ncorrelated better than simple string accuracy.  The decision list<br \/>\nmethod didn&#8217;t perform as well as the word-based model.  <\/p>\n<p>I really liked this presentation as well, and will make an effort to<br \/>\nsee his poster.  <\/p>\n<p>Kathy commented that he missed out on some previous work (Hongyan Jing<br \/>\nnamely and Bonnie Dorr on headline generation.)  This paper is more<br \/>\nstatistically oriented, but didn&#8217;t seem to include the document<br \/>\ncontext to determine what is important.  The poster describing work<br \/>\nusing Integer programming to pull in more linguistic rules.<br \/>\nAnother comment on document-level context usage.<br \/>\nA question about trying to retain information that is important to<br \/>\nsomeone (so, query-focused compression.)  A question from Inderjeet<br \/>\nMani (I think) on the quality of the internet-based free annotators.  <\/p>\n<p>&#8212;<\/p>\n<p>A Bottom-up Approach to Sentence Ordering for Multi-document<br \/>\nSummarization<br \/>\nDamushka Bollegala, Naoki Okazaki, Mitsuru Ishizuka<br \/>\n(The University of Tokyo)<\/p>\n<p>They propose an automatic evaluation metric for sentence ordering<br \/>\n(average continuity.)  They do a hierarchical ordering based on<br \/>\npairwise comparison of all sentences (or blocks) to others.  They have<br \/>\na variety of heuristics for sentence ordering that go into the<br \/>\nevaluation functions, and learn the combination of all the features<br \/>\nfrom a set of manual sentence orderings.  Chronological, topical<br \/>\nrelevance using cosine similarity, precedence criteria (compare<br \/>\nsentences in their document that come before the sentence they have<br \/>\nextracted, and look at info in extracted set),  succession criteria is<br \/>\nthe same.  They create training instances with the 4 features over<br \/>\nmanually ordered summary.  <\/p>\n<p>Average continuity computed n-grams of orderings and looks at how many<br \/>\nof those are in the target ordering.  <\/p>\n<p>I didn&#8217;t really think that this is groundbreaking, but just has put<br \/>\ntogether a few features for sentence ordering.  It is a nice piece of<br \/>\nengineering to pull together the different sentence ordering<br \/>\napproaches.  <\/p>\n<p>&#8212;<\/p>\n<p>Went to lunch with Kathy McKeown and Min-yen Kan.<\/p>\n<p>&#8212;<\/p>\n<p>Skipped the first session so I could check email and brush my teeth.<\/p>\n<p>&#8212;<\/p>\n<p>Direct Word Sense Matching for Lexical Substitution<br \/>\nIdo Dagan, Oren Glickman, Alfio Gliozzo, Efrat Marmorshtein, Carlo<br \/>\nStrapparava<\/p>\n<p>The main point of this talk was that WSD does not have to be defined<br \/>\nas picking out the wordnet synset number for each word we are<br \/>\ninterested in.  It can be reduced to a binary categorization task:<br \/>\ndoes the word sense of the target word match the word sense of the<br \/>\nsource word?  Moving to that formalization will make things clearer<br \/>\nfor actual use in applications (a less complicated task than what<br \/>\npeople have been solving unnecessarily) and open up the field to new<br \/>\nmethods from classification.  <\/p>\n<p>&#8212;<\/p>\n<p>Segment-based Hidden Markov Models for Information Extraction<br \/>\nZhenmei Gu, Nick Cerone<\/p>\n<p>I&#8217;m having a hard time understanding what the main thrust of this talk<br \/>\nis 11 slides in.  <\/p>\n<p>&#8212;<\/p>\n<p>Information Retrieval I Session<\/p>\n<p>An Iterative Implicit Feedback Approach to Personalized Search<br \/>\nYuanhau Lv, Le Sun, Junlin Zhang, Jina-Yun Nie, Wan Chen, Wei Zhang<\/p>\n<p>Using click-through information to gather terms for query expansion<br \/>\nthat is used to re-rank scores for the user.  They use a HITS-like<br \/>\nalgorithm analogously with &#8220;search results&#8221; and &#8220;query terms&#8221; for<br \/>\n&#8220;authorities&#8221; and &#8220;hubs&#8221;.  The interesting thing is that the<br \/>\nre-ranking and query expansion is performed at the same time (upon<br \/>\nconvergence, take top terms as QE, top results to re-rank.)<\/p>\n<p>Out-performs Google in both English and Chinese on precision at 5, 10,<br \/>\n20, and 30 documents.  <\/p>\n<p>A question on what documents are used to determine the terms used to<br \/>\npick for QE.  The accusation is that by using terms from unclicked<br \/>\ndocuments to determine which terms to select, they are not using a<br \/>\ngood discrimination set.  <\/p>\n<p>Atsushi Fujii from NII asked whether they have tried pseudo-relevance<br \/>\nfeedback, which should have similar performance.  There is some<br \/>\nquestion about whether they have used a fair baseline.  <\/p>\n<p>&#8212;<\/p>\n<p>The Effect of Translation Quality in MT-Based Cross-Language<br \/>\nInformation Retrieval<br \/>\nJiang Zhu, Haifeng Wang (Toshiba R&#038;D China)<\/p>\n<p>The idea is to use translated queries from an MT system that has been<br \/>\nartificially degraded by reducing knowledge (rule \/ dictionary) base<br \/>\nsize, then try to correlate translation quality with result quality.<br \/>\nThey are correlated, so better MT means better IR results.  Degrading<br \/>\nthe rule base leads to syntax errors and word sense errors.  Degrading<br \/>\nthe dictionary causes only word sense errors, to which the IR systems<br \/>\nare more sensitive.  <\/p>\n<p>Comments: much CLIR does not even use MT since it isn&#8217;t really<br \/>\nrequired, and often systems throw in all senses of words.  The<br \/>\nquestion is did they try to determine which categories of words seemed<br \/>\nto have the most impact (Named Entities, nouns, verbs, etc.)  They<br \/>\ndidn&#8217;t look at that, they focused on translation quality and search<br \/>\neffectiveness.  Similar question on syntax rules, but they only<br \/>\nrandomly selected rules for dropping.  <\/p>\n<p>&#8212;<\/p>\n<p>A Comparison of Document, Sentence, and Term Event Spaces<br \/>\nCatherine Blake<\/p>\n<p>The conclusion is that the document IDF and and inverse sentence<br \/>\nfrequency (and inverse term frequency) are correlated.  IDF values<br \/>\nappear to be very stable.  Language used in abstracts are different<br \/>\nfrom language for full text (for this data anyway.)  <\/p>\n<p>A kind of interesting investigation of IDF applied to different ways<br \/>\nof using it.  There were some questions from (my notes abruptly end here.)<\/p>\n","protected":false},"excerpt":{"rendered":"<p>2006-07-18 Invited Keynote Tuesday morning Argmax Search in Natural Language Processing Daniel Marcu The goal is to convince us that search is hard, and by not worrying about it (just throwing in an argmax without thinking about it much) we have a problem. For example with ISI&#8217;s syntax based MT system, the big problem they [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[4,10],"tags":[],"_links":{"self":[{"href":"https:\/\/fugutabetai.com\/blog\/wp-json\/wp\/v2\/posts\/123"}],"collection":[{"href":"https:\/\/fugutabetai.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/fugutabetai.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/fugutabetai.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/fugutabetai.com\/blog\/wp-json\/wp\/v2\/comments?post=123"}],"version-history":[{"count":0,"href":"https:\/\/fugutabetai.com\/blog\/wp-json\/wp\/v2\/posts\/123\/revisions"}],"wp:attachment":[{"href":"https:\/\/fugutabetai.com\/blog\/wp-json\/wp\/v2\/media?parent=123"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/fugutabetai.com\/blog\/wp-json\/wp\/v2\/categories?post=123"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/fugutabetai.com\/blog\/wp-json\/wp\/v2\/tags?post=123"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}