September 14, 2006

IPSJ in Shinjuku Day two

Wednesday was the final day of the IPSJ meeting. I've got more comments on the papers that I saw that day below.

Yoshihisa Shinozawa - "Extended simple recurrent networks by using
bigram" - Keio University

I had a tough time understanding this paper: I don't know much about
word nets using perceptrons. I also had a tough time following his


Takashi Kawakami, Hisashi Suzuki - "A calculation of Word Similarity
using Decision Lists" - Chuo University.

Given two "words" (kana, kanji or kanji, kanji or probably kana, kana)
the system will return a number from 0 to 1, where 0 is not similar,
and 1 similar. Can we use this for disambiguation? They are using
decision lists. The look at thing have been categorized and not 夏は
寒いです and 冬は寒いです and check how similar the two are to each
other. Use 12 novels available for free on the internet. In the
paper they present a table with similar terms. There are lots of
numbers (1 and 2 are similar, and some antonyms as well, such as
mother, father.)

There were lots of questions on this paper as well. In the table he
presents, also "月" and "年" came out to be similar. In some ways
they are only similar in some certain contexts.


Akiko Aizawaq - "On the Effect of Corpus Size in Words Similarity
Calculation" - National Institute of Informatics

The main focus is on how the corpus quality for extracting synonyms
changes the output. There are two ways to do the extraction:
pattern-based (A such as B) and co-occurrence vectors. When the
corpora are made larger, does that help? (Particularly in the case of
the vector-based approaches?) The conclusion is that larger corpora
help, but you need to use a simple filter to avoid bias that emerges
from high frequency words.


Takahiro Ono, Akira Suganuma, Rin-ichiro Taniguchi - "Extraction of
the sentences whose modification relation is misunderstood for a
writing tool" - Kyuushyuu Daigaku

They do work on automatic text revision. Their focus here is
indicating sentences which are difficult to understand and might
easily be misinterpreted. They focus on nouns with multiple
modifiers and dependency structure between clauses. It was very
interesting for me, but of course I had trouble following some of the
Japanese grammar vocabulary.

They did an experiment with humans.


Yu Akiyama, Masahiro Fukaya, Hajime Ohiwa, Masakazu Tateno -
"Extending Kwic Concordance by Standardization of sentence pattern" -
Keio University / Fuji Xerox

This one was skipped - cancelled at the last minute.


Yoshinobu Kano, Yusuke Miyao, Junichi Tsuji - "Candidate Reduction in
Syntactic Structure Analysis with Pure Incremental Processing" -
University of Tokyo

I didn't really follow this talk. I'm not strong on parsing, and
certainly not strong on parsing when it is discussed in Japanese.


Manuel Medina Gonzalex and Hirosato Nomura - "A Cross-Lingual Grammar
Model and its Application to Japanese-Spanish Machine Translation" -
Kyuushyuu University

First talk in English. Their translation model is based on the Alt/JE
model. They try to predict certain Spanish features based on the
Japanese sentence (for features that do not exist in Spanish.) For
example, gender and number.


Yohei Seki, Koji Eguchi, Noriko Kando, Masaki Aono - "An Analysis of
Opinion-focused Summarization using Opinion Annotation" - Toyohashi
University of Technology / National Institute of Informatics


Shunpei Tatebayashi, Makoto Haraguchi - "A Coherent Text Summarization
Method based on Semantic Correlations between Sentences" - Hokkaidou

They want to summarize the important parts of long stories,
abstracting out the events and common themes in the stories. I
think. They are looking at summarization the preserves and reflects
the structure of the input - so analyze the segments in the text, and
when summarizing only extract segments that are related. They compare
to some graph-based summarization methods too, so it might be
interested to read in further detail later.


Hu Bai, Ueda Yoshihiro, Oka Mamiko - "Phrase-Representation
Summarization Method for Chinese" - Fuji Xerox

The second (and final) talk in English.
Phrase based summarization for Chinese for IR support. The Japanese
version that this is based on does something like sentence
simplification for summarization based on a dependency structure
parse. They use LFG to parse the Chinese, and generate the summary
sentence using a syntactical pattern.


Provide your email address when commenting and Gravatar will provide general portable avatars, and if you haven't signed up with them, a cute procedural avatar with their implementation of Shamus Young's Wavatars.

Comments have now been turned off for this post