Beruflich Dokumente
Kultur Dokumente
Guoxing Li
Stanford University
Stanford, CA 94305
guoxing@stanford.edu
Nathan Eidelson
Stanford University
Stanford, CA 94305
Nishith Khandwala
Stanford University
Stanford, CA 94305
nathanje@stanford.edu
nishith@stanford.edu
Abstract
How can machines communicate with human in a more natural way? To answer this
question, a machine first needs to express
its knowledge in a human friendly format,
namely natural language. In this paper, we
present a system that is able to generate
human friendly sentences based on structured knowledge, specifically relationship
tuples. The model adopts an unsupervised
approach, and takes advantage of a powerful bi-directional recurrent neural network language model and GloVe vectors
for word representation to generate sentences that conveys the same knowledge
as the original relationship tuples. Our experiments show that the performance of
our system is still suboptimal. However, it
should serve as an initial attempt to solve
the task of expressing machine knowledge
in natural languages.
Introduction
Large online databases such as Freebase1 consolidate vast amounts of information, crowd-sourced
and vetted by active contributors. The data is
exposed to external developers through well
documented APIs. However, it is not intended
for the general populus consumption in the same
way Wikipedia2 articles are. The Freebase data
is stored in complex nested MQL formats that
extend several layers deep and link to other
objects in the database. Wikipedia articles are
easily understood by humans, while Freebase is
easily understood by machines. Not only is information ultimately duplicated between these two
platforms, but crowd-sourcing efforts are doubled
in their maintenance. We hypothesize that the
1
2
https://www.freebase.com
https://www.wikipedia.org/
Motivation
Related Work
4
4.1
Approach
Relation Extraction and Data Generation
..
..
.
.
.
.
.
(X, r k, Z 1)
(X, r k, Z p)
Sentence Generation
Recurrent neural network (RNN) is a type of network where connections between units form a directed cycle. In other words, the hidden layer
from a previous timestamp is combined with input layer to compute the hidden layer at the current timestamp. RNNs have shown great success
in a range of natural language processing tasks,
especially in language modeling (Mikolov et al.,
2010), (Mikolov and Zweig, 2012). However,
such RNNs only have information in the past when
making a decision on xt , which is not sufficient
in the word filling task since often times the context afterwards (especially the immediate following word) also determines the words to be filled
in. One way to resolve this issue is to use a bi-
ht = sigmoid( L xt + V h t1 )
(1)
ht = sigmoid( L xt + V h t+1 )
(2)
yt = softmax( U h t + U h t )
(3)
probability of all words in the vocabulary to appear at slot t. Figure 3 demonstrates our model.
The BRNN is a core part of our sentence generation system. It is able to predict one missing word
based on the context around the word, which is an
essential building piece in the sentence generation
process. Note that in our greedy search algorithm,
the BRNN model does suffer from accuracy issue
since the context provided is not complete most of
the time. However, we think its a fair trade-off in
gain of much more efficiency.
In this section, we introduce our experiment settings, show experimental results of our system,
and then analyze them.
5.1
4.4
Evaluation Metric
http://www.nltk.org/
Datasets
Word Vectors
https://www.cis.upenn.edu/ treebank/
Training
5.4
Results
Quantitative: Table 1 shows our experimental results. There are two metrics we used, top-1 sentence accuracy rate and top-10 sentences accuracy rate. Top-n sentences accuracy rate means
the number of true positives over all test examples. We count an example as true positive if at
least one correct sentence appear in the top n sentences generated from our system ranked by average perplexity. Looking from the results, we
can see that the system with PPDB integrated performs better than the one without PPDB. This validates our hypothesis. However, in general, our
system didnt perform as well as we expected. We
think there are three main reasons. First, the language model trained is using complete sentences
as the training corpus. However, during sentence
generation, we have to force the model to predict
missing words based on half-complete sentences
to limit the searching space. Second, the training corpus is both small (56,522 sentences) and
improper for our task. The corpus is generated
from Wall Street Journal articles. Though its descriptive, we think a better option would be the
Wikipedia corpus, since its knowledge based corpus and usually contains the right format for sentences about knowledge. At last, we believe the
model will perform better if we have a larger vocabulary due to the error introduced by training on
a small set of words as mentioned in Section 5.2.
Qualitative: Table 2 lists the top 10 generated sentences based on the relation tuple (Larry
Page, board member, Google). Note that the
words Larry, Page, Google were not included
in our 8,000 model vocabulary during training.
Our system is able to recognize Larry Page and
Google as entities to fill the proper words. Since
board is not a frequent word in the corpus and it
happens to appear together with big a lot. Our
model gives higher rank to the phrase big board.
We believe the problem can be allieviated by having a larger training corpus.
Future Work
Relation extraction should be used to further natural language generation. Our results are indicative
that tuples formed as a result of information
extraction methods reinforce the fidelity of the
generated sentences. Here, in this section, we
enlist a few variations in our methodology we
would have liked to iterate over.
7
A bigger RNN model trained for a longer
time over a more exhaustive dataset would have
definitely helped improve tuples containing
proper nouns. As of now, our training set does not
keep up with current updates and hence, lacks the
names of people and places often mentioned in
our input tuples. We believe that a larger training
size would resolve this issue.
We were also considering making a model
learn on the PPDB. Currently, our approach performs a linear search to find relevant paraphrases.
In order to boost our performance, it would be
interesting to investigate how a neural network
trained on the PPDB generates paraphrases in
comparison to our existing approach.
Lastly, our current sentence generation model
could be modified to further augment the number
of input tuples. Given a relation tuple, the order of
the three constituents could be permuted. While
evaluating our results, we noticed that better sentences would have been generated if the two noun
phrases in the tuple had been switched in position.
For instance, the tuple (Barack Obama, children, Natasha Obama) will probably not result
in a well-formed sentences unless the relation
phrase is modified to incorporate the fact that
Conclusion
In this project, we attempted to generate wellformed, logically sound and grammatically accurate sentences given relation tuples that usually
take the form, (Noun Phrase 1, Relation Phrase,
Noun Phrase 2). Our model comprised of a relation extraction engine that collected such tuples and augmented their number, a bi-directional
RNN that took these tuples as input and generated
words to add before and after the relation phrase,
a scoring metric called the perplexity to rank the
candidate outputs, and a greedy searching algorithm to limit the searching scope. Upon manual evaluation of our results, our model obtained
an accuracy of 0.30 without PPDB and 0.35 with
PPDB only considering the best generated sentence. Evaluating on the top 10 sentences for each
input tuple, the accuracies were higher with 0.45
without PPDB and 0.65 with PPDB. We believe
our work is an initial attempt to generate human
readable sentences based on relationship tuples,
and theres still a good amount improvement can
be done based on our work.
Acknowledgements
We would like to thank Professors Bill MacCartney and Chris Potts for being immensely supportive of our project.
References
Mohit Iyyer, Jordan Boyd-Graber, and Hal Daume III.
Generating sentences from semantic vector space
representations.
Andrej Karpathy and Li Fei-Fei. 2014. Deep visualsemantic alignments for generating image descriptions. arXiv preprint arXiv:1412.2306.
Irene Langkilde and Kevin Knight. 1998. The practical value of n-grams in generation. In Proceedings
of the ninth international workshop on natural language generation, pages 248255. Citeseer.
Tomas Mikolov and Geoffrey Zweig. 2012. Context
dependent recurrent neural network language model.
In SLT, pages 234239.
Tomas Mikolov, Martin Karafiat, Lukas Burget, Jan
Cernock`y, and Sanjeev Khudanpur. 2010. Recurrent neural network based language model. In INTERSPEECH 2010, 11th Annual Conference of the
International Speech Communication Association,
Makuhari, Chiba, Japan, September 26-30, 2010,
pages 10451048.
Jeffrey Pennington, Richard Socher, and Christopher D
Manning.
2014.
Glove: Global vectors for
word representation. Proceedings of the Empiricial
Methods in Natural Language Processing (EMNLP
2014), 12:15321543.
Mike Schuster and Kuldip K Paliwal. 1997. Bidirectional recurrent neural networks. Signal Processing,
IEEE Transactions on, 45(11):26732681.
Ilya Sutskever, James Martens, and Geoffrey E Hinton. 2011. Generating text with recurrent neural
networks. In Proceedings of the 28th International
Conference on Machine Learning (ICML-11), pages
10171024.