Академический Документы
Профессиональный Документы
Культура Документы
Database Exploration
Abstract
Literature Survey
Genome (http://genome.ucsc.edu/)
Sky Server (http://cas.sdss.org/)
Proposed Solution
Architecture
Re-Ranking
based on KLDiversion
Methodology/ Implementation
Details
Session summaries
Recommendation seed computation
Generation of query recommendations
Query Processing
Query Relaxation
Query Parsing
Fragment Based
Recommendations
Session Summary:
The session summary vector Si for a user i consists
of all the query fragments of the users past
queries.
Let Qi represent the set of queries posed by user i
during a session
F represent the set of all distinct query fragments
recorded in the query logs.
We assume that the vector SQ represents a single
query Q Qi. For a given fragment F, we define
SQ[] as a binary variable that represents the
presence or absence of in a query Q.
Then Si[] represents the importance of in session
Si.
Fragment Based
Recommendations
Predicted Summary
Computation
Generation of Query
Recommendation
Query Relaxation
Query Parsing
Our Contribution...
KullbackLeibler divergence Theorem:
KL divergence is a special case of a broader
class of divergences called f-divergences. It
was originally introduced by Solomon
Kullback and Richard Leibler in 1951 as the
directed divergence between two
distributions. It can be derived from a
Bregman divergence.
Our Contribution...
In words, it is the expectation of the
logarithmic difference between the
probabilities P and Q, where the expectation
is taken using the probabilities P. The KL
divergence is only defined if P and Q both
sum to 1 and if Q(i)=0 implies P(i)=0 for all i
(absolute continuity). If the quantity 0 ln 0
appears in the formula, it is interpreted as
zero because
where p and q denote the densities of P and Q.
For distributions
P and Q of a continuous
Our Contribution...
More generally, if P and Q are probability
measures over a set X, and P is absolutely
continuous with respect to Q, then the
KullbackLeibler divergence from P to Q is
defined as
where dp/dq is the RadonNikodym
derivative of P with respect to Q, and
provided the expression on the right-hand
side exists. Equivalently, this can be written
Our Contribution...
which we recognize as the entropy of P
relative to Q. Continuing in this case, if Mue
is any measure on X for which
and
exist, then the KL divergence from P to
Q is given as
Our Contribution...
Agglomerative Clustering Algorithm
The algorithm forms clusters in a bottom-up manner, as follows:
1.Initially, put each article in its own cluster.
2.Among all current clusters, pick the two clusters with the smallest
distance.
3.Replace these two clusters with a new cluster, formed by merging the
two original ones.
4.Repeat the above two steps until there is only one remaining cluster in
the pool.
Thus, the agglomerative clustering algorithm will result in a binary
cluster tree with single article clusters as its leaf nodes and a root node
containing all the articles.In the clustering algorithm, we use a distance
measure based on log likelihood. For articles A and B, the distance is
defined as
Our Contribution...
The log likelihood LL(X) of an article or cluster X is given by a unigram
model:
Here, cx(w) and px(w) are the count and probability, respectively, of
word w in cluster X, and Nx is the total number of words occurring in
cluster X.Notice that this definition is equivalent to the weighted
information loss after merging two articles:
Where
Our Contribution...
where C1 and C2 are two clusters, and A, B are articles from C1 and
C2 , respectively.Once a cluster tree is created, we must decide where to
slice the tree to obtain disjoint partitions for building cluster-specific LMs.
This is equivalent to choosing the total number of clusters. There is a
tradeoff involved in this choice. Clusters close to the leaves can maintain
more specifics of the word distributions. However, clusters close to the
root of the tree yield LMs with more reliable estimates, because of the
larger amount of data.We roughly optimized the number of clusters by
evaluating the perplexity of the Hub4 development test set. We created
sets of 1, 5, 10, 15, and 20 article clusters, by slicing the cluster tree at
different points. A backoff trigram model was built for each cluster, and
interpolated with a trigram model derived from all articles for smoothing,
to compensate for the different amounts of training data per cluster.
Our Contribution...
and P(LMi) and P(LMi | A) are the prior and posterior cluster
probabilities, respectively.In training, A is the reference transcript for one
story from the Hub4 development data. During testing, A is the 1-best
hypothesis for the story, as determined using the standard LM.
Our Contribution...
Re-ranking based on clarity score
Reranking algorithms can mainly be categorized into two
approaches: Pseudo relevance feedback and Graph-based
reranking.
Pseudo relevance feedback approach display top results as relevant
samples and then collects some samples that are assumed to be
irrelevant.
Graph-based reranking approach usually follows two assumptions.
First, the disagreement between the initial ranking list and the refined
ranking list should be small. Second, approach constructs a graph
where the vertices are images or videos and the edges reflect their
pair wise similarities.
Thank You.