Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion PV211: Introduction to Information Retrieval https://www.fi.muni.cz/~sojka/PV211 IIR 9: Relevance feedback & Query expansion Handout version Petr Sojka, Hinrich Schütze et al. Faculty of Informatics, Masaryk University, Brno Center for Information and Language Processing, University of Munich 2024-04-24 (compiled on 2024-04-24 10:06:17) Sojka, IIR Group: PV211: Relevance feedback & Query expansion 1 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Overview 1 Motivation 2 Relevance feedback: Basics 3 Relevance feedback: Details 4 Query expansion Sojka, IIR Group: PV211: Relevance feedback & Query expansion 2 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Take-away today Interactive relevance feedback: improve initial retrieval results by telling the IR system which docs are relevant / non-relevant Best known relevance feedback method: Rocchio feedback Query expansion: improve retrieval results by adding synonyms / related terms to the query Sources for related terms: Manual thesauri, automatic thesauri, query logs Sojka, IIR Group: PV211: Relevance feedback & Query expansion 3 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion How can we improve recall in search? Main topic today: two ways of improving recall: relevance feedback and query expansion As an example consider query q: [aircraft] . . . . . . and document d containing “plane”, but not containing “aircraft” A simple IR system will not return d for q. Even if d is the most relevant document for q! We want to change this: Return relevant documents even if there is no term match with the (original) query Sojka, IIR Group: PV211: Relevance feedback & Query expansion 5 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Recall Loose definition of recall in this lecture: “increasing the number of relevant documents returned to user” This may actually decrease recall on some measures, e.g., when expanding “jaguar” with “panthera” . . . which eliminates some relevant documents, but increases relevant documents returned on top pages Sojka, IIR Group: PV211: Relevance feedback & Query expansion 6 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Options for improving recall Local: Do a “local”, on-demand analysis for a user query Main local method: relevance feedback Part 1 Global: Do a global analysis once (e.g., of collection) to produce thesaurus Use thesaurus for query expansion Part 2 Sojka, IIR Group: PV211: Relevance feedback & Query expansion 7 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Google examples for query expansion One that works well ˜flights -flight One that doesn’t work so well ˜dogs -dog Sojka, IIR Group: PV211: Relevance feedback & Query expansion 8 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Relevance feedback: Basic idea The user issues a (short, simple) query. The search engine returns a set of documents. User marks some docs as relevant, some as non-relevant. Search engine computes a new representation of the information need. Hope: better than the initial query. Search engine runs new query and returns new results. New results have (hopefully) better recall. We can iterate this: several rounds of relevance feedback. We will use the term ad hoc retrieval to refer to regular retrieval without relevance feedback. Sojka, IIR Group: PV211: Relevance feedback & Query expansion 10 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Relevance feedback: Examples We will now look at three different examples of relevance feedback that highlight different aspects of the process. Sojka, IIR Group: PV211: Relevance feedback & Query expansion 11 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Relevance Feedback: Example 1 Sojka, IIR Group: PV211: Relevance feedback & Query expansion 12 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Results for initial query Sojka, IIR Group: PV211: Relevance feedback & Query expansion 13 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion User feedback: Select what is relevant Sojka, IIR Group: PV211: Relevance feedback & Query expansion 14 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Results after relevance feedback Sojka, IIR Group: PV211: Relevance feedback & Query expansion 15 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Vector space example: query “canine” (1) source: Fernando Díaz Sojka, IIR Group: PV211: Relevance feedback & Query expansion 16 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Similarity of docs to query “canine” source: Fernando Díaz Sojka, IIR Group: PV211: Relevance feedback & Query expansion 17 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion User feedback: Select relevant documents source: Fernando Díaz Sojka, IIR Group: PV211: Relevance feedback & Query expansion 18 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Results after relevance feedback source: Fernando Díaz Sojka, IIR Group: PV211: Relevance feedback & Query expansion 19 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Example 3: A real (non-image) example Initial query: [new space satellite applications] Results for initial query: (r = rank) r + 1 0.539 NASA Hasn’t Scrapped Imaging Spectrometer + 2 0.533 NASA Scratches Environment Gear From Satellite Plan 3 0.528 Science Panel Backs NASA Satellite Plan, But Urges Launches of Smaller Probes 4 0.526 A NASA Satellite Project Accomplishes Incredible Feat: Staying Within Budget 5 0.525 Scientist Who Exposed Global Warming Proposes Satellites for Climate Research 6 0.524 Report Provides Support for the Critics Of Using Big Satellites to Study Climate 7 0.516 Arianespace Receives Satellite Launch Pact From Telesat Canada + 8 0.509 Telecommunications Tale of Two Companies User then marks relevant documents with “+”. Sojka, IIR Group: PV211: Relevance feedback & Query expansion 20 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Expanded query after relevance feedback 2.074 new 15.106 space 30.816 satellite 5.660 application 5.991 nasa 5.196 eos 4.196 launch 3.972 aster 3.516 instrument 3.446 arianespace 3.004 bundespost 2.806 ss 2.790 rocket 2.053 scientist 2.003 broadcast 1.172 earth 0.836 oil 0.646 measure Compare to original query: [new space satellite applications] Sojka, IIR Group: PV211: Relevance feedback & Query expansion 21 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Results for expanded query (old ranks in parens) r * 1 (2) 0.513 NASA Scratches Environment Gear From Satellite Plan * 2 (1) 0.500 NASA Hasn’t Scrapped Imaging Spectrometer 3 0.493 When the Pentagon Launches a Secret Satellite, Space Sleuths Do Some Spy Work of Their Own 4 0.493 NASA Uses ‘Warm’ Superconductors For Fast Cir- cuit * 5 (8) 0.492 Telecommunications Tale of Two Companies 6 0.491 Soviets May Adapt Parts of SS-20 Missile For Commercial Use 7 0.490 Gaping Gap: Pentagon Lags in Race To Match the Soviets In Rocket Launchers 8 0.490 Rescue of Satellite By Space Agency To Cost $90 Million Sojka, IIR Group: PV211: Relevance feedback & Query expansion 22 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Key concept for relevance feedback: Centroid The centroid is the center of mass of a set of points. Recall that we represent documents as points in a high-dimensional space. Thus: we can compute centroids of documents. Definition: µ(D) = 1 |D| d∈D v(d) where D is a set of documents and v(d) = d is the vector we use to represent document d. Sojka, IIR Group: PV211: Relevance feedback & Query expansion 24 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Centroid: Examples xx x x ⋄ ⋄ ⋄ ⋄ ⋄ ⋄ Sojka, IIR Group: PV211: Relevance feedback & Query expansion 25 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Rocchio algorithm The Rocchio algorithm implements relevance feedback in the vector space model. Rocchio chooses the query qopt that maximizes qopt = arg max q [sim(q, µ(Dr )) − sim(q, µ(Dnr ))] Dr : set of relevant docs; Dnr : set of nonrelevant docs Intent: qopt is the vector that separates relevant and non-relevant docs maximally. Making some additional assumptions, we can rewrite qopt as: qopt = µ(Dr ) + [µ(Dr ) − µ(Dnr )] Sojka, IIR Group: PV211: Relevance feedback & Query expansion 26 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Rocchio algorithm The optimal query vector is: qopt = µ(Dr ) + [µ(Dr ) − µ(Dnr )] = 1 |Dr | dj ∈Dr dj + [ 1 |Dr | dj ∈Dr dj − 1 |Dnr | dj ∈Dnr dj] We move the centroid of the relevant documents by the difference between the two centroids. Sojka, IIR Group: PV211: Relevance feedback & Query expansion 27 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Exercise: Compute Rocchio vector x x x x xx circles: relevant documents, Xs: nonrelevant documents compute: qopt = µ(Dr ) + [µ(Dr ) − µ(Dnr )] Sojka, IIR Group: PV211: Relevance feedback & Query expansion 28 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Rocchio illustrated x x x x xx µR µNR µR − µNRqopt circles: relevant documents, Xs: non-relevant documents µR: centroid of relevant documents µR does not separate relevant/non-relevant. µNR: centroid of non-relevant documents µR − µNR: difference vector Add difference vector to µR . . . . . . to get qopt qopt separates relevant/non-relevant perfectly. Sojka, IIR Group: PV211: Relevance feedback & Query expansion 29 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Terminology So far, we have used the name Rocchio for the theoretically better motivated original version of Rocchio. The implementation that is actually used in most cases is the SMART implementation – this SMART version of Rocchio is what we will refer to from now on. Sojka, IIR Group: PV211: Relevance feedback & Query expansion 30 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Rocchio 1971 algorithm (SMART) Used in practice: qm = αq0 + βµ(Dr ) − γµ(Dnr ) = αq0 + β 1 |Dr | dj ∈Dr dj − γ 1 |Dnr | dj ∈Dnr dj qm: modified query vector; q0: original query vector; Dr and Dnr : sets of known relevant and non-relevant documents respectively; α, β, and γ: weights New query moves towards relevant documents and away from non-relevant documents. Tradeoff α vs. β/γ: If we have a lot of judged documents, we want a higher β/γ. Set negative term weights to 0. “Negative weight” for a term doesn’t make sense in the vector space model. Sojka, IIR Group: PV211: Relevance feedback & Query expansion 31 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Positive vs. negative relevance feedback Positive feedback is more valuable than negative feedback. For example, set β = 0.75, γ = 0.25 to give higher weight to positive feedback. Many systems only allow positive feedback. Sojka, IIR Group: PV211: Relevance feedback & Query expansion 32 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Relevance feedback: Assumptions When can relevance feedback enhance recall? Assumption A1: The user knows the terms in the collection well enough for an initial query. Assumption A2: Relevant documents contain similar terms (so I can “hop” from one relevant document to a different one when giving relevance feedback). Sojka, IIR Group: PV211: Relevance feedback & Query expansion 33 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Violation of A1 Assumption A1: The user knows the terms in the collection well enough for an initial query. Violation: Mismatch of searcher’s vocabulary and collection vocabulary Example: cosmonaut / astronaut Sojka, IIR Group: PV211: Relevance feedback & Query expansion 34 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Violation of A2 Assumption A2: Relevant documents are similar. Example for violation: [contradictory government policies] Several unrelated “prototypes” Subsidies for tobacco farmers vs. anti-smoking campaigns Aid for developing countries vs. high tariffs on imports from developing countries Relevance feedback on tobacco docs will not help with finding docs on developing countries. Sojka, IIR Group: PV211: Relevance feedback & Query expansion 35 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Relevance feedback: Assumptions When can relevance feedback enhance recall? Assumption A1: The user knows the terms in the collection well enough for an initial query. Assumption A2: Relevant documents contain similar terms (so I can “hop” from one relevant document to a different one when giving relevance feedback). Sojka, IIR Group: PV211: Relevance feedback & Query expansion 36 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Relevance feedback: Evaluation Pick an evaluation measure, e.g., precision in top 10: P@10 Compute P@10 for original query q0 Compute P@10 for modified relevance feedback query q1 In most cases: q1 is spectacularly better than q0! Is this a fair evaluation? Sojka, IIR Group: PV211: Relevance feedback & Query expansion 37 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Relevance feedback: Evaluation Fair evaluation must be on “residual” collection: docs not yet judged by user. Studies have shown that relevance feedback is successful when evaluated this way. Empirically, one round of relevance feedback is often very useful. Two rounds are marginally useful. Sojka, IIR Group: PV211: Relevance feedback & Query expansion 38 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Evaluation: Caveat True evaluation of usefulness must compare to other methods taking the same amount of time. Alternative to relevance feedback: User revises and resubmits query. Users may prefer revision/resubmission to having to judge relevance of documents. There is no clear evidence that relevance feedback is the “best use” of the user’s time. Sojka, IIR Group: PV211: Relevance feedback & Query expansion 39 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Exercise Do search engines use relevance feedback? Why? Sojka, IIR Group: PV211: Relevance feedback & Query expansion 40 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Relevance feedback: Problems Relevance feedback is expensive. Relevance feedback creates long modified queries. Long queries are expensive to process. Users are reluctant to provide explicit feedback. It’s often hard to understand why a particular document was retrieved after applying relevance feedback. The search engine Excite had full relevance feedback at one point, but abandoned it later. Sojka, IIR Group: PV211: Relevance feedback & Query expansion 41 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Pseudo-relevance feedback Pseudo-relevance feedback automates the “manual” part of true relevance feedback. Pseudo-relevance feedback algorithm: Retrieve a ranked list of hits for the user’s query Assume that the top k documents are relevant. Do relevance feedback (e.g., Rocchio) Works very well on average But can go horribly wrong for some queries. Because of query drift If you do several iterations of pseudo-relevance feedback, then you will get query drift for a large proportion of queries. Sojka, IIR Group: PV211: Relevance feedback & Query expansion 42 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Pseudo-relevance feedback at TREC4 Cornell SMART system Results show number of relevant documents out of top 100 for 50 queries (so total number of documents is 5000): method number of relevant documents lnc.ltc 3210 lnc.ltc-PsRF 3634 Lnu.ltu 3709 Lnu.ltu-PsRF 4350 Results contrast two length normalization schemes (L vs. l) and pseudo-relevance feedback (PsRF). The pseudo-relevance feedback method used added only 20 terms to the query. (Rocchio will add many more.) This demonstrates that pseudo-relevance feedback is effective on average. Sojka, IIR Group: PV211: Relevance feedback & Query expansion 43 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Query expansion: Example Sojka, IIR Group: PV211: Relevance feedback & Query expansion 45 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Types of user feedback User gives feedback on documents. More common in relevance feedback User gives feedback on words or phrases. More common in query expansion Sojka, IIR Group: PV211: Relevance feedback & Query expansion 46 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Query expansion Query expansion is another method for increasing recall. We use “global query expansion” to refer to “global methods for query reformulation”. In global query expansion, the query is modified based on some global resource, i.e. a resource that is not query-dependent. Main information we use: (near-)synonymy Sojka, IIR Group: PV211: Relevance feedback & Query expansion 47 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion “Global” resources used for query expansion A publication or database that collects (near-)synonyms is called a thesaurus. Manual thesaurus (maintained by editors, e.g., PubMed) Automatically derived thesaurus (e.g., based on co-occurrence statistics) Query-equivalence based on query log mining (common on the web as in the “palm” example) Sojka, IIR Group: PV211: Relevance feedback & Query expansion 48 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Thesaurus-based query expansion For each term t in the query, expand the query with words the thesaurus lists as semantically related with t. Example from earlier: hospital → medical Generally increases recall May significantly decrease precision, particularly with ambiguous terms interest rate → interest rate fascinate Widely used in specialized search engines for science and engineering It’s very expensive to create a manual thesaurus and to maintain it over time. Sojka, IIR Group: PV211: Relevance feedback & Query expansion 49 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Example for manual thesaurus: PubMed Sojka, IIR Group: PV211: Relevance feedback & Query expansion 50 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Automatic thesaurus generation Attempt to generate a thesaurus automatically by analyzing the distribution of words in documents Fundamental notion: similarity between two words Definition 1: Two words are similar if they co-occur with similar words. “car” ≈ “motorcycle” because both occur with “road”, “gas” and “license”, so they must be similar. Definition 2: Two words are similar if they occur in a given grammatical relation with the same words. You can harvest, peel, eat, prepare, etc., apples and pears, so apples and pears must be similar. Co-occurrence is more robust, grammatical relations are more accurate. Sojka, IIR Group: PV211: Relevance feedback & Query expansion 51 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Co-occurrence-based thesaurus: Examples Word Nearest neighbors absolutely absurd whatsoever totally exactly nothing bottomed dip copper drops topped slide trimmed captivating shimmer stunningly superbly plucky witty doghouse dog porch crawling beside downstairs makeup repellent lotion glossy sunscreen skin gel mediating reconciliation negotiate case conciliation keeping hoping bring wiping could some would lithographs drawings Picasso Dali sculptures Gauguin pathogens toxins bacteria organisms bacterial parasite senses grasp psyche truly clumsy naive innate WordSpace demo on web Sojka, IIR Group: PV211: Relevance feedback & Query expansion 52 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Soft cosine measure Use a matrix S that specifies the cosine similarity of basis vectors (i.e. of words) in Salton’s vector space model. Definition 3: The similarity of two words is proportional to their cosine similarity. “car” ≈ “motorcycle” iff cos(“car”, “motorcycle”) ≈ 1. When the search engine supports non-orthogonal vector space model, then we can directly compute the soft cosine measure (SCM) between document vectors u and v by computing the matrix product uTSv. Otherwise, we can expand the text query as follows: 1 Translate the text query to a query vector u. 2 Compute u′ = uS. 3 Translate u′ back to a (now expanded) text query. Unlike a thesaurus based on word co-occurrences, the matrix S can be derived from word embeddings, the Levenshtein distance, and other measures of word similarity / relatedness. Sojka, IIR Group: PV211: Relevance feedback & Query expansion 53 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion SCM query expansion: Example Query expansion using a Gramm matrix S that was built from the Google News word embeddings distributed with Word2Vec: Original query : “I did enact Julius Caesar: I was killed i’ the Capitol” Expanded query : “Give␣unto␣Caesar Brutus␣Cassius choreographers␣Bosco Julius␣Caesar therefore␣unto␣Caesar Marcus␣Antonius Caesarion Gallic␣Wars Marcus␣Crassus Antoninus Catiline Seleucus Gaius␣Julius␣Caesar Theodoric Marcus␣Tullius␣Cicero ... Kenneth Philip Marcus Arthur Carl Fred Edward Jonathan Eric Frank Anthony William Richard Robert enact Capitol killed Ididn’t honestly myself I I my we the ’d ’m did was” We can include only highly similar words in the expanded query. Search engines such as Apache Lucene make it possible to assign weights to words in text queries. Sojka, IIR Group: PV211: Relevance feedback & Query expansion 54 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Query expansion at search engines Main source of query expansion at search engines: query logs Example 1: After issuing the query [herbs], users frequently search for [herbal remedies]. → “herbal remedies” is potential expansion of “herb”. Example 2: Users searching for [flower pix] frequently click on the URL photobucket.com/flower. Users searching for [flower clipart] frequently click on the same URL. → “flower clipart” and “flower pix” are potential expansions of each other. Sojka, IIR Group: PV211: Relevance feedback & Query expansion 55 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Take-away today Interactive relevance feedback: improve initial retrieval results by telling the IR system which docs are relevant / non-relevant Best known relevance feedback method: Rocchio feedback Query expansion: improve retrieval results by adding synonyms / related terms to the query Sources for related terms: Manual thesauri, automatic thesauri, query logs Sojka, IIR Group: PV211: Relevance feedback & Query expansion 56 / 57 Motivation Relevance feedback: Basics Relevance feedback: Details Query expansion Resources Chapter 9 of IIR Resources at https://www.fi.muni.cz/~sojka/PV211/ and http://cislmu.org, materials in MU IS and FI MU library Daniel Tunkelang’s articles on query understanding, namely on query relaxation and query expansion. Salton and Buckley 1990 (original relevance feedback paper) Spink, Jansen, Ozmultu 2000: Relevance feedback at Excite Justin Bieber: related searches fail Word Space Schütze 1998: Automatic word sense discrimination (describes a simple method for automatic thesaurus generation) Sidorov et al. 2014: Soft similarity and soft cosine measure: Similarity of features in vector space model Charlet and Damnati 2017: SimBow at SemEval-2017 Task 3: Soft-Cosine Semantic Similarity between Questions for Community Question Answering (describes two matrices S) Sojka, IIR Group: PV211: Relevance feedback & Query expansion 57 / 57