How to create a boolean query with both "should" and "must" clauses? #258

yattias · 2021-06-28T02:55:01Z

Questions

Hi @barseghyanartur. First, thanks for this great package. It has been extremely useful.

I was unable to find an answer to my question in the docs or by examining the source code so I figured I'd take a look.

Basically, what I need to do is generate a boolean query where one part of it is in a must clause and the other is in a should. More specifically, the query I would like to generate is as such:

  "query": {
    "bool": {
      "must": [
        {
          "multi_match": {
            "fields": [ SOME_FIELDS ],
            "operator": "and",
            "query": "SOME QUERY TERMS"
          }
        }
      ],
      "should": [
          {
            "term": {
                "SPECIFIC_FIELD": "SOME QUERY TERMS"
            }
          }
      ]
    }
  },

The reason for the above is to boost a phrase match.

With that being said, whenever I try mixing the following backends:

filter_backends = [
        PhraseSearchFilterBackend, # Custom
        MultiMatchSearchFilterBackend,
]

What ends up happening is that my term query ends up in a must clause even though I specify matching="should" pretty much everywhere.

I even debugged this all the way to base.py where I confirmed matching="should" yet somehow the final query ends up all in the "must".

Any ideas what I'm doing wrong?

The text was updated successfully, but these errors were encountered:

yattias · 2021-06-28T02:56:40Z

For reference, here is my configuration:

class PaperDocumentView(DocumentViewSet):
    document = PaperDocument
    permission_classes = [ReadOnly]
    serializer_class = PaperDocumentSerializer
    pagination_class = LimitOffsetPagination
    lookup_field = 'id'
    filter_backends = [
        PhraseSearchFilterBackend,
        MultiMatchSearchFilterBackend,
        CompoundSearchFilterBackend,
        FacetedSearchFilterBackend,
        FilteringFilterBackend,
        PostFilterFilteringFilterBackend,
        DefaultOrderingFilterBackend,
        OrderingFilterBackend,
        HighlightBackend,
    ]

    search_fields = {
        'doi': {'boost': 3, 'fuzziness': 1},
        'title': {'boost': 2, 'fuzziness': 1},
        'raw_authors.full_name': {'boost': 1, 'fuzziness': 1},
        'abstract': {'boost': 1, 'fuzziness': 1},
        'hubs_flat': {'boost': 1, 'fuzziness': 1},
    }

    multi_match_search_fields = {
        'doi': {'boost': 3, 'fuzziness': 1},
        'title': {'boost': 2, 'fuzziness': 1},
        'raw_authors.full_name': {'boost': 1, 'fuzziness': 1},
        'abstract': {'boost': 1, 'fuzziness': 1},
        'hubs_flat': {'boost': 1, 'fuzziness': 1},
    }

    multi_match_options = {
        'operator': 'and'
    }

    post_filter_fields = {
        'hubs': 'hubs.name',
    }

    faceted_search_fields = {
        'hubs': 'hubs.name'
    }

    filter_fields = {
        'publish_date': 'paper_publish_date'
    }

    ordering = ('_score', '-hot_score', '-discussion_count', '-paper_publish_date')

    ordering_fields = {
        'publish_date': 'paper_publish_date',
        'discussion_count': 'discussion_count',
        'score': 'score',
        'hot_score': 'hot_score',
    }

    highlight_fields = {
        'raw_authors.full_name': {
            'field': 'raw_authors',
            'enabled': True,
            'options': {
                'pre_tags': ["<mark>"],
                'post_tags': ["</mark>"],
                'fragment_size': 1000,
                'number_of_fragments': 10,
            },
        },
        'title': {
            'enabled': True,
            'options': {
                'pre_tags': ["<mark>"],
                'post_tags': ["</mark>"],
                'fragment_size': 2000,
                'number_of_fragments': 1,
            },
        },
        'abstract': {
            'enabled': True,
            'options': {
                'pre_tags': ["<mark>"],
                'post_tags': ["</mark>"],
                'fragment_size': 5000,
                'number_of_fragments': 1,
            },
        }
    }

yattias · 2021-06-30T18:06:13Z

Wondering if someone here can help 🙏

Sachin-Kahandal · 2022-07-21T06:31:55Z

extend base class get_queryset() and define your own queries there, rather than using search-filter-backends

def get_queryset(self): 
    # getting search param from request
    request = self.request
    text_raw = request.GET.get("search")
    query0 = multi-match query
    query1 = match query
    query2 = matchphrase
    etc...
    q1 = Bool(should=[query0, query1, tquery1, dquery1, tquery3, dquery3, item_url_query])
    queryset = Search(using=self.client, index=self.index, doc_type=self.document._doc_type.name).query(q1)
    return queryset

You will have finer control over your queries with this

barseghyanartur · 2022-07-21T07:30:36Z

This question comes up regularly. I'll add it to the FAQ, but TL;DR:

If you need a combination of ANDs and ORs, use SimpleQueryStringSearchFilterBackend. Check for examples here and in docs.

yattias added the question label Jun 28, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to create a boolean query with both "should" and "must" clauses? #258

How to create a boolean query with both "should" and "must" clauses? #258

yattias commented Jun 28, 2021

yattias commented Jun 28, 2021

yattias commented Jun 30, 2021

Sachin-Kahandal commented Jul 21, 2022

barseghyanartur commented Jul 21, 2022 •

edited

How to create a boolean query with both "should" and "must" clauses? #258

How to create a boolean query with both "should" and "must" clauses? #258

Comments

yattias commented Jun 28, 2021

yattias commented Jun 28, 2021

yattias commented Jun 30, 2021

Sachin-Kahandal commented Jul 21, 2022

barseghyanartur commented Jul 21, 2022 • edited

barseghyanartur commented Jul 21, 2022 •

edited