ES条件查询-等值查询,模糊查询,分页查询,排序

wylc123 1年前 ⋅ 2100 阅读

开发过程中多使用ES的javaAPI,通过javaAPI来对ES的索引进行操作,对ES的操作一般都是通过构建QueryBuilder对象来进行操作。下面介绍几种QueryBuilder的构建。

1. maven配置

<dependency>
   <groupId>org.elasticsearch</groupId>
   <artifactId>elasticsearch</artifactId>
   <version>6.3.2</version>
</dependency>
<dependency>
   <groupId>org.elasticsearch.client</groupId>
   <artifactId>transport</artifactId>
   <version>6.3.2</version>
</dependency>

2. 等值查询

BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery()
                .must(QueryBuilders.termQuery("name", "小李"));

查询name=小李的ES文档,等同于命令:

{
  "query": {
    "bool": {
      "adjust_pure_negative": true,
      "must": [{
        "term": {
          "name": {
            "boost": 1.0,
            "value": "小李"
          }
        }
      }],
      "boost": 1.0
    }
  }
}

3. 范围查询

BoolQueryBuilder queryBuilder = QueryBuilders.rangeQuery("age")
                                .gte(18)
                                .lte(50);

查询年龄大于等于18,并且小于等于50的记录,等同于以下命令。

{
  "query": {
    "range": {
      "age": {
        "include_lower": true,
        "include_upper": true,
        "from": 18,
        "boost": 1.0,
        "to": 50
      }
    }
  }
}

4. 模糊查询

BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery()
                .must(QueryBuilders.wildcardQuery("name", "*小李*"));

查询姓名中包含有小李的的文档记录,等同于以下命令:

{
  "query": {
    "bool": {
      "adjust_pure_negative": true,
      "must": [{
        "wildcard": {
          "name": {
            "boost": 1.0,
            "wildcard": "*小李*"
          }
        }
      }],
      "boost": 1.0
    }
  }
}

5. 多条件查询

BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery()
                .must(QueryBuilders.termQuery("name", "小李"))
                .must(QueryBuilders.rangeQuery("age")
                        .gte(10)
                        .lte(50));

查询姓名为:小李,并且年龄在10-50之间的文档,等同于以下命令:

{
  "query": {
    "bool": {
      "adjust_pure_negative": true,
      "must": [{
        "term": {
          "name": {
            "boost": 1.0,
            "value": "小李"
          }
        }
      }, {
        "range": {
          "age": {
            "include_lower": true,
            "include_upper": true,
            "from": 10,
            "boost": 1.0,
            "to": 50
          }
        }
      }],
      "boost": 1.0
    }
  }
}

6. 集合查询

List<String> list = Arrays.asList("北京", "上海", "杭州");
BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery()
                .must(QueryBuilders.termQuery("name", "李明"))
                .must(QueryBuilders.termsQuery("address", list))
                .must(QueryBuilders.rangeQuery("age")
                        .gte(10)
                        .lte(50));

查询地址在北京、上海、杭州,并且年龄在10至50,名字叫做李明的文档,等同于以下命令:

{
  "query": {
    "bool": {
      "adjust_pure_negative": true,
      "must": [{
        "term": {
          "name": {
            "boost": 1.0,
            "value": "李明"
          }
        }
      }, {
        "terms": {
          "address": ["北京", "上海", "杭州"],
          "boost": 1.0
        }
      }, {
        "range": {
          "age": {
            "include_lower": true,
            "include_upper": true,
            "from": 10,
            "boost": 1.0,
            "to": 50
          }
        }
      }],
      "boost": 1.0
    }
  }
}

7. 使用should查询

BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery()
                .should(QueryBuilders.wildcardQuery("name", "*小李*"))
                .should(QueryBuilders.termQuery("address", "北京"));

查询姓名包含小李或者是地址是北京的记录,should相当于或者or,命令如下:

{
  "query": {
    "bool": {
      "adjust_pure_negative": true,
      "should": [{
        "wildcard": {
          "name": {
            "boost": 1.0,
            "wildcard": "*小李*"
          }
        }
      }, {
        "term": {
          "address": {
            "boost": 1.0,
            "value": "北京"
          }
        }
      }],
      "boost": 1.0
    }
  }
}

8. should和must配合查询

BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery()
                .must(QueryBuilders.termQuery("sex", "男"))
                .should(QueryBuilders.wildcardQuery("name", "*小李*"))
                .should(QueryBuilders.termQuery("address", "北京"))
                .minimumShouldMatch(1);

查询性别为男,姓名包含小李或地址为北京的记录,**minimumShouldMatch(1)**表示最少要匹配到一个should条件。相当于以下命令:

{
  "query": {
    "bool": {
      "adjust_pure_negative": true,
      "should": [{
        "wildcard": {
          "name": {
            "boost": 1.0,
            "wildcard": "*小李*"
          }
        }
      }, {
        "term": {
          "address": {
            "boost": 1.0,
            "value": "北京"
          }
        }
      }],
      "minimum_should_match": "1",
      "must": [{
        "term": {
          "sex": {
            "boost": 1.0,
            "value": "男"
          }
        }
      }],
      "boost": 1.0
    }
  }
}

must:必须满足的条件

should:非必须满足的条件

minimumShouldMatch(1):至少要满足一个 should 条件

以上queryBuilder可以理解为需要满足一个must条件,并且至少要满足一个should条件。

9. 有值查询

BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery()
                .must(QueryBuilders.existsQuery("name"))
                .mustNot(QueryBuilders.existsQuery("tag"));

查询name有值,tag不存在值的文档,命令如下:

{
  "query": {
    "bool": {
      "adjust_pure_negative": true,
      "must_not": [{
        "exists": {
          "field": "tag",
          "boost": 1.0
        }
      }],
      "must": [{
        "exists": {
          "field": "name",
          "boost": 1.0
        }
      }],
      "boost": 1.0
    }
  }
}

10. 空值查询

BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery()
.must(QueryBuilders.termQuery("systemnamerely.keyword", ""));

查询name有值,tag不存在值的文档,命令如下:

{
  "query": {
    "bool": {
      "adjust_pure_negative": true,
      "must": [{
        "term": {
          "systemnamerely": {
            "boost": 1.0,
            "value": ""
          }
        }
      }],
      "boost": 1.0
    }
  }
}

11. 分页查询

SearchResponse response = this.transportClient.prepareSearch(index)
                .setTypes(type)
                .setQuery(queryBuilder)
                .setFrom(offset)
                .setSize(rows)
                .setExplain(false)
                .execute()
                .actionGet();

普通分页查询数据,相当于以下命令:

{
  "from" : 0,
  "size" : 10,
  "query": {
    "bool": {
      "adjust_pure_negative": true,
      "must": [{
        "term": {
          "name": {
            "boost": 1.0,
            "value": "小李"
          }
        }
      }],
      "boost": 1.0
    }
  }
}

基于scrollId查询的API如下:

SearchResponse scrollResp = null;
        String scrollId = ContextParameterHolder.get("scrollId");
        if (scrollId != null) {
            scrollResp = getTransportClient().prepareSearchScroll(scrollId).setScroll(new TimeValue(60000)).execute()
                    .actionGet();
        } else {
            log.info("基于scroll的分页查询,scrollId为空");
            scrollResp = this.prepareSearch()
                    .setSearchType(SearchType.QUERY_AND_FETCH)
                    .setScroll(new TimeValue(60000))
                    .setQuery(queryBuilder)
                    .setSize(page.getPageSize()).execute().actionGet();
            ContextParameterHolder.set("scrollId", scrollResp.getScrollId());
        }

12. 排序

List<SortBuilder> sortBuilder = new ArrayList<>();
FieldSortBuilder fieldSortBuilder1 = SortBuilders.fieldSort("negativewordscheck").order(SortOrder.DESC);
            FieldSortBuilder fieldSortBuilder2 = SortBuilders.fieldSort("systemnamerely.keyword").order(SortOrder.ASC);
            sortBuilder.add(fieldSortBuilder1);
            sortBuilder.add(fieldSortBuilder2);
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

        //searchSourceBuilder.query(queryBuilder).from((pageNo - 1) * pagesize).size(pagesize).sort(sortBuilder).fetchSource(includes,excludes);
        if(sortBuilder != null) {
            searchSourceBuilder.query(queryBuilder).from((pageNo - 1) * pagesize).size(pagesize);
            for(SortBuilder item : sortBuilder){
                searchSourceBuilder.sort(item);
            }
        }

多条件排序,相当于以下命令:

GET sharingdata/_search
{
  "query": {
    "bool": {
      "filter": [
          {
            "terms": {
              "status": [
                "1",
                "2"
              ],
              "boost": 1
            }
          }
        ]
    }
  },
  "size": 1,
  "from": 0,
  "sort": [
      {
        "negativewordscheck": {
          "order": "desc"
        }
      },
      {
        "systemnamerely.keyword": {
          "order": "asc"
        }
      }
    ]
}

13. 多条件查询,分页,排序方法

public List<LinkedHashMap<String,Object>> getPageResultListLinked(QueryBuilder queryBuilder, String esIndex,int pageNo,int pagesize, List<SortBuilder> sortBuilder, String[] includes,String[] excludes){
        SearchRequest searchRequest = new SearchRequest(esIndex);
        SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

        //searchSourceBuilder.query(queryBuilder).from((pageNo - 1) * pagesize).size(pagesize).sort(sortBuilder).fetchSource(includes,excludes);
        if(sortBuilder != null) {
            searchSourceBuilder.query(queryBuilder).from((pageNo - 1) * pagesize).size(pagesize);
            for(SortBuilder item : sortBuilder){
                searchSourceBuilder.sort(item);
            }
        } else {
            searchSourceBuilder.query(queryBuilder).from((pageNo - 1) * pagesize).size(pagesize);
        }
        if(includes != null && includes.length > 0){
            searchSourceBuilder.fetchSource(includes,excludes);
        }
        searchRequest.source(searchSourceBuilder);
        client = getRestHighLevelClient();
        SearchResponse searchResponse = null;

        try {
            searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
        } catch (IOException e) {
            e.printStackTrace();
        }

        // 从response中获得结果
        List<LinkedHashMap<String,Object>> list = new LinkedList();
        searchResponse.getHits();

        SearchHits hits = searchResponse.getHits();

        Iterator<SearchHit> iterator = hits.iterator();
        while (iterator.hasNext()) {
            SearchHit next = iterator.next();
            list.add(getMapValueForLinkedHashMap(next.getSourceAsMap()));
        }
        return list;
    }

注意:

  1. 需要添加.keyword的情况

一定要注意api版本和elasticsearch版本的搭配,虽然有上下版本兼容,但是兼容的很奇怪。到底加不加keyword,需要查看一下创建的索引结构,如果索引结构中字段下有一个keyword,那就需要添加keyword进行查询。

使用term查询无法生效。

@Field(type = FieldType.Keyword)
QueryBuilder queryBuilder = QueryBuilders.termQuery("source", "淘宝");
准确来说并不是无法生效,而是没有查询出数据。首先说一下对于term查询的语义:
term query会去倒排索弓|中寻找确切的term,它并不知道分词器的存在。这种查询适合keyword、numeric. date.
term表示查询某个字段里含有某个关键词的文档,terms表示查询某个字段里合有多个关键词的文档

注意:查询某个字段里含有某个关键词的文档,这句话就说明了直接对字段进行term查询实际上还是模糊搜索,区别只不过是不会对搜索的输入字符串进行分词处理而已。如果想通过term查询到数据,那么term查询的字段在索引库中就必须有与term查询条件相同的索引词,否则就是无法查询到结果的。

最后发现原因还是在于keyword这个属性的语义。

keyword属性,网上很多文章说在字段上设置该属性表示对于该字段不进行分词索引,但实际上该字段仍然会被分词,所以如果分词之后的索引库中不包含该字段的完整词,那么直接对该字段用上面的term查询是完全无法查询的。

可以尝试将检索条件”淘宝“改为”淘“或者”宝“就可以检索出结果,这说明还是进行了分词处理。

那设置keyword有什么用?

实际上,keyword属性的设置是添加了一个额外字段,这个字段就是source.keyword,也就是es在source字段之下额外生成添加了一个属性字段是keyword,这个keyword才是真正的不分词的索引字段,source.keyword字段才是真正意义上的不分词处理字段。而索引也是索引该字段才是真正的精确匹配。大概类似于下面的maping

"mapping": {
    "properties": {
      "id": {
        "type": "long"
      },
      "searchField": {
        "type": "text",
        "fields": {
          "keyword": {
            "type": "keyword",
            "ignore_above": 256
          }
        }
      }
 }


所以,在对keyword属性字段进行精确查询时,应该改为如下代码

QueryBuilder queryBuilder = QueryBuilders.termQuery("source.keyword", "淘宝");

同样对模糊查询,排序也是一样的:

queryBuilder.must(QueryBuilders.wildcardQuery("tablenamecn.keyword", "*"+query.getTablenamecn()+"*"));
List<SortBuilder> sortBuilder = new ArrayList<>();
FieldSortBuilder fieldSortBuilder1 = SortBuilders.fieldSort("negativewordscheck").order(SortOrder.DESC);
            FieldSortBuilder fieldSortBuilder2 = SortBuilders.fieldSort("systemnamerely.keyword").order(SortOrder.ASC);
            sortBuilder.add(fieldSortBuilder1);
            sortBuilder.add(fieldSortBuilder2);

相关文章推荐

全部评论: 0

    我有话说: