使用聚合在Double ElasticSearch Spring Boot中按范围计数。

huangapple go评论75阅读模式
英文:

Counting by ranges in Double elasticSearch Spring boot using aggregation

问题

I'm trying to count records from elastic search which are in specific range.
我想要计算在 Elasticsearch 中特定范围内的记录。

I have 3 ranges which represent different values in double.
我有3个范围,代表不同的双精度值。

low (0-4]
低(0-4]

medium (4-7]
中等(4-7]

high (7-10]
高(7-10]

and the object is something like
对象类似于

{

"company":"companyName", // string
"score": 6.2 // double

}

and let's say for company1 I would like to get all the counts for score values.
假设对于公司1,我想获取所有分数值的计数。

to get back an object like the following
为了获得以下类似的对象

{"high":20, "medium":10, "low":3}

I have found a way to do it using the API,
我已经找到了一种使用API来实现的方法,

public interface ItemRepository extends ElasticsearchRepository {

@Query("{\"bool\":{\"must\":[{\"match\":{\"userId\":100}},{\"range\":{\"score\":{\"gte\":?0,\"lt\":?1}}}]}}")
long countByScoreRangeAndUserId(int lowerBound, int upperBound);

}

@Service
public class ItemService {

@Autowired
private ItemRepository itemRepository;

public void getScoreCountRanges() {
    long range1Count = itemRepository.countByScoreRangeAndUserId(0, 40);
    long range2Count = itemRepository.countByScoreRangeAndUserId(40, 70);
    long range3Count = itemRepository.countByScoreRangeAndUserId(70, 101); // inclusive lower bound, exclusive upper bound
    System.out.println("Range 0-39 Count: " + range1Count);
    System.out.println("Range 40-69 Count: " + range2Count);
    System.out.println("Range 70-100 Count: " + range3Count);
}

}

but I would like to do it in a single swipe over the database and not to count it in 3 times.
但我想在数据库上进行一次单独的操作,而不是三次计数。

and which way is better and faster?
哪种方式更好、更快?

Thanks a lot!
非常感谢!

英文:

I'm trying to count records from elastic search which are in specific range.
I have 3 ranges which represent different values in double.

low (0-4]
medium (4-7]
high (7-10]

and the object is something like

{

"company":"companyName", // string

"score": 6.2 // double

}

and lets say for company1 i would like to get all the counts for score values.

to get back an object like the following
{"high":20, "medium":10, "low":3}

I have found a way to do it using the API ,

public interface ItemRepository extends ElasticsearchRepository<Item, String> {

    @Query("{\"bool\":{\"must\":[{\"match\":{\"userId\":100}},{\"range\":{\"score\":{\"gte\":?0,\"lt\":?1}}}]}}")
    long countByScoreRangeAndUserId(int lowerBound, int upperBound);
}


@Service
public class ItemService {

    @Autowired
    private ItemRepository itemRepository;

    public void getScoreCountRanges() {
        long range1Count = itemRepository.countByScoreRangeAndUserId(0, 40);
        long range2Count = itemRepository.countByScoreRangeAndUserId(40, 70);
        long range3Count = itemRepository.countByScoreRangeAndUserId(70, 101); // inclusive lower bound, exclusive upper bound
        System.out.println("Range 0-39 Count: " + range1Count);
        System.out.println("Range 40-69 Count: " + range2Count);
        System.out.println("Range 70-100 Count: " + range3Count);
    }
}

but i would like to do it in a singe swipe over the database and not to count it in 3 times.
and which way is better and faster ?

thanks a lot

答案1

得分: 0

尝试使用NativeSearchQueryBuilder

@Service
public class ItemService {

    @Autowired
    private ElasticsearchOperations elasticsearchOperations;

    public void getScoreCountRanges() {
        SearchQuery searchQuery = new NativeSearchQueryBuilder()
                .withQuery(QueryBuilders.matchQuery("userId", 100))
                .addAggregation(AggregationBuilders.range("score_ranges")
                        .field("score")
                        .addUnboundedTo("low", 4)
                        .addRange("medium", 4, 7)
                        .addUnboundedFrom("high", 7)
                )
                .build();

        Aggregations aggregations = elasticsearchOperations.query(searchQuery, SearchResponse::getAggregations);
        Range rangeAggregation = aggregations.get("score_ranges");

        long lowCount = rangeAggregation.getBucketByKey("low").getDocCount();
        long mediumCount = rangeAggregation.getBucketByKey("medium").getDocCount();
        long highCount = rangeAggregation.getBucketByKey("high").getDocCount();

        System.out.println("Low Range Count: " + lowCount);
        System.out.println("Medium Range Count: " + mediumCount);
        System.out.println("High Range Count: " + highCount);
    }
}
英文:

Try using the NativeSearchQueryBuilder

@Service
public class ItemService {

@Autowired
private ElasticsearchOperations elasticsearchOperations;

public void getScoreCountRanges() {
    SearchQuery searchQuery = new NativeSearchQueryBuilder()
            .withQuery(QueryBuilders.matchQuery("userId", 100))
            .addAggregation(AggregationBuilders.range("score_ranges")
                    .field("score")
                    .addUnboundedTo("low", 4)
                    .addRange("medium", 4, 7)
                    .addUnboundedFrom("high", 7)
            )
            .build();

    Aggregations aggregations = elasticsearchOperations.query(searchQuery, SearchResponse::getAggregations);
    Range rangeAggregation = aggregations.get("score_ranges");

    long lowCount = rangeAggregation.getBucketByKey("low").getDocCount();
    long mediumCount = rangeAggregation.getBucketByKey("medium").getDocCount();
    long highCount = rangeAggregation.getBucketByKey("high").getDocCount();

    System.out.println("Low Range Count: " + lowCount);
    System.out.println("Medium Range Count: " + mediumCount);
    System.out.println("High Range Count: " + highCount);
  }
}

答案2

得分: 0

我将用工作代码回答,如果将来有人可能会觉得它有用。

    @Override
    public Scores getScoreCountRanges() {
        Scores scores = new Scores();
        String aggregationName = "score_ranges";
        NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
                .withQuery(QueryBuilders.matchQuery("userId", "DESIRED-USER-ID"))
                .withAggregations(AggregationBuilders.range(aggregationName)
                        .field("scores")
                        .addRange(LOW, LOW_LOWER_BOUND, MEDIUM_LOWER_BOUND)
                        .addRange(MEDIUM, MEDIUM_LOWER_BOUND, HIGH_LOWER_BOUND)
                        .addRange(HIGH, HIGH_LOWER_BOUND, HIGH_UPPER_BOUND)
                )
                .build();

        SearchHits<?> searchHits = operations.search(searchQuery, ClassOfData.class);
        if (!searchHits.hasAggregations())
            return scores;
        AggregationsContainer<?> aggregationsContainer = searchHits.getAggregations();
        if (aggregationsContainer == null) {
            return scores;
        }
        Aggregations aggregations = (Aggregations) aggregationsContainer.aggregations();
        ParsedRange rangeAggregation = aggregations.get(aggregationName);
        rangeAggregation.getBuckets().forEach(bucket -> fillScores(scores, bucket.getKey().toString(), bucket.getDocCount()));
        return scores;
    }

此代码将执行所谓的范围聚合,并返回符合 matchQuery 的记录,并且位于您指定的范围内的桶。

我们将使用该桶的文档计数来知道符合特定范围的条件的记录有多少。

您可以根据需要使用 "fillScores" 或执行其他操作,返回包含键值对的映射也是一个不错的选择。

希望对您有所帮助。

英文:

I will Answer this with working Code if someone might find it useful in the future.

@Override
public Scores getScoreCountRanges() {
	Scores scores = new Scores();
    String aggregationName = &quot;score_ranges&quot;;
    NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
            .withQuery(QueryBuilders.matchQuery(&quot;userId&quot;, &quot;DESIRED-USER-ID&quot;)
            .withAggregations(AggregationBuilders.range(aggregationName)
                    .field(&quot;scores&quot;)
                    .addRange(LOW, LOW_LOWER_BOUND,MEDIUM_LOWER_BOUND)    
                    .addRange(MEDIUM, MEDIUM_LOWER_BOUND, HIGH_LOWER_BOUND)   
                    .addRange(HIGH, HIGH_LOWER_BOUND, HIGH_UPPER_BOUND)  
            )
            .build();

    SearchHits&lt;?&gt; searchHits = operations.search(searchQuery, ClassOfData.class);
    if (!searchHits.hasAggregations())
        return scores;
    AggregationsContainer&lt;?&gt; aggregationsContainer = searchHits.getAggregations();
    if (aggregationsContainer == null) {
        return scores;
    }
    Aggregations aggregations = (Aggregations) aggregationsContainer.aggregations();
    ParsedRange rangeAggregation = aggregations.get(aggregationName);
    rangeAggregation.getBuckets().forEach(bucket -&gt; fillScores(scores, bucket.getKey().toString(), bucket.getDocCount()));
    return scores;
}

this code will do what is called rangeAggregation, and will return buckets that contail records that answer the matchQuery and are found inside the ranges you decided.

and we will use the doc count of that bucket to know how many records that match the criteria are in each specific range.

you can use fill scores as you desire or do any thing else, returning a map containing a key and value is also a good choice.

hope it was helpful.

huangapple
  • 本文由 发表于 2023年6月25日 20:11:27
  • 转载请务必保留本文链接:https://go.coder-hub.com/76550323.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定