如何使用来自不同边的属性进行计算

huangapple go评论63阅读模式
英文:

How to make calculations with properties from different edges

问题

I was making a graph for a recommendation system and added vertices for users, categories, and products and edges to represent the connections between them. One product may have connections to categories and a rating as a property for them. Users can also have a rating for each category. So, it is something like this:

-- 用户偏好。
SELECT * FROM cypher('RecommenderSystem', $$
    MATCH (a:Person {name: 'Abigail'}), (A:Category), (C:Category), (H:Category)
    WHERE A.name = 'A' AND C.name = 'C' AND H.name = 'H' 
    CREATE (a)-[:RATING {rating: 3}]->(C),
           (a)-[:RATING {rating: 1}]->(A),
           (a)-[:RATING {rating: 0}]->(H)
$$) AS (a agtype);

-- 产品评分。
SELECT * FROM cypher('RecommenderSystem', $$
    MATCH (product:Product {title: 'Product_Name'}), (A:Category), (C:Category), (H:Category)
    WHERE A.name = 'A' AND C.name = 'C' AND H.name = 'H' 
    CREATE (product)-[:RATING {rating: 0}]->(C),
           (product)-[:RATING {rating: 4}]->(A),
           (product)-[:RATING {rating: 0}]->(H)
$$) AS (a agtype);

我的推荐系统基于内容过滤,它使用我们了解的关于人和产品的信息作为推荐的连接点。因此,为了实现这一点,需要进行如下计算:[(user_rating_C x product_rating_C) + (user_rating_A x product_rating_A) + (user_rating_H x product_rating_H)] / (num_categories x max_rating)。例如,Abigail喜欢上面Cypher查询中的产品的可能性是:

[(3 x 0) + (1 x 4) + (0 x 0)] / (3 x 4) = 0.333
在范围从0到4的情况下,她很可能不喜欢这个产品。接近4的值表示用户更有可能购买或消费该产品。

然后,如何检索连接到人和产品的每个边缘评分并执行这种类型的计算呢?

英文:

I was making a graph for a recommendation system and added vertices for users, categories and products and edges to represent the connections between them. One product may have connections to categories and a rating as a property for them. Users can also have a rating for each category. So, it is something like this:

-- User preferences.
SELECT * FROM cypher('RecommenderSystem', $$
    MATCH (a:Person {name: 'Abigail'}), (A:Category), (C:Category), (H:Category)
    WHERE A.name = 'A' AND C.name = 'C' AND H.name = 'H' 
    CREATE (a)-[:RATING {rating: 3}]->(C),
           (a)-[:RATING {rating: 1}]->(A),
           (a)-[:RATING {rating: 0}]->(H)
$$) AS (a agtype);

-- Products rating.
SELECT * FROM cypher('RecommenderSystem', $$
    MATCH (product:Product {title: 'Product_Name'}), (A:Category), (C:Category), (H:Category)
    WHERE A.name = 'A' AND C.name = 'C' AND H.name = 'H' 
    CREATE (product)-[:RATING {rating: 0}]->(C),
           (product)-[:RATING {rating: 4}]->(A),
           (product)-[:RATING {rating: 0}]->(H)
$$) AS (a agtype);

如何使用来自不同边的属性进行计算

My recommendation system is based on Content Filtering, which uses information we know about people and products as connective tissue for recommendations. So for this, it would be necessary to do a calculation like: [(user_rating_C x product_rating_C) + (user_rating_A x product_rating_A) + (user_rating_H x product_rating_H)] / (num_categories x max_rating). For example, the likelihood of Abigail liking the product from the cypher query above would be:

[(3 x 0) + (1 x 4) + (0 x 0)] / (3 x 4) = 0.333
which in a range from 0 to 4, she is likely going to hate the product. And the closer to 4, the more likely becomes for the user to buy or consume the product.

But then, how could I retrieve every edge rating that is connected to a person and a product and do this type of calculation with it?

答案1

得分: 2

以下是翻译好的部分:

对于这种情况,以下查询应该有效:

    SELECT e1/(ct*4) AS factor FROM cypher('RecommenderSystem', $$
    MATCH (u: )-[e1: 评分]->(v: 类别)<-[e2: 评分]-(w: 产品), (c: 类别) WITH e1, e2, COUNT(DISTINCT c) AS ct
    RETURN SUM(e1.rating* e2.rating)::float, ct  
    $$) AS (e1  float, ct agtype);

这将输出:

          factor       
    -------------------
    0.333333333333333
    (1 )

**解释**

您需要使用MATCH子句找到该人和产品都设置了评分的类别。一旦您获得了这些评分,将这些评分的乘积求和将得到

> [(用户评分_C x 产品评分_C) + (用户评分_A x 产品评分_A) + (用户评分_H x 产品评分_H)]

现在将其除以

> (类别数 x 最大评分)

您可以使用`COUNT(DISTINCT c)`获得`类别数`,我假设您已经知道`最大评分`

希望对您有所帮助。

**编辑**

我假设通过`类别数`,您指的是系统中的总类别数,而不是仅与人和产品共同关联的类别数。如果`类别数`是与产品和人共同关联的类别数,请将您的`WITH`子句修改为

    WITH e1, e2, COUNT(*) AS ct
其他部分保持不变
英文:

The following query should work for this situation

SELECT e1/(ct*4) AS factor FROM cypher(&#39;RecommenderSystem&#39;, $$
MATCH (u: Person)-[e1: RATING]-&gt;(v: Category)&lt;-[e2: RATING]-(w:      
Product), (c: Category) WITH e1, e2, COUNT(DISTINCT c) AS ct
RETURN SUM(e1.rating* e2.rating)::float, ct  
$$) AS (e1  float, ct agtype);

This outputs:

      factor       
-------------------
0.333333333333333
(1 row)

Explanation

You need to find the category for which the person and product both have set the rating using the MATCH clause. Once you get these ratings, the sum of the product of these ratings would give

> [(user_rating_C x product_rating_C) + (user_rating_A x product_rating_A) + (user_rating_H x product_rating_H)]

Now to divide it by the product of

> (num_categories x max_rating)

You get num_categories using COUNT(DISTINCT c) and I assume that you already know the max_rating.

Hope it helps

Edit

I assumed that by num_categories, you meant the total number of categories in the system and not the only ones that are associated with the person and product in common. In case, num_categories is the count of categories associated with product and person in common, then modify your WITH clause as

WITH e1, e2, COUNT(*) AS ct

Else is fine

答案2

得分: 2

If I understand correctly, you want to calculate the rating of each product for a user based on the given formula: [(user_rating_C x product_rating_C) + (user_rating_A x product_rating_A) + (user_rating_H x product_rating_H)] / (num_categories x max_rating).

According to your model, max_rating is set to 4 (range from 0 to 4).

To perform this calculation, you can use the following query:

SELECT * FROM cypher('RecommenderSystem', $$
    MATCH (a: Person {name: 'Abigail'})-[r1: RATING]->(c: Category)<-[r2: RATING]-(p:Product)
    WITH a.name AS person, p.title AS product, 
         SUM(r1.rating * r2.rating)/(count(c) * 4)::float AS rate
    RETURN person AS a, product AS p, rate AS r
$$) AS (a agtype, p agtype, r float);

I added another product (rating 0 with category C, rating 1 with category A and rating 3 with category H) and this query gave me these results:
[![Query results. Person: Abigail, product: Product_Name, rating: 0.33 and Person: Abigail, product: Other_Product, rating: 0.083](https://i.stack.imgur.com/NeZUK.png)]
英文:

If I understand correctly, you want to calculate the rating of each product for a user based on the given formula: [(user_rating_C x product_rating_C) + (user_rating_A x product_rating_A) + (user_rating_H x product_rating_H)] / (num_categories x max_rating). According to your model, max_rating is set to 4 (range from 0 to 4). To perform this calculation, you can use the following query:

SELECT * FROM cypher(&#39;RecommenderSystem&#39;, $$
    MATCH (a: Person {name: &#39;Abigail&#39;})-[r1: RATING]-&gt;(c: Category)&lt;-[r2: RATING]-(p:Product)
    WITH a.name AS person, p.title AS product, 
         SUM(r1.rating * r2.rating)/(count(c) * 4)::float AS rate
    RETURN person AS a, product AS p, rate AS r
$$) AS (a agtype, p agtype, r float);

I added another product (rating 0 with category C, rating 1 with category A and rating 3 with category H) and this query gave me these results:
如何使用来自不同边的属性进行计算

答案3

得分: 1

以下是翻译好的部分:

可能对你有用的内容如下:

    使用以下内容:
      &#39;Abigail&#39; 作为 perName,
      [{c: &#39;A&#39;, p: &#39;prod_1&#39;}, {c: &#39;C&#39;, p: &#39;prod_9&#39;}, {c: &#39;H&#39;, p: &#39;prod_4&#39;}] 作为 x
    匹配 (per:Person)-[perRating:RATING]->(cat:Category)<-[prodRating:RATING]-(prod:Product)
    其中 per.name = perName 并且 任何(i IN x WHERE cat.name = i.c AND prod.name = i.p)
    使用 *, SUM(perRating.rating*prodRating.rating) 作为 total,MAX(prodRating.rating) 作为 maxProdRating
    返回 per, total/(SIZE(x) * maxProdRating) 作为 affinity

`perName` 是人的名字,`x` 是所需的类别/产品名称对的列表,`affinity` 将是计算结果。

注意:即使在数据中未找到所有所需的对 `x`,此查询也使用 `x` 的大小作为分母。如果不需要这样,请调整查询。

[更新]

不幸的是,[ANY](https://neo4j.com/docs/cypher-manual/current/functions/predicate/#functions-any) 谓词函数不是 `openCypher` 的一部分,因此不受 Apache AGE 支持。

更不幸的是,尽管 [list comprehension](https://neo4j.com/docs/cypher-manual/current/syntax/lists/#cypher-list-comprehension) 是 `openCypher` 的一部分,但 AGE 尚不支持它。

但是,在支持列表推导的 `openCypher` 系统上,我们可以用以下内容替代这个部分(我们不关心生成的列表的内容,所以我们只是使用任意的 `1` 元素):

    SIZE([i IN x WHERE cat.name = i.c AND prod.name = i.p | 1]) &gt; 0
英文:

Something like this may work for you:

WITH
  &#39;Abigail&#39; AS perName,
  [{c: &#39;A&#39;, p: &#39;prod_1&#39;}, {c: &#39;C&#39;, p: &#39;prod_9&#39;}, {c: &#39;H&#39;, p: &#39;prod_4&#39;}] AS x
MATCH (per:Person)-[perRating:RATING]-&gt;(cat:Category)&lt;-[prodRating:RATING]-(prod:Product)
WHERE per.name = perName AND ANY(i IN x WHERE cat.name = i.c AND prod.name = i.p)
WITH *, SUM(perRating.rating*prodRating.rating) AS total, MAX(prodRating.rating) AS maxProdRating
RETURN per, total/(SIZE(x) * maxProdRating) AS affinity

perName is the person's name, x is a list of the desired category/product name pairs, and affinity will be the calculated result.

NOTE: Even if not all desired pairs in x are found in the data, this query uses the size of x in the denominator. Adjust the query if this is not wanted.

[UPDATE]

Unfortunately, the ANY predicate function is not part of openCypher, so it is not supported by Apache AGE.

Even more unfortunately, even though list comprehension is a part of openCypher, AGE does not yet support that either.

But, on an openCypher system that does support list comprehension, we could replace this:

ANY(i IN x WHERE cat.name = i.c AND prod.name = i.p)

with something like this (we don't care about the generated list's contents, so we just use arbitrary 1 elements):

SIZE([i IN x WHERE cat.name = i.c AND prod.name = i.p | 1]) &gt; 0

huangapple
  • 本文由 发表于 2023年4月20日 06:38:31
  • 转载请务必保留本文链接:https://go.coder-hub.com/76059290.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定