DataLoader批量加载错误(MyBatis,JavaEE)

huangapple go评论141阅读模式
英文:

DataLoader batch loading bug (MyBatis, JavaEE)

问题

我在生产中使用了 GraphQL,并且禁止分享代码。我使用了 graphql-java-servlet,作为 ORM 我使用了 MyBatis。

<dependency>
    <groupId>com.graphql-java-kickstart</groupId>
    <artifactId>graphql-java-servlet</artifactId>
    <version>9.2.0</version>
</dependency>

在启用 BatchLoading 后,我发现 GraphQL DataLoader 在批量请求的实体中混淆了位置。还可以通过查看 futureCacheMap 来轻松检查,您会发现 Key(Id)和 Value(Entity)具有不同的 id。

经过调试,我没有找到图形解析实体在批量加载(1000 个 psc)后是如何解析的。所以我决定应该有一些排序,于是我实现了它,但这并没有解决问题。

例如:
我有一个包含子元素的 Parent.class。

class Parent {
    private Long id;
    private List<Long> childsIds;
}

我有 ChildDataloader

private BatchLoader<Long, Child> buildBatchLoader() {
    return list -> CompletableFuture.supplyAsync(() -> childService.findByIds(list));                
}

private DataLoader<Long, Child> buildDataLoader(BatchLoader batchLoader) {
    DataLoaderOptions options = DataLoaderOptions.newOptions();
    options.setMaxBatchSize(1000);
    return new DataLoader<Long, Child>(batchLoader, options);
}

我有 ChildsFetcher,在其中调用了 dataLoader.loadMany()

public class ChildsFetcher implements DataFetcher<CompletableFuture<List<Child>>>{
    
    private static final String PK_FIELD_NAME = "childsIds";
    
    @Override
    public CompletableFuture<List<LoadDefinitionDTO>> get(DataFetchingEnvironment environment) {
        GraphQLContext context = environment.getContext();
        DataLoaderRegistry dataLoaderRegistry = context.getDataLoaderRegistry().orElseThrow(
            () -> new DalException("there was no dataLoaderRegistry in context", Response.Status.INTERNAL_SERVER_ERROR)
        );
        List<Long> childsIds = getParentFieldValue(environment, PK_FIELD_NAME , List.class);
    
        DataLoader<Long, Child> childDataLoader = dataLoaderRegistry.getDataLoader("childDataLoader");
        return childDataLoader.loadMany(childsIds);
    }
}

例如,我有两个 Parent,每个有 3 个子元素。

"parents": [
    {
        "id": 1,
        "childIds": [1, 3, 5]
    },
    {
        "id": 2,
        "childIds": [2, 4, 6]
    }
]

作为结果,在 fetcher 中我将有 2 个请求:

  1. childDataLoader.loadMany([1, 3, 5])
  2. childDataLoader.loadMany([2, 4, 6])

在 DataLoader 中将只有一个请求(如预期的那样),但是看一下 id 的顺序(我无法对其进行控制):

childService.findByIds([1, 3, 5, 2, 4, 6])

并且在输出中我将收到:

"data": {
    "parents": [
        {
            "id": 1,
            "childs": [
                {
                    "id": 1
                },
                {
                    "id": 2
                },
                {
                    "id": 3
                }
            ]
        },
        {
            "id": 2,
            "childs": [
                {
                    "id": 4
                },
                {
                    "id": 5
                },
                {
                    "id": 6
                }
            ]
        }
    ]
}
英文:

I have graphql in my production, and it's forbidden to share the code. I'm using graphql-java-servlet, as ORM I use MyBatis.

&lt;dependency&gt;
     &lt;groupId&gt;com.graphql-java-kickstart&lt;/groupId&gt;
     &lt;artifactId&gt;graphql-java-servlet&lt;/artifactId&gt;
     &lt;version&gt;9.2.0&lt;/version&gt;
&lt;/dependency&gt;

After enabling BatchLoading I found that graphQL DataLoader confuses places of batch-requested Entities. Also it's easy to check if take a look at futureCacheMap, you will find that Key(Id) And Value(Entity) have different ids.

After debugging, I didn't find the way how graphQL resolves Entities after batchLoading (1000 psc). So I decided that I should have some ordering, so I implemented it, but it didn't resolve the issue.

For example:
I Have Parent.class which have Childs inside.

class Parent {
    private Long id;
    private List&lt;Long&gt; childsIds;
}

I have ChildDataloader

private BatchLoader&lt;Long, Child&gt; buildBatchLoader() {
    return list -&gt; &gt; CompletableFuture.supplyAsync(() -&gt; childService.findByIds(list));                
    }

private DataLoader&lt;Long, Child&gt; buildDataLoader(BatchLoader batchLoader) {
    DataLoaderOptions options = DataLoaderOptions.newOptions();
    options.setMaxBatchSize(1000);
    return new DataLoader&lt;Long, Child&gt;(batchLoader, options);
    }
}

I have ChildsFetcher where I call dataLoader.loadMany()

public class ChildsFetcher implements DataFetcher&lt;CompletableFuture&lt;List&lt;Child&gt;&gt;&gt;{

    private static final String PK_FIELD_NAME = &quot;childsIds&quot;;

    @Override
    public CompletableFuture&lt;List&lt;LoadDefinitionDTO&gt;&gt; get(DataFetchingEnvironment environment) {
        GraphQLContext context = environment.getContext();
        DataLoaderRegistry dataLoaderRegistry = context.getDataLoaderRegistry().orElseThrow(
                () -&gt; new DalException(&quot;there was no dataLoaderRegistry in context&quot;, Response.Status.INTERNAL_SERVER_ERROR)
        );
        List&lt;Long&gt; childsIds = getParentFieldValue(environment, PK_FIELD_NAME , List.class);

        DataLoader&lt;Long, Child&gt; childDataLoader = dataLoaderRegistry.getDataLoader(&quot;childDataLoader&quot;);
        return childDataLoader.loadMany(childsIds)
    }
}

For Example I have 2 Parents with 3 child each.

parents: [
 {
   &quot;id&quot;: 1
   &quot;childIds&quot;: {1,3,5}
 },
 {
   &quot;id&quot;: 2
   &quot;childIds&quot;: {2,4,6}
 }
]

As a result in fetcher I will have 2 requests:

  1. childDataLoader.loadMany({1,3,5})

  2. childDataLoader.loadMany({2,4,6})

In Dataloader it will be only one (as expected), but look at the order of the ids (I cannot take control on it):

childService.findByIds({1,3,5,2,4,6})

And in output I will receive:

&quot;data&quot;: {
    &quot;parents&quot;: [
      { 
        &quot;id&quot;: 1,
        &quot;childs&quot;: [
          {
            &quot;id&quot;: 1,
          },
          {
            &quot;id&quot;: 2,
           },
          {
            &quot;id&quot;: 3,
           }
       },
      { 
        &quot;id&quot;: 2,
        &quot;childs&quot;: [
          {
            &quot;id&quot;: 4,
          },
          {
            &quot;id&quot;: 5,
           },
          {
            &quot;id&quot;: 6,
           }
       }
      ]
     }
   ]
}

答案1

得分: 2

按照请求的顺序返回您的答案,如果ORM对SQL响应进行了排序,在从ORM获取响应后只需在DataLoader中重新对其进行排序,例如:

private BatchLoader<Long, Child> buildBatchLoader() {
   return list -> CompletableFuture.supplyAsync(() -> 
                   childService.findByIds(list).stream()
                     .sorted(Comparator.comparingLong(entity -> 
                             list.indexOf(entity.getId())))
                     .collect(Collectors.toList()));
};
英文:

Order of your answer have to be the same as order of the request,
If ORM have sorting the sql response, just resort it back in DataLoader after you get response from ORM, for example :

 private BatchLoader&lt;Long, Child&gt; buildBatchLoader() {
   return list -&gt; CompletableFuture.supplyAsync(() -&gt; 
                   childService.findByIds(list).stream()
                     .sorted(Comparator.comparingLong(entity -&gt; 
                             list.indexOf(entity.getId())))
                     .collect(Collectors.toList()));
 };   

huangapple
  • 本文由 发表于 2020年8月28日 16:50:34
  • 转载请务必保留本文链接:https://go.coder-hub.com/63630507.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定