Jackson serialization: String field containing regex expression de-escaping to get actual literal

huangapple go评论120阅读模式
英文:

Jackson serialization: String field containing regex expression de-escaping to get actual literal

问题

我有一个包含一个 String 类型字段的DTO,其中包含一个正则表达式。为了成为有效的Java字符串,它包含双重 \\,因为第一个用于转义,这很容易理解。就像:

  1. private final String regex = "myapp\\.\\w{2,3}\\/confirmation(.*)";

要使用的实际正则表达式是 myapp\.\w{2,3}\/confirmation(.*)

此外,我将在Kafka消息中发送此DTO,并且序列化是由Jackson完成的。

  1. ProducerRecord<String, String> record = new ProducerRecord<>(
  2. kafkaTopicProperties.getTopic(),
  3. String.valueOf(myDto.getOrderId()),
  4. objectMapper.writeValueAsString(myDto)
  5. );

可以理解的是,Jackson无法区分普通字符串和正则表达式字符串,并且将以Java字符串转义的方式发送。此外,在JSON中省略转义也是无效的(至少当我编辑 .json 文件以删除转义符 \ 时,IntelliJ会显示解析错误),因此对于有效的JSON,我还需要进行转义。到目前为止都是正常的。

但是,Kafka的消费者将接收到一个经过转义的正则表达式字符串,而且必须对正则表达式进行去转义操作(删除额外的 \\)。问题就在这里。语法上的变化导致语义上的差异。

实际上,因为Kafka没有对发送内容施加限制,我们可以在发送之前进行去转义操作,因为它会成为纯文本。

但是,Jackson能否为我执行这种操作呢?

英文:

I have a DTO containing a field of type String which contains a regex expression. To be a valid Java string, it contains double \ because the first one is used for escaping, which is easy to understand. Like:

  1. private final String regex = &quot;myapp\\.\\w{2,3}\\/confirmation(.*)&quot;;

The actual regex to use is myapp\.\w{2,3}\/confirmation(.*).

And, I will send this DTO in a Kafka message, and the serialization is done by Jackson.

  1. ProducerRecord&lt;String, String&gt; record = new ProducerRecord&lt;&gt;(
  2. kafkaTopicProperties.getTopic(),
  3. String.valueOf(myDto.getOrderId()),
  4. objectMapper.writeValueAsString(myDto)
  5. );

Understandably, Jackson cannot distinguish normal string and regex string, and will send the Java string escaped as-is. Additionally, it is also invalid to omit the escaping in JSON(at least when I edit a .json file to delete the escaping \, IntelliJ shows parsing error), so for a valid JSON, I also need to escape it. Normal till now.

But then, the consumer of Kafka will received a escaped regex string, and will have to de-escape the regex(removing the extra \). Here comes the problem. A syntatic change results in semantic difference.

Actually because Kafka has no limitations over what to send, we are free to de-escape before sending because it would be plain text.

But, can Jackson do this magic for me?

答案1

得分: 0

感谢 @Bohemian 和 @NyamiouTheGaleanthrope

正如你所说,我认为我找到了问题所在:在 Consumer 中,反序列化器应该是 org.springframework.kafka.support.serializer.JsonDeserializer,而在生产者中,序列化器应该是 org.springframework.kafka.support.serializer.JsonSerializer。然后一切都正常了,我可以在日志中看到正则表达式没有额外的转义字符。我之前使用的是 String(De)Serializer

两侧的配置合并如下:

application.yml

  1. spring:
  2. kafka:
  3. producer:
  4. key-serializer: org.apache.kafka.common.serialization.StringSerializer
  5. value-serializer: org.springframework.kafka.support.serializer.JsonSerializer
  6. consumer: # 如果你使用 @Autowired 自动装配了消费者,则会使用这些配置
  7. key-deserializer: org.apache.kafka.common.serialization.StringDeserializer
  8. value-deserializer: org.springframework.kafka.support.serializer.JsonDeserializer
  9. properties.spring.json.trusted.packages: '*'

生产者:

  1. ProducerRecord<String, OrderMessageDto> record = new ProducerRecord<>(
  2. kafkaTopicProperties.getTopic(),
  3. String.valueOf(orderDto.getId()),
  4. orderDto
  5. );
  6. kafkaTemplateOrder.send(record).get(kafkaTopicProperties.getTimeout(), TimeUnit.MILLISECONDS);

消费者(我只在测试中有一个消费者,所以必须手动配置):

  1. @Autowired
  2. private EmbeddedKafkaBroker kafkaBroker;
  3. ...
  4. private ConsumerRecords consumeRecords(String topic) {
  5. Map<String, Object> consumerProps = KafkaTestUtils.consumerProps(topic, "true", kafkaBroker);
  6. consumerProps.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
  7. consumerProps.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
  8. consumerProps.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, JsonDeserializer.class);
  9. JsonDeserializer<OrderPaymentUrlDto> valueDeserializer = new JsonDeserializer<>();
  10. valueDeserializer.addTrustedPackages("*"); // 将DTO包含为受信任类型所必需
  11. ConsumerFactory<String, OrderPaymentUrlDto> factory = new DefaultKafkaConsumerFactory<>(
  12. consumerProps,
  13. new StringDeserializer(),
  14. valueDeserializer
  15. );
  16. Consumer consumer = factory.createConsumer();
  17. kafkaBroker.consumeFromAnEmbeddedTopic(consumer, topic);
  18. return KafkaTestUtils.getRecords(consumer, 2000);
  19. }
英文:

Thanks for @Bohemian and @NyamiouTheGaleanthrope

Indeed as you said, I think I have found the problem: the deserializer should be org.springframework.kafka.support.serializer.JsonDeserializer in the Consumer, and in the producer, the serializer should be org.springframework.kafka.support.serializer.JsonSerializer. Then all is good, I can see in the log that the regex has no extra escape char. I was using String(De)Serializer before.

Configs for both sides put together:

application.yml:

  1. spring:
  2. kafka:
  3. producer:
  4. key-serializer: org.apache.kafka.common.serialization.StringSerializer
  5. value-serializer: org.springframework.kafka.support.serializer.JsonSerializer
  6. consumer: # would be picked if you autowired the consumer
  7. key-deserializer: org.apache.kafka.common.serialization.StringDeserializer
  8. value-deserializer: org.springframework.kafka.support.serializer.JsonDeserializer
  9. properties.spring.json.trusted.packages: &#39;*&#39;

Producer:

  1. ProducerRecord&lt;String, OrderMessageDto&gt; record = new ProducerRecord&lt;&gt;(
  2. kafkaTopicProperties.getTopic(),
  3. String.valueOf(orderDto.getId()),
  4. orderDto
  5. );
  6. kafkaTemplateOrder.send(record).get(kafkaTopicProperties.getTimeout(), TimeUnit.MILLISECONDS);

Consumer(I only have a consumer in test, so I have to configure by hand):

  1. @Autowired
  2. private EmbeddedKafkaBroker kafkaBroker;
  3. ...
  4. private ConsumerRecords consumeRecords(String topic) {
  5. Map&lt;String, Object&gt; consumerProps = KafkaTestUtils.consumerProps(topic, &quot;true&quot;, kafkaBroker);
  6. consumerProps.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, &quot;earliest&quot;);
  7. consumerProps.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
  8. consumerProps.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, JsonDeserializer.class);
  9. JsonDeserializer&lt;OrderPaymentUrlDto&gt; valueDeserializer = new JsonDeserializer&lt;&gt;();
  10. valueDeserializer.addTrustedPackages(&quot;*&quot;); // necessary for include DTO as trusted type
  11. ConsumerFactory&lt;String, OrderPaymentUrlDto&gt; factory = new DefaultKafkaConsumerFactory&lt;&gt;(
  12. consumerProps,
  13. new StringDeserializer(),
  14. valueDeserializer
  15. );
  16. Consumer consumer = factory.createConsumer();
  17. kafkaBroker.consumeFromAnEmbeddedTopic(consumer, topic);
  18. return KafkaTestUtils.getRecords(consumer, 2000);
  19. }

huangapple
  • 本文由 发表于 2020年6月5日 21:56:09
  • 转载请务必保留本文链接:https://go.coder-hub.com/62217027.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定