Jackson serialization: String field containing regex expression de-escaping to get actual literal

huangapple go评论81阅读模式
英文:

Jackson serialization: String field containing regex expression de-escaping to get actual literal

问题

我有一个包含一个 String 类型字段的DTO,其中包含一个正则表达式。为了成为有效的Java字符串,它包含双重 \\,因为第一个用于转义,这很容易理解。就像:

private final String regex = "myapp\\.\\w{2,3}\\/confirmation(.*)"; 

要使用的实际正则表达式是 myapp\.\w{2,3}\/confirmation(.*)

此外,我将在Kafka消息中发送此DTO,并且序列化是由Jackson完成的。

ProducerRecord<String, String> record = new ProducerRecord<>(
    kafkaTopicProperties.getTopic(),
    String.valueOf(myDto.getOrderId()),
    objectMapper.writeValueAsString(myDto)
);

可以理解的是,Jackson无法区分普通字符串和正则表达式字符串,并且将以Java字符串转义的方式发送。此外,在JSON中省略转义也是无效的(至少当我编辑 .json 文件以删除转义符 \ 时,IntelliJ会显示解析错误),因此对于有效的JSON,我还需要进行转义。到目前为止都是正常的。

但是,Kafka的消费者将接收到一个经过转义的正则表达式字符串,而且必须对正则表达式进行去转义操作(删除额外的 \\)。问题就在这里。语法上的变化导致语义上的差异。

实际上,因为Kafka没有对发送内容施加限制,我们可以在发送之前进行去转义操作,因为它会成为纯文本。

但是,Jackson能否为我执行这种操作呢?

英文:

I have a DTO containing a field of type String which contains a regex expression. To be a valid Java string, it contains double \ because the first one is used for escaping, which is easy to understand. Like:

    private final String regex = &quot;myapp\\.\\w{2,3}\\/confirmation(.*)&quot;; 

The actual regex to use is myapp\.\w{2,3}\/confirmation(.*).

And, I will send this DTO in a Kafka message, and the serialization is done by Jackson.

ProducerRecord&lt;String, String&gt; record = new ProducerRecord&lt;&gt;(
    kafkaTopicProperties.getTopic(),
    String.valueOf(myDto.getOrderId()),
    objectMapper.writeValueAsString(myDto)
);

Understandably, Jackson cannot distinguish normal string and regex string, and will send the Java string escaped as-is. Additionally, it is also invalid to omit the escaping in JSON(at least when I edit a .json file to delete the escaping \, IntelliJ shows parsing error), so for a valid JSON, I also need to escape it. Normal till now.

But then, the consumer of Kafka will received a escaped regex string, and will have to de-escape the regex(removing the extra \). Here comes the problem. A syntatic change results in semantic difference.

Actually because Kafka has no limitations over what to send, we are free to de-escape before sending because it would be plain text.

But, can Jackson do this magic for me?

答案1

得分: 0

感谢 @Bohemian 和 @NyamiouTheGaleanthrope

正如你所说,我认为我找到了问题所在:在 Consumer 中,反序列化器应该是 org.springframework.kafka.support.serializer.JsonDeserializer,而在生产者中,序列化器应该是 org.springframework.kafka.support.serializer.JsonSerializer。然后一切都正常了,我可以在日志中看到正则表达式没有额外的转义字符。我之前使用的是 String(De)Serializer

两侧的配置合并如下:

application.yml

spring:
  kafka:
    producer:
      key-serializer: org.apache.kafka.common.serialization.StringSerializer
      value-serializer: org.springframework.kafka.support.serializer.JsonSerializer
    consumer: # 如果你使用 @Autowired 自动装配了消费者,则会使用这些配置
      key-deserializer: org.apache.kafka.common.serialization.StringDeserializer
      value-deserializer: org.springframework.kafka.support.serializer.JsonDeserializer
      properties.spring.json.trusted.packages: '*'

生产者:

ProducerRecord<String, OrderMessageDto> record = new ProducerRecord<>(
    kafkaTopicProperties.getTopic(),
    String.valueOf(orderDto.getId()),
    orderDto
);
kafkaTemplateOrder.send(record).get(kafkaTopicProperties.getTimeout(), TimeUnit.MILLISECONDS);

消费者(我只在测试中有一个消费者,所以必须手动配置):

    @Autowired
    private EmbeddedKafkaBroker kafkaBroker;

    ...


    private ConsumerRecords consumeRecords(String topic) {
        Map<String, Object> consumerProps = KafkaTestUtils.consumerProps(topic, "true", kafkaBroker);
        consumerProps.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
        consumerProps.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
        consumerProps.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, JsonDeserializer.class);
        JsonDeserializer<OrderPaymentUrlDto> valueDeserializer = new JsonDeserializer<>();
        valueDeserializer.addTrustedPackages("*"); // 将DTO包含为受信任类型所必需
        ConsumerFactory<String, OrderPaymentUrlDto> factory = new DefaultKafkaConsumerFactory<>(
            consumerProps,
            new StringDeserializer(),
            valueDeserializer
        );
        Consumer consumer = factory.createConsumer();
        kafkaBroker.consumeFromAnEmbeddedTopic(consumer, topic);
        return KafkaTestUtils.getRecords(consumer, 2000);
    }
英文:

Thanks for @Bohemian and @NyamiouTheGaleanthrope

Indeed as you said, I think I have found the problem: the deserializer should be org.springframework.kafka.support.serializer.JsonDeserializer in the Consumer, and in the producer, the serializer should be org.springframework.kafka.support.serializer.JsonSerializer. Then all is good, I can see in the log that the regex has no extra escape char. I was using String(De)Serializer before.

Configs for both sides put together:

application.yml:

spring:
  kafka:
    producer:
      key-serializer: org.apache.kafka.common.serialization.StringSerializer
      value-serializer: org.springframework.kafka.support.serializer.JsonSerializer
    consumer: # would be picked if you autowired the consumer
      key-deserializer: org.apache.kafka.common.serialization.StringDeserializer
      value-deserializer: org.springframework.kafka.support.serializer.JsonDeserializer
      properties.spring.json.trusted.packages: &#39;*&#39;

Producer:

ProducerRecord&lt;String, OrderMessageDto&gt; record = new ProducerRecord&lt;&gt;(
    kafkaTopicProperties.getTopic(),
    String.valueOf(orderDto.getId()),
    orderDto
);
kafkaTemplateOrder.send(record).get(kafkaTopicProperties.getTimeout(), TimeUnit.MILLISECONDS);

Consumer(I only have a consumer in test, so I have to configure by hand):

    @Autowired
    private EmbeddedKafkaBroker kafkaBroker;

    ...


    private ConsumerRecords consumeRecords(String topic) {
        Map&lt;String, Object&gt; consumerProps = KafkaTestUtils.consumerProps(topic, &quot;true&quot;, kafkaBroker);
        consumerProps.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, &quot;earliest&quot;);
        consumerProps.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
        consumerProps.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, JsonDeserializer.class);
        JsonDeserializer&lt;OrderPaymentUrlDto&gt; valueDeserializer = new JsonDeserializer&lt;&gt;();
        valueDeserializer.addTrustedPackages(&quot;*&quot;); // necessary for include DTO as trusted type
        ConsumerFactory&lt;String, OrderPaymentUrlDto&gt; factory = new DefaultKafkaConsumerFactory&lt;&gt;(
                consumerProps,
                new StringDeserializer(),
                valueDeserializer
        );
        Consumer consumer = factory.createConsumer();
        kafkaBroker.consumeFromAnEmbeddedTopic(consumer, topic);
        return KafkaTestUtils.getRecords(consumer, 2000);
    }

huangapple
  • 本文由 发表于 2020年6月5日 21:56:09
  • 转载请务必保留本文链接:https://go.coder-hub.com/62217027.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定