拥有一个线程安全的XSD验证缓存

huangapple go评论82阅读模式
英文:

Having a thread-safe cache for XSD Validation

问题

我正在开发一个用于XSD验证的应用程序,我希望能够缓存我的模式(schemas)。
另一方面,该应用程序使用了多个线程,因此我想知道加载XSD文件的线程安全方法是什么。
目前,每个XSD都会创建一个新的net.sf.saxon.s9api.Processor。之后,会使用SchemaManager来验证许多XML文件。

Processor processor = new Processor(true);

SchemaManager sm = processor.getSchemaManager();
sm.load(new StreamSource(new File(xsdFilename)));

这样做是真的必要吗?我能否只实例化一个Processor,然后将其用于所有XSD吗?在这种情况下,在多线程环境中获取SM是否安全?

另外,将SchemaManager实例存储在表示应用程序缓存的Map中是否正确?或者应该使用SchemaValidator对象?

英文:

I am working on an application for XSD validation and I want my schemas to be cached.
On the other hand the application uses multiple threads so I am wonder what is the thread-safe approach to load the XSD files.
At the moment I have a new net.sf.saxon.s9api.Processor gets created for every XSD. A SchemaManager is using to validate lots of xmls after that

Processor processor = new Processor(true);

SchemaManager sm = processor.getSchemaManager();
sm.load(new StreamSource(new File(xsdFilename)));

Is it really necessary? Can I instantiate a single Processor and use it for all the XSDs? So would it be safe to get the SMs in multi-thread context in this case?

Additionally, is it correct to store SchemaManager instances in a Map by which the application cache is represented? Or SchemaValidator objects should be use for it?

答案1

得分: 1

Saxon处理器和SchemaManager可用于存储多个模式(或者更确切地说,是多个模式文档中所有模式组件的并集的模式),它是线程安全的,只要所有模式是兼容的,它就能很好地工作。我的意思是,您不能有两个具有相同名称的不同模式组件,例如由于加载不同的无命名空间模式或使用xs:redefines而导致的情况。

但是,如果您想要保持模式的分离,您需要为每个模式使用不同的ProcessorSchemaManager

SchemaValidator对象不是线程安全的:您应该为每个验证任务创建一个新的SchemaValidator。创建此对象是廉价的。

还值得注意的是,存在一些边缘情况,即使对“组合”模式进行验证,如果模式的几个部分是不相交的,也可能会改变验证结果,例如,当元素通配符具有processContents="strict"processContents="lax"时。

英文:

The Saxon Processor and SchemaManager can be used to store multiple schemas (or rather, one schema that is the union of all the schema components from multiple schema documents), and it's thread safe, so it should work fine so long as all the schemas are compatible. By that I mean you can't have two different schema components with the same name, e.g. as a result of loading different no-namespace schemas, or as a result of using xs:redefines.

If you want to keep your schemas separate, however, you will need a different Processor and SchemaManager for each one.

The SchemaValidator object isn't thread-safe: you should create a new SchemaValidator for each validation task. Creating this object is cheap.

It's also worth noting that there are corner cases where validating against a "composite" schema may change the validation outcome even if the several parts of the schema are disjoint: for example, when an element wildcard has processContents="strict" or processContents="lax".

huangapple
  • 本文由 发表于 2020年9月12日 00:27:06
  • 转载请务必保留本文链接:https://go.coder-hub.com/63850921.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定