模糊匹配Java中的姓名列表

huangapple go评论84阅读模式
英文:

Fuzzy Matching a list of names in Java

问题

我正在尝试使用模糊匹配库来匹配我们数据库中的一组姓名。
数据已序列化为Java对象,想要找出如何将数据映射到模糊匹配库中定义的Document对象。

https://github.com/intuit/fuzzy-matcher

我们的User.java类具有以下属性:

  • userId
  • firstName
  • lastName
  • address
  • 等等....

我们在数据库中有超过1000个用户,并希望通过模糊匹配来帮助检测重复项。

任何能帮助我们更好理解该库的代码片段都将有所帮助。

英文:

I am trying to use fuzzy-matcher library to match a list of names from our data base.
The data is serialized into a java object, wanted to find out how we can map the data into the Document object defined in fuzzy-matcher library

https://github.com/intuit/fuzzy-matcher

Our User.java class has these attributes

  • userId
  • firstName
  • lastName
  • address
  • etc ....

We have over 1000 users in our db, and would like to run these through fuzzy-matcher to help detect duplicates.

Any code snippet that can help us better understand the library would be helpful

答案1

得分: 0

以下是翻译好的内容:

这里的 User 对象映射到 Document,而每个属性都需要映射到 fuzzy-matcher 中的 Element 对象。

如果您在 Java 中有一个 User 对象的集合,您可以将其转换为 Document/Element 对象,方法如下:

List<User> users = // 从数据库获取数据

List<Document> documents = users.stream().map(user -> {
    return new Document.Builder(userId)
            .addElement(new Element.Builder<String>().setValue(user.getFirstName() + " " + user.getLastName()).setType(NAME).createElement())
            .addElement(new Element.Builder<String>().setValue(user.getAddress()).setType(ADDRESS).createElement())
            .createDocument();
}).collect(Collectors.toList());

一旦您获得了一个 Document 列表,您可以将其传递给 MatchService 中的方法之一,以触发模糊匹配:

MatchService matchService = new MatchService();
Map<String, List<Match<Document>>> result = matchService.applyMatchByDocId(documents);

注意,上述方法将返回按 userId 分组的结果。

英文:

Here the User object maps to Document whereas each attributes will need to map to an Element object in fuzzy-matcher

If you have a Collection of the User object in java, you can convert it to Document/Element objects like this

List&lt;User&gt; users = // fetch data from db

List&lt;Document&gt; documents = users.stream().map(user -&gt; {
return new Document.Builder(userId)
            .addElement(new Element.Builder&lt;String&gt;().setValue(user.getFirstName() + &quot; &quot; + user.getLastName()).setType(NAME).createElement())
            .addElement(new Element.Builder&lt;String&gt;().setValue(user.getAddress()).setType(ADDRESS).createElement())
            .createDocument();
}).collect(Collectors.toList());

Once you have a List of Documents, you can pass it to one of the methods in MatchService to trigger a fuzzy match

MatchService matchService = new MatchService();
Map&lt;String, List&lt;Match&lt;Document&gt;&gt;&gt; result = matchService.applyMatchByDocId(documents);

Note that this above method will return you a result which is grouped by the userId

huangapple
  • 本文由 发表于 2020年8月4日 03:39:38
  • 转载请务必保留本文链接:https://go.coder-hub.com/63235916.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定