英文:
Fuzzy Matching a list of names in Java
问题
我正在尝试使用模糊匹配库来匹配我们数据库中的一组姓名。
数据已序列化为Java对象,想要找出如何将数据映射到模糊匹配库中定义的Document对象。
https://github.com/intuit/fuzzy-matcher
我们的User.java类具有以下属性:
- userId
- firstName
- lastName
- address
- 等等....
我们在数据库中有超过1000个用户,并希望通过模糊匹配来帮助检测重复项。
任何能帮助我们更好理解该库的代码片段都将有所帮助。
英文:
I am trying to use fuzzy-matcher library to match a list of names from our data base.
The data is serialized into a java object, wanted to find out how we can map the data into the Document object defined in fuzzy-matcher library
https://github.com/intuit/fuzzy-matcher
Our User.java class has these attributes
- userId
- firstName
- lastName
- address
- etc ....
We have over 1000 users in our db, and would like to run these through fuzzy-matcher to help detect duplicates.
Any code snippet that can help us better understand the library would be helpful
答案1
得分: 0
以下是翻译好的内容:
这里的 User
对象映射到 Document
,而每个属性都需要映射到 fuzzy-matcher
中的 Element
对象。
如果您在 Java 中有一个 User
对象的集合,您可以将其转换为 Document/Element 对象,方法如下:
List<User> users = // 从数据库获取数据
List<Document> documents = users.stream().map(user -> {
return new Document.Builder(userId)
.addElement(new Element.Builder<String>().setValue(user.getFirstName() + " " + user.getLastName()).setType(NAME).createElement())
.addElement(new Element.Builder<String>().setValue(user.getAddress()).setType(ADDRESS).createElement())
.createDocument();
}).collect(Collectors.toList());
一旦您获得了一个 Document 列表,您可以将其传递给 MatchService 中的方法之一,以触发模糊匹配:
MatchService matchService = new MatchService();
Map<String, List<Match<Document>>> result = matchService.applyMatchByDocId(documents);
注意,上述方法将返回按 userId 分组的结果。
英文:
Here the User
object maps to Document
whereas each attributes will need to map to an Element
object in fuzzy-matcher
If you have a Collection of the User
object in java, you can convert it to Document/Element objects like this
List<User> users = // fetch data from db
List<Document> documents = users.stream().map(user -> {
return new Document.Builder(userId)
.addElement(new Element.Builder<String>().setValue(user.getFirstName() + " " + user.getLastName()).setType(NAME).createElement())
.addElement(new Element.Builder<String>().setValue(user.getAddress()).setType(ADDRESS).createElement())
.createDocument();
}).collect(Collectors.toList());
Once you have a List of Documents, you can pass it to one of the methods in MatchService to trigger a fuzzy match
MatchService matchService = new MatchService();
Map<String, List<Match<Document>>> result = matchService.applyMatchByDocId(documents);
Note that this above method will return you a result which is grouped by the userId
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论