英文:
issue parsing json file with Langchain
问题
需要一些帮助。
我有一个文件中包含以下 JSON 内容,并想要使用 langchain.js 和 gpt 来解析、存储和回答类似以下的问题:
例如:
"find me jobs with 2 year experience" ==> 应该返回一个列表
"I have knowledge in javascript find me jobs" ==> 应该返回 jobs 对象
我使用了 langchain JSON 加载器,我看到文件被解析了,但它显示找到了 13 个文档。文件中只有 3 个文档。JSON 结构是否不正确?
以下是我的解析代码片段:
const loader = new DirectoryLoader(docPath, {
".json": (path) => new JSONLoader(path),
});
const docs = await loader.load();
console.log(docs);
console.log(docs.length);
以下是我的输入数据:
[
{
"jobid":"job1",
"title":"software engineer",
"skills":"java,javascript",
"description":"this job requires a associate degrees in CS and 2 years experience"
},
{
"jobid":"job2",
"skills":"math, accounting, spreadsheet",
"description":"this job requires a degrees in accounting and 2 years experience"
},
{
"jobid":"job3",
"title":"programmer",
"skills":"java,javascript,cloud computing",
"description":"this job requires a ,master degrees in CS and 3 years experience"
}
]
输出结果:
[
Document {
pageContent: 'job1',
metadata: {
source: 'langchain-document-loaders-in-node-js/documents/jobs.json',
line: 1
}
},
Document {
pageContent: 'software engineer',
metadata: {
source: 'langchain-document-loaders-in-node-js/documents/jobs.json',
line: 2
}
},
Document {
pageContent: 'java,javascript',
metadata: {
source: 'langchain-document-loaders-in-node-js/documents/jobs.json',
line: 3
}
},
Document {
pageContent: 'this job requires a associate degrees in CS and 2 years experience',
metadata: {
source: 'langchain-document-loaders-in-node-js/documents/jobs.json',
line: 4
}
},
Document {
pageContent: 'job2',
metadata: {
source: 'langchain-document-loaders-in-node-js/documents/jobs.json',
line: 5
}
},
...
]
(翻译完毕,不包括代码部分)
英文:
Need some help.
I have the following json content in a file and would like to use langchain.js and gpt to parse , store and answer question such as
for example:
"find me jobs with 2 year experience" ==> should return a list
"I have knowledge in javascript find me jobs" ==> should return the jobs pbject
I use langchain json loader and I see the file is parse but it say that it find 13 docs . There is only be 3 docs in file . Is the json structure not correct?
Here is snippet of my parse code
const loader = new DirectoryLoader(docPath, {
".json": (path) => new JSONLoader(path),
});
const docs = await loader.load();
console.log(docs);
console.log(docs.length);
Here is my input data
[
{
"jobid":"job1",
"title":"software engineer"
"skills":"java,javascript",
"description":"this job requires a associate degrees in CS and 2 years experience"
},
{
"jobid":"job2",
"skills":"math, accounting, spreadsheet",
"description":"this job requires a degrees in accounting and 2 years experience"
},
{
"jobid":"job3",
"title":"programmer"
"skills":"java,javascript,cloud computing",
"description":"this job requires a ,master degrees in CS and 3 years experience"
}
]
OUTPUT
[
Document {
pageContent: 'job1',
metadata: {
source: 'langchain-document-loaders-in-node-js/documents/jobs.json',
line: 1
}
},
Document {
pageContent: 'software engineer',
metadata: {
source: 'langchain-document-loaders-in-node-js/documents/jobs.json',
line: 2
}
},
Document {
pageContent: 'java,javascript',
metadata: {
source: 'langchain-document-loaders-in-node-js/documents/jobs.json',
line: 3
}
},
Document {
pageContent: 'this job requires a associate degrees in CS and 2 years experience',
metadata: {
source: 'langchain-document-loaders-in-node-js/documents/jobs.json',
line: 4
}
},
Document {
pageContent: 'job2',
metadata: {
source: 'langchain-document-loaders-in-node-js/documents/jobs.json',
line: 5
}
},
...
答案1
得分: 1
你的JSON包含一个JavaScript数组,其中有三个JavaScript对象。其中两个对象有四个属性,一个对象有三个属性。所有属性的值都是文本字符串。看起来你的解析器将每个属性都提取到其中的一个文档中。
你需要找到一种方法来告诉你的解析器,每个JavaScript对象都是一个文档。
英文:
Your JSON contains a Javascript array of three Javascript objects. Two of them have four properties, and one has three. All the properties have text strings for values. It looks like your parser pulls each property into one of its Documents.
You need to find a way to tell your parser that each Javascript object is one of its Documents.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论