在MongoDB和Golang中搜索文档的键。

huangapple go评论83阅读模式
英文:

Search for keys of a document in MongoDB and Golang

问题

我正在使用golang和它的官方mongo db驱动程序,并且我想要将文档保存在以下结构中:

type BlacklistRecord struct {
    ID         string         `bson:"_id" json:"id"`
    Type       string         `bson:"type" json:"type"`
    Value      string         `bson:"value" json:"value"`
    Source     map[string]int `bson:"source" json:"source"`
    LastUpdate string         `bson:"lastUpdate" json:"lastUpdate"`
}

这是一个保存在数据库中的示例:

{
    "_id": "1b836f704c884d28",
    "type": "url",
    "value": "smtp.clarinda.bluehornet.com",
    "source": {
        "https://hostfiles.frogeye.fr/firstparty-trackers-hosts.txt": 1
    },
    "lastUpdate": "2022-05-18 13:30:44.425104695 +0000 UTC m=+624.684836025"
}

我想要做的是搜索至少有一个源包含子字符串(不区分大小写)的文档。
源值本身是一个映射,其键是源URL,值是该源URL中的重复次数。
我尝试了很多但是没有做得很好。
我知道我可以使用:

key := bson.M{
    "$regex": primitive.Regex{
        Pattern: ".*" + value + ".*",
        Options: "i",
    },
}

这只适用于键的值。那么如何搜索键本身呢?
例如,如果有人给我"hosTfiLes",我应该返回那些在它们的源字段中存在一个具有这个表达式(不区分大小写)的键的记录。
谢谢你的帮助。

英文:

I'm using golang and its official mongo db driver, and I want to save documents is in the following structure:

type BlacklistRecord struct {
    ID         string         `bson:"_id" json:"id"`
    Type       string         `bson:"type" json:"type"`
    Value      string         `bson:"value" json:"value"`
    Source     map[string]int `bson:"source" json:"source"`
    LastUpdate string         `bson:"lastUpdate" json:"lastUpdate"`
}

this is what is saved into database as a sample:

{
    _id: '1b836f704c884d28',
    type: 'url',
    value: 'smtp.clarinda.bluehornet.com',
    source: {
        'https://hostfiles.frogeye.fr/firstparty-trackers-hosts.txt': 1
    },
    lastUpdate: '2022-05-18 13:30:44.425104695 +0000 UTC m=+624.684836025'
}

What I want to do is searching for documents which at least one of their sources contains a sub string (case insensitive).
The source value is a map itself, which its key is the source URL and the value is the number of repeats in that source url.
I have tried a lot but I couldn't do much.
I know I can use:

key := bson.M{
     "$regex": primitive.Regex{
	 Pattern: ".*" + value + ".*", Options: "i",
   }

this only works for value of key. what about search for the key itself?
for example if someone give me "hosTfiLes" I should return the records which inside the source field of them, a key with this expression (case insensitive) exists.
Thank you for your helps.

答案1

得分: 1

我不确定它是否直接与find$regex一起使用。最好先在Mongo中尝试,然后再在Go中实现。示例数据:

/* 1 */
{
    "_id" : "1b836f704c884d28",
    "type" : "url",
    "value" : "smtp.clarinda.bluehornet.com",
    "source" : {
        "https://hostfiles.frogeye.fr/firstparty-trackers-hosts.txt" : 1.0
    },
    "lastUpdate" : "2022-05-18 13:30:44.425104695 +0000 UTC m=+624.684836025"
}

/* 2 */
{
    "_id" : "1b836f704c884d29",
    "type" : "url",
    "value" : "smtp.clarinda.bluehornet.org",
    "source" : {
        "https://hostfiles.frogeye.fr/firstparty-trackers-hosts.csv" : 1.0
    },
    "lastUpdate" : "2022-05-18 13:30:44.425104695 +0000 UTC m=+624.684836025"
}

/* 3 */
{
    "_id" : "1b836f704c884d30",
    "type" : "url",
    "value" : "smtp.clarinda.bluehornet.org",
    "source" : {
        "https://hostfiles.frogeye.fr/firstparty-trackers-hosts.csv" : 1.0,
        "https://hostfiles.frogeye.fr/firstparty-trackers-hosts.html" : 2.0
    },
    "lastUpdate" : "2022-05-18 13:30:44.425104695 +0000 UTC m=+624.684836025"
}

例如,如果我们正在搜索以.csv结尾的源,记录2有一个源,记录3有2个源中的1个符合我们的要求。以下聚合函数给出了预期的结果。

db.getCollection('blacklist').aggregate([ 
    { 
        $addFields: { 
            doc: { $objectToArray: "$source" } 
        } 
    }, 
    { 
        $match: {
            "doc.k": {$regex: '.csv$'},
        } 
    },
    {
        $project: {"doc":0},
    }
])

现在要在Go中实现相同的功能,代码片段如下:

pipeline := mongo.Pipeline{
	{
		Key: "$addFields",
		Value: bson.M{
			"doc": bson.M{"$objectToArray": "$source"},
		},
	},
	{
		Key: "$match",
		Value: bson.M{
			"doc.k": bson.M{
				"$regex": ".csv$",
			},
		},
	},
	{
		Key:   "$project",
		Value: bson.M{"doc": 0},
	},
}

cursor, err := collection.Aggregate(ctx, pipeline)
if err != nil {
	log.Fatal(err)
}

var result []BlacklistRecord
if err = cursor.All(ctx, &result); err != nil {
	log.Fatal(err)
}

但是,为此您需要在结构体中引入新字段,尽管您可以在JSON中排除该字段。

type BlacklistRecord struct {
	ID         string         `bson:"_id" json:"id"`
	Type       string         `bson:"type" json:"type"`
	Value      string         `bson:"value" json:"value"`
	Source     map[string]int `bson:"source" json:"source"`
	LastUpdate string         `bson:"lastUpdate" json:"lastUpdate"`
	Doc        []KV           `bson:"doc"` // json tag is exempted
}

type KV struct {
	Key string `bson:"k"`
	// The value field here is exempted.
}

Go Playground上的代码片段。如果您在本地尝试相同的操作,请根据您的服务器更新凭据和主机端口。

参考资料:

  1. 使用aggregate进行此操作:https://www.mongodb.com/community/forums/t/how-do-i-specify-a-document-keys-value-as-regex-expression-to-find-a-document-in-mongodb/4934/2
  2. 使用$project进行过滤:https://www.codegrepper.com/code-examples/whatever/mongodb+aggregate+remove+field
英文:

I'm not sure if it directly works with find and $regex. It's better to try it in mongo first. Then implement in Go. Sample data:

/* 1 */
{
    "_id" : "1b836f704c884d28",
    "type" : "url",
    "value" : "smtp.clarinda.bluehornet.com",
    "source" : {
        "https://hostfiles.frogeye.fr/firstparty-trackers-hosts.txt" : 1.0
    },
    "lastUpdate" : "2022-05-18 13:30:44.425104695 +0000 UTC m=+624.684836025"
}

/* 2 */
{
    "_id" : "1b836f704c884d29",
    "type" : "url",
    "value" : "smtp.clarinda.bluehornet.org",
    "source" : {
        "https://hostfiles.frogeye.fr/firstparty-trackers-hosts.csv" : 1.0
    },
    "lastUpdate" : "2022-05-18 13:30:44.425104695 +0000 UTC m=+624.684836025"
}

/* 3 */
{
    "_id" : "1b836f704c884d30",
    "type" : "url",
    "value" : "smtp.clarinda.bluehornet.org",
    "source" : {
        "https://hostfiles.frogeye.fr/firstparty-trackers-hosts.csv" : 1.0,
        "https://hostfiles.frogeye.fr/firstparty-trackers-hosts.html" : 2.0
    },
    "lastUpdate" : "2022-05-18 13:30:44.425104695 +0000 UTC m=+624.684836025"
}

For instance, if we are searching for sources that end with .csv, record 2 has one source and record 3 has 1 of 2 sources that match our requirement. The following aggregate function gives the expected result.

db.getCollection('blacklist').aggregate([ 
    { 
        $addFields: { 
            doc: { $objectToArray: "$source" } 
        } 
    }, 
    { 
        $match: {
            "doc.k": {$regex: '.csv$'},
        } 
    },
    {
        $project: {"doc":0},
    }
])

Now to implement the same in Go, the code snippet:

pipeline := mongo.Pipeline{
	{{
		Key: "$addFields",
		Value: bson.M{
			"doc": bson.M{"$objectToArray": "$source"},
		},
	}},
	{{
		Key: "$match",
		Value: bson.M{
			"doc.k": bson.M{
				"$regex": ".csv$",
			},
		},
	}},
	{{
		Key:   "$project",
		Value: bson.M{"doc": 0},
	}},
}

cursor, err := collection.Aggregate(ctx, pipeline)
if err != nil {
	log.Fatal(err)
}

var result []BlacklistRecord
if err = cursor.All(ctx, &result); err != nil {
	log.Fatal(err)
}

However, for this you need to introduce new field in the struct which you can exclude in JSON though.

type BlacklistRecord struct {
	ID         string         `bson:"_id" json:"id"`
	Type       string         `bson:"type" json:"type"`
	Value      string         `bson:"value" json:"value"`
	Source     map[string]int `bson:"source" json:"source"`
	LastUpdate string         `bson:"lastUpdate" json:"lastUpdate"`
	Doc        []KV           `bson:"doc"` // json tag is exempted
}

type KV struct {
	Key string `bson:"k"`
	// The value field here is exempted.
}

Code snippet on Go Playground. Update the creds and host:port as per your server if you are trying the same in local.

References:

  1. Usage of aggregate for this: https://www.mongodb.com/community/forums/t/how-do-i-specify-a-document-keys-value-as-regex-expression-to-find-a-document-in-mongodb/4934/2
  2. Using $project for filtering: https://www.codegrepper.com/code-examples/whatever/mongodb+aggregate+remove+field

huangapple
  • 本文由 发表于 2022年5月20日 17:36:31
  • 转载请务必保留本文链接:https://go.coder-hub.com/72316712.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定