DocumentDB查询正在使用错误的索引。

huangapple go评论46阅读模式
英文:

DocumentDB query is using the wrong index

问题

I'm using MongoDB version 4.0.0 with DocumentDB.

I have a collection called transactions. That collection has these two indices:

{
  "creation_date": -1
}
{
  "project_id": 1,
  "mid.id": 1
}

I'm running the following query:

db.transactions.explain('executionStats').find({
  "project_id": "1",
  "mid.id": {
    $in: [/* array with 47 ids */]
  }
}).sort({creation_date: -1}).skip(0).limit(50)

The execution stats for that code are as follows:

{
  "queryPlanner": {
    "plannerVersion": 1,
    "namespace": "myDB.transactions",
    "winningPlan": {
      "stage": "SUBSCAN",
      "inputStage": {
        "stage": "LIMIT_SKIP",
        "inputStage": {
          "stage": "IXSCAN",
          "indexName": "creation_date_idx",
          "direction": "forward"
        }
      }
    }
  },
  "executionStats": {
    "executionSuccess": true,
    "executionTimeMillis": "445731.524",
    "planningTimeMillis": "1.487",
    "executionStages": {
      "stage": "SUBSCAN",
      "nReturned": "44",
      "executionTimeMillisEstimate": "445729.856",
      "inputStage": {
        "stage": "LIMIT_SKIP",
        "nReturned": "44",
        "executionTimeMillisEstimate": "445729.787",
        "inputStage": {
          "stage": "IXSCAN",
          "nReturned": "44",
          "executionTimeMillisEstimate": "445729.765",
          "indexName": "creation_date_idx",
          "direction": "forward"
        }
      }
    }
  },
  "serverInfo": {
    "host": "127.0.0.1",
    "port": 27017,
    "version": "4.0.0"
  },
  "ok": 1,
  "operationTime": {
    "$timestamp": "7231419170506670081"
  }
}

It chose the creation_date index and took 445 seconds to perform the query!

If I try to hint the query to force using the project_id and mid.id index, here is the winning plan:

{
  "queryPlanner": {
    "plannerVersion": 1,
    "namespace": "myDB.transactions",
    "winningPlan": {
      "stage": "SUBSCAN",
      "inputStage": {
        "stage": "LIMIT_SKIP",
        "inputStage": {
          "stage": "SORT",
          "sortPattern": {
            "creation_date": -1
          },
          "inputStage": {
            "stage": "IXSCAN",
            "indexName": "project_id_mid.id_idx",
            "direction": "forward"
          }
        }
      }
    }
  },
  "executionStats": {
    "executionSuccess": true,
    "executionTimeMillis": "10.966",
    "planningTimeMillis": "1.615",
    "executionStages": {
      "stage": "SUBSCAN",
      "nReturned": "44",
      "executionTimeMillisEstimate": "9.310",
      "inputStage": {
        "stage": "LIMIT_SKIP",
        "nReturned": "44",
        "executionTimeMillisEstimate": "9.292",
        "inputStage": {
          "stage": "SORT",
          "nReturned": "44",
          "executionTimeMillisEstimate": "9.287",
          "sortPattern": {
            "creation_date": -1
          },
          "inputStage": {
            "stage": "IXSCAN",
            "nReturned": "44",
            "executionTimeMillisEstimate": "9.208",
            "indexName": "project_id_mid.id_idx",
            "direction": "forward"
          }
        }
      }
    }
  },
  "serverInfo": {
    "host": "127.0.0.1",
    "port": 27017,
    "version": "4.0.0"
  },
  "ok": 1,
  "operationTime": {
    "$timestamp": "7231419990845423617"
  }
}

It takes 10 milliseconds to perform the query. So... why is DocumentDB choosing the creation_date index by default? Is something wrong with my query?

Some other things I tried:

  • If I run the query with MongoDB Compass pointing to DocumentDB, the query works fine without me having to hint anything.
  • If I reduce the size of the mid.id array to 14 items, it works
英文:

I'm using MongoDB version 4.0.0 with DocumentDB.

I have a collection called transactions. That collection has this two indices:

{
  "creation_date": -1
}
{
  "project_id": 1,
  "mid.id": 1  
}

I'm running the following query:

db.transactions.explain('executionStats').find({
  "project_id": "1",
  "mid.id": {
    $in: [/* array with 47 ids */]
  }
}).sort({creation_date: -1}).skip(0).limit(50)

The execution stats for that code are the following:

{
  "queryPlanner": {
    "plannerVersion": 1,
    "namespace": "myDB.transactions",
    "winningPlan": {
      "stage": "SUBSCAN",
      "inputStage": {
        "stage": "LIMIT_SKIP",
        "inputStage": {
          "stage": "IXSCAN",
          "indexName": "creation_date_idx",
          "direction": "forward"
        }
      }
    }
  },
  "executionStats": {
    "executionSuccess": true,
    "executionTimeMillis": "445731.524",
    "planningTimeMillis": "1.487",
    "executionStages": {
      "stage": "SUBSCAN",
      "nReturned": "44",
      "executionTimeMillisEstimate": "445729.856",
      "inputStage": {
        "stage": "LIMIT_SKIP",
        "nReturned": "44",
        "executionTimeMillisEstimate": "445729.787",
        "inputStage": {
          "stage": "IXSCAN",
          "nReturned": "44",
          "executionTimeMillisEstimate": "445729.765",
          "indexName": "creation_date_idx",
          "direction": "forward"
        }
      }
    }
  },
  "serverInfo": {
    "host": "127.0.0.1",
    "port": 27017,
    "version": "4.0.0"
  },
  "ok": 1,
  "operationTime": {
    "$timestamp": "7231419170506670081"
  }
}

It chose the creation_date index and took 445 seconds to perform the query!

If I try, instead, to hint the query to force using the project_id and mid.id index, here is the winning plan:

{
  "queryPlanner": {
    "plannerVersion": 1,
    "namespace": "myDB.transactions",
    "winningPlan": {
      "stage": "SUBSCAN",
      "inputStage": {
        "stage": "LIMIT_SKIP",
        "inputStage": {
          "stage": "SORT",
          "sortPattern": {
            "creation_date": -1
          },
          "inputStage": {
            "stage": "IXSCAN",
            "indexName": "project_id_mid.id_idx",
            "direction": "forward"
          }
        }
      }
    }
  },
  "executionStats": {
    "executionSuccess": true,
    "executionTimeMillis": "10.966",
    "planningTimeMillis": "1.615",
    "executionStages": {
      "stage": "SUBSCAN",
      "nReturned": "44",
      "executionTimeMillisEstimate": "9.310",
      "inputStage": {
        "stage": "LIMIT_SKIP",
        "nReturned": "44",
        "executionTimeMillisEstimate": "9.292",
        "inputStage": {
          "stage": "SORT",
          "nReturned": "44",
          "executionTimeMillisEstimate": "9.287",
          "sortPattern": {
            "creation_date": -1
          },
          "inputStage": {
            "stage": "IXSCAN",
            "nReturned": "44",
            "executionTimeMillisEstimate": "9.208",
            "indexName": "project_id_mid.id_idx",
            "direction": "forward"
          }
        }
      }
    }
  },
  "serverInfo": {
    "host": "127.0.0.1",
    "port": 27017,
    "version": "4.0.0"
  },
  "ok": 1,
  "operationTime": {
    "$timestamp": "7231419990845423617"
  }
}

It takes 10 milliseconds to perform the query. So... why is DocumentDB choosing the creation_date index by default? Is something wrong with my query?

Some other things I tried:

  • If I run the query with MongoDB Compass pointing to DocumentDB, the query works fine without me having to hint anything.
  • If I reduce the size of the mid.id array to 14 items, it works

答案1

得分: 2

你迫使查询规划器在选择最小化检查文档数量的索引和避免内存排序的索引之间做出选择。在测试计划时,哪个性能更好取决于索引开头附近有最多结果的索引以及避免内存排序的索引会因减少内存使用而获得奖励。

如果您创建一个既可以执行这两项任务的索引,就像这样:

{
"project_id": 1,
"mid.id": 1,
"creation_date": -1
}

那么它应该在每次比赛中获胜,毫不费力。

英文:

You are forcing the query planner to choose between an index that minimized the number of documents examined, and an index that avoids an in-memory sort. Which performs better when testing the plans will depend on things like which index has the most results near the beginning of the index, and the one that avoids the in-memory sort gets a bonus for reduced memory usage.

If you create an index that will do both, like:

{
"project_id": 1,
"mid.id": 1,
"creation_date": -1
}

It should win, hands down, every time.

huangapple
  • 本文由 发表于 2023年5月10日 13:51:07
  • 转载请务必保留本文链接:https://go.coder-hub.com/76215262.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定