最适合大型数据集的数据库选择

huangapple go评论73阅读模式
英文:

The best choice for database for large datasets

问题

我即将开始一个新项目,其中应该有一个相当大的数据库。

记录大致有多达10个字段,这些数据将具有以下规格:

  • 仅进行插入和读取记录操作
  • 开始时将存储8603876000条记录(插入时间和性能不重要)
  • 每10分钟将添加500条新记录到数据库
  • 读取查询性能非常重要(以查找字段之间的关系)

如果您分享您的经验和想法,我将感到高兴。

英文:

I'm about to start a new project which should have a rather large database.

the records have roughly up to 10 fields , these datas will have following specifications :

  • Only insert and read records will be performed
  • There will be 8603876000 record stored at the beginning (insert time and performance not important)
  • Every 10 min 500 new record will be added to db
  • Read query performance is so important ( to find relations between fields )

i will be happy if you share your experience and ideas

答案1

得分: 1

MongoDB:一种非关系型数据库,以类似JSON的文档方式存储数据。它被认为是处理大量文本和大数据的最佳数据库。它还支持复制和分片以实现高可用性和可伸缩性。

Microsoft Azure:一个云平台,提供广泛的数据库软件和管理选项,包括关系型和非关系型数据库。它允许您使用各种工具和功能构建自己的数据库或管理现有数据库。

Redshift:一个针对分析和快速查询性能进行优化的基于云的数据仓库。它可以使用列存储和大规模并行处理处理PB级的结构化和半结构化数据。

BigQuery:一个在Google Cloud Platform上运行的无服务器数据仓库。它可以使用类似SQL的查询在几秒内处理TB级的数据,以及在几分钟内处理PB级的数据。它还支持流式数据摄入和机器学习集成。

MySQL:一个免费且开源的关系型数据库,因其易用性而受欢迎。它可以处理大量数据负载,每秒多个请求。它还支持复制和分区以实现高可用性和可伸缩性。

英文:

i think need a database that can handle large data volume and fast read queries.

MongoDB: A non-relational database that stores data in JSON-like documents. It is considered to be the best database for large amounts of text and the best database for large data. It also supports replication and sharding for high availability and scalability.
Microsoft Azure: A cloud platform that offers a wide range of database software and management options, including relational and non-relational databases. It allows you to build your own databases or manage existing ones with various tools and features

Redshift: A cloud-based data warehouse that is optimized for analytics and fast query performance. It can handle petabytes of structured and semi-structured data using columnar storage and massively parallel processing

BigQuery: A serverless data warehouse that runs on Google Cloud Platform. It can process terabytes of data in seconds and petabytes in minutes using SQL-like queries. It also supports streaming data ingestion and machine learning integration

MySQL: A free and open source relational database that is popular for its ease of use. It can handle large loads of data with multiple requests per second. It also supports replication and partitioning for high availability and scalability.

i hope this helps you✅❤️

答案2

得分: 0

当处理给定的问题时,我觉得考虑以下内容对您会有帮助。

  • 如果您的数据是非关系型的,请考虑使用NoSQL数据库。NoSQL数据库如MongoDB在某些类型的查询上可能比传统的关系型数据库更快。
  • 使用负载均衡器来将工作负载分发到多个服务器上。这可以提高性能并通过确保没有单个服务器超负荷来增加可靠性。
  • 如果您需要更高的性能,有时考虑对数据库进行分区可能是个不错的选择。分区涉及将数据库分成更小、更可管理的部分。
英文:

When taking the given problem, I feel considering the following will be helpful for you.

  • Consider using a NoSQL database if your data is non-relational. NoSQL
    databases like MongoDB can be faster than traditional relational
    databases for certain types of queries.
  • Use a load balancer to
    distribute the workload across multiple servers. This can improve
    performance and increase reliability by ensuring that no single
    server is overloaded.
  • And if you need more performance, sometimes it will be good to consider partitioning the database to improve performance. Partitioning involves splitting the database into smaller, more manageable sections.

huangapple
  • 本文由 发表于 2023年5月7日 18:10:12
  • 转载请务必保留本文链接:https://go.coder-hub.com/76193266.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定