OLAP vs OLTP – 基本差异

huangapple go评论73阅读模式
英文:

OLAP vs OLTP - Fundamental difference

问题

OLTP与OLAP之间的主要区别是什么?我已经阅读并理解了OLAP用于分析工作负载,OLTP用于事务工作负载,这是其目的。为什么不能将OLTP数据库用于OLAP呢?我看到OLAP主要是列存储,而OLTP主要是行存储,主要是因为写入/更新与读取用例,这在以下链接中有解释:https://dataschool.com/data-modeling-101/row-vs-column-oriented-databases/。对其他解释非常感激。

我向多位专家咨询过这个问题,我得到的答案主要是OLTP用于事务系统,而OLAP用于分析。3NF到非规范化数据建模等等,但我好奇了解OLTP和OLAP中表格之间的一些基本区别。

英文:

What is a primary difference between a OLTP vs OLAP? I have read and understood OLAP is for analytics workloads and OLTP for transactions workload which is the purpose. Why cant a OLTP DB be used for OLAP as well? I see that OLAP is mostly column store vs OLTP being row store primarily because of Writes/Updates vs Read use cases which is explained in

https://dataschool.com/data-modeling-101/row-vs-column-oriented-databases/

Any other explanations are really appreciated.

I enquired this with multiple folks and i got the answers which i feel are predominantly the outcomes such as OLTP is for transactional systems and OLAP is for analytics. 3NF to denormalized data modelling etc but curious to understand the few fundamental differences between a table in OLTP and a table in OLAP.

答案1

得分: 3

OLTP,正如您所提到的,用于事务处理目的。这意味着您在OLTP数据库中以高并发率写入小对象。这些数据库需要更高的一致性水平和高可用性。通常在用户路径中使用OLTP数据库。

现在,我们不能将OLTP数据库用于OLAP目的吗?是的,绝对可以使用,但这将不是最佳选择。分析查询是长时间运行的查询。一个大型分析查询可能需要几分钟才能完成,并且可能需要读取许多行OLTP数据库。

在进行分析查询时,OLTP数据库将对被分析查询访问的行添加读锁,并在查询完成之前不会释放锁,这都是基于两阶段锁定范式的。因此,这些行将无法被任何事务查询写入,使事务查询等待或重试。

这是一个问题。我们正在使用OLTP数据库进行高速并发处理,但现在由于大型分析查询,数据库行被锁定。这降低了OLTP的可用性。

对业务意味着什么?如果OLTP数据库行被长时间锁定,实际客户将无法对其进行更改。您可以将此映射到任何业务用例。内部分析查询的运行会影响最终客户。

这种困境可以通过在专用OLAP数据库中添加而不是在同一OLTP数据库中运行分析查询来避免。

另一种看待的方式是,OLAP数据库不需要高可用性,因为我们将在其中运行内部分析查询,没有直接访问OLAP数据库的客户。因此,我们可以将这些数据库分开以避免与直接客户查询路径发生冲突。

英文:

OLTP as you mentioned is used for transactional purposes. It means you are writing small objects at high concurrency in OLTP databases. These databases require higher consistency levels and require high availability. OLTP databases are generally used in the user path.

Now, can't we use OLTP databases for OLAP purposes? Yes absolutely you can use but that would be suboptimal. The analytics queries are long-running queries. A large analytic query might require minutes to run and might require reading many many rows of the OLTP database.

While the analytics query is going on, the OLTP database will add read locks to the rows that are accessed by the analytic query and will not relinquish the lock till the query is complete. This is all based on 2 phase locking paradigm. Thus, these rows will not be writeable by any transactional query making the transactional queries wait or retried.

This is a problem. We are using the OLTP databases for high-speed concurrency but now the database rows are locked due to a large analytics query. This reduces the availability of the OLTP.

What does it mean to business? If OLTP database rows are locked for a long time, the actual customer will not be able to make changes to them. You can map this to any business use case. The end customer is impacted due to internal analytics queries running.

This quandary is avoided by adding a dedicated OLAP database instead of running the analytics query in the same OLTP database.

Another way to look at, the OLAP databases don't require to be highly available as we will be running internal analytics queries to it and no customer directly accessing the OLAP DB. So we can keep these databases separate to avoid any conflict with the direct customer query path.

huangapple
  • 本文由 发表于 2023年2月6日 10:44:53
  • 转载请务必保留本文链接:https://go.coder-hub.com/75356927.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定