一个大模型还是两个较小的模型

huangapple go评论60阅读模式
英文:

One large model or two smaller models

问题

我应该将投资轮次单独放在fundingRound对象中,还是将其全部放在CompanyFinancials模型中?为什么?

我会建议将投资轮次放在单独的fundingRound对象中。这样做有以下几个好处:

  1. 数据结构清晰:将不同类型的数据分开存储可以使数据结构更加清晰和可维护。这使得在添加、更新或查询特定投资轮次数据时更加方便。

  2. 数据库性能:如果您将所有投资轮次数据都放在CompanyFinancials模型中,数据库表可能会变得非常庞大,这可能会影响数据库性能。单独的fundingRound对象可以减轻这个问题,因为每个对象都只包含特定轮次的数据。

  3. 扩展性:如果未来需要添加更多的轮次或者对轮次数据进行其他修改,单独的fundingRound对象使得扩展更加容易,而不会影响CompanyFinancials模型的结构。

  4. 可读性和维护性:单独的fundingRound对象可以更好地体现代码的可读性和维护性。当查看代码时,人们可以清楚地了解每个对象的用途,而不会混淆或混合不同类型的数据。

总之,将投资轮次单独放在fundingRound对象中可以提高代码的组织性、性能和可维护性,因此这是一个更好的选择。

英文:

I'm developing a database that holds financial information.

I have two models:

  1. Company (~30 fields)
  2. CompanyFinacials (~120 fields)

CompanyFinacials includes a field with Company as a primary key to maintain the relationship.

I need to 10 add investment rounds to CompanyFinacials. Each round will have 12 fields.

So, the question is, do i build it like this:

class CompanyFinancials(models.Model):
    company = models.OneToOneField(Company, on_delete=models.CASCADE, primary_key=True)
    date_added = models.DateTimeField(auto_now=True)
    financial_start_year = models.CharField(max_length=2000, default='NONE', blank=True) ...
   ...
   ...
   ...
    
class fundingRound(models.Model):
    company_obj = models.OneToOneField(Company, on_delete=models.CASCADE, primary_key=True)
    funding_stage = models.CharField(max_length=2000, default='NONE', blank=True)
    target_size_of_raise = models.CharField(max_length=2000, default='NONE', blank=True)
    invest_amount = models.CharField(max_length=2000, default='NONE', blank=True)
    investment_vehicle = models.CharField(max_length=2000, default='NONE', blank=True)
    discount_percent = models.CharField(max_length=2000, default='NONE', blank=True)
    pre_money_valuation = models.CharField(max_length=2000, default='NONE', blank=True)
    dilution = models.CharField(max_length=2000, default='NONE', blank=True)
    post_money_valuation = models.CharField(max_length=2000, default='NONE', blank=True)
    equity_percent = models.CharField(max_length=2000, default='NONE', blank=True)

or like this:

class CompanyFinancials(models.Model):
    company = models.OneToOneField(Company, on_delete=models.CASCADE, primary_key=True)
    date_added = models.DateTimeField(auto_now=True)
    financial_start_year = models.CharField(max_length=2000, default='NONE', blank=True)
    ...
    ...
    ...
   round1_funding_stage = models.CharField(max_length=2000, default='NONE', blank=True)
   round1_target_size_of_raise = models.CharField(max_length=2000, default='NONE', blank=True)
   round1_invest_amount = models.CharField(max_length=2000, default='NONE', blank=True)
   round1_investment_vehicle = models.CharField(max_length=2000, default='NONE', blank=True)
   round1_discount_percent = models.CharField(max_length=2000, default='NONE', blank=True)
   round1_pre_money_valuation = models.CharField(max_length=2000, default='NONE', blank=True)
   round1_dilution = models.CharField(max_length=2000, default='NONE', blank=True)
   round1_post_money_valuation = models.CharField(max_length=2000, default='NONE', blank=True)
   round1_equity_percent = models.CharField(max_length=2000, default='NONE', blank=True)
   round2_funding_stage = models.CharField(max_length=2000, default='NONE', blank=True)
   round2_target_size_of_raise = models.CharField(max_length=2000, default='NONE', blank=True)
   round2_invest_amount = models.CharField(max_length=2000, default='NONE', blank=True)
   round2_investment_vehicle = models.CharField(max_length=2000, default='NONE', blank=True)
   round2_discount_percent = models.CharField(max_length=2000, default='NONE', blank=True)
   round2_pre_money_valuation = models.CharField(max_length=2000, default='NONE', blank=True)
   round2_dilution = models.CharField(max_length=2000, default='NONE', blank=True)
   round2_post_money_valuation = models.CharField(max_length=2000, default='NONE', blank=True)
   round2_equity_percent = models.CharField(max_length=2000, default='NONE', blank=True)
    ...
    ...
    ...
   round10_equity_percent = models.CharField(max_length=2000, default='NONE', blank=True)

I can see that the DB model will very large if I manually put all 10 rounds into it, but is there really a disadvantage? It would be easier to work with an object that contains all the data.

Should I add a separate object for fundingRound, or keep it all in the CompanyFinancials model? And why?

答案1

得分: 1

我应该为FundingRound添加一个单独的对象,还是将其全部放在CompanyFinancials模型中?为什么?

是的。正如我的一位教授所说:在建模中通常只有三个常数无限。每轮输入九个字段,而每轮结果是十轮,总共有90个字段。这将导致查询繁琐。如果您想检查是否有任何一轮融资超过1000万美元,那将需要在九个OR之间进行查询。

即使所有这些都可以工作,也不能保证十轮就足够了,最终可能会有一家公司出现超过十一轮或更多的情况,那么就没有空间了。

这将引入大量的NULL,大量重复的代码来验证所有这些元素,最终会使问题变得更加复杂。

通常情况下,建模是相反的:线性化。将事物分成行,而不是列。这会使汇总融资轮次、检查融资轮次数量、确保数据完整性变得更加困难。

事实上,这种建模违反了数据库的第一范式&nbsp;<sup>[维基]</sup>,它旨在线性化数据。另一件可能需要修复的事情是使用专用字段来存储数据,例如用于百分比的DecimalField&nbsp;<sup>[Django文档]</sup>

英文:

> Should I add a separate object for FundingRound, or keep it all in the CompanyFinancials model? And why?

Yes. As one of my professors said: in modeling there are usually only three constants zero, one and infinity. The fact that you enter nine fields per round, and ten round results in 90 fields. It results in a database where querying will be cumbersome. If you want to check if any round got more than $10M in investment, that will require a query with nine ORs in between.

Even if that all would work, it is not said that ten round will be enough, eventually it is possible some Company turns up with eleven rounds, or more, so then there is no space anymore.

It will introduce a lot of NULLs, a lot of duplicated code to validate all these elements, and eventually make it harder.

Usually modeling is the other way around: linearizing. Splitting things into rows, not columns. This makes summing up funding rounds, checking the amount of funding rounds, guaranteeing data integrity a lot harder.

In fact the modeling violates the first normal form of databases&nbsp;<sup>[wiki]</sup> that aims to linearize data. Another thing that should probably be fixed is using dedicated fields to store data, a DecimalField&nbsp;<sup>[Django-doc]</sup> for example for percentages.

huangapple
  • 本文由 发表于 2023年7月14日 05:51:59
  • 转载请务必保留本文链接:https://go.coder-hub.com/76683455.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定