在Python中为相关矩阵赋值而无需迭代。

huangapple go评论55阅读模式
英文:

Attributing values to a correlation matrix in Python without iterating

问题

import numpy as np

# Define the correlation matrix
corr_matrix = np.zeros((6, 6))

for i in range(6):
    for j in range(6):
        if i == j:
            corr_matrix[i, j] = 1
        elif df['sector'][i] == df['sector'][j] and df['country'][i] == df['country'][j]:
            corr_matrix[i, j] = 0.35
        elif df['country'][i] == df['country'][j]:
            corr_matrix[i, j] = 0.25
        elif df['sector'][i] == df['sector'][j]:
            corr_matrix[i, j] = 0.25
        else:
            corr_matrix[i, j] = 0.05

这是优化后的代码,用于生成相关性矩阵。

英文:

I have a table of loans with the features country and sector.

# LOAN SECTOR COUNTRY
Loan 1 food germany
Loan 2 telecom italy
Loan 3 auto japan
Loan 4 food japan
Loan 5 telecom germany
Loan 6 auto italy

I need to define a correlation matrix using the following rules:

  • if 2 loans have the same sector and country they are correlated by 35%
  • if 2 loans have the same country they are correlated by 25%
  • if 2 loans have the same sector they are correlated by 25%
  • otherwise they are correlated by 5%
# LOAN Loan 1 Loan 2 Loan 3 Loan 4 Loan 5 Loan 6
Loan 1 1 - - - - -
Loan 2 0.05 1 - - - -
Loan 3 0.05 0.05 1 - - -
Loan 4 0.25 0.05 0.25 1 - -
Loan 5 0.25 0.25 0.05 0.05 1 -
Loan 6 0.05 0.25 0.25 0.05 0.05 1

I created it by iteration. Is there a smarter way to do it?
Thanks

corr = np.zeros((10,10))
for i in range(df.shape[0]):
    for j in range(df.shape[0]):
        if df.loan[i] == df.loan[j]:
            corr[i,j] = 1
        else:
            if (df.sector[i] == df.sector[j]) and (df.country[i] == df.country[j]):
                corr[i,j] = 0.6
            else:
                if (df.sector[i] == df.sector[j]) or (df.country[i] == df.country[j]):
                    corr[i,j]=0.4
                else:
                    corr[i,j]=0.2

答案1

得分: 1

Here is the translated code:

import numpy as np
import pandas as pd

s = """# LOAN	SECTOR	COUNTRY
Loan 1	food	germany
Loan 2	telecom	italy
Loan 3	auto	japan
Loan 4	food	japan
Loan 5	telecom	germany
Loan 6	auto	italy
Loan 7	auto	italy"""

rows = s.split('\n')
data = [r.split('\t') for r in rows[1:]]
header = rows[0].split('\t')
df = pd.DataFrame(data, columns=header)

n = len(df)
cols = df['# LOAN']
cmat1 = pd.DataFrame(np.full((n, n), 0), columns=cols, index=cols)
cmat2 = pd.DataFrame(np.full((n, n), 0), columns=cols, index=cols)
for sec in df.SECTOR.unique():
    idx = df.SECTOR == sec
    cmat1.iloc[idx, idx] = 1
for cou in df.COUNTRY.unique():
    idx = df.COUNTRY == cou
    cmat2.iloc[idx, idx] = 1

cmat = cmat1 + cmat2
cmat = cmat.replace(0, 0.05)
cmat = cmat.replace(1, 0.25)
cmat = cmat.replace(2, 0.35)
np.fill_diagonal(cmat.values, 1)

print(cmat)

It appears to be a Python code snippet for data manipulation and calculation, but please let me know if you need any specific explanations or further assistance.

英文:
import numpy as np
import pandas as pd

s="""# LOAN	SECTOR	COUNTRY
Loan 1	food	germany
Loan 2	telecom	italy
Loan 3	auto	japan
Loan 4	food	japan
Loan 5	telecom	germany
Loan 6	auto	italy
Loan 7	auto	italy"""

rows = s.split('\n')
data = [r.split('\t') for r in rows[1:]]
header = rows[0].split('\t')
df = pd.DataFrame(data, columns=header)

n = len(df)
cols = df['# LOAN']
cmat1 = pd.DataFrame(np.full((n, n), 0), columns=cols, index=cols)
cmat2 = pd.DataFrame(np.full((n, n), 0), columns=cols, index=cols)
for sec in df.SECTOR.unique():
    idx = df.SECTOR == sec
    cmat1.iloc[idx, idx] = 1
for cou in df.COUNTRY.unique():
    idx = df.COUNTRY == cou
    cmat2.iloc[idx, idx] = 1

cmat = cmat1 + cmat2
cmat = cmat.replace(0, 0.05)
cmat = cmat.replace(1, 0.25)
cmat = cmat.replace(2, 0.35)
np.fill_diagonal(cmat.values, 1)

print(cmat)

prints

# LOAN  Loan 1  Loan 2  Loan 3  Loan 4  Loan 5  Loan 6  Loan 7
# LOAN                                                        
Loan 1    1.00    0.05    0.05    0.25    0.25    0.05    0.05
Loan 2    0.05    1.00    0.05    0.05    0.25    0.25    0.25
Loan 3    0.05    0.05    1.00    0.25    0.05    0.25    0.25
Loan 4    0.25    0.05    0.25    1.00    0.05    0.05    0.05
Loan 5    0.25    0.25    0.05    0.05    1.00    0.05    0.05
Loan 6    0.05    0.25    0.25    0.05    0.05    1.00    0.35
Loan 7    0.05    0.25    0.25    0.05    0.05    0.35    1.00

huangapple
  • 本文由 发表于 2023年6月6日 06:37:51
  • 转载请务必保留本文链接:https://go.coder-hub.com/76410411.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定