英文:
Attributing values to a correlation matrix in Python without iterating
问题
import numpy as np
# Define the correlation matrix
corr_matrix = np.zeros((6, 6))
for i in range(6):
for j in range(6):
if i == j:
corr_matrix[i, j] = 1
elif df['sector'][i] == df['sector'][j] and df['country'][i] == df['country'][j]:
corr_matrix[i, j] = 0.35
elif df['country'][i] == df['country'][j]:
corr_matrix[i, j] = 0.25
elif df['sector'][i] == df['sector'][j]:
corr_matrix[i, j] = 0.25
else:
corr_matrix[i, j] = 0.05
这是优化后的代码,用于生成相关性矩阵。
英文:
I have a table of loans with the features country and sector.
# LOAN | SECTOR | COUNTRY | |
---|---|---|---|
Loan 1 | food | germany | |
Loan 2 | telecom | italy | |
Loan 3 | auto | japan | |
Loan 4 | food | japan | |
Loan 5 | telecom | germany | |
Loan 6 | auto | italy |
I need to define a correlation matrix using the following rules:
- if 2 loans have the same sector and country they are correlated by 35%
- if 2 loans have the same country they are correlated by 25%
- if 2 loans have the same sector they are correlated by 25%
- otherwise they are correlated by 5%
# LOAN | Loan 1 | Loan 2 | Loan 3 | Loan 4 | Loan 5 | Loan 6 |
---|---|---|---|---|---|---|
Loan 1 | 1 | - | - | - | - | - |
Loan 2 | 0.05 | 1 | - | - | - | - |
Loan 3 | 0.05 | 0.05 | 1 | - | - | - |
Loan 4 | 0.25 | 0.05 | 0.25 | 1 | - | - |
Loan 5 | 0.25 | 0.25 | 0.05 | 0.05 | 1 | - |
Loan 6 | 0.05 | 0.25 | 0.25 | 0.05 | 0.05 | 1 |
I created it by iteration. Is there a smarter way to do it?
Thanks
corr = np.zeros((10,10))
for i in range(df.shape[0]):
for j in range(df.shape[0]):
if df.loan[i] == df.loan[j]:
corr[i,j] = 1
else:
if (df.sector[i] == df.sector[j]) and (df.country[i] == df.country[j]):
corr[i,j] = 0.6
else:
if (df.sector[i] == df.sector[j]) or (df.country[i] == df.country[j]):
corr[i,j]=0.4
else:
corr[i,j]=0.2
答案1
得分: 1
Here is the translated code:
import numpy as np
import pandas as pd
s = """# LOAN SECTOR COUNTRY
Loan 1 food germany
Loan 2 telecom italy
Loan 3 auto japan
Loan 4 food japan
Loan 5 telecom germany
Loan 6 auto italy
Loan 7 auto italy"""
rows = s.split('\n')
data = [r.split('\t') for r in rows[1:]]
header = rows[0].split('\t')
df = pd.DataFrame(data, columns=header)
n = len(df)
cols = df['# LOAN']
cmat1 = pd.DataFrame(np.full((n, n), 0), columns=cols, index=cols)
cmat2 = pd.DataFrame(np.full((n, n), 0), columns=cols, index=cols)
for sec in df.SECTOR.unique():
idx = df.SECTOR == sec
cmat1.iloc[idx, idx] = 1
for cou in df.COUNTRY.unique():
idx = df.COUNTRY == cou
cmat2.iloc[idx, idx] = 1
cmat = cmat1 + cmat2
cmat = cmat.replace(0, 0.05)
cmat = cmat.replace(1, 0.25)
cmat = cmat.replace(2, 0.35)
np.fill_diagonal(cmat.values, 1)
print(cmat)
It appears to be a Python code snippet for data manipulation and calculation, but please let me know if you need any specific explanations or further assistance.
英文:
import numpy as np
import pandas as pd
s="""# LOAN SECTOR COUNTRY
Loan 1 food germany
Loan 2 telecom italy
Loan 3 auto japan
Loan 4 food japan
Loan 5 telecom germany
Loan 6 auto italy
Loan 7 auto italy"""
rows = s.split('\n')
data = [r.split('\t') for r in rows[1:]]
header = rows[0].split('\t')
df = pd.DataFrame(data, columns=header)
n = len(df)
cols = df['# LOAN']
cmat1 = pd.DataFrame(np.full((n, n), 0), columns=cols, index=cols)
cmat2 = pd.DataFrame(np.full((n, n), 0), columns=cols, index=cols)
for sec in df.SECTOR.unique():
idx = df.SECTOR == sec
cmat1.iloc[idx, idx] = 1
for cou in df.COUNTRY.unique():
idx = df.COUNTRY == cou
cmat2.iloc[idx, idx] = 1
cmat = cmat1 + cmat2
cmat = cmat.replace(0, 0.05)
cmat = cmat.replace(1, 0.25)
cmat = cmat.replace(2, 0.35)
np.fill_diagonal(cmat.values, 1)
print(cmat)
prints
# LOAN Loan 1 Loan 2 Loan 3 Loan 4 Loan 5 Loan 6 Loan 7
# LOAN
Loan 1 1.00 0.05 0.05 0.25 0.25 0.05 0.05
Loan 2 0.05 1.00 0.05 0.05 0.25 0.25 0.25
Loan 3 0.05 0.05 1.00 0.25 0.05 0.25 0.25
Loan 4 0.25 0.05 0.25 1.00 0.05 0.05 0.05
Loan 5 0.25 0.25 0.05 0.05 1.00 0.05 0.05
Loan 6 0.05 0.25 0.25 0.05 0.05 1.00 0.35
Loan 7 0.05 0.25 0.25 0.05 0.05 0.35 1.00
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论