英文:
Map Latitude and Longitude for each data point
问题
我有30个站点的大量时间序列数据,需要将纬度和经度映射到每个数据点,以便在地图上绘制。纬度和经度数据在另一个.csv文件中,需要根据站点ID进行映射。我知道我可以编写一个非常长的for循环,但我希望找到一种更有效的方法,在另一个.csv文件中查找并填充每个数据点的数据。
以下是仅包含少量数据的3个站点的数据摘要:
date precip_mm temp_C ID
11/2/2006 11:00 47.7 3.8 301
11/2/2006 12:00 47.7 4.4 301
11/2/2006 13:00 48.9 4.4 301
11/2/2006 14:00 50.2 3.8 301
11/2/2006 15:00 50.2 2.9 301
11/2/2006 16:00 50.2 2.7 301
11/2/2006 17:00 50.2 2.8 301
11/2/2006 18:00 50.2 2.6 301
11/2/2006 19:00 50.2 2.5 301
11/2/2006 11:00 50.2 2.2 321
11/2/2006 12:00 50.2 1.7 321
11/2/2006 13:00 50.2 1.5 321
11/2/2006 14:00 50.2 1.2 321
11/2/2006 15:00 50.2 1.1 321
11/2/2006 16:00 50.2 1.3 321
11/2/2006 17:00 50.2 1.3 321
11/2/2006 18:00 50.2 1.3 321
11/2/2006 19:00 50.2 1.2 321
11/2/2006 11:00 50.2 1.4 391
11/2/2006 12:00 50.2 1.2 391
11/2/2006 13:00 50.2 1.1 391
11/2/2006 14:00 50.2 1.1 391
11/2/2006 15:00 50.2 1.6 391
11/2/2006 16:00 50.2 2.1 391
11/2/2006 17:00 50.2 2.7 391
11/2/2006 18:00 51.4 2.1 391
11/2/2006 19:00 51.4 0.9 391
以下是元数据:
Name ID Latitude Longitude
Adin Mtn 301 41.23583 -120.79192
Bear Creek 321 41.83384 -115.45278
Cedar Pass 391 41.58233 -120.3025
我尝试使用以下代码进行映射,但我看到的所有论坛都是使用melt或merge来匹配一个项目与一个项目,但我需要填充多年的数据的纬度和经度。当我运行这行代码时,列中填充了NaN,但没有出现错误。请帮助我!
df['latitude'] = df['ID'].map(metadata.set_index('ID')['Latitude'])
英文:
I have a lot of timeseries data for 30 stations and I need to map the latitude and longitude to each data point so I can plot it on a map. The latitude and longitude data is in another .csv that needs to be maped over based on the stations ID. I know I can write a very long for loop, but I was hoping to find a more efficient wat to look it up in the other .csv and fill it in for each data point.
Here is a summary of the data with just 3 stations for a few hours of data:
date precip_mm temp_C ID
11/2/2006 11:00 47.7 3.8 301
11/2/2006 12:00 47.7 4.4 301
11/2/2006 13:00 48.9 4.4 301
11/2/2006 14:00 50.2 3.8 301
11/2/2006 15:00 50.2 2.9 301
11/2/2006 16:00 50.2 2.7 301
11/2/2006 17:00 50.2 2.8 301
11/2/2006 18:00 50.2 2.6 301
11/2/2006 19:00 50.2 2.5 301
11/2/2006 11:00 50.2 2.2 321
11/2/2006 12:00 50.2 1.7 321
11/2/2006 13:00 50.2 1.5 321
11/2/2006 14:00 50.2 1.2 321
11/2/2006 15:00 50.2 1.1 321
11/2/2006 16:00 50.2 1.3 321
11/2/2006 17:00 50.2 1.3 321
11/2/2006 18:00 50.2 1.3 321
11/2/2006 19:00 50.2 1.2 321
11/2/2006 11:00 50.2 1.4 391
11/2/2006 12:00 50.2 1.2 391
11/2/2006 13:00 50.2 1.1 391
11/2/2006 14:00 50.2 1.1 391
11/2/2006 15:00 50.2 1.6 391
11/2/2006 16:00 50.2 2.1 391
11/2/2006 17:00 50.2 2.7 391
11/2/2006 18:00 51.4 2.1 391
11/2/2006 19:00 51.4 0.9 391
Here is the metadata:
Name ID Latitude Longitude
Adin Mtn "301 " 41.23583 -120.79192
Bear Creek "321 " 41.83384 -115.45278
Cedar Pass "391 " 41.58233 -120.3025
I have tried to map with the following code and all of the forums I have seen are matching one item to one item with melt or merge but I need to fill in the lat and long for many years of data. When i run this like of code, the column in filled with nan and does not give me an error. Help please!
df['latitude'] = df['ID'].map(metadata.set_index('ID')['Latitude'])
答案1
得分: 0
很可能导致你得到NaN值的原因是因为你的元数据中ID以字符串形式存储,但在时间序列数据中以整数形式存储。你需要将该列转换为适当的数值列,以便合并能够正确匹配。
meta['ID'] = meta['ID'].astype('int')
df.merge(meta, on='ID')
英文:
Most likely the reason you are getting NaN's is because ID is stored as a string in your metadata but as an integer in your timeseries data. You need to convert that column to a proper numerical column for the merge to match properly.
meta['ID'] = meta['ID'].astype('int')
df.merge(meta, on = 'ID')
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论