如何使Streamlit在数据框中显示年份时不带逗号?

huangapple go评论128阅读模式
英文:

How can I get Streamlit to display years in data frames without a comma?

问题

I am creating a Streamlit app for a final project for school. It contains two raw data frames and two graphs. However, when I post the data frames to the app, the Year columns come out with commas, i.e. 1,993 instead of 1993.

So far, I've tried saving the cleaned data with the Year columns set as int and also as objects--didn't work. I've also tried saving the cleaned data as a .csv to load into my Streamlit code instead of a .xlsx, in case there was something funky with the Excel format that caused the commas to appear--this also didn't work. I expected for the data frames to be posted to the Streamlit app in a YYYY format as opposed to a Y,YYY format, but I got the Y,YYY format instead. In the end, I used matplotlib to post the graphs since it doesn't add unnecessary commas.

This is what my streamlit code looks like:

  1. import pandas as pd
  2. import matplotlib.pyplot as plt
  3. import streamlit as st
  4. st.title('Global Biodiversity Decline')
  5. st.write(' ')
  6. st.write(' ')
  7. st.write(' ')
  8. live = pd.read_excel('living-planet-spread.xlsx')
  9. live = live.drop(axis=1, columns='Unnamed: 0')
  10. live['Year'] = live['Year'].astype('object')
  11. live2 = pd.pivot_table(live, index='Year', columns='Region', values='Average Index', fill_value=0)
  12. st.subheader('Decline of Average Index by Year')
  13. if st.checkbox('Show Raw Biodiversity Data'):
  14. st.subheader('Raw Data')
  15. st.write(live2)
  16. st.caption("Data Source: World Wildlife Fund (WWF) and Zoological Society of London")
  17. chart = pd.DataFrame(live2, columns=['Africa', 'Asia', 'Europe', 'South America', 'North America', 'World'])
  18. fig, ax = plt.subplots(figsize=(12, 6))
  19. ax.plot(chart)
  20. ax.set(xlabel='Year', ylabel='Index (%)')
  21. ax.legend(['Africa', 'Asia', 'Europe', 'South America', 'North America'])
  22. st.pyplot(fig)
  23. st.caption('Above is a graph plotting the average index of biodiversity per region. Note that all regions are on a steady decline, particularly Latin America which has a sharper decline than all other regions. One possible cause of this could be deforestation related to farming. See the below graph.')
  24. st.write(' ')
  25. st.write(' ')
  26. st.write(' ')
  27. # I had to set the index as 'Year' in order for the x-axis of this graph to show up as the Years instead of a numbered index
  28. land = pd.read_excel('fao_land_data_spread.xlsx')
  29. land = land.set_index('Year')
  30. st.subheader('Regional Increase in Land Use for Farming by Year')
  31. if st.checkbox('Show Raw Land Area Data'):
  32. st.subheader('Raw Data')
  33. st.write(land)
  34. st.caption('Data Source: UNData')
  35. chart2 = pd.DataFrame(land, columns=['Africa', 'Asia', 'Europe', 'South America', 'North America'])
  36. chart3 = pd.DataFrame(land, columns=['World'])
  37. fig, ax = plt.subplots(figsize=(12, 6))
  38. ax.plot(chart2)
  39. ax.set(xlabel='Year', ylabel='Area (1000 Ha)e+06')
  40. ax.legend(['Africa', 'Asia', 'Europe', 'South America', 'North America'])
  41. st.pyplot(fig)
  42. st.caption('Above is a graph plotting the area of farmland used per region...')
  43. st.write(' ')
  44. st.write(' ')
  45. st.write(' ')
  46. st.subheader('Global Increase in Land Use for Farming by Year')
  47. fig, ax = plt.subplots(figsize=(12, 6))
  48. ax.plot(chart3)
  49. ax.set(xlabel='Year', ylabel='Area (1000 Ha)e+06')
  50. st.pyplot(fig)
  51. st.caption('I put the Global area of farmland in its own graph...')

And this is a sample of what each data frame looks like:

  1. Africa Asia Europe North America South America World
  2. Year
  3. 1961 927526.222222 911930.555556 825966.444444 586216.444444 502466.333333 4.146173e+06
  4. 1962 927657.000000 913559.333333 826292.888889 585067.666667 503954.444444 4.149369e+06
  5. 1963 928080.888889 914962.222222 825754.111111 584786.000000 505403.444444 4.152637e+06
  6. 1964 928313.333333 916675.333333 825170.777778 584079.000000 506533.333333 4.155457e+06
  7. 1965 928717.111111 918125.555556 825569.555556 583276.444444 507664.888889 4.159057e+06
  1. Region Year Average Index Upper Index Lower Index
  2. 44 Africa 2014 32.492869 68.628636 15.238575
  3. 45 Africa 2015 31.293573 66.256152 14.669147
  4. 46 Africa 2016 32.054221 68.026893 14.968882
  5. 47 Africa 2017 34.445875 73.433580 15.991854
  6. 48 Africa 2018 34.445875 73.433580 15.991854
英文:

I am creating a Streamlit app for a final project for school. It contains two raw data frames and two graphs. However, when I post the data frames to the app, the Year columns come out with commas, i.e. 1,993 instead of 1993.

So far, I've tried saving the cleaned data with the Year columns set as int and also as objects--didn't work. I've also tried saving the cleaned data as a .csv to load into my Streamlit code instead of a .xlsx, in case there was something funky with the Excel format that caused the commas to appear--this also didn't work. I expected for the data frames to be posted to the Streamlit app in a YYYY format as opposed to a Y,YYY format, but I got the Y,YYY format instead. In the end, I used matplotlib to post the graphs since it doesn't add unnecessary commas.

This is what my streamlit code looks like:

  1. import pandas as pd
  2. import matplotlib.pyplot as plt
  3. import streamlit as st
  4. st.title('Global Biodiversity Decline')
  5. st.write(' ')
  6. st.write(' ')
  7. st.write(' ')
  8. live=pd.read_excel('living-planet-spread.xlsx')
  9. live=live.drop(axis=1, columns='Unnamed: 0')
  10. live['Year']=live['Year'].astype('object')
  11. live2=pd.pivot_table(live, index='Year', columns='Region', values='Average Index', fill_value=0)
  12. st.subheader('Decline of Average Index by Year')
  13. if st.checkbox('Show Raw Biodiversity Data'):
  14. st.subheader('Raw Data')
  15. st.write(live2)
  16. st.caption("Data Source: World Wildlife Fund (WWF) and Zoological Society of London")
  17. chart=pd.DataFrame(live2, columns=['Africa', 'Asia and Pacific', 'Europe and Central Asia', 'Latin America and the Carribean', 'North America', 'World'])
  18. fig, ax=plt.subplots(figsize=(12,6))
  19. ax.plot(chart)
  20. ax.set(xlabel='Year', ylabel='Index (%)')
  21. ax.legend(['Africa', 'Asia', 'Europe', 'South America', 'North America'])
  22. st.pyplot(fig)
  23. st.caption('Above is a graph plotting the average index of biodiversity per region. Note that all regions are on a steady decline, particularly Latin America which has a sharper decline than all other regions. One possible cause of this could be deforestation related to farming. See the below graph.')
  24. st.write(' ')
  25. st.write(' ')
  26. st.write(' ')
  27. #I had to set the index as 'Year' in order for the x-axis of this graph to show up as the Years instead of a numbered index
  28. land=pd.read_excel('fao_land_data_spread.xlsx')
  29. land=land.set_index('Year')
  30. st.subheader('Regional Increase in Land Use for Farming by Year')
  31. if st.checkbox('Show Raw Land Area Data'):
  32. st.subheader('Raw Data')
  33. st.write(land)
  34. st.caption('Data Source: UNData')
  35. chart2=pd.DataFrame(land, columns=['Africa', 'Asia', 'Europe', 'South America', 'North America'])
  36. chart3=pd.DataFrame(land, columns=['World'])
  37. fig, ax=plt.subplots(figsize=(12,6))
  38. ax.plot(chart2)
  39. ax.set(xlabel='Year', ylabel='Area (1000 Ha)e+06')
  40. ax.legend(['Africa', 'Asia', 'Europe', 'South America', 'North America'])
  41. st.pyplot(fig)
  42. st.caption('Above is a graph plotting the area of farmland used per region...')
  43. st.write(' ')
  44. st.write(' ')
  45. st.write(' ')
  46. st.subheader('Global Increase in Land Use for Farming by Year')
  47. fig, ax=plt.subplots(figsize=(12,6))
  48. ax.plot(chart3)
  49. ax.set(xlabel='Year', ylabel='Area (1000 Ha)e+06')
  50. st.pyplot(fig)
  51. st.caption('I put the Global area of farmland in its own graph...')

And this is a sample of what each data frame looks like:

  1. Africa Asia Europe North America South America World
  2. Year
  3. 1961 927526.222222 911930.555556 825966.444444 586216.444444 502466.333333 4.146173e+06
  4. 1962 927657.000000 913559.333333 826292.888889 585067.666667 503954.444444 4.149369e+06
  5. 1963 928080.888889 914962.222222 825754.111111 584786.000000 505403.444444 4.152637e+06
  6. 1964 928313.333333 916675.333333 825170.777778 584079.000000 506533.333333 4.155457e+06
  7. 1965 928717.111111 918125.555556 825569.555556 583276.444444 507664.888889 4.159057e+06
  1. Region Year Average Index Upper Index Lower Index
  2. 44 Africa 2014 32.492869 68.628636 15.238575
  3. 45 Africa 2015 31.293573 66.256152 14.669147
  4. 46 Africa 2016 32.054221 68.026893 14.968882
  5. 47 Africa 2017 34.445875 73.433580 15.991854
  6. 48 Africa 2018 34.445875 73.433580 15.991854

答案1

得分: 0

从您的描述和代码片段来看,逗号是因为从Excel中读取Year列时将其读取为数字类型引起的。在将数字类型转换为对象类型时似乎引入了逗号,这似乎是Pandas Excel读取器的默认行为。

您可以尝试将Year的数据类型指定为字符串,然后将其转换回数字或整数,如下所示:

  1. live = pd.read_excel('living-planet-spread.xlsx', dtype={'Year': str})
  2. # 将"Year"列转换为数字
  3. live['Year'] = pd.to_numeric(live['Year'])
  4. # 将"Year"列转换为整数
  5. live['Year'] = live['Year'].astype(int)
英文:

From your description and code snippet, it seems the comma is caused by the Year column being read in as a numeric type from Excel. The comma seems to be introduced when converting the numeric type to an object type which seems to be a default behavior of Pandas excel reader.

You can try specifying the data type of Year as a String then convert it back to numeric or int as such:

  1. live=pd.read_excel('living-planet-spread.xlsx', dtype={'Year': str})
  2. # Convert the "Year" column to a numeric
  3. live['Year'] = pd.to_numeric(live['Year'])
  4. # Convert the "Year" column to an integer
  5. live['Year'] = live['Year'].astype(int)

huangapple
  • 本文由 发表于 2023年3月23日 10:53:51
  • 转载请务必保留本文链接:https://go.coder-hub.com/75818899.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定