I'm trying to use this create_df() function in Streamlit to gather a list of user-provided URLs called "recipes" and loop through each URL to return a df I've labeled "res" towards the end of the function. I've tried several approaches with the Streamlit syntax but I just cannot get this to work as I'm getting this error message:

recipe_scrapers._exceptions.WebsiteNotImplementedError: recipe-scrapers exception: Website (h) not supported.

Have a look at my entire repo here. The script works just fine once you've installed all requirements locally, but when I try running the same script with Streamlit syntax in the script I get the above error. Once you run streamlit run in your terminal and have a look at the UI I've create it should be quite clear what I'm aiming at, which is providing the user with a csv of all ingredients in the recipe URLs they provided for a convenient grocery shopping list.

Any help would be greatly appreciated!


  1. def create_df(recipes):
  2. """
  3. Description:
  4. Creates one df with all recipes and their ingredients
  5. Arguments:
  6. * recipes: list of recipe URLs provided by user
  8. Note that ingredients with qualitative amounts e.g., "scheutje melk", "snufje zout" have been ommitted from the ingredient list
  9. """
  10. df_list = []
  11. for recipe in recipes:
  12. scraper = scrape_me(recipe)
  13. recipe_details = replace_measurement_symbols(scraper.ingredients())
  14. recipe_name = recipe.split("", 1)[1]
  15. recipe_name = recipe_name.rsplit('-', 1)[0]
  16. print("Processing data for "+ recipe_name +" recipe.")
  17. for ingredient in recipe_details:
  18. try:
  19. df_temp = pd.DataFrame(columns=['Ingredients', 'Measurement'])
  20. df_temp[str(recipe_name)] = recipe_name
  21. ing_1 = ingredient.split("2 * ", 1)[1]
  22. ing_1 = ing_1.split(" ", 2)
  23. item = ing_1[2]
  24. measurement = ing_1[1]
  25. quantity = float(ing_1[0]) * 2
  26. df_temp.loc[len(df_temp)] = [item, measurement, quantity]
  27. df_list.append(df_temp)
  28. except (ValueError, IndexError) as e:
  29. pass
  30. df = pd.concat(df_list)
  31. print("Renaming duplicate ingredients e.g., Kruimige aardappelen, Voorgekookte halve kriel met schil -> Aardappelen")
  32. ingredient_dict = {
  33. 'Aardappelen': ('Dunne frieten', 'Half kruimige aardappelen', 'Voorgekookte halve kriel met schil',
  34. 'Kruimige aardappelen', 'Roodschillige aardappelen', 'Opperdoezer Ronde aardappelen'),
  35. 'Ui': ('Rode ui'),
  36. 'Kipfilet': ('Kipfilet met tuinkruiden en knoflook'),
  37. 'Kipworst': ('Gekruide kipworst'),
  38. 'Kipgehakt': ('Gemengd gekruid gehakt', 'Kipgehakt met Mexicaanse kruiden', 'Half-om-halfgehakt met Italiaanse kruiden',
  39. 'Kipgehakt met tuinkruiden'),
  40. 'Kipshoarma': ('Kalkoenshoarma')
  41. }
  42. reverse_label_ing = {x:k for k,v in ingredient_dict.items() for x in v}
  43. df["Ingredients"].replace(reverse_label_ing, inplace=True)
  44. print("Assigning ingredient categories")
  45. category_dict = {
  46. 'brood': ('Biologisch wit rozenbroodje', 'Bladerdeeg', 'Briochebroodje', 'Wit platbrood'),
  47. 'granen': ('Basmatirijst', 'Bulgur', 'Casarecce', 'Cashewstukjes',
  48. 'Gesneden snijbonen', 'Jasmijnrijst', 'Linzen', 'Maïs in blik',
  49. 'Parelcouscous', 'Penne', 'Rigatoni', 'Rode kidneybonen',
  50. 'Spaghetti', 'Witte tortilla'),
  51. 'groenten': ('Aardappelen', 'Aubergine', 'Bosui', 'Broccoli',
  52. 'Champignons', 'Citroen', 'Gele wortel', 'Gesneden rodekool',
  53. 'Groene paprika', 'Groentemix van paprika, prei, gele wortel en courgette',
  54. 'IJsbergsla', 'Kumato tomaat', 'Limoen', 'Little gem',
  55. 'Paprika', 'Portobello', 'Prei', 'Pruimtomaat',
  56. 'Radicchio en ijsbergsla', 'Rode cherrytomaten', 'Rode paprika', 'Rode peper',
  57. 'Rode puntpaprika', 'Rode ui', 'Rucola', 'Rucola en veldsla', 'Rucolamelange',
  58. 'Semi-gedroogde tomatenmix', 'Sjalot', 'Sperziebonen', 'Spinazie', 'Tomaat',
  59. 'Turkse groene peper', 'Veldsla', 'Vers basilicum', 'Verse bieslook',
  60. 'Verse bladpeterselie', 'Verse koriander', 'Verse krulpeterselie', 'Wortel', 'Zoete aardappel'),
  61. 'kruiden': ('Aïoli', 'Bloem', 'Bruine suiker', 'Cranberrychutney', 'Extra vierge olijfolie',
  62. 'Extra vierge olijfolie met truffelaroma', 'Fles olijfolie', 'Gedroogde laos',
  63. 'Gedroogde oregano', 'Gemalen kaneel', 'Gemalen komijnzaad', 'Gemalen korianderzaad',
  64. 'Gemalen kurkuma', 'Gerookt paprikapoeder', 'Groene currykruiden', 'Groentebouillon',
  65. 'Groentebouillonblokje', 'Honing', 'Italiaanse kruiden', 'Kippenbouillonblokje', 'Knoflookteen',
  66. 'Kokosmelk', 'Koreaanse kruidenmix', 'Mayonaise', 'Mexicaanse kruiden', 'Midden-Oosterse kruidenmix',
  67. 'Mosterd', 'Nootmuskaat', 'Olijfolie', 'Panko paneermeel', 'Paprikapoeder', 'Passata',
  68. 'Pikante uienchutney', 'Runderbouillonblokje', 'Sambal', 'Sesamzaad', 'Siciliaanse kruidenmix',
  69. 'Sojasaus', 'Suiker', 'Sumak', 'Surinaamse kruiden', 'Tomatenblokjes', 'Tomatenblokjes met ui',
  70. 'Truffeltapenade', 'Ui', 'Verse gember', 'Visbouillon', 'Witte balsamicoazijn', 'Wittewijnazijn',
  71. 'Zonnebloemolie', 'Zwarte balsamicoazijn'),
  72. 'vlees': ('Gekruide runderburger', 'Half-om-half gehaktballetjes met Spaanse kruiden', 'Kipfilethaasjes', 'Kipfiletstukjes',
  73. 'Kipgehaktballetjes met Italiaanse kruiden', 'Kippendijreepjes', 'Kipshoarma', 'Kipworst', 'Spekblokjes',
  74. 'Vegetarische döner kebab', 'Vegetarische kaasschnitzel', 'Vegetarische schnitzel'),
  75. 'zuivel': ('Ei', 'Geraspte belegen kaas', 'Geraspte cheddar', 'Geraspte grana padano', 'Geraspte oude kaas',
  76. 'Geraspte pecorino', 'Karnemelk', 'Kruidenroomkaas', 'Labne', 'Melk', 'Mozzarella',
  77. 'Parmigiano reggiano', 'Roomboter', 'Slagroom', 'Volle yoghurt')
  78. }
  79. reverse_label_cat = {x:k for k,v in category_dict.items() for x in v}
  80. df["Category"] = df["Ingredients"].map(reverse_label_cat)
  81. col = "Category"
  82. first_col = df.pop(col)
  83. df.insert(0, col, first_col)
  84. df = df.sort_values(['Category', 'Ingredients'], ascending = [True, True])
  85. print("Merging ingredients by row across all recipe columns using justify()")
  86. gp_cols = ['Ingredients', 'Measurement']
  87. oth_cols = df.columns.difference(gp_cols)
  88. arr = np.vstack(df.groupby(gp_cols, sort=False, dropna=False).apply(lambda gp: justify(gp.to_numpy(), invalid_val=np.NaN, axis=0, side='up')))
  89. # Reconstruct DataFrame
  90. # Remove entirely NaN rows based on the non-grouping columns
  91. res = (pd.DataFrame(arr, columns=df.columns)
  92. .dropna(how='all', subset=oth_cols, axis=0))
  93. res = res.fillna(0)
  94. res['Total'] = res.drop(['Ingredients', 'Measurement'], axis=1).sum(axis=1)
  95. res=res[res['Total'] !=0] #To drop rows that are being duplicated with 0 for some reason; will check later
  96. print("Processing complete!")
  97. return res


得分: 1

Your function create_df needs a list as an argument, but st.text_input always returns a string.

In your, replace this df_download = create_df(recs) with this df_download = create_df([recs]). However, if you need to handle multiple URLs, you should use str.split like this:

  1. def create_df(recipes):
  2. recipes = recipes.split(",") # <--- add this line to create a list from the user input
  3. ### rest of the code ###
  4. if download:
  5. df_download = create_df(recs)

