Week 5.2: Developing "Recipes" and "Ingredients" Dataset

Friday, March 17, 2023






1. Approach
I tried looking for both the "recipe" and "ingredients" datasets online that could represent the availability in Indonesia but did not find one that is suitable. Hence, I needed to create my own "recipe" and "ingredients" datasets for the app.

At first, the idea was to manually create the dataset, however, I find web scraping would be a more efficient way.


2. Webscraping with Beautiful Soup
I followed this tutorial to start web scraping with Beautiful Soup:
https://www.html.am/html-codes/color/html-black-code.cfm

The process was pretty straight forward and I scraped the targetted data based on their Class Names. Because I scraped from different websites, i need to adjust the parameter of my code depending on the class name of the specific website.

I then converted the scraped data to a JSON file so that it could be used in my app.

Link to my source codes (Beautiful soup web scraping):
https://github.com/hilaryoung/HNYGraduationProject/blob/main/CustomDatasets/HNY-webScraping-Recipes.py



After a combination of both web scraping and manually inputting data to the JSON file, I manage to gather 30 spices and 60 recipes.

Link to "ingredients" dataset:
https://github.com/hilaryoung/HNYGraduationProject/blob/main/CustomDatasets/ingredients.json
Link to "recipes" dataset:
https://github.com/hilaryoung/HNYGraduationProject/blob/main/CustomDatasets/recipes.json



Post a Comment

© HNY Process Log. Design by HNY.