You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi @techhat , following on our discussion in an earlier post #5 , I've given some more thought to the format and have experimented with different structures. I think I've arrived at one which might be worth considering. Here I thought I would just share some ideas using a concrete example of a recipe taken from this website.
An example YAML file that encodes this recipe, with some modifications from the original ORF:
# name of the dish
dish_name: MUSTARD GREEN PORK RICE
# Name of the dish's originator
creator: Cynthia Lim
# How many persons this recipe can feed
serves:
min: 6
max: 8 # If there is no range in the number of persons, just fill in the max field
# Information about the recipe, such as its uses, occasions where it is served, and any historical notes
info:
- It is my all-time favourite comfort food
# The most specific date the recipe was created, if known, else leave blank
creation_date:
# Category of the dish's cuisine, and associated cultural and geographical information
category:
race: CHINESE
ethnic_group:
geography: SINGAPORE
# Information source where this recipe was obtained from, be it a website or a cookbook
source_info:
type: website
# Either website or book name is provided, one is filled, the other is left empty
url: http://mysingaporefood.com/recipe/mustard-green-pork-rice/
book_name:
# Ingredients used in the recipe
ingredients:
RICE:
amount: 2
units: CUP
notes:
- Uncooked raw rice
MUSTARD GREENS:
amount: 2
unit: BUNCH
notes:
- Washed and cut into bite-size
GARLIC:
amount: 8
unit: CLOVE
notes:
- Mince the garlic
GINGER:
amount: 0.25
unit: PIECE
notes:
- Sliced thinly
DRIED SHRIMPS:
amount: 30
unit: GRAM
notes:
- Soaked in hot water, then drained
PORK BELLY:
amount: 1
unit: SLAB
notes:
- Marinated with 2 tablespoon of soya sauce and 1 teaspoon sesame oil
- Sliced into bite-size
DRIED MUSHROOMS:
amount: 30
unit: GRAM
notes:
- Soaked in hot water to soften, then drained
SOYA SAUCE LIGHT:
amount: 3
unit: TABLESPOON
notes:
- 2 tbsp used for seasoning pork belly, 1 tbsp used to drizzle on top of cooked rice
SESAME OIL:
amount: 2
unit: TEASPOON
notes:
- 1 tsp used for sauteing, 1 tsp used to drizzle on top of cooked rice
# Steps to cook the dish
steps:
1: Wash the rice twice and drain well.
2: Heat up wok. Add sesame oil.
3: Saute ginger till fragrant.
4: Add garlic and saute.
5: Add dried shrimps and saute.
6: Add mushrooms and saute till fragrant.
7: Add pork belly and stir-fry till pork belly is half cooked.
8: Add in rice.
9: Add in soya sauce and stir well.
10: Add in mustard greens. Stir well.
11: Scoop mixture into rice cooker and cover with sufficient water.
12: Cook as per instructions on rice cooker.
13: Ready and serve.
Modifications and their rationale:
Make the ingredients in the recipe as accessible at the top level as possible. To this goal, I've stopped making the ingredient names be marked with a dash, and have instead just put down their names as nested under the ingredients key. The names themselves are going to contain information like amounts and notes; I've made sure to nest these under each respective ingredient name. In this way, I've made it possible to make a generator object of ingredient names, immediately after parsing the YAML file for the flavor formula, by just calling the keys() method on the ingredients dictionary that I index out of the main data structure. More concretely the code involved would simply be:
In [9]: with open("./mustard_green_pork_rice.yaml", "r") as f:
...: fn = yaml.load(f)
...:
In [10]: fn["ingredients"].keys()
Out[10]: dict_keys(['DRIED MUSHROOMS', 'DRIED SHRIMPS', 'MUSTARD GREENS', 'SOYA SAUCE LIGHT', 'PORK BELLY', 'RICE', 'SESAME OIL', 'GINGER', 'GARLIC'])
This greatly simplifies accessing the list of ingredients, an issue which I've raised in an earlier post numbered #5 .
Eliminate unncessary sequence generation, focus on dictionaries dictionaries The structure of the original format always resulted in the parsed python object becoming a huge mangle of lists of dictionaries. This was due to the extensive use of dashes. Not a pretty and elegant representation. In making this modification, one guiding principle has been to prioritize dictionaries over lists, to represent as much as possible, the layers as dictionary-keys-dictionary-keys and to only have the list be represented at the end at the bottom-most level of a dictionary-key chain.
Added new fields for recipe origin As some research projects might involve studying some categorical or cultural aspects of recipes, I thought that including these new fields related to such information would be useful from the perspective of provenance.
Looking forward to your thoughts on this. Thank you.
The text was updated successfully, but these errors were encountered:
Hey @kohaugustine. The problem I have here is in switching things from regular lists to, well, anything else (except for OrderedDict; but does YAML support that?). I wrote a number of articles some years ago about writing recipes, which I believe discuss this, but the short version is: ingredients should be listed in the order in which they are to be used.
I realize that most non-professional cooks don't care, as is heavily evidenced by, well, almost every recipe site ever. But professionals care. Somebody working in a production kitchen/bakery will read a recipe all the way first, and then start collecting ingredients in the order in which they are listed, often adding them to the bowl/pot/etc in said order, as the instructions dictate.
These users will need ingredients to reliably be listed in exactly the same order as they are entered. Even in YAML supports ordered dictionaries, there is no guarantee that users will make use of them. I believe that forcing a list format will help here.
@kohaugustine please see #10 for links to the articles. These should explain a lot of the reasoning why I've tried to do things the way I have. I'll see if I can compile them into ORF documentation later.
Hi @techhat , following on our discussion in an earlier post #5 , I've given some more thought to the format and have experimented with different structures. I think I've arrived at one which might be worth considering. Here I thought I would just share some ideas using a concrete example of a recipe taken from this website.
An example YAML file that encodes this recipe, with some modifications from the original ORF:
Modifications and their rationale:
ingredients
key. The names themselves are going to contain information like amounts and notes; I've made sure to nest these under each respective ingredient name. In this way, I've made it possible to make a generator object of ingredient names, immediately after parsing the YAML file for the flavor formula, by just calling thekeys()
method on theingredients
dictionary that I index out of the main data structure. More concretely the code involved would simply be:This greatly simplifies accessing the list of ingredients, an issue which I've raised in an earlier post numbered #5 .
Eliminate unncessary sequence generation, focus on dictionaries dictionaries The structure of the original format always resulted in the parsed python object becoming a huge mangle of lists of dictionaries. This was due to the extensive use of dashes. Not a pretty and elegant representation. In making this modification, one guiding principle has been to prioritize dictionaries over lists, to represent as much as possible, the layers as dictionary-keys-dictionary-keys and to only have the list be represented at the end at the bottom-most level of a dictionary-key chain.
Added new fields for recipe origin As some research projects might involve studying some categorical or cultural aspects of recipes, I thought that including these new fields related to such information would be useful from the perspective of provenance.
Looking forward to your thoughts on this. Thank you.
The text was updated successfully, but these errors were encountered: