Repository structure (file layouts, cookiecutter templating engines, distribution options) #1

agriyakhetarpal · 2023-07-28T06:09:03Z

Starting this as a placeholder issue for tracking down tasks to be completed and those that are complete. I will be dividing these into separate issues and PRs

Cookiecutters

as suggested by @Saransh-cpp

Examples

Suggestions from @brosaplanella:

"A bit different, but Julia has DrWatson.jl which has many cool features, maybe we can get some ideas".
A figures folder (with a .gitkeep)
A data folder (later we could have some pipelines to process it, like the Data Science example above).

Possible layout

The folder structure can look like this

├── .github/workflows
├── src
├── data
└── docs
      ├── examples  # notebooks that can be rendered with nbsphinx
      ├── _static
      ├── sphinxext
      ├── source
      └── conf.py
├── tests  # (optional)
├── parameters
    └── my_parameters.py  # contains the get_parameter_values() function
├── examples/  # alternatively, example scripts or notebooks that are not to be rendered with the Sphinx builder
├── pyproject.toml
├── README.md
├── .pre-commit-config.yaml
├── .readthedocs.yaml
├── .env # contains DATA_PATH
├── noxfile.py  # (or tox.ini, if users want to use tox)

The required documentation should

Explain how to clone this template to start a new project
Explain how to rename the project in pyproject.toml and docs/conf.py
Explain how to structure a typical project with source files and utility classes and methods (in src/), unit tests with
Parameter entry points in pyproject.toml, with

[project.entry-points.pybamm_parameter_sets]
MyParameters = "package_name.parameters.my_parameters:get_parameter_values"

Point to the PyBaMM documentation wherein the developments and advancements in this repository shall be reflected on a separate page or section in the user guide

which can then be accessed as pybamm.ParameterValues(“MyParameters”) in the source code.

Tracked in #6.

Configuration options

Build-backends

hatch
flit
poetry
setuptools (later)

Documentation

Theme
Sphinx extensions

Project structure

Project metadata in pyproject.toml
and so on

Available licenses (#2)

MIT
Apache-2.0
BSD-3-Clause
and so on (should be permissive licenses suitable for collaborative research practices and open science)

Addendum 27/02/2024: another thing we would want would be entry points for models in the PyBaMM model structure rather than just parameter sets, please see pybamm-team/PyBaMM#3839 (comment)

The text was updated successfully, but these errors were encountered:

agriyakhetarpal · 2023-07-28T10:50:07Z

So I am not sure how this will go yet, though I learned that cruft retains full compatibility with templates based on cookiecutter, but copier has some differences (it uses a YAML file instead of JSON for the project specification).

However, scientific-python/cookie supports all three of them, so I think our use case as a stripped-down, barebones version of it can also support all three—unless we don't need to support all three and just cookiecutter and cruft will be enough

Saransh-cpp · 2023-07-28T14:33:15Z

Thanks for summarising this! Don't worry about supporting a lot of things at the start. We can start with a simple structure, one backend (hatch) support, and just cookiecutter support.

valentinsulzer · 2023-07-28T15:30:20Z

Rather than having a data folder, we should encourage separation of code and data with data path specified via the .env file. People are welcome to keep their code and data in the same place, but the data should ideally not be updated with the code to github, except for some examples.

If the data path is set as DATA_PATH="path/to/data" in .env, then the following code will load it

from dotenv import load_dotenv
load_dotenv()

path_to_data = os.environ["DATA_PATH"]

We could add that as one of the default utility functions in the src folder, e.g. in util.py

from dotenv import load_dotenv
load_dotenv()

def environ():
    return os.environ

agriyakhetarpal · 2023-07-28T16:26:40Z

What sorts of data would DATA_PATH contain ideally? We can further streamline the process of using it with some extra utility functions based on that too

agriyakhetarpal · 2023-07-28T16:26:45Z

Also, I renamed the project from pybamm-cookie-cutter to pybamm-cookiecutter because I saw that most templates with the cookiecutter topic on GitHub were named as such, i.e., without the space between "cookie" and "cutter"

valentinsulzer · 2023-08-01T13:57:32Z

What sorts of data would DATA_PATH contain ideally? We can further streamline the process of using it with some extra utility functions based on that too

Probably csv or parquet

agriyakhetarpal · 2023-08-01T19:37:10Z

I think csv and parquet files would be nice, we would have to use pandas as a dependency in that case (or just get it from the optional dependencies after pybamm-team/PyBaMM#3144 is merged)

A utility function for them could be something like

from pybamm_cookiecutter.util import DataLoader
import pybamm

battery_data = DataLoader.load_data("file1.csv")

In other words, as a wrapper over a combination of load_dotenv and pandas.read_csv() with some customisation here and there

agriyakhetarpal · 2023-08-01T19:42:01Z

See also: https://learn.scientific-python.org/development/patterns/data-files/. We could adopt pooch within PyBaMM too, especially for the SuiteSparse and SUNDIALS downloadables in scripts/install_KLU_Sundials.py

valentinsulzer · 2024-02-22T17:41:36Z

Adding something else to this roadmap, it would be nice if we could add new models via entry points as well. This requires a few changes in PyBaMM though

agriyakhetarpal · 2024-02-22T17:46:15Z

Adding a model via an entry point sounds like a nice idea, but it could be too excessive as well if it isn't done correctly. Do you have a proof-of-concept – I'm not entirely sure how it would go?

valentinsulzer · 2024-02-22T17:56:04Z

The author would create a model class similar to pybamm's BasicDFN where it's entirely self-contained, then anyone else would be able to call the model with something like pybamm.lithium_ion.Model("author/model-name").

This would solve several existing pain points with adding new models:

PyBaMM’s submodel structure is too complex, huge barrier to entry
You have to add things to the PyBaMM repo, uncertain ownership and IP
PyBaMM team has to "endorse" every model or gatekeep

With entry points, adding a new model is separate from PyBaMM and authors get to retain ownership but we don't have to endorse the models

agriyakhetarpal · 2024-02-22T18:23:15Z

Ah, sounds great – a bootstrapped model should be possible to implement, and IIUC should work similar to how we do parameter sets; though I would like to note that parameter sets are returned as Python dictionaries so it's easier to handle them, here we might have to establish a class that can either parse the AST for a model (or rather just import a JSON-serialised model) to pass it to pybamm.lithium_ion.Model("author/model-name"). This might be better to do in the PyBaMM source code, as you mentioned.

This issue has been referenced in the GSoC 2024 ideas page for potential readers and contributors, so if and when we flesh out these ideas a bit more, I suggest we should edit and add everything to the top of the thread as well.

santacodes · 2024-05-31T17:15:11Z

Rather than having a data folder, we should encourage separation of code and data with data path specified via the .env file. People are welcome to keep their code and data in the same place, but the data should ideally not be updated with the code to github, except for some examples.

I guess now with the pooch PR merged we could add the default pooch data files path for storing data here as well, which is under .cache for POSIX, and under %appdata% for windows machines. That way, we could use pybamm.DataLoader to load data files inside PyBaMM based projects.

agriyakhetarpal · 2024-08-21T09:46:05Z

Adding support for a data/ folder in the generated project, better guidelines on setting up documentation via MyST-NB (i.e., better ways presentation for results/code from research papers), etc., all sound like good feature sets for a v1 release someday, given that we are releasing v0.1 this week.

agriyakhetarpal pinned this issue Jul 28, 2023

agriyakhetarpal added documentation Improvements or additions to documentation help wanted Extra attention is needed infrastructure Issues that relate to the infrastructure of the repository or the template provided through it labels Jul 28, 2023

agriyakhetarpal changed the title ~~Repository template (file layouts, cookiecutter templating engines, distribution options)~~ Repository structure (file layouts, cookiecutter templating engines, distribution options) Jul 28, 2023

agriyakhetarpal mentioned this issue Aug 1, 2023

Initial draft for a cookiecutter template (licenses and folder structure) #2

Merged

agriyakhetarpal mentioned this issue Aug 4, 2023

Documentation about pybamm-cookiecutter #5

Open

5 tasks

agriyakhetarpal mentioned this issue May 14, 2024

Look into https://github.com/nelsontky/gh-pages-url-shortener / shorten URLs via data.pybamm.org subdomain pybamm-team/pybamm-data#1

Open

santacodes mentioned this issue Oct 4, 2024

Unified entry point for models and parameter sets pybamm-team/PyBaMM#4490

Open

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository structure (file layouts, cookiecutter templating engines, distribution options) #1

Repository structure (file layouts, cookiecutter templating engines, distribution options) #1

agriyakhetarpal commented Jul 28, 2023 •

edited

Loading

agriyakhetarpal commented Jul 28, 2023

Saransh-cpp commented Jul 28, 2023

valentinsulzer commented Jul 28, 2023

agriyakhetarpal commented Jul 28, 2023

agriyakhetarpal commented Jul 28, 2023

valentinsulzer commented Aug 1, 2023

agriyakhetarpal commented Aug 1, 2023

agriyakhetarpal commented Aug 1, 2023 •

edited

Loading

valentinsulzer commented Feb 22, 2024

agriyakhetarpal commented Feb 22, 2024

valentinsulzer commented Feb 22, 2024

agriyakhetarpal commented Feb 22, 2024

santacodes commented May 31, 2024

agriyakhetarpal commented Aug 21, 2024

Repository structure (file layouts, cookiecutter templating engines, distribution options) #1

Repository structure (file layouts, cookiecutter templating engines, distribution options) #1

Comments

agriyakhetarpal commented Jul 28, 2023 • edited Loading

Cookiecutters

Examples

Possible layout

The required documentation should

Configuration options

Build-backends

Documentation

Project structure

Available licenses (#2)

agriyakhetarpal commented Jul 28, 2023

Saransh-cpp commented Jul 28, 2023

valentinsulzer commented Jul 28, 2023

agriyakhetarpal commented Jul 28, 2023

agriyakhetarpal commented Jul 28, 2023

valentinsulzer commented Aug 1, 2023

agriyakhetarpal commented Aug 1, 2023

agriyakhetarpal commented Aug 1, 2023 • edited Loading

valentinsulzer commented Feb 22, 2024

agriyakhetarpal commented Feb 22, 2024

valentinsulzer commented Feb 22, 2024

agriyakhetarpal commented Feb 22, 2024

santacodes commented May 31, 2024

agriyakhetarpal commented Aug 21, 2024

agriyakhetarpal commented Jul 28, 2023 •

edited

Loading

agriyakhetarpal commented Aug 1, 2023 •

edited

Loading