Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modified search to take in multiple strings #4650

Open
wants to merge 13 commits into
base: develop
Choose a base branch
from

Conversation

medha-14
Copy link
Contributor

@medha-14 medha-14 commented Dec 9, 2024

Description

Fixes #4629

Type of change

Please add a line in the relevant section of CHANGELOG.md to document the change (include PR #) - note reverse order of PR #s. If necessary, also add to the list of breaking changes.

  • New feature (non-breaking change which adds functionality)
  • Optimization (back-end change that speeds up the code)
  • Bug fix (non-breaking change which fixes an issue)

Key checklist:

  • No style issues: $ pre-commit run (or $ nox -s pre-commit) (see CONTRIBUTING.md for how to set this up to run automatically when committing locally, in just two lines of code)
  • All tests pass: $ python -m pytest (or $ nox -s tests)
  • The documentation builds: $ python -m pytest --doctest-plus src (or $ nox -s doctests)

You can run integration tests, unit tests, and doctests together at once, using $ nox -s quick.

Further checks:

  • Code is commented, particularly in hard-to-understand areas
  • Tests added that prove fix is effective or that feature works

@medha-14 medha-14 requested a review from a team as a code owner December 9, 2024 10:14
@medha-14
Copy link
Contributor Author

medha-14 commented Dec 9, 2024

I have modified the search method to accept multiple strings. In cases where an exact match is not found, I pass the concatenated string of the multiple inputs to the get_close_matches method.Will this approach suffice for what we are trying to achieve here ?

Copy link

codecov bot commented Dec 9, 2024

Codecov Report

Attention: Patch coverage is 88.46154% with 3 lines in your changes missing coverage. Please review.

Project coverage is 99.20%. Comparing base (72c23ea) to head (3f96d73).
Report is 22 commits behind head on develop.

Files with missing lines Patch % Lines
src/pybamm/util.py 88.46% 3 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #4650      +/-   ##
===========================================
- Coverage    99.21%   99.20%   -0.02%     
===========================================
  Files          302      302              
  Lines        22858    22889      +31     
===========================================
+ Hits         22679    22707      +28     
- Misses         179      182       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@kratman kratman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not test the code changes locally, but they were just renamings.

Can you add a test with multiple keys in the search? It looks like you only fixed the test for the output formatting

src/pybamm/util.py Outdated Show resolved Hide resolved
src/pybamm/util.py Outdated Show resolved Hide resolved
src/pybamm/util.py Outdated Show resolved Hide resolved
src/pybamm/util.py Outdated Show resolved Hide resolved
src/pybamm/util.py Outdated Show resolved Hide resolved
@medha-14
Copy link
Contributor Author

I have implemented the suggested changes and added tests for searching multiple strings as well.

Copy link
Member

@brosaplanella brosaplanella left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, just needs an entry to the CHANGELOG before merging

Copy link
Member

@agriyakhetarpal agriyakhetarpal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @medha-14! Happy to approve/merge after these suggestions.

CHANGELOG.md Show resolved Hide resolved
src/pybamm/util.py Show resolved Hide resolved
@agriyakhetarpal
Copy link
Member

Actually, in the case of a partial match, would it be better to indicate that it is such? For example,

model.variables.search(["NotAVariable", "concentration"])

now returns

No results for search using '['NotAVariable', 'concentration']'.
Best matches are ['Electrolyte concentration']

but we could say something like

Partial match for key 'concentration' in search using keys '['NotAVariable', 'concentration']'.
Best matches are ['Electrolyte concentration']

because we do have a match here for "concentration", but not for "NotAVariable".

@medha-14
Copy link
Contributor Author

medha-14 commented Dec 11, 2024

In cases where some keys have an exact match while others only have a best match, what should the expected result be? Should we only prioritize the exact matches in such cases or should we also have the best matches printed separately?

@agriyakhetarpal
Copy link
Member

In cases where some keys have an exact match while others only have a best match, what should the expected result be? Should we only prioritize the exact matches in such cases or should we also have the best matches printed separately?

Do you mean the case where we have an exact match for $m$ keys and a partial/best match for $n - m$ keys? Yes, we should print both the exact matches and the best matches according to the keys. Could you share an example?

If I understood your question correctly, then an input as follows:

model.variables.search(["Electrolyte concentration", "Electrolite concentration"])

should return something, in my opinion, like:

Results matched against 'Electrolyte concentration' in search:
Electrolyte concentration

Partial match for 'Electrolite concentration' in search:
Best matches are ['Electrolyte concentration', 'Electrode potential']

We can figure out the best way to display the output later. There is also a case to be made to say that this improvement to the search functionality to accept multiple strings means that the result is returned for only the string that does return a match, but we are not really implementing a search engine, so I feel it is acceptable to have all results for all input strings (as if we are looping over them in the search). @brosaplanella, what do you think?

@medha-14
Copy link
Contributor Author

For now i have modified the method to search for exact matches having all the search_keys if no such matches are found it gives search results for each term individually.

model.variables.search(["electrolyte", "concentration"])

Since both terms are present together in a single key, the result will be:

 Results for 'Electrolyte concentration': ['Electrolyte concentration']

For the cases where there are no such matches it will iterate over each string individually and give results as such:

model.variables.search(["RandomKey", "elecrtolyte concentration","electrolite"])

will give results as:

No matches found for 'RandomKey'.
Exact matches for 'electrolyte concentration': ['Electrolyte concentration [Molar]', 'Electrolyte concentration [mol.m-3]']
No exact matches found for 'electrolite'. Best matches are: ['Electrolyte potential [V]']

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow dictionary search to take multiple substrings
4 participants