-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What do researchers search for when looking for code repositories? #1
Comments
It would be great if the Software Discovery Dashboard included options to search for reference implementations of published papers, by looking up the authors names, DOIs, or titles of the paper. License and language would also be interesting. |
@acabunoc asked me to repeat what I said in #sciencelab that https://dataverse.harvard.edu has a fair amount of R code. Stata too. |
What are valuable search criteria when attempting to discover code repositories?
What kind of information would researchers find necessary (or just helpful) in search results?
|
I usually "apt-cache search" first to find software at my fingertips. If not present there -- then google it up. And then in neurscience/neuroimaging domain there are NIF (http://www.neuinfo.org/) and NITRC (http://nitrc.org) which collate/host various related software projects. Google at times leads me there ;) As for software implementing some publication/method -- we have plans (not sufficient force yet) to add centralized reporting to duecredit (https://github.com/duecredit/duecredit/) so later you would be able to find software implementing some referenced publication |
@arfon is thinking about the related area of software citation: https://twitter.com/arfon/status/628504262121816064 |
Re-usable packages from CRAN, PyPI, etc. are one thing. The actual scripts researchers write and use in analysis are another. People are now archiving analytical code in R, Matlab, and other languages into various data repositories such as the KNB and FigShare as part of their archived data packages. Here's an example of such a package with R code, which has very minimal metadata about the software. For this type of code in the KNB (and DataONE) it would be useful to be able to search for software used in analyses based on a classification of the types of analysis that was done, on who created it, in which papers it was used, etc. Some (idiosyncratic) example queries researchers might want would include:
|
For us (computational biologists) at least, most of the time it's method driven. We want to answer such and such and heard that method X was a good. Or that method Y overcomes difficulties that method X does not. The starting point is then literature based and we just hope that the code is available somewhere online. I imagine a useful dashboard for computational biologists might contain topics broken down by methods and then by implementation. E.g --
|
You might want to also take a look at this idea from the Scholar Ninja project http://juretriglav.si/discovery-of-scientific-software/ which recommends scientific software while browsing GitHub by extracting software citations from papers. |
I have three routes to finding relevant software:
Actually very rarely will I search for scientific code, because unless it is some sort of general utility or plumbing, I care first about whether the underlying method is good, then about whether it is implemented well. There are many sites which attempt to categorise or provide search of scientific software, but mostly they are much harder to use than google. |
@blahah, we are mainly driven by method also, you succinctly summarized our approach in your post. Curious, what is your main 'branch' of research? We are mainly genetics and systems biology. I'm wondering if work-flows differ much between disciplines? Do the physical sciences have organizational approaches the biological sciences don't? |
@schae234 computational biology / genomics here, so we overlap considerably I would think. |
Some initial thoughts:
Also, I had more general thoughts about this area in the following two articles: |
Thanks everyone for the input, it helps give us more context and an idea how to approach the problem. Please watch for new issues as we learn more and could use more informed input. @blahah what kind of overhead is there with the existing research software search services that makes them hard to use? Could you give an example? |
Comparative dashboard suggesting similar tools. Information surrounding licensing, open/proprietary/free, update activity, github repo, programming language, API, gallery of examples/use cases, data footprint, minimum spec, ratings, frameworks that also incorporate this tool, automation possibilities. |
From README
Here are a couple of questions to start the discussion about what would make a Software Discovery Dashboard most useful for researchers:
The text was updated successfully, but these errors were encountered: