-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a sample test repository #595
Comments
I have to confess it's not really that good yet. However, my attention during the ELIXIR BioHackathon has already been requested for other tasks... hope to have an update soon! |
no rush! |
Back in business! Below, I'll add things I noticed while creating the demo repo. As I won't be done today, more might follow...
|
Thanks, let me open this in a new issue. Many people have been editing the d ocs, and keeping everything consistent can be challenging |
Having too many contributors sounds like a lovely problem to have :). Below, I got two more potential issues... I'll post them in separate comments |
I think there might be an issue with extracting a logo when there is no slash ( README.md# Image
Images used to illustrate the software component.
![logo1.png](logo1.png)
# Logo
Main logo used to represent the target software component.
![logo2.png](logo_directory/logo2.png) SOMEF Output"logo": [
{
"result": {
"type": "Url",
"value": "https://raw.githubusercontent.com/tpronk/somef-demo-repo/main/logo_directory/logo2.png"
},
"confidence": 1,
"technique": "regular_expression",
"source": "https://raw.githubusercontent.com/tpronk/somef-demo-repo/main/README.md"
}
],
"image": [
{
"result": {
"type": "Url",
"value": "https://raw.githubusercontent.com/tpronk/somef-demo-repo/main/logo1.png"
},
"confidence": 1,
"technique": "regular_expression",
"source": "https://raw.githubusercontent.com/tpronk/somef-demo-repo/main/README.md"
}
] |
At the Hackathon, we've been extracting metadata from around 65 repos, but in none of the SOMEF output can I find the field |
I found a case where values extracted for the "invocation": [
{
"result": {
"type": "Text_excerpt",
"value": "\n```{r, echo=FALSE, results='asis', message = FALSE}\nmy_apc %>% select(institution, euro) %>% \n group_by(institution) %>% \n ezsummary::ezsummary(n = TRUE, digits= 0, median = TRUE,\n extra = c(\n sum = \"sum(., na.rm = TRUE)\",\n min = \"min(., na.rm = TRUE)\",\n max = \"max(., na.rm = TRUE)\"\n )) %>%\n mutate_all(format, big.mark=',') %>%\n ezsummary::ezmarkup('...[. (.)]..[. - .]') %>%\n#> get rid of blanks\n mutate(`mean (sd)` = gsub(\"\\\\( \", \"(\", .$`mean (sd)`)) %>% \n select(institution, n, sum, `mean (sd)`, median, `min - max`) %>%\n arrange(desc(n)) %>%\n knitr::kable(col.names = c(\"Institution\", \"Articles\", \"Spending total (in \u20ac)\", \"Mean (SD)\", \"Median\", \"Minimum - Maximum\"), align = c(\"l\",\"r\", \"r\", \"r\", \"r\", \"r\"))\n``` \n",
"original_header": "Fully Open Access Journals"
},
"confidence": 0.906763643352601,
"technique": "supervised_classification",
"source": "https://raw.githubusercontent.com/MPDL/unibiAPC/master/README.md"
},
{
"result": {
"type": "Text_excerpt",
"value": "```{r, echo = FALSE, warning = TRUE}\n\nknitr::opts_knit$set(base.url = \"/\")\nknitr::opts_chunk$set(\n comment = \"#>\",\n collapse = TRUE,\n warning = FALSE,\n message = FALSE,\n echo = FALSE,\n fig.width = 9,\n fig.height = 6\n)\noptions(scipen = 999, digits = 0, tibble.width = Inf, tibble.print_max = Inf)\n\nknitr::knit_hooks$set(inline = function(x) {\n prettyNum(x, big.mark = \",\")\n})\n```\n```{r}\nrequire(dplyr)\nrequire(ggplot2)\nrequire(ezsummary)\nrequire(pander)\n```\n```{r, echo=FALSE, cache = FALSE}\nmy_apc <- readr::read_csv(\"data/apc_de.csv\")\n```\n \n"
},
"confidence": 0.9211067534061969,
"technique": "supervised_classification",
"source": "https://raw.githubusercontent.com/MPDL/unibiAPC/master/README.md"
}
] |
Thanks for these issues. |
If you find any more, please open them! I usually open them as I test in diverse repos, but some time is tricky getting to these edge cases |
Bueno & gracias. I'll keep 'em coming then :) |
Wrapping things up, I compared fields mentioned in the README.md of SOMEF to the fields in constants.py. These are the discrepancies I found in terms of entries I couldn't find in one or the other, ignoring cases where they probably just have a different name
|
All right then. SOMEF 0.9.4 can extract a total of 48 fields from this version of somef-demo-repo, which can make it a nice integration test I guess |
Definitely. Thanks!! |
This repository: https://github.com/tpronk/somef-demo-repo should be added in the documentation
The text was updated successfully, but these errors were encountered: