Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create readme author list ordered as in dataset view page #2123

Draft
wants to merge 22 commits into
base: develop
Choose a base branch
from

Conversation

kencho51
Copy link
Contributor

@kencho51 kencho51 commented Dec 10, 2024

Pull request for issue: #2116

This is a pull request for the following functionalities:

  1. Abstracted the query$author_sql in function getReleaseDetails() in protected/components/StoredDatasetMainSection.php to gigadb-website/gigadb/app/models/Author.php as listByDatasetId()
  2. Made the function available for the readme-generator tool
  3. Made the function available for the main codebase
  4. Created unit test for the function getReleaseDetails()
  5. Created functional test to confirm the generated readme contains the authors as listed as from the dataset view page
  6. Created acceptance test to confirm the function getReleaseDetails() would out the author list as it was

How to test?

Describe how the new functionalities can be tested by PR reviewers

Follow the steps in gigadb/app/tools/readme-generator/README.md to spin up gigadb-website and the readme-generator services.

In dev

% cd gigadb-website
./up.sh
./tests/unit_functional_runner
./tests/acceptance_runner
% cd gigadb/app/tools/readme-generator 
% bats tests/createReadme.bats
% docker-compose run --rm tool ./vendor/codeception/codeception/codecept run unit tests/unit/models/AuthorTest.php
% docker-compose run --rm tool ./vendor/bin/codecept run tests/functional/ReadmeCest.php

In staging

  1. Follow the steps in docs/SETUP_PROVISIONING.md to spin up staging environments
  2. Follow the steps in gigadb/app/tools/readme-generator/README.md section Using readme generator tool on Bastion server to execute the readme generator tool, or briefly as follow:
[ec2-user@ip-10-99-0-70 ~]$ psql -h rds-server-staging-ken.cjizsjwbxkxv.ap-northeast-2.rds.amazonaws.com -U gigadb -c "select id from dataset where identifier = '100925';"
Password for user gigadb: 
  id  
------
 2335
(1 row
[ec2-user@ip-10-99-0-70 ~]$ psql -h rds-server-staging-ken.cjizsjwbxkxv.ap-northeast-2.rds.amazonaws.com -U gigadb -c "select id, name, location, size from file where dataset_id = 2335 and name = 'readme_100925.txt';"
Password for user gigadb: 
   id   |       name        |                                                      location                                                      | size 
--------+-------------------+--------------------------------------------------------------------------------------------------------------------+------
 464507 | readme_100925.txt | https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/staging/pub/10.5524/100001_101000/100925/readme_100925.txt | 9962
(1 row)

[ec2-user@ip-10-99-0-70 ~]$
[ec2-user@ip-10-99-0-70 ~]$ psql -h rds-server-staging-ken.cjizsjwbxkxv.ap-northeast-2.rds.amazonaws.com -U gigadb -c 'select * from file_attributes where file_id = 464507'
Password for user gigadb: 
   id   | file_id | attribute_id |              value               | unit_id 
--------+---------+--------------+----------------------------------+---------
 413584 |  464507 |          605 | 4b5700b732f4d0239ac3682009424265 | 
(1 row)

[ec2-user@ip-10-99-0-70 ~]$ docker run --rm -v /home/ec2-user/readmeFiles/:/app/readmeFiles registry.gitlab.com/gigascience/forks/kencho-gigadb-website/production_tool:staging /app/yii readme/create --doi 100925 --outdir /app/readmeFiles --bucketPath wasabi:gigadb-datasets/staging/pub/10.5524
[ec2-user@ip-10-99-0-70 ~]$ ls readmeFiles/
readme_100925.txt
[ec2-user@ip-10-99-0-70 ~]$ cat readmeFiles/readme_100925.txt 
[ec2-user@ip-10-99-0-70 ~]$ psql -h rds-server-staging-ken.cjizsjwbxkxv.ap-northeast-2.rds.amazonaws.com -U gigadb -c 'select * from file_attributes where file_id = 464507'
Password for user gigadb: 
   id   | file_id | attribute_id |              value               | unit_id 
--------+---------+--------------+----------------------------------+---------
 413584 |  464507 |          605 | 484527e1a4e187bfae3f2bc35860ae60 | 
(1 row)

[ec2-user@ip-10-99-0-70 ~]$ /usr/local/bin/createReadme --doi 100925
[ec2-user@ip-10-99-0-70 ~]$ ls /var/log/gigadb/
readme.log
[ec2-user@ip-10-99-0-70 ~]$ cat /var/log/gigadb/readme.log
2024/12/13 06:44:13 INFO  : Created readme file for DOI 100925 in /home/ec2-user/readme_100925.txt
2024/12/13 06:44:14 NOTICE: readme_100925.txt: Skipped copy as --dry-run is set (size 9.731Ki)
2024/12/13 06:44:14 NOTICE: 
Transferred:        9.731 KiB / 9.731 KiB, 100%, 0 B/s, ETA -
Transferred:            1 / 1, 100%
Elapsed time:         0.6s

2024/12/13 06:44:14 INFO  : Executed: rclone copy --s3-no-check-bucket /home/ec2-user/readme_100925.txt wasabi:gigadb-datasets/staging/pub/10.5524/100001_101000/100925/ --config /home/ec2-user/.config/rclone/rclone.conf --dry-run --log-file /var/log/gigadb/readme.log --log-level INFO --stats-log-level DEBUG >> /var/log/gigadb/readme.log
2024/12/13 06:44:14 INFO  : Successfully copied file to Wasabi for DOI: 100925

How have functionalities been implemented?

Describe how the new functionalities have been implemented by the
changed code at a high level

There is inconsistency found in the author's order between readme-generator and the dataset view page, because they are using two different methods to generate the author list from the database. readme-generator uses the yii default methods to query database which does not concern any relationship with other tables, while dataset view uses the custom sql command to query the database with relationships with other tables are considered, and the relationship seems important as designed by the curators.

In order to remove the inconsistency, the custom sql command is abstracted and is available to both readme-generator and the main codebase, as a result, both of them will be using the same methods to generate the authors list.

Any issues with implementation?

None.

Any changes to automated tests?

Describe any automated tests that have been developed for the new
functionalities

Any changes to documentation?

Replaced centos with ec2-user in gigadb/app/tools/readme-generator/README.md.

Any technical debt repayment?

None.

Any improvements to CI/CD pipeline?

Describe any improvements to the Gitlab pipeline


[Citation]
Oleksyk TK; Guiblet W; Pombert JF; Valentin R; Martinez-Cruzado JC (2012): Genomic data of the Puerto Rican Parrot (<em>Amazona vittata</em>) from a locally funded project.
GigaScience Database. https://dx.doi.org/10.5524/100039
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whilst you're working on the readme, can you remove this spureous carriage return, the "GigaScience Database" line should continue on the end of the title in the line above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

2 participants