Skip to content

Commit

Permalink
Merge remote-tracking branch 'IQSS/develop' into IQSS/10814-Improve_d…
Browse files Browse the repository at this point in the history
…ataset_version_differencing
  • Loading branch information
qqmyers committed Nov 6, 2024
2 parents 5f8fe02 + b28812b commit f5714c4
Show file tree
Hide file tree
Showing 93 changed files with 2,927 additions and 1,386 deletions.
4 changes: 2 additions & 2 deletions .env
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
APP_IMAGE=gdcc/dataverse:unstable
POSTGRES_VERSION=16
POSTGRES_VERSION=17
DATAVERSE_DB_USER=dataverse
SOLR_VERSION=9.3.0
SKIP_DEPLOY=0
SKIP_DEPLOY=0
2 changes: 1 addition & 1 deletion .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

**Which issue(s) this PR closes**:

Closes #
- Closes #

**Special notes for your reviewer**:

Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/guides_build_sphinx.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,6 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: OdumInstitute/sphinx-action@master
- uses: uncch-rdmc/sphinx-action@master
with:
docs-folder: "doc/sphinx-guides/"
10 changes: 10 additions & 0 deletions doc/release-notes/10379-MetricsBugsFixes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@

### Metrics API Bug fixes

Two bugs in the Metrics API have been fixed:

- The /datasets and /datasets/byMonth endpoints could report incorrect values if/when they have been called using the dataLocation parameter (which allows getting metrics for local, remote (harvested), or all datasets) as the metrics cache was not storing different values for these cases.

- Metrics endpoints who's calculation relied on finding the latest published datasetversion were incorrect if/when the minor version number was > 9.

When deploying the new release, the [/api/admin/clearMetricsCache](https://guides.dataverse.org/en/latest/api/native-api.html#metrics) API should be called to remove old cached values that may be incorrect.
7 changes: 7 additions & 0 deletions doc/release-notes/10697-improve-permission-indexing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
### Reindexing after a role assignment is less memory intensive

Adding/removing a user from a role on a collection, particularly the root collection, could lead to a significant increase in memory use resulting in Dataverse itself failing with an out-of-memory condition. Such changes now consume much less memory.

If you have experienced out-of-memory failures in Dataverse in the past that could have been caused by this problem, you may wish to run a [reindex in place](https://guides.dataverse.org/en/latest/admin/solr-search-index.html#reindex-in-place) to update any out-of-date information.

For more information, see #10697 and #10698.
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
MDC Citation retrieval with the PID settings has been fixed.
DOI parsing in Dataverse is case insensitive, improving interaction with services that may change the case.
Warnings related to managed/excluded PID lists for PID providers have been reduced
3 changes: 3 additions & 0 deletions doc/release-notes/10742-newest-oldest-sort-order-backwards.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
## Minor bug fix to UI to fix the order of the files on the Dataset Files page when ordering by Date

A fix was made to the ui to fix the ordering 'Newest' and 'Oldest' which were reversed
2 changes: 2 additions & 0 deletions doc/release-notes/10772-fix-importDDI-otherId.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Bug Fix :
This PR fixes the `edu.harvard.iq.dataverse.util.json.JsonParseException: incorrect multiple for field otherId` error when DDI harvested data contains multiple ortherId.
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
Search API (/api/search) responses for Datafiles include image_url for the thumbnail if each of the following are true:
1. The DataFile is not Harvested
2. A Thumbnail is available for the Datafile
3. If the Datafile is Restricted then the caller must have Download File Permission for the Datafile
4. The Datafile is NOT actively embargoed
5. The Datafile's retention period has NOT expired

See also #10875 and #10886.
7 changes: 7 additions & 0 deletions doc/release-notes/10889_bump_PG17_FlyWay10.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
This release bumps both the Postgres JDBC driver and Flyway versions. This should better support Postgres version 17, and as of version 10 Flyway no longer requires a paid subscription to support older versions of Postgres.

While we don't encourage the use of older Postgres versions, this flexibility may benefit some of our long-standing installations in their upgrade paths. Postgres 13 remains the version used with automated testing.

As part of this update, the containerized development environment now uses Postgres 17 instead of 16. Developers must delete their data (`rm -rf docker-dev-volumes`) and start with an empty database. They can rerun the quickstart in the dev guide.

The Docker compose file used for [evaluations or demos](https://dataverse-guide--10912.org.readthedocs.build/en/10912/container/running/demo.html) has been upgraded from Postgres 13 to 17.
1 change: 1 addition & 0 deletions doc/release-notes/10901deaccessioned file edit fix.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
When a dataset was deaccessioned and was the only previous version it will cause an error when trying to update the files.
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Adds a new endpoint (`PUT /api/dataverses/<identifier>`) for updating an existing Dataverse collection using a JSON file following the same structure as the one used in the API for the creation.
3 changes: 3 additions & 0 deletions doc/release-notes/10914-users-token-api-credentials.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Extended the users/token GET endpoint to support any auth mechanism for retrieving the token information.

Previously, this endpoint only accepted an API token to retrieve its information. Now, it accepts any authentication mechanism and returns the associated API token information.
1 change: 1 addition & 0 deletions doc/release-notes/10919-minor-DataCiteXML-bugfix.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
A minor bug fix was made to avoid sending a useless ", null" in the DataCiteXML sent to DataCite and in the DataCite export when a dataset has a metadata entry for "Software Name" and no entry for "Software Version". The bug fix will update datasets upon publication. Anyone with existing published datasets with this problem can be fixed by [pushing updated metadata to DataCite for affected datasets](https://guides.dataverse.org/en/6.4/admin/dataverses-datasets.html#update-metadata-for-a-published-dataset-at-the-pid-provider) and [re-exporting the dataset metadata](https://guides.dataverse.org/en/6.4/admin/metadataexport.html#batch-exports-through-the-api) or by following steps 9 and 10 in the v6.4 release notes to update and re-export all datasets.
5 changes: 5 additions & 0 deletions doc/release-notes/10939-i18n-docker.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
## Multiple Language in Docker

Configuration and documentation has been added to explain how to set up multiple languages (e.g. English and French) in the tutorial for setting up Dataverse in Docker.

See also #10939
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
## Unpublished file bug fix

A bug fix was made that gets the major version of a Dataset when all major versions were deaccessioned. This fixes the incorrect showing of the files as "Unpublished" in the search list even when they are published.
This fix affects the indexing, meaning these datasets must be re-indexed once Dataverse is updated. This can be manually done by calling the index API for each affected Dataset.

Example:
```shell
curl http://localhost:8080/api/admin/index/dataset?persistentId=doi:10.7910/DVN/6X4ZZL
```

See also #10947 and #10974.
2 changes: 2 additions & 0 deletions doc/release-notes/10969-order-subfields-version-difference.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Bug Fix:
In order to facilitate the comparison between the draft version and the published version of a dataset, a sort on subfields has been added (#10969)
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
## Fix facets filter labels not translated in result block

On the main page, it's possible to filter results using search facets. If internationalization (i18n) has been activated in the Dataverse installation, allowing pages to be displayed in several languages, the facets are translated in the filter column. However, they aren't translated in the search results and remain in the default language, English.

This version of Dataverse fix this, and includes internationalization in the facets visible in the search results section.

For more information, see issue [#9408](https://github.com/IQSS/dataverse/issues/9408) and pull request [#10158](https://github.com/IQSS/dataverse/pull/10158)
2 changes: 1 addition & 1 deletion doc/sphinx-guides/source/admin/metadatacustomization.rst
Original file line number Diff line number Diff line change
Expand Up @@ -579,7 +579,7 @@ In general, the external vocabulary support mechanism may be a better choice for
The specifics of the user interface for entering/selecting a vocabulary term and how that term is then displayed are managed by third-party Javascripts. The initial Javascripts that have been created provide auto-completion, displaying a list of choices that match what the user has typed so far, but other interfaces, such as displaying a tree of options for a hierarchical vocabulary, are possible.
Similarly, existing scripts do relatively simple things for displaying a term - showing the term's name in the appropriate language and providing a link to an external URL with more information, but more sophisticated displays are possible.

Scripts supporting use of vocabularies from services supporting the SKOMOS protocol (see https://skosmos.org), retrieving ORCIDs (from https://orcid.org), services based on Ontoportal product (see https://ontoportal.org/), and using ROR (https://ror.org/) are available https://github.com/gdcc/dataverse-external-vocab-support. (Custom scripts can also be used and community members are encouraged to share new scripts through the dataverse-external-vocab-support repository.)
Scripts supporting use of vocabularies from services supporting the SKOSMOS protocol (see https://skosmos.org), retrieving ORCIDs (from https://orcid.org), services based on Ontoportal product (see https://ontoportal.org/), and using ROR (https://ror.org/) are available https://github.com/gdcc/dataverse-external-vocab-support. (Custom scripts can also be used and community members are encouraged to share new scripts through the dataverse-external-vocab-support repository.)

Configuration involves specifying which fields are to be mapped, to which Solr field they should be indexed, whether free-text entries are allowed, which vocabulary(ies) should be used, what languages those vocabulary(ies) are available in, and several service protocol and service instance specific parameters, including the ability to send HTTP headers on calls to the service.
These are all defined in the :ref:`:CVocConf <:CVocConf>` setting as a JSON array. Details about the required elements as well as example JSON arrays are available at https://github.com/gdcc/dataverse-external-vocab-support, along with an example metadata block that can be used for testing.
Expand Down
52 changes: 52 additions & 0 deletions doc/sphinx-guides/source/api/native-api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,58 @@ The request JSON supports an optional ``metadataBlocks`` object, with the follow

To obtain an example of how these objects are included in the JSON file, download :download:`dataverse-complete-optional-params.json <../_static/api/dataverse-complete-optional-params.json>` file and modify it to suit your needs.

.. _update-dataverse-api:

Update a Dataverse Collection
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Updates an existing Dataverse collection using a JSON file following the same structure as the one used in the API for the creation. (see :ref:`create-dataverse-api`).

The steps for updating a Dataverse collection are:

- Prepare a JSON file containing the fields for the properties you want to update. You do not need to include all the properties, only the ones you want to update.
- Execute a curl command or equivalent.

As an example, you can download :download:`dataverse-complete.json <../_static/api/dataverse-complete.json>` file and modify it to suit your needs. The controlled vocabulary for ``dataverseType`` is the following:

- ``DEPARTMENT``
- ``JOURNALS``
- ``LABORATORY``
- ``ORGANIZATIONS_INSTITUTIONS``
- ``RESEARCHERS``
- ``RESEARCH_GROUP``
- ``RESEARCH_PROJECTS``
- ``TEACHING_COURSES``
- ``UNCATEGORIZED``

The curl command below assumes you are using the name "dataverse-complete.json" and that this file is in your current working directory.

Next you need to figure out the alias or database id of the Dataverse collection you want to update.

.. code-block:: bash
export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
export SERVER_URL=https://demo.dataverse.org
export DV_ALIAS=dvAlias
curl -H "X-Dataverse-key:$API_TOKEN" -X PUT "$SERVER_URL/api/dataverses/$DV_ALIAS" --upload-file dataverse-complete.json
The fully expanded example above (without environment variables) looks like this:

.. code-block:: bash
curl -H "X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" -X PUT "https://demo.dataverse.org/api/dataverses/dvAlias" --upload-file dataverse-complete.json
You should expect an HTTP 200 response and JSON beginning with "status":"OK" followed by a representation of the updated Dataverse collection.

Same as in :ref:`create-dataverse-api`, the request JSON supports an optional ``metadataBlocks`` object, with the following supported sub-objects:

- ``metadataBlockNames``: The names of the metadata blocks you want to add to the Dataverse collection.
- ``inputLevels``: The names of the fields in each metadata block for which you want to add a custom configuration regarding their inclusion or requirement when creating and editing datasets in the new Dataverse collection. Note that if the corresponding metadata blocks names are not specified in the ``metadataBlockNames``` field, they will be added automatically to the Dataverse collection.
- ``facetIds``: The names of the fields to use as facets for browsing datasets and collections in the new Dataverse collection. Note that the order of the facets is defined by their order in the provided JSON array.

To obtain an example of how these objects are included in the JSON file, download :download:`dataverse-complete-optional-params.json <../_static/api/dataverse-complete-optional-params.json>` file and modify it to suit your needs.

.. _view-dataverse:

View a Dataverse Collection
Expand Down
50 changes: 50 additions & 0 deletions doc/sphinx-guides/source/container/dev-usage.rst
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,56 @@ Alternatives:
- If you used Docker Compose for running, you may use ``docker compose -f docker-compose-dev.yml logs <service name>``.
Options are the same.

Accessing Harvesting Log Files
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

\1. Open a terminal and access the Dataverse container.

Run the following command to access the Dataverse container (assuming your container is named dataverse-1):

.. code-block::
docker exec -it dataverse-1 bash
This command opens an interactive shell within the dataverse-1 container.

\2. Navigate to the log files directory.

Once inside the container, navigate to the directory where Dataverse logs are stored:

.. code-block::
cd /opt/payara/appserver/glassfish/domains/domain1/logs
This directory contains various log files, including those relevant to harvesting.

\3. Create a directory for copying files.

Create a directory where you'll copy the files you want to access on your local machine:

.. code-block::
mkdir /dv/filesToCopy
This will create a new folder named filesToCopy inside /dv.

\4. Copy the files to the new directory.

Copy all files from the current directory to the newly created filesToCopy directory:

.. code-block::
cp * /dv/filesToCopy
This command copies all files in the logs directory to /dv/filesToCopy.

\5. Access the files on your local machine.

On your local machine, the copied files should appear in the following directory:

.. code-block::
docker-dev-volumes/app/data/filesToCopy
Redeploying
-----------
Expand Down
17 changes: 17 additions & 0 deletions doc/sphinx-guides/source/container/running/demo.rst
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,23 @@ In the example below of configuring :ref:`:FooterCopyright` we use the default u

One you make this change it should be visible in the copyright in the bottom left of every page.

Multiple Languages
++++++++++++++++++

Generally speaking, you'll want to follow :ref:`i18n` in the Installation Guide to set up multiple languages such as English and French.

To set up the toggle between English and French, we'll use a slight variation on the command in the instructions above, adding the unblock key we created above:

``curl "http://localhost:8080/api/admin/settings/:Languages?unblock-key=unblockme" -X PUT -d '[{"locale":"en","title":"English"},{"locale":"fr","title":"Français"}]'``

Similarly, when loading the "languages.zip" file, we'll add the unblock key:

``curl "http://localhost:8080/api/admin/datasetfield/loadpropertyfiles?unblock-key=unblockme" -X POST --upload-file /tmp/languages/languages.zip -H "Content-Type: application/zip"``

Stop and start the Dataverse container in order for the language toggle to work.

Note that ``dataverse.lang.directory=/dv/lang`` has already been configured for you in the ``compose.yml`` file. The step where you loaded "languages.zip" should have populated the ``/dv/lang`` directory with files ending in ".properties".

Next Steps
----------

Expand Down
8 changes: 4 additions & 4 deletions doc/sphinx-guides/source/developers/version-control.rst
Original file line number Diff line number Diff line change
Expand Up @@ -291,16 +291,16 @@ By default, when a pull request is made from a fork, "Allow edits from maintaine

This is a nice feature of GitHub because it means that the core dev team for the Dataverse Project can make small (or even large) changes to a pull request from a contributor to help the pull request along on its way to QA and being merged.

GitHub documents how to make changes to a fork at https://help.github.com/articles/committing-changes-to-a-pull-request-branch-created-from-a-fork/ but as of this writing the steps involve making a new clone of the repo. This works but you might find it more convenient to add a "remote" to your existing clone. The example below uses the fork at https://github.com/OdumInstitute/dataverse and the branch ``4709-postgresql_96`` but the technique can be applied to any fork and branch:
GitHub documents how to make changes to a fork at https://help.github.com/articles/committing-changes-to-a-pull-request-branch-created-from-a-fork/ but as of this writing the steps involve making a new clone of the repo. This works but you might find it more convenient to add a "remote" to your existing clone. The example below uses the fork at https://github.com/uncch-rdmc/dataverse and the branch ``4709-postgresql_96`` but the technique can be applied to any fork and branch:

.. code-block:: bash
git remote add OdumInstitute [email protected]:OdumInstitute/dataverse.git
git fetch OdumInstitute
git remote add uncch-rdmc [email protected]:uncch-rdmc/dataverse.git
git fetch uncch-rdmc
git checkout 4709-postgresql_96
vim path/to/file.txt
git commit
git push OdumInstitute 4709-postgresql_96
git push uncch-rdmc 4709-postgresql_96
.. _develop-into-develop:

Expand Down
2 changes: 1 addition & 1 deletion doc/sphinx-guides/source/installation/config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1783,7 +1783,7 @@ Now that you have a "languages.zip" file, you can load it into your Dataverse in

``curl http://localhost:8080/api/admin/datasetfield/loadpropertyfiles -X POST --upload-file /tmp/languages/languages.zip -H "Content-Type: application/zip"``

Click on the languages using the drop down in the header to try them out.
Stop and start Payara and then click on the languages using the drop down in the header to try them out.

.. _help-translate:

Expand Down
2 changes: 2 additions & 0 deletions doc/sphinx-guides/source/qa/testing-infrastructure.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,8 @@ To build and test a PR, we use a job called `IQSS_Dataverse_Internal` on <https:

1. If for some reason it didn't deploy, check the server.log file. It may just be a caching issue so try un-deploying, deleting cache, restarting, and re-deploying on the server (`su - dataverse` then `/usr/local/payara6/bin/asadmin list-applications; /usr/local/payara6/bin/asadmin undeploy dataverse-6.1; /usr/local/payara6/bin/asadmin deploy /tmp/dataverse-6.1.war`)

1. When a Jenkins job fails after a release, it might be due to the version number in the `pom.xml` file not being updated in the pull request (PR). To verify this, open the relevant GitHub issue, navigate to the PR branch, and go to `dataverse > modules > dataverse-parent > pom.xml`. Look for the version number, typically shown as `<revision>6.3</revision>`, and ensure it matches the current Dataverse build version. If it doesn't match, ask the developer to update the branch with the latest from the "develop" branch.

1. If that didn't work, you may have run into a Flyway DB script collision error but that should be indicated by the server.log. See {doc}`/developers/sql-upgrade-scripts` in the Developer Guide. In the case of a collision, ask the developer to rename the script.

1. Assuming the above steps worked, and they should 99% of the time, test away! Note: be sure to `tail -F server.log` in a terminal window while you are doing any testing. This way you can spot problems that may not appear in the UI and have easier access to any stack traces for easier reporting.
Expand Down
2 changes: 1 addition & 1 deletion doc/sphinx-guides/source/style/text.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,4 @@ Here we describe the guidelines that help us provide helpful, clear and consiste
Metadata Text Guidelines
========================

These guidelines are maintained in `a Google Doc <https://docs.google.com/document/d/1uRk_dAZlaCS91YFbqE6L9Jwhwum7mOadkJ59XUx40Sg>`__ as we expect to make frequent changes to them. We welcome comments in the Google Doc.
These guidelines are maintained in `a Google Doc <https://docs.google.com/document/d/1tY5t3gjrIgAGoRxVMWQSCh46fnbSmnFDLQ7aLkNLhJ8/>`__ as we expect to make frequent changes to them. We welcome comments in the Google Doc.
Loading

0 comments on commit f5714c4

Please sign in to comment.