Skip to content

Commit

Permalink
updating to reflect new approach to publishing
Browse files Browse the repository at this point in the history
  • Loading branch information
blue442 committed Mar 18, 2024
1 parent fbdf3ff commit c03fd76
Showing 1 changed file with 74 additions and 24 deletions.
98 changes: 74 additions & 24 deletions examples/publishing-guides/dataset_publishing.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,7 @@
"id": "eA1nPvoZe68H"
},
"source": [
"This section describes and defines the variables for all of the possible arguments you could pass to `f.publish()`, for illustrative purposes."
"This section describes and defines the variables for all of elements needed to construct a FoundryDataset object and publish it."
]
},
{
Expand Down Expand Up @@ -263,6 +263,34 @@
"The DataCite Metadata Schema is a list of core metadata properties chosen for an accurate and consistent identification of a resource for citation and retrieval purposes. More information about this schema and the larger DataCite project can be found at https://datacite.org/"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"example_iris_datacite = {'identifier': {'identifier': '10.xx/xx', 'identifierType': 'DOI'},\n",
" 'rightsList': [{'rights': 'CC-BY 4.0'}],\n",
" 'creators': [{'creatorName': 'Brown, C', 'familyName': 'Brown', 'givenName': 'Charles'},\n",
" {'creatorName': 'Van Pelt, L', 'familyName': 'Van Pelt', 'givenName': 'Lucia'}],\n",
" 'subjects': [{'subject': 'blockheads'},\n",
" {'subject': 'foundry'},\n",
" {'subject': 'test_data'}],\n",
" 'publicationYear': 2024,\n",
" 'publisher': 'Materials Data Facility',\n",
" 'dates': [{'date': '2024-08-03', 'dateType': 'Accepted'}],\n",
" 'titles': [{'title': \"You're a Good man, Charlie Brown\"}],\n",
" 'resourceType': {'resourceTypeGeneral': 'Dataset', \n",
" 'resourceType': 'Dataset'}}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Creating a FoundryDataset object"
]
},
{
"cell_type": "code",
"execution_count": 3,
Expand All @@ -271,6 +299,12 @@
},
"outputs": [],
"source": [
"\"\"\"\n",
"This is the depricated way of adding all of the datacite information via kwargs to the foundry.publish() method.\n",
"Keeping it around for now as we might want to create a datacite_generator function that can create a datacite json\n",
"object from kwargs, so folks don't have to mess with json formatting.\n",
"\"\"\"\n",
"\n",
"from datetime import datetime\n",
"timestamp = datetime.now().timestamp()\n",
"\n",
Expand Down Expand Up @@ -301,43 +335,40 @@
"publication_year = 2023"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now that we have the metadata and datacite information contained in the json objects we created above, we can create an instance of a FoundryDataset object. This serves as a container to hold and organize all of the data as well as the metadata for the dataset. We just need one additional bit of information which is a `dataset name` associated with the dataset."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"{'identifier': {'identifier': '10.xx/xx', 'identifierType': 'DOI'},\n",
" 'rightsList': [{'rights': 'CC-BY 4.0'}],\n",
" 'creators': [{'creatorName': 'Brown, C', 'familyName': 'Brown', 'givenName': 'Charles'},\n",
" {'creatorName': 'Van Pelt, L', 'familyName': 'Van Pelt', 'givenName': 'Lucia'}],\n",
" 'subjects': [{'subject': 'blockheads'},\n",
" {'subject': 'foundry'},\n",
" {'subject': 'test_data'}],\n",
" 'publicationYear': 2024,\n",
" 'publisher': 'Materials Data Facility',\n",
" 'dates': [{'date': '2024-08-03', 'dateType': 'Accepted'}],\n",
" 'titles': [{'title': \"You're a Good man, Charlie Brown\"}],\n",
" 'resourceType': {'resourceTypeGeneral': 'Dataset', \n",
" 'resourceType': 'Dataset'}}"
"dataset_name = 'charlies_iris_dataset'"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "3YNK1e5UfTaN"
},
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"We won't use all of these variables in our call to `f.publish()`, because many of the default values for the parameters (such as \"MDF\" for `publisher`) work well for our use case. \n",
"\n",
"However, the **metadata**, **data path** (HTTPS) or **data source** (Globus Connect Client), **title**, and **authors** are all required."
"from foundry import FoundryDataset"
]
},
{
"cell_type": "markdown",
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"Note that instead of `https_data_path`, you'll want to specify `globus_data_source` if you are uploading data using Globus Connect Client instead of HTTPS (see _Uploading via Globus Connect Client_ at the end of this notebook)."
"iris_dataset = FoundryDataset(dataset_name, \n",
" example_iris_metadata, \n",
" example_iris_datacite)"
]
},
{
Expand All @@ -349,6 +380,24 @@
"## Publishing to Foundry"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "3YNK1e5UfTaN"
},
"source": [
"We won't use all of these variables in our call to `f.publish()`, because many of the default values for the parameters (such as \"MDF\" for `publisher`) work well for our use case. \n",
"\n",
"However, the **metadata**, **data path** (HTTPS) or **data source** (Globus Connect Client), **title**, and **authors** are all required."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that instead of `https_data_path`, you'll want to specify `globus_data_source` if you are uploading data using Globus Connect Client instead of HTTPS (see _Uploading via Globus Connect Client_ at the end of this notebook)."
]
},
{
"cell_type": "code",
"execution_count": null,
Expand All @@ -358,7 +407,8 @@
"outputs": [],
"source": [
"# publish to Foundry! returns a result object we can inspect\n",
"res = f.publish_dataset(example_iris_metadata, title, authors, https_data_path=data_path, short_name=short_name)"
"res = f.publish_dataset(iris_dataset, \n",
" https_data_path=data_path)"
]
},
{
Expand Down

0 comments on commit c03fd76

Please sign in to comment.