diff --git a/CHANGELOG.md b/CHANGELOG.md index e192c789e..7c05a1f39 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,74 @@ # Changelog +## [5.2.0](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/compare/v5.1.0...v5.2.0) (2022-11-01) + + +### Datasets + +* Add geom columns for thelook_ecommerce dataset ([#307](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/307)) ([f39a177](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/f39a177dafd64a2dfed941bfb078ef24cece4b26)) +* Add Municipal Calendar to San Francisco Dataset ([#480](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/480)) ([a21c2ef](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/a21c2efb507d08c794f65572ea16dcadc5124350)) +* Add PM25_FRM_DAILY_SUMMARY Pipeline To Epa_Historical_Air_Quality Dataset ([#518](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/518)) ([4f66c05](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/4f66c05c28550b755f68afb05e54006a503644d7)) +* Add Storms Database to Noaa Dataset ([#498](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/498)) ([8d02866](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/8d02866dd3d89a72054f60a0473674c2b6d98e82)) +* Adding a tutorial for the Iowa Liquor dataset ([#419](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/419)) ([b619b71](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/b619b711fb5ab9fd474a12f26876f1b44981078d)) +* Adding New Pipelines To San Francisco Dataset. ([#487](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/487)) ([58cda71](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/58cda718e0404e614a38d3134753491d98d22900)) +* Extract the tabular metadata for Cloud Datasets program ([#452](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/452)) ([1a3d59e](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/1a3d59e4e9b62e0fde53c57be0c0034fbdeb7f92)) +* Launch AFDB v4 dataset ([#522](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/522)) ([c6664a7](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/c6664a78ed428bfa5df2e0c30bd95fb5065f2751)) +* Migrate the dataset Covid19 Italy from Xenon ([#488](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/488)) ([1ca6bd6](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/1ca6bd64b2bf5f55d2940bb1deb37cae0a4f2970)) +* Migrate the World Bank datasets x 3 from Xenon ([#506](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/506)) ([65295d0](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/65295d046dbd3db11a61368bfb8dedb7c8f04ee5)) +* Migrate the Xenon World Bank WDI dataset ([#482](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/482)) ([35457a9](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/35457a9e33a929cf2949645693d17dc82340215e)) +* onboard chembl-30 dataset ([#467](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/467)) ([ef9c57b](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/ef9c57bb8b87f5228915bad6a55256812e576935)) +* Onboard COVID-19 Genome Sequence dataset ([#460](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/460)) ([0b7828f](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/0b7828fb5643499ab882e56f82e64d9670dcdbba)) +* Onboard dataset Open Buildings ([#453](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/453)) ([739b6cf](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/739b6cf3a13c06efd588e6cb818a6b21feec1b8e)) +* Onboard EBI CHemBL Previous Data dataset ([#470](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/470)) ([63b4012](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/63b4012f6cc2a7b9beaaf907f2a7644d9eaa23f9)) +* Onboard FDIC dataset ([#495](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/495)) ([e20e157](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/e20e15798930e08a2b291a9ecc310de413f3e53d)) +* Onboard Fec dataset ([#485](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/485)) ([2da413e](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/2da413e779a90fc1c16c60772d380fb332c04ae1)) +* Onboard Human Variant Annotation dataset ([#438](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/438)) ([ebfe4de](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/ebfe4def2a2f01ae4adcbcc28bbbb0a0af2b40b1)) +* Onboard IDC v10 dataset ([#433](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/433)) ([c2ffc77](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/c2ffc77d0a71fa8a8b78223f83c9d3cd3f2ca2b7)) +* onboard irs 990 ein dataset ([#481](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/481)) ([65544a2](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/65544a2baac5c706e351f9b9c73d8f821ef63880)) +* Onboard MERFISH Mouse Brain Receptor Map dataset ([#457](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/457)) ([4333fca](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/4333fca7aea2343ff8ed53c7436e477544a27e39)) +* Onboard Multilingual Spoken Words Corpus - MLCommons Association dataset ([#461](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/461)) ([22cc27c](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/22cc27c01f6b2bc2d456733813ed047bf44b1294)) +* Onboard New Fec dataset ([#486](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/486)) ([6ee1fa3](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/6ee1fa3babd54a5914943b279efc29721835c598)) +* Onboard New FEC dataset ([#513](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/513)) ([e770220](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/e77022031aefdb5caed962be90491abe1a54739a)) +* Onboard NHTSA Traffic Fatalities dataset ([#454](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/454)) ([eb409c4](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/eb409c4aa5eef92b09ed0c2ba85e5bea6e31ba3b)) +* Onboard NOAA Passive Bioacoustic dataset ([#471](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/471)) ([2ecd9ea](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/2ecd9ea99c051418464407947f0eab5fa2fb9586)) +* Onboard Uniref50 dataset ([#443](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/443)) ([dbf2300](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/dbf23007d1033613a3fc0e4026f00cc15eac5e98)) +* Onboard Uniref50 dataset ([#473](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/473)) ([b44d572](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/b44d5724fa2778c339f8437a03cb77271568e83f)) +* YAML custom tag for interpolating GCR image URLs ([#372](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/372)) ([ef901e5](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/ef901e5a511d7a38167af298a06a342528bae7b3)) + + +### Bug Fixes + +* Added "is_public" to cloud_datasets.tabular_datasets table ([#501](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/501)) ([802cff6](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/802cff6d564bd42defa8a7b795e904a993032fc3)) +* Added Airport Fee To Schema Files And Pipeline.Yaml In New York Taxi Trips Dataset ([#476](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/476)) ([d94105a](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/d94105a83cd89064f60538275d72e4e1de310ff6)) +* Adds BRL currency in Google Political Ads ([#469](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/469)) ([edd3654](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/edd3654c8501901fb3918feeea0484620c077ca2)) +* AlphaFold dataset - add accession_ids.csv to the bucket ([#451](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/451)) ([cacd9f1](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/cacd9f1fb7b6c9bec982f2eac17ece1505593694)) +* Change Destination Dataset in Noaa Pipelines ([#479](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/479)) ([c7c047c](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/c7c047c73ffb090ec95263c0171e2fde21aec634)) +* City Health Dashboard Schema Changes ([#515](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/515)) ([1bdb0dd](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/1bdb0ddeddc043e0d411475011e357c2816208eb)) +* deleting pod error ([#511](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/511)) ([77fe479](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/77fe4797b40cbdf9947c9250ae9b361709dfb358)) +* Fixing the forecasting issue in the notebook. ([#472](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/472)) ([de7f1fa](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/de7f1fada921423388c1593e0225b7ec9ebd84e0)) +* For COVID-19 Italy, resolve bucket variable in pipeline.yaml ([#509](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/509)) ([1f913ac](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/1f913acaf689f7820f16f4a90ac7b8cd61a31a57)) +* For FDA Food Enforcement, Resolve invalid source DateTime data. ([#508](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/508)) ([f4b5a52](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/f4b5a5245871d9bb445843b9f25e16c5b9eef8ae)) +* Increase number of years to back date to 2009 in New York Taxi Trips Dataset ([#445](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/445)) ([a9c5998](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/a9c599821c82dfe06e1e1d0637613a3a6e89ba9b)) +* Modified Resources for Kpod Operator ([#521](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/521)) ([e715154](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/e715154090ec47669028b81b919f5cd21c2f3159)) +* Remove GKE cluster operator for dataset Census Opportunity Atlas ([#458](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/458)) ([9ecfbc4](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/9ecfbc4630724bc4b01d675243d0ab9dba838dc8)) +* Removed create cluster process ([#517](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/517)) ([d36e6d4](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/d36e6d46c58bc1fa79517133a7f0acadc17ca511)) +* Resolve cluster name mismatch in pipeline.yaml ([#439](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/439)) ([3e8d20d](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/3e8d20dd90f29a8daeaa7d2b34af8360fe383759)) +* Resolve cluster name mismatch in pipeline.yaml ([#440](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/440)) ([d2658f6](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/d2658f6b8d22726a021885d911033e2195aa9e1d)) +* Resolve DateTime Issues In FEC Dataset ([#514](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/514)) ([014465b](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/014465bdc3c95568550d15e1913766e1af4ee586)) +* Resolve failure in production for the dataset Open Buildings ([#468](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/468)) ([9a22d5f](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/9a22d5f0f48ee458d7dcad3bc128e9e2bf88d08a)) +* Resolve Failures In New york Pipeline And Merge To One Image ([#516](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/516)) ([7d21778](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/7d21778b3968795f4f545c9cada8d0762b4f1a58)) +* Resolve Issue With Name Node Corruption In New york Dataset ([#459](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/459)) ([59e3aed](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/59e3aeda633698c97588195bc4daf87523e00be8)) +* Resolve null column for csv output and changed copyright year ([#466](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/466)) ([00e636e](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/00e636e440167d4932d982437367f3c50c6c0911)) +* Resolve production issue for Iowa Liquor Sales dataset ([#520](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/520)) ([cf2b460](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/cf2b460bbf7ad20cf2bc64564b4bbf5d07c114e4)) +* Resolve reference to hard coded bucket. ([#477](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/477)) ([039ff61](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/039ff616b77fd4558e9daa139526be83ca94e6ff)) +* Resolve San Francisco Pipeline Yaml Variable Assignment Issue ([#489](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/489)) ([2d34cf9](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/2d34cf900e0906d400c9acdd46c6d17b48a6f487)) +* Resolve source file location and format issue in the New York Taxi Trips dataset ([#441](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/441)) ([13a829f](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/13a829f34a1f1e9242285e7d311ee558f78390da)) +* Resolve Typo Issue In EPA Historical Air Quality Pipeline.yaml ([#519](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/519)) ([c54836d](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/c54836ddccb1645b92a6d064017a7d9f7cb88718)) +* Resolve variables. ([#464](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/464)) ([3c34e7e](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/3c34e7ee96b1723d2d4ba6329a1362cf75598e23)) +* Resolved reference to destination bucket causing failure in production. ([#507](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/507)) ([69128bc](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/69128bc34c74acbd694b16d931dd9a516ced9b01)) +* set gnomAD pipeline to run daily ([#510](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/510)) ([5f50601](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/5f50601536f084270734e7ac5c044fc1a8a65902)) +* Update project parameters for COVID-19 Genome Sequence dataset ([#462](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/issues/462)) ([78d55d9](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/commit/78d55d9bd8c39d872a19c1234c2d7b0ee7b926a6)) + ## [5.1.0](https://github.com/GoogleCloudPlatform/public-datasets-pipelines/compare/v5.0.0...v5.1.0) (2022-07-30) diff --git a/version.txt b/version.txt index 831446cbd..91ff57278 100644 --- a/version.txt +++ b/version.txt @@ -1 +1 @@ -5.1.0 +5.2.0