From b560ad240b5655fb31c36f106e880094d6047776 Mon Sep 17 00:00:00 2001 From: Vincent Arel-Bundock Date: Mon, 13 Nov 2023 16:46:21 -0500 Subject: [PATCH] docsify --- .Rbuildignore | 1 + docs/NEWS.md | 203 ------------ docs/README.md | 94 ------ docs/articles/contributions.md | 94 ------ docs/articles/contributions.qmd | 62 ---- docs/articles/countrycode.md | 376 --------------------- docs/articles/countrycode.qmd | 211 ------------ docs/articles/countryname.md | 26 -- docs/articles/countryname.qmd | 13 - docs/articles/custom.md | 68 ---- docs/articles/custom.qmd | 53 --- docs/index.html | 69 ---- docs/man/cldr_examples.md | 28 -- docs/man/codelist.md | 376 --------------------- docs/man/codelist_panel.md | 17 - docs/man/countrycode-package.md | 30 -- docs/man/countrycode.md | 237 -------------- docs/man/countryname.md | 90 ----- docs/man/countryname_dict.md | 14 - docs/man/get_dictionary.md | 47 --- docs/man/guess_field.md | 67 ---- docs/reference.md | 564 -------------------------------- vignettes/contributions.qmd | 2 + vignettes/custom.qmd | 2 + 24 files changed, 5 insertions(+), 2739 deletions(-) delete mode 100644 docs/NEWS.md delete mode 100644 docs/README.md delete mode 100644 docs/articles/contributions.md delete mode 100644 docs/articles/contributions.qmd delete mode 100644 docs/articles/countrycode.md delete mode 100644 docs/articles/countrycode.qmd delete mode 100644 docs/articles/countryname.md delete mode 100644 docs/articles/countryname.qmd delete mode 100644 docs/articles/custom.md delete mode 100644 docs/articles/custom.qmd delete mode 100644 docs/index.html delete mode 100644 docs/man/cldr_examples.md delete mode 100644 docs/man/codelist.md delete mode 100644 docs/man/codelist_panel.md delete mode 100644 docs/man/countrycode-package.md delete mode 100644 docs/man/countrycode.md delete mode 100644 docs/man/countryname.md delete mode 100644 docs/man/countryname_dict.md delete mode 100644 docs/man/get_dictionary.md delete mode 100644 docs/man/guess_field.md delete mode 100644 docs/reference.md diff --git a/.Rbuildignore b/.Rbuildignore index d49b65e..16b7214 100644 --- a/.Rbuildignore +++ b/.Rbuildignore @@ -22,3 +22,4 @@ revdep* README.Rmd ^altdoc$ ^docs$ +^vignettes/$ diff --git a/docs/NEWS.md b/docs/NEWS.md deleted file mode 100644 index cf28cea..0000000 --- a/docs/NEWS.md +++ /dev/null @@ -1,203 +0,0 @@ -# countrycode 1.5.0.9000 - -- Important speed-up for detection of country names using regular expressions (Thanks to Etienne Bacher). -- `countryname` gets the `nomatch` argument. -- `countryname` returns NA when the code does not support a given country. (Issue [#336](https://github.com/vincentarelbundock/countrycode/issues/336)) -- Improved regex for Italy - -## countrycode 1.5.0 - -- `get_dictionary()` function to download custom dictionaries (cross-walks): - - US States, Swiss Cantons, Global Burden of Disease, ExioBase, GTAP. -- New codes: Polity V "p5c", "p5n" -- New code "unhcr\_region". Thanks to [@galalH](https://github.com/galalH) for code contribution [#329](https://github.com/vincentarelbundock/countrycode/issues/329) -- Many regex improvements -- Several minor bug fixes - -## countrycode 1.4.0 - -- Detect French country names using regular expressions: `origin = "country.name.fr"` (Thanks to Samuel Meichtry) -- Detect Italian country names using regular expressions: `origin = "country.name.it"` (Thanks to Samuel Meichtry) - -## countrycode 1.3.1 - -- New code: unhcr - -## countrycode 1.3.0 - -- destination argument accepts a vector of strings and tries one after the other -- countryname(warn=TRUE) by default -- better class checks -- countryname defaults to `country.name.en` for missing country names (nomatch=NULL) -- Vietnam: better regex and support for vdem -- Namibia fixes: eurostat, genc2c, wb\_api2c, ecb -- Various regex improvements -- Congo French disambiguation - -## countrycode 1.2.0 - -- New 'countryname' function converts country names from any language (thanks to [@davidsjoberg](https://github.com/davidsjoberg)) -- New `guess_field` function guesses which code a vector uses -- Bug in dict build inserted NA in region variable (Thanks to M. Pascariu) -- Added region23 with the old, more granular regions -- Added unicode.symbol, which converts to emoji flags -- Added ISO 4217 currency name, alpha, and numeric codes -- Added UN region codes and names -- Added IANA ccTLD codes -- Improved various regexes - -## countrycode 1.1.3 - -- Added Demographic and Health Surveys (thanks to [@mcooper](https://github.com/mcooper)) - -## countrycode 1.1.2 - -- Updated World Bank regions with manual additions - -## countrycode 1.1.1 - -- Bug: Typo prevented users for using "p4n" as origin code -- Fixed bad icao.region codes (Thanks to [@espinielli](https://github.com/espinielli)) -- Added country name "United Arab Republic" and its regex (Thanks to Gina Reynolds) -- Added SOM to wb code (Thanks to Fabian Besche) -- Added Vietnam to codelist\_panel - -## countrycode 1.1.0 - -- Gleditsch and Ward codes (Thanks to Altaf Ali) -- V-Dem 8 country codes (panel and cross-section) -- Fixed Netherlands Antilles test (ANT code retired by ISO) -- codelist\_panel now excludes years where a country didn't exist -- Scraping function for UN M49 codes. (Thanks to [@cjyetman](https://github.com/cjyetman) and [@emilBeBri](https://github.com/emilBeBri)) -- `nomatch = NULL` now works as expected when sourcvar is a factor ([#192](https://github.com/vincentarelbundock/countrycode/issues/192) thanks to [@jhuovari](https://github.com/jhuovari) for reporting) - -## countrycode 1.0.0 - -- Huge thanks to [@cjyetman](https://github.com/cjyetman) for his incredible work on this major release! -- Country-Year (panel) conversion dictionary -- Dictionary built from original sources -- Liechtenstein should not be in eu28 -- Russia eurocontrol region fix -- CLRD country names - -## countrycode 0.19.1 - -- Move to Semantic Versioning 2.0.0 - [http://semver.org/#semantic-versioning-specification-semver](http://semver.org/#semantic-versioning-specification-semver) -- Fixed North Korea regex and added tests -- Fixed Sudan iso3n code -- Removed lookbehind from Ireland regex for javascript compatibility (request by plotly) -- Added nomatch argument - -## countrycode 0.19 - -New features - -- "custom\_dict" argument allows user-supplied dictionary data.frames -- "custom\_match" argument allows a user-supplied named vector of custom - origin->destination matches that will supercede any matching values in the - default result (issue [#107](https://github.com/vincentarelbundock/countrycode/issues/107)) (Thanks to [@cjyetman](https://github.com/cjyetman)) -- German, French, Spanish, Russian, Chinese, and Arabic country names as destination codes -- German regular expression to convert from German names to codes. (Thanks to [@sumtxt](https://github.com/sumtxt)) -- Aviation codes (Thanks to Enrico Spinielli) -- ar5 and eu28 (Thanks to Niklas Roming) -- eurostat (Thanks to [@cjyetman](https://github.com/cjyetman)) -- 2 and 3 character codes for the World Bank API: wb\_api2c and wb\_api3c (Thanks to [@cjyetman](https://github.com/cjyetman)) -- alpha and numeric codes for Polity IV: p4\_scode and p4\_ccode (Thanks to [@cjyetman](https://github.com/cjyetman)) -- World Values Survey numeric code (Thanks to [@cjyetman](https://github.com/cjyetman)) - -Regex fixes and improvements: - -- Improved regex for Ireland and United States of America (Thanks to [@cjyetman](https://github.com/cjyetman)) -- D.R. Congo (found in WVS) matches Democratic Republic of the Congo (Thanks to [@cjyetman](https://github.com/cjyetman)) -- Southern Africa -- Federated States of Micronesia -- Republic of China == Taiwan (Thanks to Nils Enevoldsen) -- Martinique (Thanks to Martyn Plummer) -- Tahiti country name string converts to French Polynesia - -Misc: - -- Major speed-up in regex conversion by using factors (Thanks to [@cjyetman](https://github.com/cjyetman)) -- when more than one match is found for a given string, is returned rather - than arbitrarily choosing the last match found (Thanks to [@cjyetman](https://github.com/cjyetman)) -- updated tests to new testthat convention (Thanks to [@cjyetman](https://github.com/cjyetman)) -- English country names are now official UN versions -- Better docs, examples, and README -- Taiwan FAO code is 214 (Thanks to Matthieu Stigler) - -## countrycode 0.18 - -- Nils Enevoldsen did wonderful work refactoring most of the regex in the dictionary. -- Nils also added a bunch of tests. Thanks! -- Added Tokelau - -## countrycode 0.17 - -- Added International Olympic Committee codes (Thanks to Devon Meunier) -- Bug: fips04 -> fips104 (Thanks to Florian Hollenbach) -- Complete FIPS104 codes (Thanks to Andy Halterman) -- Generic code name validity check (Thanks to Stefan Zeugner) -- Fixed IMF codes (Thanks to Stefan Zeugner) -- Regex fix to work better with Database of Political Insitutions (Thanks to Christopher Gandrud) -- Avoids confusion with Eq Guinea (Thanks to Christopher Gandrud) - -## countrycode 0.16 - -- Bug: NA cowc -> ABW (Thanks to Jon Mellon) - -## countrycode 0.15 - -- Regex fixes - - Guinea - - West Bank - - Kitts / Christopher - - Georgia / India - - Mali - - Sudan nigeria - - Belgium - - Korea Somalia - - Oman - -## countrycode 0.14 - -- sint maarten typo - -## countrycode 0.13 - -- add sint maartin \& curacao (thanks johnb30) - -## countrycode 0.12 - -- Missing wb codes filled-in using iso3c -- Added South Sudan -- Thanks to Rod Alence! - -## countrycode 0.11 - -- Vietnam cown -- Regexes: - - Dominica / Dominican Republic - - New Zealand / Aland - -## countrycode 0.10 - -- De-duplicate Sudan -- Niger vs. Nigeria regex - -## countrycode 0.9 - -- Fixed regexes: Mali, Korea, Oman, Dominica - -## countrycode 0.8 - -- Added World Bank (wb) country codes. Very similar, but slightly different from iso3c. - -## countrycode 0.7 - -- Removed useless functions countrycode.nomatch and countryframe -- Fixed 2 Congo-related problems -- Added option for countrycode() to report codes for which no match was found -- Moved documentation to roxygen2 -- Fixed Trinidad Tobago regex -- Added UN and FAO country codes diff --git a/docs/README.md b/docs/README.md deleted file mode 100644 index 22ced53..0000000 --- a/docs/README.md +++ /dev/null @@ -1,94 +0,0 @@ -# countrycode - - - - - -[![DOI](http://joss.theoj.org/papers/10.21105/joss.00848/status.svg)](https://doi.org/10.21105/joss.00848) -[![AppVeyor build -status](https://ci.appveyor.com/api/projects/status/github/vincentarelbundock/countrycode?branch=master&svg=true)](https://ci.appveyor.com/project/vincentarelbundock/countrycode) -[![R build -status](https://github.com/vincentarelbundock/countrycode/workflows/R-CMD-check/badge.svg)](https://github.com/vincentarelbundock/countrycode/actions) -![CRAN -downloads](http://cranlogs.r-pkg.org/badges/grand-total/countrycode) - - - -`countrycode` standardizes country names, converts them into ~40 -different coding schemes, and assigns region descriptors. Scroll down -for more details or visit the [countrycode CRAN -page](http://cran.r-project.org/web/packages/countrycode/index.html) - -If you use `countrycode` in your research, we would be very grateful if -you could cite our paper: - -> Arel-Bundock, Vincent, Nils Enevoldsen, and CJ Yetman, (2018). -> countrycode: An R package to convert country names and country codes. -> Journal of Open Source Software, 3(28), 848, -> [https://doi.org/10.21105/joss.00848](https://doi.org/10.21105/joss.00848) - -## Why `countrycode`? - -### The Problem - -Different data sources use different coding schemes to represent -countries (e.g. CoW or ISO). This poses two main problems: (1) some of -these coding schemes are less than intuitive, and (2) merging these data -requires converting from one coding scheme to another, or from long -country names to a coding scheme. - -### The Solution - -The `countrycode` function can convert to and from 40+ different country -coding schemes, and to 600+ variants of country names in different -languages and formats. It uses regular expressions to convert long -country names (e.g. Sri Lanka) into any of those coding schemes or -country names. It can create new variables with various regional -groupings. - -## Installation - -From the R console, type: - -```r -install.packages("countrycode") -``` - -To install the latest development version, you can use the `remotes` -package: - -```r -library(remotes) -install_github('vincentarelbundock/countrycode') -``` - -## Supported codes - -To get an up-to-date list of supported country codes, install the -package and type `?codelist`. These include: - -- 600+ variants of country names in different languages and formats. -- AR5 -- Continent and region identifiers. -- Correlates of War (numeric and character) -- European Central Bank -- [EUROCONTROL](https://www.eurocontrol.int) - The European Organisation - for the Safety of Air Navigation -- Eurostat -- Federal Information Processing Standard (FIPS) -- Food and Agriculture Organization of the United Nations -- Global Administrative Unit Layers (GAUL) -- Geopolitical Entities, Names and Codes (GENC) -- Gleditsch \& Ward (numeric and character) -- International Civil Aviation Organization -- International Monetary Fund -- International Olympic Committee -- ISO (2/3-character and numeric) -- Polity IV -- United Nations -- United Nations Procurement Division -- Varieties of Democracy -- World Bank -- World Values Survey -- Unicode symbols (flags) - diff --git a/docs/articles/contributions.md b/docs/articles/contributions.md deleted file mode 100644 index 6d4c937..0000000 --- a/docs/articles/contributions.md +++ /dev/null @@ -1,94 +0,0 @@ - -# Contributions - -## Adding a new code - -New country codes are created by two files: - -1. `dictionary/get_*.R` is an `R` script which can scrape the code from - an original online source (e.g., `get_world_bank.R`). This scripts - only side effect is that it writes a CSV file to the `dictionary` - folder. -2. `dictionary/data_*.csv` is a CSV file with 1 column called - `country`, which includes the English country name, and 1 or more - columns named after the codes you want to add (e.g., `iso3c`, - `un.name.en`, `continent`). - -After creating those two files, you should: - -- Run `dictionary/build.R` -- If the code is a valid origin code (i.e., no two countries share the - same code), add it to the `valid_origin` vector in `R/countrycode.R` -- Add the new code name to the documentation in `R/codelist.R` -- Build the documentation using the devtools package: - `devtools::document()` -- Add a bullet point to `NEWS.md` file. - -If you need help with any of these steps, or if you just want to submit -a CSV file, feel free to open an issue on Github or write an email to -Vincent. I’ll be happy to help you out! - -## Custom dictionaries - -The `countrycode` repository holds several custom dictionaries: -https://github.com/vincentarelbundock/countrycode/tree/master/data/custom_dictionaries - -To add your own custom dictionary, please make sure that: - -1. You save a comma-separated CSV file that looks something like - data/custom_dictionaries/data_us_states.csv -2. The custom dictionary has a unique purpose (not overlapping with - existing custom dictionaries) -3. It uses UTF-8 encoding and conforms to RFC 4180 CSV standard - (e.g. comma-delimited, etc.). - - `R` commands to produce such a file are shown below. -4. /blank fields are blank, not the string ‘NA’ (not RFC 4180, but - important here because of Namibia) -5. It has concise, sensible, valid (in the R data frame sense) column - header names - -Using base write.csv: - -``` r -write.csv(custom_dict, 'custom_dict.csv', quote = TRUE, na = '', - row.names = FALSE, qmethod = 'double', fileEncoding = 'UTF-8') -``` - -Using `readr`: - -``` r -readr::write_csv(custom_dict, 'custom_dict.csv', na = '') -``` - -## Custom dictionary attributes - -When using custom dictionaries, it is often useful to give “meta” -information to `countrycode` so that it knows how to use certain codes. -To do this, we can set attributes of the dictionary. In this example, we -download a dictionary of US state codes. Then, we identify a column of -regular expressions using the `origin_regex` attribute, and we identify -the valid origin codes using the `origin_valid` attribute. - -``` r -state_dict <- "https://raw.githubusercontent.com/vincentarelbundock/countrycode/main/data/custom_dictionaries/data_us_states.csv" -state_dict <- read.csv(state_dict) - -attr(state_dict, "origin_regex") <- "state.regex" -attr(state_dict, "origin_valid") <- c("state.regex", "abbreviation") - -countrycode("Alabama", "state.regex", "abbreviation", custom_dict = state_dict) -``` - - Error in countrycode("Alabama", "state.regex", "abbreviation", custom_dict = state_dict): could not find function "countrycode" - -``` r -countrycode("AL", "abbreviation", "state", custom_dict = state_dict) -``` - - Error in countrycode("AL", "abbreviation", "state", custom_dict = state_dict): could not find function "countrycode" - -``` r -countrycode("Alabama", "state", "abbreviation", custom_dict = state_dict) -``` - - Error in countrycode("Alabama", "state", "abbreviation", custom_dict = state_dict): could not find function "countrycode" diff --git a/docs/articles/contributions.qmd b/docs/articles/contributions.qmd deleted file mode 100644 index 464af20..0000000 --- a/docs/articles/contributions.qmd +++ /dev/null @@ -1,62 +0,0 @@ -# Contributions - -## Adding a new code - -New country codes are created by two files: - -1. `dictionary/get_*.R` is an `R` script which can scrape the code from an original online source (e.g., `get_world_bank.R`). This scripts only side effect is that it writes a CSV file to the `dictionary` folder. -2. `dictionary/data_*.csv` is a CSV file with 1 column called `country`, which includes the English country name, and 1 or more columns named after the codes you want to add (e.g., `iso3c`, `un.name.en`, `continent`). - -After creating those two files, you should: - -* Run `dictionary/build.R` -* If the code is a valid origin code (i.e., no two countries share the same code), add it to the `valid_origin` vector in `R/countrycode.R` -* Add the new code name to the documentation in `R/codelist.R` -* Build the documentation using the devtools package: `devtools::document()` -* Add a bullet point to `NEWS.md` file. - -If you need help with any of these steps, or if you just want to submit a CSV file, feel free to open an issue on Github or write an email to Vincent. I'll be happy to help you out! - -## Custom dictionaries - -The `countrycode` repository holds several custom dictionaries: https://github.com/vincentarelbundock/countrycode/tree/master/data/custom_dictionaries - -To add your own custom dictionary, please make sure that: - -1. You save a comma-separated CSV file that looks something like data/custom_dictionaries/data_us_states.csv -2. The custom dictionary has a unique purpose (not overlapping with existing custom dictionaries) -3. It uses UTF-8 encoding and conforms to RFC 4180 CSV standard (e.g. comma-delimited, etc.). - - `R` commands to produce such a file are shown below. -4. /blank fields are blank, not the string 'NA' (not RFC 4180, but important here because of Namibia) -5. It has concise, sensible, valid (in the R data frame sense) column header names - -Using base write.csv: - -```r -write.csv(custom_dict, 'custom_dict.csv', quote = TRUE, na = '', - row.names = FALSE, qmethod = 'double', fileEncoding = 'UTF-8') -``` - -Using `readr`: - -```r -readr::write_csv(custom_dict, 'custom_dict.csv', na = '') -``` - -## Custom dictionary attributes - -When using custom dictionaries, it is often useful to give "meta" information to `countrycode` so that it knows how to use certain codes. To do this, we can set attributes of the dictionary. In this example, we download a dictionary of US state codes. Then, we identify a column of regular expressions using the `origin_regex` attribute, and we identify the valid origin codes using the `origin_valid` attribute. - -```{r, error = TRUE, message = FALSE} -state_dict <- "https://raw.githubusercontent.com/vincentarelbundock/countrycode/main/data/custom_dictionaries/data_us_states.csv" -state_dict <- read.csv(state_dict) - -attr(state_dict, "origin_regex") <- "state.regex" -attr(state_dict, "origin_valid") <- c("state.regex", "abbreviation") - -countrycode("Alabama", "state.regex", "abbreviation", custom_dict = state_dict) - -countrycode("AL", "abbreviation", "state", custom_dict = state_dict) - -countrycode("Alabama", "state", "abbreviation", custom_dict = state_dict) -``` diff --git a/docs/articles/countrycode.md b/docs/articles/countrycode.md deleted file mode 100644 index 33e3cd0..0000000 --- a/docs/articles/countrycode.md +++ /dev/null @@ -1,376 +0,0 @@ - -# Country codes - -## Convert a single name or code - -Load library: - -``` r -library(countrycode) -``` - -Convert single country codes: - -``` r -## ISO to Correlates of War -countrycode('DZA', origin = 'iso3c', destination = 'cown') -``` - - [1] 615 - -``` r -## English to ISO -countrycode('Albania', origin = 'country.name', destination = 'iso3c') -``` - - [1] "ALB" - -``` r -## German or Italian to Arabic -countrycode(c('Algerien', 'Albanien'), origin = 'country.name.de', destination = 'un.name.ar') -``` - - [1] "الجزائر" "ألبانيا" - -``` r -countrycode(c('Moldavia', 'Stati Uniti'), origin = 'country.name.it', destination = 'un.name.ar') -``` - - [1] "جمهورية مولدوفا" "الولايات المتحدة الأمريكية" - -## Convert a vector of country codes - -``` r -cowcodes <- c("ALG", "ALB", "UKG", "CAN", "USA") -countrycode(cowcodes, origin = "cowc", destination = "iso3c") -``` - - [1] "DZA" "ALB" "GBR" "CAN" "USA" - -Generate vectors and 2 data frames without a common id (i.e. can’t merge -the 2 df): - -``` r -isocodes <- c(12,8,826,124,840) -var1 <- sample(1:500,5) -var2 <- sample(1:500,5) -df1 <- data.frame(cowcodes,var1) -df2 <- data.frame(isocodes,var2) -``` - -Inspect the data: - -``` r -df1 -``` - - cowcodes var1 - 1 ALG 203 - 2 ALB 158 - 3 UKG 157 - 4 CAN 405 - 5 USA 110 - -``` r -df2 -``` - - isocodes var2 - 1 12 472 - 2 8 483 - 3 826 126 - 4 124 268 - 5 840 212 - -Create a common variable with the iso3c code in each data frame, merge -the data, and create a country identifier: - -``` r -df1$iso3c <- countrycode(df1$cowcodes, origin = "cowc", destination = "iso3c") -df2$iso3c <- countrycode(df2$isocodes, origin = "iso3n", destination = "iso3c") -df3 <- merge(df1,df2,id="iso3c") -df3$country <- countrycode(df3$iso3c, origin = "iso3c", destination = "country.name") -df3 -``` - - iso3c cowcodes var1 isocodes var2 country - 1 ALB ALB 158 8 483 Albania - 2 CAN CAN 405 124 268 Canada - 3 DZA ALG 203 12 472 Algeria - 4 GBR UKG 157 826 126 United Kingdom - 5 USA USA 110 840 212 United States - -## Flags - -`countrycode` can convert country names and codes to unicode flags. For -example, we can use the `gt` package to draw a table with countries and -their corresponding flags: - -``` r -library(gt) -library(countrycode) - -Countries <- c('Canada', 'Germany', 'Thailand', 'Algeria', 'Eritrea') -Flags <- countrycode(Countries, 'country.name', 'unicode.symbol') -dat <- data.frame(Countries, Flags) -gt(dat) -``` - -![gt_flags](https://github.com/vincentarelbundock/countrycode/assets/987057/c0c29aa3-0aab-4274-af80-1ae3efc89203.png) - -Note that embedding unicode characters in `R` graphics is possible, but -it can be tricky. If your output looks like `\U0001f1e6\U0001f1f6`, then -you could try feeding it to this function: `utf8::utf8_print()`. That -should cover a lot of cases without dipping into the complexity of -graphics devices. As a rule of thumb, if your output looks like `□□□□` -(boxes), things tend to get more complicated. In that case, you’ll have -to think about different output devices, file viewers, and/or file -formats (e.g., ‘SVG’ or ‘HTML’). - -Since inserting unicode symbols into `R` graphics is not a -`countrycode`-specific issue, we won’t be able to offer any more support -than this. Good luck! - -## Country names in 600+ different languages and formats - -The Unicode organisation hosts the CLDR project, which publishes many -variants of country names. For each language/culture locale, there is a -full set of names, plus possible ‘alt-short’ or ‘alt-variant’ variations -of specific country names. - -``` r -countrycode('United States of America', origin = 'country.name', destination = 'cldr.name.en') -``` - - [1] "United States" - -``` r -countrycode('United States of America', origin = 'country.name', destination = 'cldr.short.en') -``` - - [1] "US" - -To see a full list of country name variants available, inspect this -data.frame: - -``` r -head(countrycode::cldr_examples) -``` - - Code Example - 1 cldr.name.agq TF - 2 cldr.name.ak TF - 3 cldr.name.am የፈረንሳይ ደቡባዊ ግዛቶች - 4 cldr.name.ar الأقاليم الجنوبية الفرنسية - 5 cldr.name.ar_ly الأقاليم الجنوبية الفرنسية - 6 cldr.name.ar_sa الأقاليم الجنوبية الفرنسية - -## Custom dictionaries and cross-walks: `get_dictionary()` and `custom_dict` - -The `custom_dict` argument accepts data frame which can be used as -custom dictionaries to create “crosswalks” between arbitrary entities -(non-countries). You can create your own dictionaries (see examples -below) or use one of the dictionaries already hosted on the -`countrycode` Github repository. The current list of available -dictionaries can be seen by calling: - -``` r -get_dictionary() -``` - - Available dictionaries: ch_cantons, exiobase3, global_burden_of_disease, gtap10, us_states - -You can download a dictionary and see available fields with: - -``` r -cd <- get_dictionary("us_states") -head(cd) -``` - - state.name state.abb state.regex - 1 Alabama AL .*alabama.* - 2 Alaska AK .*alaska.* - 3 Arizona AZ .*arizona.* - 4 Arkansas AR .*arkansas.* - 5 California CA .*california.* - 6 Colorado CO .*colorado.* - -Now we can use the dictionary for conversions: - -``` r -st <- c("Arkansas", "Quebec", "Tennessee") -countrycode(st, "state.regex", "state.abb", custom_dict = cd) -``` - - Warning: Some values were not matched unambiguously: Quebec - - [1] "AR" NA "TN" - -``` r -countrycode(c("MN", "MA", "MO"), "state.abb", "state.name", custom_dict = cd) -``` - - [1] "Minnesota" "Massachusetts" "Missouri" - -Here’s an example with the GTAP dataset: - -``` r -cd <- get_dictionary("gtap10") -countrycode("Christmas Island", "country.name.en.regex", "gtap.cha", custom_dict = cd) -``` - - [1] "AUS" - -### `custom_dict`: the `ISOcodes` package - -`countrycode` already supports ISO4217 (currencies) and ISO3166 (country -codes). The `ISOcodes` package supplies other codes, including ISO15924 -(language writing systems), ISO639 (languages), and ISO8859 (computer -character encodings). Users can convert those codes using -`countrycode`’s `custom_dict` argument. - -For example, the `ISOcodes::ISO_639_2` dataframe includes 4 columns: -`Alpha_3_B`, `Alpha_3_T`, `Alpha_2`, and `Name`. We can convert language -names like this: - -``` r -countrycode('abk', 'Alpha_3_B', 'Name', custom_dict = ISOcodes::ISO_639_2) -``` - - [1] "Abkhazian" - -The `ISOcodes::ISO_8859` dataset is a 3-dimensional array where the -second dimension represents the character encoding. We take the subset -of `ISO_8859_1` codes and convert the dict to a dataframe for use in -`countrycode`’s `custom_dict` argument: - -``` r -library(ISOcodes) -dict <- ISOcodes::ISO_8859[, 'ISO_8859_1', ] -dict <- data.frame(dict) -``` - -The resulting dataframe has 3 columns: `Code`, `Name`, `Character`. We -convert the code `0x00fd` like this: - -``` r -countrycode("0x00fd", "Code", "Name", custom_dict = dict) -``` - - [1] "LATIN SMALL LETTER Y WITH ACUTE" - -``` r -countrycode("0x00fd", "Code", "Character", custom_dict = dict) -``` - - [1] "ý" - -## `destination`: Fallback codes - -Some destination codes not cover all the relevant countries. For -example, “SRB” is included in the `iso3c` code but *not* in the `cowc` -code. Some users may want to use `cowc` but to fill in missing entries -with `iso3c` codes. We can do this by feeding a vector of code names to -the `destination` argument. `countrycode` will then try one after the -other. - -For example, - -``` r -x <- c("Algeria", "Serbia") - -countrycode(x, "country.name", "cowc") -``` - - Warning: Some values were not matched unambiguously: Serbia - - [1] "ALG" NA - -``` r -countrycode(x, "country.name", "iso3c") -``` - - [1] "DZA" "SRB" - -``` r -countrycode(x, "country.name", c("cowc", "iso3c")) -``` - - Warning: Some values were not matched unambiguously: Serbia - - [1] "ALG" "SRB" - -## `nomatch`: Fill in missing codes manually - -Use the `nomatch` argument to specify the value that `countrycode` -inserts where no match was found: - -``` r -countrycode(c('DZA', 'USA', '???'), origin = 'iso3c', destination = 'country.name', nomatch = 'BAD CODE') -``` - - [1] "Algeria" "United States" "BAD CODE" - -``` r -countrycode(c('Canada', 'Fake country'), origin = 'country.name', destination = 'iso3c', nomatch = 'BAD') -``` - - [1] "CAN" "BAD" - -## `custom_match`: Override default values - -`countrycode` accepts a user supplied named vector of custom matches via -the `custom_match` argument. Any match pairs in the `custom_match` -vector will supercede the default results of the command. This allows -the user to convert to an available country code and make minor -post-edits all at once. The names of the named vector are used as the -origin code, and the values of the named vector are used as the -destination code. - -For example, Eurostat uses a modified version of iso2c, with Greece (EL -instead of GR) and the UK (UK instead of GB) being the only differences. -Getting a proper result converting to Eurostat is easy to achieve using -the `iso2c` destination and the new `custom_match` argument. (Note: -since version 0.19, `countrycode` also includes a `eurostat` -origin/destination code, so while this is a good example, doing so for -Eurostat is not necessary) - -Example: convert from country name to Eurostat code - -``` r -library(countrycode) -country_names <- c('Greece', 'United Kingdom', 'Germany', 'France') -custom_match <- c(Greece = 'EL', `United Kingdom` = 'UK') -countrycode(country_names, - origin = 'country.name', - destination = 'iso2c', - custom_match = custom_match) -``` - - [1] "EL" "UK" "DE" "FR" - -Example: convert from Eurostat code to country name - -``` r -library(eurostat) -library(countrycode) -df <- eurostat::get_eurostat("nama_10_lp_ulc") -custom_match <- c(EL = 'Greece', UK = 'United Kingdom') -countrycode(df$geo, origin = 'iso2c', destination = 'country.name', custom_match = custom_match) |> - head() -``` - - Warning: Some values were not matched unambiguously: EA, EA12, EA19, EA20, EU15, EU27_2020, EU28, XK - - [1] "Austria" "Belgium" "Bulgaria" "Switzerland" "Cyprus" - [6] "Czechia" - -## `warn`: Silence warnings - -Use `warn = TRUE` to print out a list of source elements for which no -match was found. When the source vector are long country names that need -to be matched using regular expressions, there is always a risk that -multiple regex will match a given string. When this is the case, -`countrycode` assigns a value arbitrarily, but the `warn` argument -allows the user to print a list of all strings that were matched many -times. diff --git a/docs/articles/countrycode.qmd b/docs/articles/countrycode.qmd deleted file mode 100644 index d7de0e5..0000000 --- a/docs/articles/countrycode.qmd +++ /dev/null @@ -1,211 +0,0 @@ -# Country codes - -## Convert a single name or code - -Load library: - -```{r} -library(countrycode) -``` - -Convert single country codes: - -```{r} -## ISO to Correlates of War -countrycode('DZA', origin = 'iso3c', destination = 'cown') - -## English to ISO -countrycode('Albania', origin = 'country.name', destination = 'iso3c') - -## German or Italian to Arabic -countrycode(c('Algerien', 'Albanien'), origin = 'country.name.de', destination = 'un.name.ar') - -countrycode(c('Moldavia', 'Stati Uniti'), origin = 'country.name.it', destination = 'un.name.ar') -``` - -## Convert a vector of country codes - -```{r} -cowcodes <- c("ALG", "ALB", "UKG", "CAN", "USA") -countrycode(cowcodes, origin = "cowc", destination = "iso3c") -``` - -Generate vectors and 2 data frames without a common id (i.e. can't merge the 2 df): - -```{r} -isocodes <- c(12,8,826,124,840) -var1 <- sample(1:500,5) -var2 <- sample(1:500,5) -df1 <- data.frame(cowcodes,var1) -df2 <- data.frame(isocodes,var2) -``` - -Inspect the data: - -```{r} -df1 - -df2 -``` - -Create a common variable with the iso3c code in each data frame, merge the data, and create a country identifier: - -```{r} -df1$iso3c <- countrycode(df1$cowcodes, origin = "cowc", destination = "iso3c") -df2$iso3c <- countrycode(df2$isocodes, origin = "iso3n", destination = "iso3c") -df3 <- merge(df1,df2,id="iso3c") -df3$country <- countrycode(df3$iso3c, origin = "iso3c", destination = "country.name") -df3 -``` - -## Flags - -`countrycode` can convert country names and codes to unicode flags. For example, we can use the `gt` package to draw a table with countries and their corresponding flags: - -```r -library(gt) -library(countrycode) - -Countries <- c('Canada', 'Germany', 'Thailand', 'Algeria', 'Eritrea') -Flags <- countrycode(Countries, 'country.name', 'unicode.symbol') -dat <- data.frame(Countries, Flags) -gt(dat) -``` - -![gt_flags](https://github.com/vincentarelbundock/countrycode/assets/987057/c0c29aa3-0aab-4274-af80-1ae3efc89203) - -Note that embedding unicode characters in `R` graphics is possible, but it can be tricky. If your output looks like `\U0001f1e6\U0001f1f6`, then you could try feeding it to this function: `utf8::utf8_print()`. That should cover a lot of cases without dipping into the complexity of graphics devices. As a rule of thumb, if your output looks like `□□□□` (boxes), things tend to get more complicated. In that case, you'll have to think about different output devices, file viewers, and/or file formats (e.g., 'SVG' or 'HTML'). - -Since inserting unicode symbols into `R` graphics is not a `countrycode`-specific issue, we won't be able to offer any more support than this. Good luck! - -## Country names in 600+ different languages and formats - -The Unicode organisation hosts the CLDR project, which publishes many variants of country names. For each language/culture locale, there is a full set of names, plus possible 'alt-short' or 'alt-variant' variations of specific country names. - -```{r} -countrycode('United States of America', origin = 'country.name', destination = 'cldr.name.en') - -countrycode('United States of America', origin = 'country.name', destination = 'cldr.short.en') -``` - -To see a full list of country name variants available, inspect this data.frame: - -```{r} -head(countrycode::cldr_examples) -``` - -## Custom dictionaries and cross-walks: `get_dictionary()` and `custom_dict` - -The `custom_dict` argument accepts data frame which can be used as custom dictionaries to create "crosswalks" between arbitrary entities (non-countries). You can create your own dictionaries (see examples below) or use one of the dictionaries already hosted on the `countrycode` Github repository. The current list of available dictionaries can be seen by calling: - -```{r} -get_dictionary() -``` - -You can download a dictionary and see available fields with: - -```{r, message = FALSE} -cd <- get_dictionary("us_states") -head(cd) -``` - -Now we can use the dictionary for conversions: - -```{r} -st <- c("Arkansas", "Quebec", "Tennessee") -countrycode(st, "state.regex", "state.abb", custom_dict = cd) - -countrycode(c("MN", "MA", "MO"), "state.abb", "state.name", custom_dict = cd) -``` - -Here's an example with the GTAP dataset: - -```{r, message = FALSE} -cd <- get_dictionary("gtap10") -countrycode("Christmas Island", "country.name.en.regex", "gtap.cha", custom_dict = cd) -``` - -### `custom_dict`: the `ISOcodes` package - -`countrycode` already supports ISO4217 (currencies) and ISO3166 (country codes). The `ISOcodes` package supplies other codes, including ISO15924 (language writing systems), ISO639 (languages), and ISO8859 (computer character encodings). Users can convert those codes using `countrycode`'s `custom_dict` argument. - -For example, the `ISOcodes::ISO_639_2` dataframe includes 4 columns: `Alpha_3_B`, `Alpha_3_T`, `Alpha_2`, and `Name`. We can convert language names like this: - -```{r} -countrycode('abk', 'Alpha_3_B', 'Name', custom_dict = ISOcodes::ISO_639_2) -``` - -The `ISOcodes::ISO_8859` dataset is a 3-dimensional array where the second dimension represents the character encoding. We take the subset of `ISO_8859_1` codes and convert the dict to a dataframe for use in `countrycode`'s `custom_dict` argument: - -```{r} -library(ISOcodes) -dict <- ISOcodes::ISO_8859[, 'ISO_8859_1', ] -dict <- data.frame(dict) -``` - -The resulting dataframe has 3 columns: `Code`, `Name`, `Character`. We convert the code `0x00fd` like this: - -```{r} -countrycode("0x00fd", "Code", "Name", custom_dict = dict) - -countrycode("0x00fd", "Code", "Character", custom_dict = dict) -``` - -## `destination`: Fallback codes - -Some destination codes not cover all the relevant countries. For example, "SRB" is included in the `iso3c` code but *not* in the `cowc` code. Some users may want to use `cowc` but to fill in missing entries with `iso3c` codes. We can do this by feeding a vector of code names to the `destination` argument. `countrycode` will then try one after the other. - -For example, - -```{r} -x <- c("Algeria", "Serbia") - -countrycode(x, "country.name", "cowc") - -countrycode(x, "country.name", "iso3c") - -countrycode(x, "country.name", c("cowc", "iso3c")) -``` - -## `nomatch`: Fill in missing codes manually - -Use the `nomatch` argument to specify the value that `countrycode` inserts where no match was found: - -```{r} -countrycode(c('DZA', 'USA', '???'), origin = 'iso3c', destination = 'country.name', nomatch = 'BAD CODE') - -countrycode(c('Canada', 'Fake country'), origin = 'country.name', destination = 'iso3c', nomatch = 'BAD') -``` - -## `custom_match`: Override default values - -`countrycode` accepts a user supplied named vector of custom matches via the `custom_match` argument. Any match pairs in the `custom_match` vector will supercede the default results of the command. This allows the user to convert to an available country code and make minor post-edits all at once. The names of the named vector are used as the origin code, and the values of the named vector are used as the destination code. - -For example, Eurostat uses a modified version of iso2c, with Greece (EL instead of GR) and the UK (UK instead of GB) being the only differences. Getting a proper result converting to Eurostat is easy to achieve using the `iso2c` destination and the new `custom_match` argument. (Note: since version 0.19, `countrycode` also includes a `eurostat` origin/destination code, so while this is a good example, doing so for Eurostat is not necessary) - -Example: convert from country name to Eurostat code - -```{r} -library(countrycode) -country_names <- c('Greece', 'United Kingdom', 'Germany', 'France') -custom_match <- c(Greece = 'EL', `United Kingdom` = 'UK') -countrycode(country_names, - origin = 'country.name', - destination = 'iso2c', - custom_match = custom_match) -``` - -Example: convert from Eurostat code to country name - -```{r} -library(eurostat) -library(countrycode) -df <- eurostat::get_eurostat("nama_10_lp_ulc") -custom_match <- c(EL = 'Greece', UK = 'United Kingdom') -countrycode(df$geo, origin = 'iso2c', destination = 'country.name', custom_match = custom_match) |> - head() -``` - -## `warn`: Silence warnings - -Use `warn = TRUE` to print out a list of source elements for which no match was found. When the source vector are long country names that need to be matched using regular expressions, there is always a risk that multiple regex will match a given string. When this is the case, `countrycode` assigns a value arbitrarily, but the `warn` argument allows the user to print a list of all strings that were matched many times. diff --git a/docs/articles/countryname.md b/docs/articles/countryname.md deleted file mode 100644 index 98b7c1f..0000000 --- a/docs/articles/countryname.md +++ /dev/null @@ -1,26 +0,0 @@ - -# Country names - -The function `countryname` tries to convert country names from any -language. For example: - -``` r -library(countrycode) -x <- c('ジンバブエ', 'Afeganistãu', 'Barbadas', 'Sverige', 'UK', - 'il-Georgia tan-Nofsinhar u l-Gżejjer Sandwich tan-Nofsinhar') - -countryname(x) -``` - - [1] "Zimbabwe" - [2] "Afghanistan" - [3] "Barbados" - [4] "Sweden" - [5] "United Kingdom" - [6] "South Georgia & South Sandwich Islands" - -``` r -countryname(x, 'iso3c') -``` - - [1] "ZWE" "AFG" "BRB" "SWE" "GBR" "SGS" diff --git a/docs/articles/countryname.qmd b/docs/articles/countryname.qmd deleted file mode 100644 index 01da10e..0000000 --- a/docs/articles/countryname.qmd +++ /dev/null @@ -1,13 +0,0 @@ -# Country names - -The function `countryname` tries to convert country names from any language. For example: - -```{r} -library(countrycode) -x <- c('ジンバブエ', 'Afeganistãu', 'Barbadas', 'Sverige', 'UK', - 'il-Georgia tan-Nofsinhar u l-Gżejjer Sandwich tan-Nofsinhar') - -countryname(x) - -countryname(x, 'iso3c') -``` \ No newline at end of file diff --git a/docs/articles/custom.md b/docs/articles/custom.md deleted file mode 100644 index 3d4f8c5..0000000 --- a/docs/articles/custom.md +++ /dev/null @@ -1,68 +0,0 @@ - -# Custom conversion functions - -It is easy to to create alternative functions with different default -arguments and/or dictionaries. For example, we can create: - -- `name_to_iso3c` function that sets new defaults for the `origin` and - `destination` arguments, and automatically converts country names to - iso3c -- `statecode` function to convert US state codes using a custom - dictionary by default, that we download from the internet. - -``` r -################################# -# new function: name_to_iso3c # -################################# - -# Custom defaults -name_to_iso3c <- function(sourcevar, - origin = "country.name", - destination = "iso3c", - ...) { - countrycode(sourcevar, origin = origin, destination = destination, ...) -} - -name_to_iso3c(c("Algeria", "Canada")) -``` - - Error in countrycode(sourcevar, origin = origin, destination = destination, : could not find function "countrycode" - -``` r -############################# -# new function: statecode # -############################# - -# Download dictionary -state_dict <- "https://raw.githubusercontent.com/vincentarelbundock/countrycode/main/data/custom_dictionaries/data_us_states.csv" -state_dict <- read.csv(state_dict) - -# Identify regular expression origin codes -attr(state_dict, "origin_regex") <- "state.regex" - -# Define a custom conversion function -statecode <- function(sourcevar, - origin = "state.regex", - destination = "abbreviation", - custom_dict = state_dict, - ...) { - countrycode(sourcevar, - origin = origin, - destination = destination, - custom_dict = custom_dict, - ...) -} - -# Voilà! -x <- c("Alabama", "New Mexico") -statecode(x, "state.regex", "abbreviation") -``` - - Error in countrycode(sourcevar, origin = origin, destination = destination, : could not find function "countrycode" - -``` r -x <- c("AL", "NM", "VT") -statecode(x, "abbreviation", "state") -``` - - Error in countrycode(sourcevar, origin = origin, destination = destination, : could not find function "countrycode" diff --git a/docs/articles/custom.qmd b/docs/articles/custom.qmd deleted file mode 100644 index dbdd47f..0000000 --- a/docs/articles/custom.qmd +++ /dev/null @@ -1,53 +0,0 @@ -# Custom conversion functions - -It is easy to to create alternative functions with different default arguments and/or dictionaries. For example, we can create: - -* `name_to_iso3c` function that sets new defaults for the `origin` and `destination` arguments, and automatically converts country names to iso3c -* `statecode` function to convert US state codes using a custom dictionary by default, that we download from the internet. - -```{r, error = TRUE, message = FALSE} -################################# -# new function: name_to_iso3c # -################################# - -# Custom defaults -name_to_iso3c <- function(sourcevar, - origin = "country.name", - destination = "iso3c", - ...) { - countrycode(sourcevar, origin = origin, destination = destination, ...) -} - -name_to_iso3c(c("Algeria", "Canada")) - -############################# -# new function: statecode # -############################# - -# Download dictionary -state_dict <- "https://raw.githubusercontent.com/vincentarelbundock/countrycode/main/data/custom_dictionaries/data_us_states.csv" -state_dict <- read.csv(state_dict) - -# Identify regular expression origin codes -attr(state_dict, "origin_regex") <- "state.regex" - -# Define a custom conversion function -statecode <- function(sourcevar, - origin = "state.regex", - destination = "abbreviation", - custom_dict = state_dict, - ...) { - countrycode(sourcevar, - origin = origin, - destination = destination, - custom_dict = custom_dict, - ...) -} - -# Voilà! -x <- c("Alabama", "New Mexico") -statecode(x, "state.regex", "abbreviation") - -x <- c("AL", "NM", "VT") -statecode(x, "abbreviation", "state") -``` \ No newline at end of file diff --git a/docs/index.html b/docs/index.html deleted file mode 100644 index edab1b3..0000000 --- a/docs/index.html +++ /dev/null @@ -1,69 +0,0 @@ - - - - - - countrycode - - - - - -
- - - - diff --git a/docs/man/cldr_examples.md b/docs/man/cldr_examples.md deleted file mode 100644 index 90d77e7..0000000 --- a/docs/man/cldr_examples.md +++ /dev/null @@ -1,28 +0,0 @@ - -# cldr_examples - -List of CLDR country name codes and associated examples - -## Description - -
    -
  • - -Code: CLDR code - -
  • -
  • - -Example: French Southern Territories in different languages - -
  • -
- -## Usage - -
cldr_examples
-
- -## Format - -data frame diff --git a/docs/man/codelist.md b/docs/man/codelist.md deleted file mode 100644 index 770d369..0000000 --- a/docs/man/codelist.md +++ /dev/null @@ -1,376 +0,0 @@ - -# codelist - -Country Code Translation Data Frame (Cross-Sectional) - -## Description - -A data frame used internally by the countrycode() function. -countrycode can use any valid code as destination, but only -some codes can be used as origin. - -## Format - -A data frame with codes as columns. - -## Details - -

-Origin and Destination -

-
    -
  • - -cctld: IANA country code top-level domain - -
  • -
  • - -country.name: country name (English) - -
  • -
  • - -country.name.de: country name (German) - -
  • -
  • - -country.name.fr: country name (French) - -
  • -
  • - -country.name.it: country name (Italian) - -
  • -
  • - -cowc: Correlates of War character - -
  • -
  • - -cown: Correlates of War numeric - -
  • -
  • - -dhs: Demographic and Health Surveys Program - -
  • -
  • - -ecb: European Central Bank - -
  • -
  • - -eurostat: Eurostat - -
  • -
  • - -fao: Food and Agriculture Organization of the United -Nations numerical code - -
  • -
  • - -fips: FIPS 10-4 (Federal Information Processing Standard) - -
  • -
  • - -gaul: Global Administrative Unit Layers - -
  • -
  • - -genc2c: GENC 2-letter code - -
  • -
  • - -genc3c: GENC 3-letter code - -
  • -
  • - -genc3n: GENC numeric code - -
  • -
  • - -gwc: Gleditsch & Ward character - -
  • -
  • - -gwn: Gleditsch & Ward numeric - -
  • -
  • - -imf: International Monetary Fund - -
  • -
  • - -ioc: International Olympic Committee - -
  • -
  • - -iso2c: ISO-2 character - -
  • -
  • - -iso3c: ISO-3 character - -
  • -
  • - -iso3n: ISO-3 numeric - -
  • -
  • - -p5n: Polity V numeric country code - -
  • -
  • - -p5c: Polity V character country code - -
  • -
  • - -p4n: Polity IV numeric country code - -
  • -
  • - -p4c: Polity IV character country code - -
  • -
  • - -un: United Nations M49 numeric codes - -
  • -
  • - -unicode.symbol: Region subtag (often displayed as emoji -flag) - -
  • -
  • - -unhcr: United Nations High Commissioner for Refugees - -
  • -
  • - -unpd: United Nations Procurement Division - -
  • -
  • - -vdem: Varieties of Democracy (V-Dem version 8, April 2018) - -
  • -
  • - -wb: World Bank (very similar but not identical to iso3c) - -
  • -
  • - -wvs: World Values Survey numeric code - -
  • -
-

-Destination only -

-
    -
  • - -⁠cldr.\*⁠: 600+ country name -variants from the UNICODE CLDR project (e.g., "cldr.short.en"). Inspect -the cldr_examples data.frame for a full list of available -country names and examples. - -
  • -
  • - -ar5: IPCC’s regional mapping used both in the Fifth -Assessment Report (AR5) and for the Reference Concentration Pathways -(RCP) - -
  • -
  • - -continent: Continent as defined in the World Bank -Development Indicators - -
  • -
  • - -cow.name: Correlates of War country name - -
  • -
  • - -currency: ISO 4217 currency name - -
  • -
  • - -eurocontrol_pru: European Organisation for the Safety of -Air Navigation - -
  • -
  • - -eurocontrol_statfor: European Organisation for the Safety -of Air Navigation - -
  • -
  • - -eu28: Member states of the European Union (as of December -2015), without special territories - -
  • -
  • - -icao.region: International Civil Aviation Organization -region - -
  • -
  • - -iso.name.en: ISO English short name - -
  • -
  • - -iso.name.fr: ISO French short name - -
  • -
  • - -iso4217c: ISO 4217 currency alphabetic code - -
  • -
  • - -iso4217n: ISO 4217 currency numeric code - -
  • -
  • - -p4.name: Polity IV country name - -
  • -
  • - -region: 7 Regions as defined in the World Bank Development -Indicators - -
  • -
  • - -region23: 23 Regions as used to be in the World Bank -Development Indicators (legacy) - -
  • -
  • - -un.name.ar: United Nations Arabic country name - -
  • -
  • - -un.name.en: United Nations English country name - -
  • -
  • - -un.name.es: United Nations Spanish country name - -
  • -
  • - -un.name.fr: United Nations French country name - -
  • -
  • - -un.name.ru: United Nations Russian country name - -
  • -
  • - -un.name.zh: United Nations Chinese country name - -
  • -
  • - -un.region.name: United Nations region name - -
  • -
  • - -un.region.code: United Nations region code - -
  • -
  • - -un.regionintermediate.name: United Nations intermediate -region name - -
  • -
  • - -un.regionintermediate.code: United Nations intermediate -region code - -
  • -
  • - -un.regionsub.name: United Nations sub-region name - -
  • -
  • - -un.regionsub.code: United Nations sub-region code - -
  • -
  • - -unhcr.region: United Nations High Commissioner for Refugees -region name - -
  • -
  • - -wvs.name: World Values Survey numeric code country name - -
  • -
- -## Note - -The Correlates of War (cow) and Polity 4 (p4) project produce codes in -country year format. Some countries go through political transitions -that justify changing codes over time. When building a purely -cross-sectional conversion dictionary, this forces us to make arbitrary -choices with respect to some entities (e.g., Western Germany, Vietnam, -Serbia). countrycode includes a reconciled dataset in panel -format, codelist_panel. Instead of converting code, we -recommend that users dealing with panel data "left-merge" their data -into this panel dictionary. diff --git a/docs/man/codelist_panel.md b/docs/man/codelist_panel.md deleted file mode 100644 index 29834a6..0000000 --- a/docs/man/codelist_panel.md +++ /dev/null @@ -1,17 +0,0 @@ - -# codelist_panel - -Country Code Translation Data Frame (Country-Year Panel) - -## Description - -A panel of country-year observations with various codes - -## Usage - -
codelist_panel
-
- -## Format - -data frame with codes as columns diff --git a/docs/man/countrycode-package.md b/docs/man/countrycode-package.md deleted file mode 100644 index f5cb83f..0000000 --- a/docs/man/countrycode-package.md +++ /dev/null @@ -1,30 +0,0 @@ - -# countrycode-package - -Convert Country Codes or Country Names - -## Description - -Convert country codes or country names - -## Details - -The countrycode function can convert to and from several -different country coding schemes. It uses regular expressions to convert -country names (e.g. Sri Lanka) into any of those coding schemes, or into -standardized country names in several languages. It can create variables -with the name of the continent and/or several regional groupings to -which each country belongs. - -Type ?codelist to get a list of available origin and destination codes. - -## Author(s) - -Vincent Arel-Bundock -vincent.arel-bundock@umontreal.ca - -## References - -
\url{http://arelbundock.com}
-\url{https://github.com/vincentarelbundock/countrycode}
-
diff --git a/docs/man/countrycode.md b/docs/man/countrycode.md deleted file mode 100644 index 9d9aecc..0000000 --- a/docs/man/countrycode.md +++ /dev/null @@ -1,237 +0,0 @@ - -# countrycode - -Convert Country Codes - -## Description - -Converts long country names into one of many different coding schemes. -Translates from one scheme to another. Converts country name or coding -scheme to the official short English country name. Creates a new -variable with the name of the continent or region to which each country -belongs. - -## Usage - -
countrycode(
-  sourcevar,
-  origin,
-  destination,
-  warn = TRUE,
-  nomatch = NA,
-  custom_dict = NULL,
-  custom_match = NULL,
-  origin_regex = NULL
-)
-
- -## Arguments - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
-sourcevar - -Vector which contains the codes or country names to be converted -(character or factor) -
-origin - -A string which identifies the coding scheme of origin (e.g., -“iso3c”). See codelist for a list of available -codes. -
-destination - -A string or vector of strings which identify the coding scheme of -destination (e.g., “iso3c” or c(“cowc”, -“iso3c”)). See codelist for a list of available -codes. When users supply a vector of destination codes, they are used -sequentially to fill in missing values not covered by the previous -destination code in the vector. -
-warn - -Prints unique elements from sourcevar for which no match was found -
-nomatch - -When countrycode fails to find a match for the code of origin, it -fills-in the destination vector with nomatch. The default -behavior is to fill non-matching codes with NA. If -nomatch = NULL, countrycode tries to use the origin vector -to fill-in missing values in the destination vector. -nomatch must be either NULL, of length 1, or -of the same length as sourcevar. -
-custom_dict - - -A data frame which supplies a new dictionary to replace the built-in -country code dictionary. Each column contains a different code and must -include no duplicates. The data frame format should resemble -codelist. Users can pre-assign attributes to this custom -dictionary to affect behavior (see Examples section): - -
    -
  • - -"origin.regex" attribute: a character vector with the names of columns -containing regular expressions. - -
  • -
  • - -"origin.valid" attribute: a character vector with the names of columns -that are accepted as valid origin codes. - -
  • -
-
-custom_match - -A named vector which supplies custom origin and destination matches that -will supercede any matching default result. The name of each element -will be used as the origin code, and the value of each element will be -used as the destination code. -
-origin_regex - -NULL or Logical: When using a custom dictionary, if TRUE then the origin -codes will be matched as regex, if FALSE they will be matched exactly. -When NULL, countrycode will behave as TRUE if the origin -name is in the custom_dictionary’s -origin_regex attribute, and FALSE otherwise. See examples -section below. -
- -## Note - -For a complete description of available country codes and languages, -please see the documentation for the codelist conversion -dictionary. - -Panel data (i.e., country-year) can pose particular problems when -converting codes. For instance, some countries like Vietnam or Serbia go -through political transitions that justify changing codes over time. -This can pose problems when using codes from organizations like CoW or -Polity IV, which produce codes in country-year format. Instead of -converting codes using countrycode(), we recommend that -users use the codelist_panel data.frame as a base into -which they can merge their other data. This data.frame includes most -relevant code, and is already "reconciled" to ensure that each political -unit is only represented by one row in any given year. From there, it is -just a matter of using merge() to combine different -datasets which use different codes. - -## Examples - -``` r -library(countrycode) - -library(countrycode) - -# ISO to Correlates of War -countrycode(c('USA', 'DZA'), origin = 'iso3c', destination = 'cown') -``` - - [1] 2 615 - -``` r -# English to ISO -countrycode('Albania', origin = 'country.name', destination = 'iso3c') -``` - - [1] "ALB" - -``` r -# German to French -countrycode('Albanien', origin = 'country.name.de', destination = 'iso.name.fr') -``` - - [1] "Albanie (l')" - -``` r -# Using custom_match to supercede default codes -countrycode(c('United States', 'Algeria'), 'country.name', 'iso3c') -``` - - [1] "USA" "DZA" - -``` r -countrycode(c('United States', 'Algeria'), 'country.name', 'iso3c', - custom_match = c('Algeria' = 'ALG')) -``` - - [1] "USA" "ALG" - -``` r -x <- c("canada", "antarctica") -countryname(x) -``` - - [1] "Canada" "Antarctica" - -``` r -countryname(x, destination = "cowc", warn = FALSE) -``` - - [1] "CAN" NA - -``` r -countryname(x, destination = "cowc", warn = FALSE, nomatch = x) -``` - - [1] "CAN" "antarctica" - -``` r - # Download the dictionary of US states from Github - - state_dict <- "https://raw.githubusercontent.com/vincentarelbundock/countrycode/main/data/custom_dictionaries/data_us_states.csv" - state_dict <- read.csv(state_dict) - - # The "state.regex" column includes regular expressions, so we set an attribute. - attr(state_dict, "origin_regex") <- "state.regex" - countrycode(c('AL', 'AK'), 'abbreviation', 'state', - custom_dict = state_dict) -``` - - [1] "Alabama" "Alaska" - -``` r - countrycode(c('Alabama', 'North Dakota'), 'state.regex', 'state', - custom_dict = state_dict) -``` - - [1] "Alabama" "North Dakota" diff --git a/docs/man/countryname.md b/docs/man/countryname.md deleted file mode 100644 index 28cdccc..0000000 --- a/docs/man/countryname.md +++ /dev/null @@ -1,90 +0,0 @@ - -# countryname - -Convert country names in any language to another name or code - -## Description - -Converts long country names in any language to one of many different -country code schemes or country names. countryname does 2 -passes on the data. First, it tries to detect variations of country -names in many languages extracted from the Unicode Common Locale Data -Repository. Second, it applies countrycode’s English -regexes to try to match the remaining cases. Because it does two passes, -countryname can sometimes produce ambiguous results, e.g., -Saint Martin vs. Saint Martin (French Part). Users who need a "safer" -option can use: countrycode(x, “country.name”, -“country.name”) Note that the function works with non-ASCII -characters. Please see the Github page for examples. - -## Usage - -
countryname(
-  sourcevar,
-  destination = "country.name.en",
-  nomatch = NA,
-  warn = TRUE
-)
-
- -## Arguments - - - - - - - - - - - - - - - - - - -
-sourcevar - -Vector which contains the codes or country names to be converted -(character or factor) -
-destination - -Coding scheme of destination (string such as "iso3c" enclosed in quotes -""): type ?codelist for a list of available codes. -
-nomatch - -When countrycode fails to find a match for the code of origin, it -fills-in the destination vector with nomatch. The default -behavior is to fill non-matching codes with NA. If -nomatch = NULL, countrycode tries to use the origin vector -to fill-in missing values in the destination vector. -nomatch must be either NULL, of length 1, or -of the same length as sourcevar. -
-warn - -Prints unique elements from sourcevar for which no match was found -
- -## Examples - -``` r -library(countrycode) - -x <- c('Afaganisitani', 'Barbadas', 'Sverige', 'UK') -countryname(x) -``` - - [1] "Afghanistan" "Barbados" "Sweden" "United Kingdom" - -``` r -countryname(x, destination = 'iso3c') -``` - - [1] "AFG" "BRB" "SWE" "GBR" diff --git a/docs/man/countryname_dict.md b/docs/man/countryname_dict.md deleted file mode 100644 index 511d210..0000000 --- a/docs/man/countryname_dict.md +++ /dev/null @@ -1,14 +0,0 @@ - -# countryname_dict - -A dataframe of alternative country names in many languages. Used -internally by the countryname function. - -## Description - -A dataframe of alternative country names in many languages. Used -internally by the countryname function. - -## Format - -dataframe diff --git a/docs/man/get_dictionary.md b/docs/man/get_dictionary.md deleted file mode 100644 index 368b426..0000000 --- a/docs/man/get_dictionary.md +++ /dev/null @@ -1,47 +0,0 @@ - -# get_dictionary - -Get Custom Dictionaries - -## Description - -Download a custom dictionary to use in the custom_dict -argument of countrycode() - -## Usage - -
get_dictionary(dictionary = NULL)
-
- -## Arguments - - - - - - -
-dictionary - -A character string that specifies the dictionary to be retrieved. It -must be one of "global_burden_of_disease", "ch_cantons", "us_states", -"exiobase3", "gtap10". If NULL, the function will print the list of -available dictionaries. Default is NULL. -
- -## Value - -If a valid dictionary is specified, the function will return that -dictionary as a data.frame. If an invalid dictionary or no dictionary is -specified, the function will stop and throw an error message. - -## Examples - -``` r -library(countrycode) - -cd <- get_dictionary("us_states") -countrycode::countrycode(c("MO", "MN"), origin = "state.abb", "state.name", custom_dict = cd) -``` - - [1] "Missouri" "Minnesota" diff --git a/docs/man/guess_field.md b/docs/man/guess_field.md deleted file mode 100644 index 5d58c1b..0000000 --- a/docs/man/guess_field.md +++ /dev/null @@ -1,67 +0,0 @@ - -# guess_field - -Guess the code/name of a vector - -## Description - -Users sometimes do not know what kind of code or field their data -contain. This function tries to guess by comparing the similarity -between a user-supplied vector and all the codes included in the -countrycode dictionary. - -## Usage - -
guess_field(codes, min_similarity = 80)
-
- -## Arguments - - - - - - - - - - -
-codes - -a vector of country codes or country names -
-min_similarity - -the function returns all field names where over than -min_similarity% of codes are shared between the supplied -vector and the countrycode dictionary. -
- -## Examples - -``` r -library(countrycode) - -# Guess ISO codes -guess_field(c('DZA', 'CAN', 'DEU')) -``` - - code percent_of_unique_matched - genc3c genc3c 100 - iso3c iso3c 100 - wb wb 100 - wb_api3c wb_api3c 100 - -``` r -# Guess country names -guess_field(c('Guinea','Iran','Russia','North Korea',rep('Ivory Coast',50),'Scotland')) -``` - - code percent_of_unique_matched - cow.name cow.name 83.33333 - vdem.name vdem.name 83.33333 - cldr.variant.ceb cldr.variant.ceb 83.33333 - cldr.variant.en cldr.variant.en 83.33333 - cldr.variant.en_001 cldr.variant.en_001 83.33333 - cldr.variant.en_au cldr.variant.en_au 83.33333 diff --git a/docs/reference.md b/docs/reference.md deleted file mode 100644 index cf78241..0000000 --- a/docs/reference.md +++ /dev/null @@ -1,564 +0,0 @@ -# Reference - -## Cldr examples - -### Description - -- Code: CLDR code - -- Example: French Southern Territories in different languages - -### Usage - - cldr_examples - -### Format - -data frame - - ---- -## Codelist panel - -### Description - -A panel of country-year observations with various codes - -### Usage - - codelist_panel - -### Format - -data frame with codes as columns - - ---- -## Codelist - -### Description - -A data frame used internally by the `countrycode()` function. -`countrycode` can use any valid code as destination, but only some codes -can be used as origin. - -### Format - -A data frame with codes as columns. - -### Details - -#### Origin and Destination - -- `cctld`: IANA country code top-level domain - -- `country.name`: country name (English) - -- `country.name.de`: country name (German) - -- `country.name.fr`: country name (French) - -- `country.name.it`: country name (Italian) - -- `cowc`: Correlates of War character - -- `cown`: Correlates of War numeric - -- `dhs`: Demographic and Health Surveys Program - -- `ecb`: European Central Bank - -- `eurostat`: Eurostat - -- `fao`: Food and Agriculture Organization of the United Nations - numerical code - -- `fips`: FIPS 10-4 (Federal Information Processing Standard) - -- `gaul`: Global Administrative Unit Layers - -- `genc2c`: GENC 2-letter code - -- `genc3c`: GENC 3-letter code - -- `genc3n`: GENC numeric code - -- `gwc`: Gleditsch & Ward character - -- `gwn`: Gleditsch & Ward numeric - -- `imf`: International Monetary Fund - -- `ioc`: International Olympic Committee - -- `iso2c`: ISO-2 character - -- `iso3c`: ISO-3 character - -- `iso3n`: ISO-3 numeric - -- `p5n`: Polity V numeric country code - -- `p5c`: Polity V character country code - -- `p4n`: Polity IV numeric country code - -- `p4c`: Polity IV character country code - -- `un`: United Nations M49 numeric codes - -- `unicode.symbol`: Region subtag (often displayed as emoji flag) - -- `unhcr`: United Nations High Commissioner for Refugees - -- `unpd`: United Nations Procurement Division - -- `vdem`: Varieties of Democracy (V-Dem version 8, April 2018) - -- `wb`: World Bank (very similar but not identical to iso3c) - -- `wvs`: World Values Survey numeric code - -#### Destination only - -- `⁠cldr.*⁠`: 600+ country name variants from the UNICODE CLDR project - (e.g., "cldr.short.en"). Inspect the `cldr_examples` data.frame for - a full list of available country names and examples. - -- `ar5`: IPCC's regional mapping used both in the Fifth Assessment - Report (AR5) and for the Reference Concentration Pathways (RCP) - -- `continent`: Continent as defined in the World Bank Development - Indicators - -- `cow.name`: Correlates of War country name - -- `currency`: ISO 4217 currency name - -- `eurocontrol_pru`: European Organisation for the Safety of Air - Navigation - -- `eurocontrol_statfor`: European Organisation for the Safety of Air - Navigation - -- `eu28`: Member states of the European Union (as of December 2015), - without special territories - -- `icao.region`: International Civil Aviation Organization region - -- `iso.name.en`: ISO English short name - -- `iso.name.fr`: ISO French short name - -- `iso4217c`: ISO 4217 currency alphabetic code - -- `iso4217n`: ISO 4217 currency numeric code - -- `p4.name`: Polity IV country name - -- `region`: 7 Regions as defined in the World Bank Development - Indicators - -- `region23`: 23 Regions as used to be in the World Bank Development - Indicators (legacy) - -- `un.name.ar`: United Nations Arabic country name - -- `un.name.en`: United Nations English country name - -- `un.name.es`: United Nations Spanish country name - -- `un.name.fr`: United Nations French country name - -- `un.name.ru`: United Nations Russian country name - -- `un.name.zh`: United Nations Chinese country name - -- `un.region.name`: United Nations region name - -- `un.region.code`: United Nations region code - -- `un.regionintermediate.name`: United Nations intermediate region - name - -- `un.regionintermediate.code`: United Nations intermediate region - code - -- `un.regionsub.name`: United Nations sub-region name - -- `un.regionsub.code`: United Nations sub-region code - -- `unhcr.region`: United Nations High Commissioner for Refugees region - name - -- `wvs.name`: World Values Survey numeric code country name - -### Note - -The Correlates of War (cow) and Polity 4 (p4) project produce codes in -country year format. Some countries go through political transitions -that justify changing codes over time. When building a purely -cross-sectional conversion dictionary, this forces us to make arbitrary -choices with respect to some entities (e.g., Western Germany, Vietnam, -Serbia). `countrycode` includes a reconciled dataset in panel format, -`codelist_panel`. Instead of converting code, we recommend that users -dealing with panel data "left-merge" their data into this panel -dictionary. - - ---- -## Countrycode-package - -### Description - -Convert country codes or country names - -### Details - -The `countrycode` function can convert to and from several different -country coding schemes. It uses regular expressions to convert country -names (e.g. Sri Lanka) into any of those coding schemes, or into -standardized country names in several languages. It can create variables -with the name of the continent and/or several regional groupings to -which each country belongs. - -Type ?codelist to get a list of available origin and destination codes. - -### Author(s) - -Vincent Arel-Bundock - -### References - - \url{http://arelbundock.com} - \url{https://github.com/vincentarelbundock/countrycode} - - ---- -## Countrycode - -### Description - -Converts long country names into one of many different coding schemes. -Translates from one scheme to another. Converts country name or coding -scheme to the official short English country name. Creates a new -variable with the name of the continent or region to which each country -belongs. - -### Usage - - countrycode( - sourcevar, - origin, - destination, - warn = TRUE, - nomatch = NA, - custom_dict = NULL, - custom_match = NULL, - origin_regex = NULL - ) - -### Arguments - - ---- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
sourcevar

Vector which contains the codes or country names to be converted -(character or factor)

origin

A string which identifies the coding scheme of origin (e.g., -"iso3c"). See codelist for a list of available -codes.

destination

A string or vector of strings which identify the coding scheme of -destination (e.g., "iso3c" or -c("cowc", "iso3c")). See codelist for a list -of available codes. When users supply a vector of destination codes, -they are used sequentially to fill in missing values not covered by the -previous destination code in the vector.

warn

Prints unique elements from sourcevar for which no match was -found

nomatch

When countrycode fails to find a match for the code of origin, it -fills-in the destination vector with nomatch. The default -behavior is to fill non-matching codes with NA. If -nomatch = NULL, countrycode tries to use the origin vector -to fill-in missing values in the destination vector. -nomatch must be either NULL, of length 1, or -of the same length as sourcevar.

custom_dict

A data frame which supplies a new dictionary to replace the -built-in country code dictionary. Each column contains a different code -and must include no duplicates. The data frame format should resemble -codelist. Users can pre-assign attributes to this custom -dictionary to affect behavior (see Examples section):

-
    -
  • "origin.regex" attribute: a character vector with the names of -columns containing regular expressions.

  • -
  • "origin.valid" attribute: a character vector with the names of -columns that are accepted as valid origin codes.

  • -
custom_match

A named vector which supplies custom origin and destination -matches that will supercede any matching default result. The name of -each element will be used as the origin code, and the value of each -element will be used as the destination code.

origin_regex

NULL or Logical: When using a custom dictionary, if TRUE then the -origin codes will be matched as regex, if FALSE they will be matched -exactly. When NULL, countrycode will behave as TRUE if the -origin name is in the custom_dictionary's -origin_regex attribute, and FALSE otherwise. See examples -section below.

- -### Note - -For a complete description of available country codes and languages, -please see the documentation for the `codelist` conversion dictionary. - -Panel data (i.e., country-year) can pose particular problems when -converting codes. For instance, some countries like Vietnam or Serbia go -through political transitions that justify changing codes over time. -This can pose problems when using codes from organizations like CoW or -Polity IV, which produce codes in country-year format. Instead of -converting codes using `countrycode()`, we recommend that users use the -`codelist_panel` data.frame as a base into which they can merge their -other data. This data.frame includes most relevant code, and is already -"reconciled" to ensure that each political unit is only represented by -one row in any given year. From there, it is just a matter of using -`merge()` to combine different datasets which use different codes. - -### Examples - -```r -library(countrycode) - -# ISO to Correlates of War -countrycode(c('USA', 'DZA'), origin = 'iso3c', destination = 'cown') - -# English to ISO -countrycode('Albania', origin = 'country.name', destination = 'iso3c') - -# German to French -countrycode('Albanien', origin = 'country.name.de', destination = 'iso.name.fr') - -# Using custom_match to supercede default codes -countrycode(c('United States', 'Algeria'), 'country.name', 'iso3c') -countrycode(c('United States', 'Algeria'), 'country.name', 'iso3c', -custom_match = c('Algeria' = 'ALG')) - -x <- c("canada", "antarctica") -countryname(x) -countryname(x, destination = "cowc", warn = FALSE) -countryname(x, destination = "cowc", warn = FALSE, nomatch = x) - -## Not run: - # Download the dictionary of US states from Github - - state_dict <- "https://raw.githubusercontent.com/vincentarelbundock/countrycode/main/data/custom_dictionaries/data_us_states.csv" - state_dict <- read.csv(state_dict) - - # The "state.regex" column includes regular expressions, so we set an attribute. - attr(state_dict, "origin_regex") <- "state.regex" - countrycode(c('AL', 'AK'), 'abbreviation', 'state', - custom_dict = state_dict) - countrycode(c('Alabama', 'North Dakota'), 'state.regex', 'state', - custom_dict = state_dict) - -## End(Not run) -``` - - ---- -## Countryname dict - -### Description - -A dataframe of alternative country names in many languages. Used -internally by the `countryname` function. - -### Format - -dataframe - - ---- -## Countryname - -### Description - -Converts long country names in any language to one of many different -country code schemes or country names. `countryname` does 2 passes on -the data. First, it tries to detect variations of country names in many -languages extracted from the Unicode Common Locale Data Repository. -Second, it applies `countrycode`'s English regexes to try to match the -remaining cases. Because it does two passes, `countryname` can sometimes -produce ambiguous results, e.g., Saint Martin vs. Saint Martin (French -Part). Users who need a "safer" option can use: -`countrycode(x, "country.name", "country.name")` Note that the function -works with non-ASCII characters. Please see the Github page for -examples. - -### Usage - - countryname( - sourcevar, - destination = "country.name.en", - nomatch = NA, - warn = TRUE - ) - -### Arguments - - - - - - - - - - - - - - - - - - - - -
sourcevar

Vector which contains the codes or country names to be converted -(character or factor)

destination

Coding scheme of destination (string such as "iso3c" enclosed in -quotes ""): type ?codelist for a list of available -codes.

nomatch

When countrycode fails to find a match for the code of origin, it -fills-in the destination vector with nomatch. The default -behavior is to fill non-matching codes with NA. If -nomatch = NULL, countrycode tries to use the origin vector -to fill-in missing values in the destination vector. -nomatch must be either NULL, of length 1, or -of the same length as sourcevar.

warn

Prints unique elements from sourcevar for which no match was -found

- -### Examples - -```r -## Not run: -x <- c('Afaganisitani', 'Barbadas', 'Sverige', 'UK') -countryname(x) -countryname(x, destination = 'iso3c') - -## End(Not run) -``` - - ---- -## Get dictionary - -### Description - -Download a custom dictionary to use in the `custom_dict` argument of -`countrycode()` - -### Usage - - get_dictionary(dictionary = NULL) - -### Arguments - - - - - - - - -
dictionary

A character string that specifies the dictionary to be retrieved. -It must be one of "global_burden_of_disease", "ch_cantons", "us_states", -"exiobase3", "gtap10". If NULL, the function will print the list of -available dictionaries. Default is NULL.

- -### Value - -If a valid dictionary is specified, the function will return that -dictionary as a data.frame. If an invalid dictionary or no dictionary is -specified, the function will stop and throw an error message. - -### Examples - -```r -## Not run: -cd <- get_dictionary("us_states") -countrycode::countrycode(c("MO", "MN"), origin = "state.abb", "state.name", custom_dict = cd) - -## End(Not run) -``` - - ---- -## Guess field - -### Description - -Users sometimes do not know what kind of code or field their data -contain. This function tries to guess by comparing the similarity -between a user-supplied vector and all the codes included in the -`countrycode` dictionary. - -### Usage - - guess_field(codes, min_similarity = 80) - -### Arguments - - - - - - - - - - - - -
codes

a vector of country codes or country names

min_similarity

the function returns all field names where over than -min_similarity% of codes are shared between the supplied -vector and the countrycode dictionary.

- -### Examples - -```r -# Guess ISO codes -guess_field(c('DZA', 'CAN', 'DEU')) - -# Guess country names -guess_field(c('Guinea','Iran','Russia','North Korea',rep('Ivory Coast',50),'Scotland')) -``` - - ---- diff --git a/vignettes/contributions.qmd b/vignettes/contributions.qmd index 464af20..626e10f 100644 --- a/vignettes/contributions.qmd +++ b/vignettes/contributions.qmd @@ -48,6 +48,8 @@ readr::write_csv(custom_dict, 'custom_dict.csv', na = '') When using custom dictionaries, it is often useful to give "meta" information to `countrycode` so that it knows how to use certain codes. To do this, we can set attributes of the dictionary. In this example, we download a dictionary of US state codes. Then, we identify a column of regular expressions using the `origin_regex` attribute, and we identify the valid origin codes using the `origin_valid` attribute. ```{r, error = TRUE, message = FALSE} +library(countrycode) + state_dict <- "https://raw.githubusercontent.com/vincentarelbundock/countrycode/main/data/custom_dictionaries/data_us_states.csv" state_dict <- read.csv(state_dict) diff --git a/vignettes/custom.qmd b/vignettes/custom.qmd index dbdd47f..6ae9080 100644 --- a/vignettes/custom.qmd +++ b/vignettes/custom.qmd @@ -6,6 +6,8 @@ It is easy to to create alternative functions with different default arguments a * `statecode` function to convert US state codes using a custom dictionary by default, that we download from the internet. ```{r, error = TRUE, message = FALSE} +library(countrycode) + ################################# # new function: name_to_iso3c # #################################