This is a new release.
It incorporates recent changes in the Document AI API, notably the structural shift from several endpoints to one endpoint with multiple processors. Google now appears to consider the service as mature, so a major release of daiR
appears appropriate.
Main package modifications:
-
Added several processor-related functions:
list_processor_types()
,create_processor()
,enable_processor()
,disable_processor()
, anddelete_processor()
. -
Added
get_text()
andget_tables()
as parsimonious replacements oftext_from_dai_response()
,text_from_dai_file()
,tables_from_dai_response()
andtables_from_dai_file()
. -
Added
get_entities()
anddraw_entities()
to make use of Document AI's new form parser processor. -
Removed
dai_tab_sync()
anddai_tab_async()
following Google's discontinuation of the v1beta2 endpoint on 31 January 2024. -
Modified the parameters of the
draw*()
functions for better consistency with other functions. -
Renamed the
.R
files and regrouped the functions.
- Local Pop OS Linux, R 4.3.2
- Windows Server 2022 (on Github actions), R 4.3.2
- Ubuntu 22.04 (on Github actions), R 4.3.2
- Mac OS 12.7.3 (on Github actions), R 4.3.2
There were no ERRORs or WARNINGs or NOTEs.
I am not aware of any downstream dependencies.
############################################
This is a resubmission.
- The file LICENSE has been changed to match the CRAN template.
############################################
This is a new submission.
The package was archived in March after issues were not addressed in time. The issues have since been addressed and the package has been developed further as follows:
- Added functions to deal with the introduction of processor parameters in the Google Document AI API.
- Added function to generate hOCR files from Document AI output.
- Modified several functions for increased versatility, for example:
- Functions that previously took only JSON files as input now also take HTTP response objects.
- The
draw_
family of functions now allows the user to customize the colour and line width of bounding boxes. build_token_df()
andbuild_block_df()
now include confidence scores in the output, allowing for filtering.
- Local Pop OS Linux, R 4.3.1
- Windows Server 2022 (on Github actions), R 4.3.1
- Ubuntu 22.04 (on Github actions), R 4.3.1
- Mac OS 12 (on Github actions), R 4.3.1
There were no ERRORs or WARNINGs.
There was 1 NOTE:
- New submission
I am not aware of any downstream dependencies.
###############################################
This is a resubmission. In this version I have addressed Gregor Seyers comments. I have:
- put 'daiR' in single quotes throughout, except in the top line of the DESCRIPTION
file, as
devtools::check(cran=TRUE)
throws an error if I do. - put 'Document AI' in single quotes throughout in the DESCRIPTION file.
- added a web reference for the API in the DESCRIPTION file.
- added
\value
to all .Rd files that didn't have it and reviewed all\value
entries to make sure they communicate the structure/class and meaning of the output, including in the places where no value is returned. - removed all instances I could find of functions writing to the user's homespace. I checked all the examples, tests, vignettes, as well as readme.md and changed to tempdir() throughout.
- removed the function that wrote to the global environment. I should mention that
the function --- which creates an
.auth
object on load to store access tokens --- was borrowed from a set of large R packages currently on CRAN, notably 'bigRQuery' and 'googledrive'. This led me to believe that CRAN makes exceptions for credential-storing functions. My new authentication solution works, but in case it breaks, it would be useful to know whether CRAN does indeed allow this particular operation. (I'm assuming the maintainers of the other packages use it for good reason.)
I also made some additional changes. I have:
- removed two functions (
dai_has_token()
anddai_deauth
) that are redundant under the new authentication solution. - removed one function (
create_folder()
) that I found on closer inspection to be
unnecessary. - rewritten several function descriptions (in the .Rd files) for improved clarity and consistency.
- revised news.md and the vignettes to reflect the above changes.
- changed the new version number to 0.9.0 in view of the scale of the combined changes.
- local Win 10 Enterprise install, R 4.1.0
- windows 10.0.17763 (on Github actions), R 4.1.0
- ubuntu 20.04 (on Github actions), R 4.1.0
- mac OS 10.15 (on Github actions), R 4.1.0
- windows (on WinBuilder), R Devel
- fedora 24 (on rhub), R Devel
There were no ERRORs or WARNINGs.
There was 1 NOTE on rhub and WinBuilder:
- New submission
I am not aware of any downstream dependencies.
################################################
This is a first submission.
- local Win 10 Enterprise install, R 4.1.0
- windows 10.0.17763 (on Github actions), R 4.1.0
- ubuntu 20.04 (on Github actions), R 4.1.0
- mac OS 10.15 (on Github actions), R 4.1.0
- windows (on WinBuilder), R Devel
- fedora 24 (on rhub), R Devel
There were no ERRORs or WARNINGs.
There was 1 NOTE on rhub and WinBuilder:
-
Possibly mis-spelled words in DESCRIPTION: JSON (14:39) daiR (13:15, 14:77)
These are proper names.
I am not aware of any downstream dependencies.