Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync docs/llm_main with master #13045

Merged
merged 50 commits into from
Oct 5, 2023
Merged

Sync docs/llm_main with master #13045

merged 50 commits into from
Oct 5, 2023

Conversation

rmitsch
Copy link
Contributor

@rmitsch rmitsch commented Oct 5, 2023

Description

Sync docs/llm_main with master.

Types of change

Chore/docs.

Checklist

  • I confirm that I have the right to submit this contribution under the project's MIT license.
  • I ran the tests, and all new and existing tests passed.
  • My changes don't require a change to the documentation, or if they do, I've added all required information.

svlandeg and others added 30 commits July 4, 2023 15:05
Merge `master` into `develop`
…2vec` pipe to the callback (#12785)

* `Language.replace_listeners`: Pass the replaced listener and the `tok2vec` pipe to the callback

* Update developer docs

* `isort` fixes

* Add error message to assertion

* Add clarification to dev docs

* Replace assertion with exception

* Doc fixes
* Setting up weasel branch (#12456)

* remove project-specific functionality

* remove project-specific tests

* remove project-specific schemas

* remove project-specific information in about

* remove project-specific functions in util.py

* remove project-specific error strings

* remove project-specific CLI commands

* black formatting

* restore some functions that are used beyond projects

* remove project imports

* remove imports

* remove remote_storage tests

* remove one more project unit test

* update for PR 12394

* remove get_hash and get_checksum

* remove upload_ and download_file methods

* remove ensure_pathy

* revert clumsy fingers

* reinstate E970

* feat: use weasel as spacy project command (#12473)

* feat: use weasel as spacy project command

* build: use constrained requirement for weasel

* feat: add weasel to the library requirements

* build: update weasel to new version

* build: use specific weasel tag

* build: use weasel-0.1.0rc1 from PyPI

* fix: remove weasel from requirements.txt

* fix: requirements.txt and setup.cfg need to reflect each other

* feat: remove legacy spacy project code

* bump version

* further merge fixes

* isort

---------

Co-authored-by: Basile Dura <[email protected]>
…Greek (#12829)

* Update universe.json

* Update universe.json

add some missing commas in the greCy's description.

* Update punctuation.py

Add mathematical left and right angle brackets as punctuation for ancient Greek for better tokenization.
* Update numpy build constraints for numpy 1.25

Starting in numpy 1.25 (see
https://github.com/numpy/numpy/releases/tag/v1.25.0), the numpy C API is
backwards-compatible by default.

For python 3.9+, we should be able to drop the specific numpy build
requirements and use `numpy>=1.25`, which is currently
backwards-compatible to `numpy>=1.19`.

In the future, the python <3.9 requirements could be dropped and the
lower numpy pin could correspond to the oldest supported version for the
current lower python pin.

* Turn off fail-fast

* Revert "Turn off fail-fast"

This reverts commit 4306f51.

* Update for python 3.6

* Fix typo
* Support registered vectors

* Format

* Auto-fill [nlp] on load from config and from bytes/disk

* Only auto-fill [nlp]

* Undo all changes to Language.from_disk

* Expand BaseVectors

These methods are needed in various places for training and vector
similarity.

* isort

* More linting

* Only fill [nlp.vectors]

* Update spacy/vocab.pyx

Co-authored-by: Sofie Van Landeghem <[email protected]>

* Revert changes to test related to auto-filling [nlp]

* Add vectors registry

* Rephrase error about vocab methods for vectors

* Switch to dummy implementation for BaseVectors.to_ops

* Add initial draft of docs

* Remove example from BaseVectors docs

* Apply suggestions from code review

Co-authored-by: Sofie Van Landeghem <[email protected]>

* Update website/docs/api/basevectors.mdx

Co-authored-by: Sofie Van Landeghem <[email protected]>

* Fix type and lint bpemb example

* Update website/docs/api/basevectors.mdx

---------

Co-authored-by: Sofie Van Landeghem <[email protected]>
* feat: add example stubs

* fix: add required annotations

* fix: mypy issues

* fix: use Py36-compatible Portocol

* Minor reformatting

* adding further type specifications and removing internal methods

* black formatting

* widen type to iterable

* add private methods that are being used by the built-in convertors

* revert changes to corpus.py

* fixes

* fixes

* fix typing of PlainTextCorpus

---------

Co-authored-by: Basile Dura <[email protected]>
Co-authored-by: Adriane Boyd <[email protected]>
…-v1.3-revert

Revert "Extend to spacy-transformers v1.3.x (#12877)"
* Extend to weasel v0.3

* Clean up unused imports in test_cli
* initial

* initial documentation run

* fix typo

* Remove mentions of Torchscript and quantization

Both are disabled in the initial release of `spacy-curated-transformers`.

* Fix `piece_encoder` entries

* Remove `spacy-transformers`-specific warning

* Fix duplicate entries in tables

* Doc fixes

Co-authored-by: Sofie Van Landeghem <[email protected]>

* Remove type aliases

* Fix copy-paste typo

* Change `debug pieces` version tag to `3.7`

* Set curated transformers API version to  `3.7`

* Fix transformer listener naming

* Add docs for `init fill-config-transformer`

* Update CLI command invocation syntax

* Update intro section of the pipeline component docs

* Fix source URL

* Add a note to the architectures section about the `init fill-config-transformer` CLI command

* Apply suggestions from code review

Co-authored-by: Sofie Van Landeghem <[email protected]>

* Update CLI command name, args

* Remove hyphen from the `curated-transformers.mdx` filename

* Fix links

* Remove placeholder text

* Add text to the model/tokenizer loader sections

* Fill in the `DocTransformerOutput` section

* Formatting fixes

* Add curated transformer page to API docs sidebar

* More formatting fixes

* Remove TODO comment

* Remove outdated info about default config

* Apply suggestions from code review

Co-authored-by: Sofie Van Landeghem <[email protected]>

* Add link to HF model hub

* `prettier`

---------

Co-authored-by: Madeesh Kannan <[email protected]>
Co-authored-by: Sofie Van Landeghem <[email protected]>
`weasel` (using `cloudpathlib`) does not currently support remote paths
for python 3.12.
* add span key option for CLI evaluation

* Rephrase CLI help to refer to Doc.spans instead of spancat

* Rephrase docs to refer to Doc.spans instead of spancat

---------

Co-authored-by: Adriane Boyd <[email protected]>
* Drop support for python 3.6

* Update docs
* adding rolegal model to the spaCy universe

* Fix formatting

* Use raw URL

* update image url and example

* fix pip and update url to raw

* okay, let's add thumb instead of image 🐙

* Update website/meta/universe.json

---------

Co-authored-by: Adriane Boyd <[email protected]>
* Load the cli module lazily for spacy.info

This avoids that the `spacy` module cannot be imported when the
users chooses not to install `typer`/`requests`.

* Add test

---------

Co-authored-by: Adriane Boyd <[email protected]>
adrianeboyd and others added 20 commits September 28, 2023 16:01
…master-v3.7-1

Update develop from master for v3.7
`tqdm` can cause deadlocks in the test suite if enabled.
Redesigned cython profiling and other minor updates for python 3.12
* Docs for v3.7.0

* Minor fixes

* Extend Weasel notes

* Minor edits

* Update version in README
Revert "Load the cli module lazily for spacy.info (#12962)"
Sync `master` with `docs/llm_main`
@rmitsch rmitsch added the docs Documentation and website label Oct 5, 2023
@rmitsch rmitsch merged commit 030d63a into docs/llm_main Oct 5, 2023
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation and website
Projects
None yet
Development

Successfully merging this pull request may close these issues.