Skip to content

Commit

Permalink
Merge pull request #218 from bzz/tokenizer-flex-cgo
Browse files Browse the repository at this point in the history
New, optional flex-based tokenizer
  • Loading branch information
bzz authored Apr 17, 2019
2 parents ab3c26b + 7e136ba commit b6daf5c
Show file tree
Hide file tree
Showing 9 changed files with 2,707 additions and 15 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,4 @@ Makefile.main
build/
vendor/
java/lib/
.vscode/
7 changes: 7 additions & 0 deletions internal/tokenizer/common.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
// Package tokenizer implements file tokenization used by the enry content
// classifier. This package is an implementation detail of enry and should not
// be imported by other packages.
package tokenizer

// ByteLimit defines the maximum prefix of an input text that will be tokenized.
const ByteLimit = 100000
Loading

0 comments on commit b6daf5c

Please sign in to comment.