aboutsummaryrefslogtreecommitdiff
path: root/v2/tokenizer.go
AgeCommit message (Expand)Author
2022-09-16Make the public facing API be implemented in terms of io.Reader rather thanBill Neubauer
2022-09-16Rewrite the tokenization process to work on streams rather than requiring theBill Neubauer
2022-09-16Removing the Index field from the token structures.Bill Neubauer
2022-09-16Adds Copyright detection to the report generated by the classifier.Bill Neubauer
2022-03-16Fixes handling of newline characters so that Normalize preserves the newlineBill Neubauer
2022-03-16Automated g4 rollback of changelist 415285962.Google Open Source
2022-03-16Fixes handling of newline characters so that Normalize preserves the newlineBill Neubauer
2022-03-16API implementation for the Normalize method.Bill Neubauer
2022-03-16Adding old style MIT license C-Ares.Google Open Source
2021-03-24remove reduntant checkBharat Biradar
2021-03-22Fix typoBharat Biradar
2020-11-13Change the public API to use []byte instead of string.Bill Neubauer
2020-11-13Scope use of phrase induction to fix some bugs.Bill Neubauer
2020-11-13Fix handling of extremely long lines by inserting EOL tokens at sentenceBill Neubauer
2020-11-13This CL fixes a bug in text handling due to iterating over a slice of runesBill Neubauer
2020-11-13Add new testing scenarios and testing functionality.Bill Neubauer
2020-11-13Needed to make a change to number tokenization to resolve an issueBill Neubauer
2020-11-13The tokenizer for the new version of the licenseclassifier.Bill Neubauer