aboutsummaryrefslogtreecommitdiff
path: root/icing/tokenization
AgeCommit message (Collapse)Author
2023-11-30Update Icing from upstream.Jiayu Hu
Descriptions: ======================================================================== [Icing][version 3] Bump kVersion to 3 ======================================================================== Make lite index magic dependent on `IcingSearchEngineOptions::build_property_existence_metadata_hits` ======================================================================== Add a flag in IcingSearchEngineOptions to control whether to build property existence metadata hits ======================================================================== Support `hasProperty(property_path)` in the advanced query language ======================================================================== Add PropertyExistenceIndexingHandler to index property existence metadata hit ======================================================================== [JoinIndex Improvement][11/x] Add IcingSearchEngine initialization unit test for switching join index ======================================================================== [JoinIndex Improvement][10/x] Change/Add IcingSearchEngine unit tests ======================================================================== [JoinIndex Improvement][9/x] Integrate QualifiedIdJoinIndexImplV2 with IcingSearchEngine ======================================================================== [JoinIndex Improvement][8/x] Integrate QualifiedIdJoinIndexImplV2 with JoinProcessor ======================================================================== [JoinIndex Improvement][8/x] Integrate QualifiedIdJoinIndexImplV2 with QualifiedIdJoinIndexingHandler ======================================================================== [JoinIndex Improvement][7/x] Create QualifiedIdJoinIndex interface ======================================================================== [JoinIndex Improvement][6.1/x] Unit test (Optimize) ======================================================================== [JoinIndex Improvement][6.0/x] Unit test (General, Put, GetIterator) ======================================================================== [JoinIndex Improvement][5.3/x] Implement Optimize ======================================================================== Remove accents from Greek letters in normalizer ======================================================================== Make arm emulator tests build-only. ======================================================================== [JoinIndex Improvement][5.2/x] Implement GetIterator ======================================================================== [JoinIndex Improvement][5.1/x] Implement Put ======================================================================== [JoinIndex Improvement][5.0/x] Branch QualifiedIdJoinIndex to QualifiedIdJoinIndexImplV2 ======================================================================== [JoinIndex Improvement][4/x] Implement PostingListJoinDataAccessor ======================================================================== [JoinIndex Improvement][3/x] Implement PostingListJoinDataSerializer and DocumentIdToJoinInfo data type ======================================================================== [JoinIndex Improvement][2/x] Create NamespaceFingerprintIdentifier ======================================================================== [JoinIndex Improvement][1/x] Implement namespace_id_old_to_new in Compaction ======================================================================== Update test to also handle ICU 74 segmentation rules. ======================================================================== [Icing][Expand QueryStats][3/x] Add new fields into QueryStats (1) ======================================================================== [Icing][Expand QueryStats][2/x] Refactor QueryStatsProto ======================================================================== [Icing][Expand QueryStats][1/x] Publish DocHitInfoIterator CallStats ======================================================================== Add additional property filter tests ======================================================================== Deprecate hit_intersect_section_ids_mask in DocHitInfoIterator ======================================================================== Change default requires_full_emulation to False for portable_cc_test (third_party/icing/testing) ======================================================================== Cleanup Set requires_full_emulation to True for selective tests ======================================================================== Fix monkey test failures ======================================================================== Complete monkey test logic to change schema during monkey test runtime ======================================================================== Refactor monkey test to prepare for schema update ======================================================================== Fix the schema bug found by monkey test with seed 2551429844 ======================================================================== Move set query stats to the very top of InternalSearch() ======================================================================== Apply section restriction only on leaf nodes ======================================================================== [6/n] Fix callsites in Icing that forgot to check libtextclassifier3::Status (Advanced query parser) ======================================================================== [5/n] Fix callsites in Icing that forgot to check libtextclassifier3::Status (PersistentHashMap) ======================================================================== [4/n] Fix callsites in Icing that forgot to check libtextclassifier3::Status (PostingListIntegerIndexSerializer) ======================================================================== [3/n] Fix callsites in Icing that forgot to check libtextclassifier3::Status (PostingListHitSerializer) ======================================================================== [2/n] Fix callsites in Icing that forgot to check libtextclassifier3::Status (Posting list storage) ======================================================================== [1/n] Fix callsites in Icing that forgot to check libtextclassifier3::Status (Non-functional changes) ======================================================================== Decouple section restriction data from iterators ======================================================================== Fix the crash when a schema type gets more indexable properties than allowed ======================================================================== Add a checker to verify the property data type matches the schema. ======================================================================== Change global std::string in i18n-utils to constexpr std::string_view. ======================================================================== Adjust LiteIndex sort at indexing check conditions. ======================================================================== Bug: 305098009 Bug: 307508735 Bug: 291130542 Bug: 275121148 Bug: 303239901 Bug: 301116242 Bug: 299321977 Bug: 300135897 Bug: 297549761 Bug: 309826655 Bug: 296349369 Bug: 302192690 Bug: 302609704 Bug: 301566713 NO_IFTTT="False Alarm: The path is only valid in G3. kVersion is changed to 3, and schema is compatible with version 1." Change-Id: I8c4c3cd9b93e5240bd774f0a3d6d812f7a9ec198
2023-05-11Update Icing from upstream.Tim Barron
Descriptions: ======================================================================== Modify the definition of propertyDefined: ======================================================================== Remove default args in SchemaStore::SetSchema and fix calls ======================================================================== Add allow_circular_schema_definitions flag ======================================================================== Onboard version detection to Icing ======================================================================== Add version util to help read/write version info ======================================================================== Add support for the overlay schema. ======================================================================== Allow cycles in schema-property-iterator ======================================================================== Add joinable properties into schema definition cycle restrictions. ======================================================================== Loosen circular references restriction for Schema Definitions. ======================================================================== Implement BackupSchemaProducer to generate a backup schema ======================================================================== Minor fix: remove a redundant log ======================================================================== Allow schema types to inherit from more than one parent ======================================================================== allow nested document properties to accept documents of subtype ======================================================================== Support polymorphism for Icing projection in Search and Get API ======================================================================== Add max_joined_child_per_parent into ResultSpec and change behavior ======================================================================== Support Icing schema type polymorphism for the search filter API ======================================================================== Verify that every child type's property set has included all compatible properties from parent types ======================================================================== Add individual type index latency ======================================================================== Build the iterator node for the propertyDefined() custom function ======================================================================== Advance all hits with same doc id from and merge sections once for the same bucket iter ======================================================================== Introduce DocHitInfoIteratorPropertyInSchema for property existence check ======================================================================== Add SchemaUtil::BuildTransitiveInheritanceGraph to build an inheritance map from schema ======================================================================== Introduce a lookup method for a property defined in a schema ======================================================================== Rollback of: Allow LanguageSegmenter::Iterators to declare AccessType. ======================================================================== Adds join info to QueryStatsProto ======================================================================== Bug:280698419 Bug:280698125 Bug:280698121 Bug:280697513 Bug:276349029 Bug:272145329 Bug:270102295 Bug:269295094 Bug:268680462 Bug:265304217 Bug:259744228 Bug:259743562 Bug:256022027 Change-Id: I54cd1d22121c314f8c238d2d49f0809165dc0ca3
2023-03-16Update Icing from upstream.Tim Barron
Descriptions: ======================================================================== Set overall index latency ======================================================================== Change GetUsageScores return type to optional. ======================================================================== Upstream fix to thread-safety annotation for cached_break_iterator_ ======================================================================== Bug: 259744228 Change-Id: Ia7a5032fd7655db773e311173f5735e4451b30c1
2023-03-14Update Icing from upstream.Tim Barron
Descriptions: ======================================================================== Cache an instance of UBreakIterator to reduce unnecessary creations. ======================================================================== Cap number of individual IntegerIndexStorages that IntegerIndex creates. ======================================================================== Change error in trimRightMostNode from Unimplemented to InvalidArgument. ======================================================================== Add detection for new language features of List Filters Query Language. ======================================================================== Add option to control threshold to rebuild index during optimize by flag ======================================================================== Add option to control use of namespace id to build urimapper by flag. ======================================================================== Enforce schema validation for joinable config. ======================================================================== Adopt bucket splitting for IntegerIndexStorage. ======================================================================== Implement bucket splitting function. ======================================================================== Add Icing initialization unit tests for QualifiedIdTypeJoinableIndex. ======================================================================== Add Icing schema change unit tests for QualifiedIdTypeJoinableIndex. ======================================================================== Add Icing optimization unit tests for QualifiedIdTypeJoinableIndex. ======================================================================== Integrate QualifiedIdTypeJoinableIndex into IcingSearchEngine. ======================================================================== Implement QualifiedIdJoinablePropertyIndexingHandler. ======================================================================== Change QualifiedIdTypeJoinableIndex to store raw qualified id string. ======================================================================== Pass info about unnormalized query terms through lexer/parser/visitor. ======================================================================== Bug: 208654892 Bug: 263890397 Bug: 259743562 Bug: 272145329 Bug: 227356108 Change-Id: I438a390ddda5673cf2b5781af502f2b7cfeaee74
2023-03-06Update Icing from upstream.Tim Barron
Descriptions: ====================================================================== Refactor IndexProcessor ====================================================================== Rename Joinable Cache as Joinable Index ====================================================================== Implement Optimize and Clear for QualifiedIdTypeJoinableCache ====================================================================== Add JoinablePropertyMetadata reverse lookup ====================================================================== Allow code creating LanguageSegmenter::Iterators to declare AccessType ====================================================================== Further codifies the escape behavior in the parser test ====================================================================== Bug: 263890397 Bug: 268680462 Bug: 270102295 Change-Id: I3233733b40e985e11c4a6d75c1528cd6a72c1173
2023-03-01Update Icing from upstream.Terry Wang
Descriptions: ====================================================================== Add PropertyUtil for all property name/path related operations ====================================================================== [JoinableCache][2.0/x] Create SchemaPropertyIterator ====================================================================== [JoinableCache][2.1/x] Handle nested indexable flag ====================================================================== [JoinableCache][2.2/x] Add schema cycle dependency detection for SchemaPropertyIterator ====================================================================== [JoinableCache][3.0/x] Refactor SectionManager ====================================================================== [JoinableCache][3.1/x] Add unit tests for SectionManager::Builder and SchemaTypeManager ====================================================================== [NumericSearch][Storage][12/x] Implement Edit and GetIterator for IntegerIndex ====================================================================== [NumericSearch][Storage][13.0/x] Rename numeric-index_test as integer-index_test ====================================================================== [NumericSearch][Storage][13.1/x] Add IntegerIndexTest ====================================================================== Support the "len", "sum" and "avg" functions in advanced scoring. ====================================================================== Support the "this.childrenScores()" function to allow expressing children scores of joins in advanced scoring. ====================================================================== Create an integration test for Join with advanced scoring ====================================================================== Rename the word "children" to "args" for function related ScoreExpression ====================================================================== Improve IndexBlock by PRead/PWrite instead of repeating mmap/msync/unmap ====================================================================== Refactor QueryVisitor to prepare for support for function calls. ====================================================================== Add support for function calls. ====================================================================== Fix breakage in score-and-rank_benchmark. ====================================================================== [NumericSearch][Storage][adhoc][ez] Fix comment for IntegerIndex ====================================================================== [NumericSearch][Storage][14/x] Create first IntegerIndexStorage benchmark ====================================================================== Rename Icing schema related terminology to prepare for polymorphism support ====================================================================== [JoinableCache][4.0/x] Move common methods from SectionManager to PropertyUtil ====================================================================== [JoinableCache][4.1/x] Retire GetSectionContent ====================================================================== [JoinableCache][4.2/x] Polish SectionManagerTest ====================================================================== Modify QueryVisitor to do: ====================================================================== [NumericSearch][Storage][15/x] Implement TransferIndex for IntegerIndexStorage ====================================================================== [NumericSearch][Storage][16/x] Implement Optimize and last added document id for IntegerIndex ====================================================================== [NumericSearch][rollout][1/x] Include indexable int64 into SchemaDelta and backward compatibility ====================================================================== Add backwards compatibility test for Icing schema storage migration. ====================================================================== Implement trim the right-most node from the doc-hit-info-iterator. ====================================================================== Add TrimmedNode structure into doc-hit-info-iterator. ====================================================================== [JoinableCache][5/x] Implement JoinableProperty and JoinablePropertyManager ====================================================================== [JoinableCache][6/x] Add JoinablePropertyManager into SchemaTypeManager ====================================================================== [JoinableCache][7/x] Implement ExtractJoinableProperties ====================================================================== [JoinableCache][8/x] Create class QualifiedIdTypeJoinableCache ====================================================================== [JoinableCache][9/x] Implement factory method for QualifiedIdTypeJoinableCache ====================================================================== [JoinableCache][10/x] Implement Get and Put for QualifiedIdTypeJoinableCache ====================================================================== [JoinableCache][11/x] Add unit tests for QualifiedIdTypeJoinableCache ====================================================================== Modify DocHitInfoIteratorSectionRestrict to allow multi-property restricts ====================================================================== Fix the definition of LiteIndex::WantsMerge. ====================================================================== [NumericSearch][rollout][2.0/x] Rollout persistent IntegerIndex ====================================================================== [NumericSearch][rollout][2.1/x] Add more tests for integer index restoration and optimization ====================================================================== [JoinableCache][adhoc][ez] Remove qualified id type joinable cache size info from document storage info ====================================================================== Integrate trim right node into suggestion processor. Bug: 208654892 Bug: 228240987 Bug: 249829533 Bug: 256081830 Bug: 259744228 Bug: 261474063 Bug: 263890397 Bug: 266103594 Bug: 268738297 Bug: 269295094 Change-Id: I5f1b3f3ed0b5d6933dc8c2ab3279904f7706b23e
2022-12-12Sync from upstream.Tim Barron
Descriptions: ====================================================================== Add ScoringSpec into JoinSpec. Rename joined_document to child_document. ====================================================================== Create JoinedScoredDocumentHit class and refactor ScoredDocumentHitsRanker. ====================================================================== Implement initial Join workflow ====================================================================== Implement the Lexer for Icing Advanced Query Language ====================================================================== Create struct Options for PersistentHashMap ====================================================================== Premapping FileBackedVector ====================================================================== Create class PersistentHashMapKeyMapper ====================================================================== Add integer sections into TokenizedDocument and rename string sections ====================================================================== Create NumericIndex interface and DocHitInfoIteratorNumeric ====================================================================== Implement DummyNumericIndex and unit test ====================================================================== Change PostingListAccessor::Finalize to rvalue member function ====================================================================== Define the Abstract Syntax Tree for Icing's list_filter parser. ====================================================================== Refactor query processing and score ====================================================================== Refactor IcingSearchEngine for AppSearch Dynamite Module 0p APIs ====================================================================== Implement the Lexer for Icing Advanced Scoring Language ====================================================================== Add a common interface for IcingSearchEngine and dynamite client ====================================================================== Implement a subset of the query grammar. ====================================================================== Refactor index processor ====================================================================== Add integer index into IcingSearchEngine and IndexProcessor ====================================================================== Implement the parser for Icing Advanced Scoring Language ====================================================================== Implement IntegerIndexData and PostingListUsedIntegerIndexDataSerializer ====================================================================== Add PostingListAccessor abstract class for common components and methods ====================================================================== Implement PostingListIntegerIndexDataAccessor ====================================================================== Create PostingListIntegerIndexDataAccessorTest ====================================================================== Fix Icing Segmentation tests for word connectors that changed in ICU 72. ====================================================================== Modify the Advanced Query grammar to allow functions to accept expressions. ====================================================================== Implement QueryVisitor. ====================================================================== Enable the Advanced Query Parser to handle member functions ====================================================================== Refactor the Scorer class to support the Advanced Scoring Language ====================================================================== Integrate advanced query parser with the query processor. ====================================================================== Implement support for JoinSpec in Icing. ====================================================================== Implement the Advanced Scoring Language for basic functions and operators ====================================================================== Bug: 208654892 Bug: 249829533 Bug: 256022027 Bug: 261474063 Bug: 240333360 Bug: 193919210 Change-Id: I5f5bdc6249282ecc4b014b4fbdf8e2d1f8b20c19
2022-11-11Sync from upstream.Armaan Danewalia
Descriptions: ====================================================================== Include equals-proto and convert CodeToString methods to inline. ====================================================================== Add schema and document generators used by monkey test ====================================================================== Create in-memory icing for monkey testing ====================================================================== [NumericSearch][Storage][refactor_posting_list][1/x] Create PostingListUsedHitSerializer ====================================================================== [NumericSearch][Storage][refactor_posting_list][2/x] Create PostingListUsedHitSerializerTest ====================================================================== [NumericSearch][Storage][refactor_posting_list][3/x] Refactor all posting list related classes to use PostingListUsedSerializer ====================================================================== Adds a JoinSpecProto, and a new ranking strategy ====================================================================== Create Monkey Test Runner that randomly performs Icing API calls and check the results with the in-memory Icing ====================================================================== Support monkey testing the DeleteByQuery and Search APIs of Icing search engine ====================================================================== Directly include proto.h files from portable_proto_library() instead of wrapper pb.h files. ====================================================================== Directly include proto.h files from portable_proto_library() instead of wrapper pb.h files. ====================================================================== Directly include proto.h files from portable_proto_library() instead of wrapper pb.h files. ====================================================================== Removes nested_query from JoinSpec, as nested_search_spec includes it. ====================================================================== Directly include proto.h files from portable_proto_library() instead of wrapper pb.h files. ====================================================================== Swap the order when we build the doc-hit-info-iterator-and. ====================================================================== Support monkey testing section restrictions in the DeleteByQuery and Search APIs ====================================================================== Allow the in-memory icing to return the number of deleted documents, and check it with the delete stats of DeleteByNamespace, DeleteBySchemaType, and DeleteByQuery in the monkey test ====================================================================== [NumericSearch][Storage][refactor_posting_list][4/x] Move posting list common files into another directory ====================================================================== Add file-skipping for exports to Jetpack using @exportToAOSP:skipFile() tag ====================================================================== Change invalid type int32 to int32_t in icing-search-enging-jni.cc. ====================================================================== Minor fix, move generate term_iterator inside of the ranking_strategy checking ====================================================================== Address index out of bounds issue in third_party/icing/result/snippet-retriever.cc. This issue is causing test failures (see example failure: http://sponge2/cfcda71a-1312-455a-9a70-821c74c708e6). ====================================================================== Fix 1 DependencyCleaner findings: ====================================================================== Implement URL tokenization for Icing-lib [2/2]: ====================================================================== [NumericSearch][General][1/x] Create numeric.proto and add IntegerIndexingConfig ====================================================================== [NumericSearch][General][2/x] Refactor GetStringSectionContent and GetStringPropertyContent ====================================================================== [NumericSearch][General][3/x] Add DataType into SectionMetadata and change AssignSections ====================================================================== Move unit test constant definition into schema builder ====================================================================== [NumericSearch][General][4/x] Create templated Section and SectionGroup ====================================================================== [NumericSearch][General][5/x] Create BasicHit ====================================================================== Change URL tokenizer's url_parse dependency to use third_party/url_parse:url_parse_stripped_down ====================================================================== Exclude memory-mapped-file-leak_test.cc from export to AOSP. ====================================================================== Rollback of changelist 487633403. Reason: This cl breaks YouTube Music builds due to duplicate symbols. ====================================================================== [ez] Fix ScoredDocumentHit related comparators ====================================================================== Bug: 193244409 Bug: 246984163 Bug: 249829533 Bug: 256022027 Bug: 256679292 Bug: 246964044 Change-Id: I55c2ed417a31321c22de377e97ffe0096478d28c
2022-10-11Remove url-tokenizer from upstream-master.Terry Wang
url-tokenizer is not ready for Jetpack yet, this is added by accident. Bug: 246964044 Change-Id: I854084d8880e410f6bf7740cb9a0bf8a77b973dc
2022-10-11Update Icing from upstream.Terry Wang
Descriptions: ====================================================================== Implement URL tokenization for Icing-lib ====================================================================== Adds RFC822_HOST_ADDRESS ====================================================================== Support Suggestion API could be ordered by term's frequency. ====================================================================== Bug: 246964044 Bug: 230553264 Change-Id: Id7e7b1e080bb66ccf03452e75b72c5cceed2f7db
2022-09-30Update Icing from upstream.Terry Wang
Descriptions: ====================================================================== Integrate ANTLR-based advanced query prototype with query processor. ====================================================================== [PersistentHashMap][6.1/x] Wrap the return value of KeyMapper::ComputeChecksum by StatusOr ====================================================================== [PersistentHashMap][6.0/x] Replace GetValuesToKeys with iterator for KeyMapper ====================================================================== [PersistentHashMap][5.1/x] Allow client to specify initial num buckets ====================================================================== [PersistentHashMap][5/x] Implement rehashing ====================================================================== Add SchemaType filter and Document Id filter in Search Suggestion API. ====================================================================== Follow up to cl/463377778 ====================================================================== Add Document Id filters in Search Suggestion API. ====================================================================== Cleanup LSC: Replace inclusion of *_proto_portable.pb.h files coming from portable_proto_library() with *(!_proto_portable).pb.h coming from cc_proto_library(). ====================================================================== Cleanup LSC: Replace inclusion of *_proto_portable.pb.h files coming from portable_proto_library() with *(!_proto_portable).pb.h coming from cc_proto_library(). ====================================================================== Cloned from CL 464902284 by 'g4 patch'. ====================================================================== Cleanup LSC: Replace inclusion of *_proto_portable.pb.h files coming from portable_proto_library() with *(!_proto_portable).pb.h coming from cc_proto_library(). ====================================================================== Cleanup Remove unused visibility specs (last referenced in codebase over 132 days ago). ====================================================================== Cleanup LSC: Replace inclusion of *_proto_portable.pb.h files coming from portable_proto_library() with *(!_proto_portable).pb.h coming from cc_proto_library(). ====================================================================== Cleanup LSC: Replace inclusion of *_proto_portable.pb.h files coming from portable_proto_library() with *(!_proto_portable).pb.h coming from cc_proto_library(). ====================================================================== Remove dsaadati@ from third_party/icing OWNERS ====================================================================== Cleanup Move package level default_copts attribute to copts. ====================================================================== Refactor QueryProcessor and QueryProcessTest in preparation for adding ANTLR prototype to parse queries with search_type EXPERIMENTAL_ICING_ADVANCED_QUERY. ====================================================================== Bug: 208654892 Bug: 230553264 Bug: 237324702 Bug: 193919210 Change-Id: I2f0a612747ccb754502489a9b168406532cffaee
2022-08-11Sync from upstream.Tim Barron
Descriptions: ====================================================================== Implement new version of ResultState and ResultStateManager to 1) enforce a page byte size limit and 2) improve handling of pagination when we encounter deleted documents. ====================================================================== Fix bugs in IcingDynamicTrie::Delete. ====================================================================== Implement IcingDynamicTrie::IsBranchingTerm. ====================================================================== Change Icing default logging level to INFO ====================================================================== Refactor KeyMapper class to be an interface. ====================================================================== Improve NamespaceChecker logic to improve Suggest latency. ====================================================================== Change icing native log tag to "AppSearchIcing" ====================================================================== Implement Index Compaction rather than rebuilding index during Compaction. ====================================================================== Implement reverse iterator for IcingDynamicTrie ====================================================================== Avoid adding unnecessary branch points during index compaction ====================================================================== Invalidate expired result states when adding to/retrieving from ResultStateManager. ====================================================================== Add new methods (MutableView, MutableArrayView, Append, Allocate) to FileBackedVector ====================================================================== Create and implement PersistentHashMap class. ====================================================================== Implement RFC822 Tokenizer ====================================================================== Remove uses of StringPrintf in ICING_LOG statements ====================================================================== Properly set query latency when an error is encountered or results are empty. ====================================================================== Bug: 146903474 Bug: 152934343 Bug: 193919210 Bug: 193453081 Bug: 231368517 Bug: 235395538 Bug: 236412165 Change-Id: I8aa278cebb12b25b39deb0ef584c0f198952659d
2022-05-20Sync from upstream.Tim Barron
Descriptions: ====================================================================== Fix bug in schema store where a failure during RegenerateDerivedFiles would lead to a dangling pointer. ====================================================================== Add RAII class that will create and destroy file directories. ====================================================================== Convert MainIndexDebugInfoProto and LiteIndexDebugInfoProto to string ====================================================================== Make SchemaStore move assignable. ====================================================================== Rollback of "convert the string lexicon debug information to a protocol buffer" ====================================================================== Fix NPE caused by a remap failure. ====================================================================== Unify the name "priority" and "severity" in Icing logging ====================================================================== Avoiding string formatting in Icing logging when we should not log ====================================================================== Switch to use an enum with BASIC/DETAILED to control the verbosity of getDebugInfo ====================================================================== Remove the behavior in the Language Segmenter to filter out non-ascii+non-alphanumeric characters. ====================================================================== Fix the SetSchema bug when we override a schema with nested incompatible types ====================================================================== Wrap __android_log_write with __android_is_loggable ====================================================================== Enable removing expired page tokens to free cache space ====================================================================== Bug: 146903474 Bug: 193453081 Bug: 222349894 Bug: 229770338 Bug: 229778472 Bug: 230879098 Bug: 231416401 Bug: 231237897 Bug: 232273174 Change-Id: I22f050de16f56dce39e12a7033947519d598c840
2022-05-03Sync from upstream.Jiayu Hu
Descriptions: ====================================================================== Export Icing logging control to JNI ====================================================================== Prepare Icing logging class for JNI export ====================================================================== Export getDebugInfo to JNI ====================================================================== Expose the return_deleted_document_info parameter for deleteByQuery JNI ====================================================================== Enable runtime log control for Icing Library ====================================================================== Fix 1 ClangTidyBuild finding: ====================================================================== Update comments to run benchmarks. ====================================================================== Making icing's own logging class ====================================================================== Convert the string lexicon debug information to a protocol buffer ====================================================================== Fix issue with printing fingerprinted key in our error logs. ====================================================================== Support dump function for IcingSearchEngine ====================================================================== Bug: 146903474 Bug: 229778472 Bug: 209071710 Bug: 222349894 Bug: 225914361 Change-Id: I9750149d1ed0b59f345b8828ff312a62773667fe
2022-04-12Sync from upstream.Tim Barron
Descriptions: ====================================================================== Add some additional logging that will help diagnose b/218413237 ====================================================================== Mark VerbatimTokenizer::ResetToTokenStartingAfter as 'override'. ====================================================================== Support dump function for SchemaStore ====================================================================== Bug: 218413237 Change-Id: I9efd1dd388cd510df15989c84a8577d4ba56ab3c
2022-04-12Sync from upstream.Tim Barron
====================================================================== Refactor DocumentStore::Initialize to improve readability of document store recovery. ====================================================================== Remove non-NDK API usages of ICU4C in libicing. ====================================================================== Move IcuDataFileHelper to the testing directory since it is a test-only util. ====================================================================== Support dump function for DocumentStore ====================================================================== Switch to use PRead rather than MMap in the proto log. ====================================================================== Support dump function for main/lite index and lexicon ====================================================================== Fix LiteIndex::AppendHits ====================================================================== Enable and fix DocumentStoreTest.LoadScoreCacheAndInitializeSuccessfully ====================================================================== Fix MainIndex::GetStorageInfo. ====================================================================== Fix icing-search-engine_fuzz_test by making IcuLanguageSegmenterIterator::Advance non-recursive. ====================================================================== Allow to return additional information for deleted documents in DeleteByQuery ====================================================================== Using enum class in Token::Type for better type safety. ====================================================================== Bug: 158089703 Bug: 185845269 Bug: 209071710 Bug: 211785521 Bug: 218413237 Bug: 223549255 Change-Id: Id2786047ab279734bdd2aee883e82607b6a0e403
2021-12-28Sync from upstream.Dan Saadati
Descriptions: ================ Normalize Tokens by Token type when retrieving snippets ================ Rename max_window_bytes to max_window_utf32_length, Delete the max_tokens_per_doc field in IcingSearchEngineOptions. ================ Handle suggestion namespace ownership. ================ Fix OkStatus() is not a valid argument to StatusOr in Main_index.RetrieveMoreHits. ================ Allow advancing when current indices are negative in CharacterIterator ================ Adds support for verbatim tokenization and indexing in IcingLib ================ Renames TokenizerIterator Reset functions ================ Add term_match_type to SuggestionSpec proto ================ Unify the C++ proto enum style ================ Allow zero property weights in IcingLib ================ Bug: 204333391 Bug: 152934343 Bug: 205209589 Bug: 206147728 Bug: 209993976 Change-Id: Id94a377fd37c5eb7ebc3d7547cf8ff0ad4152620
2021-10-21Sync from upstream.Tim Barron
Descriptions: ================ Replace refs to c lib headers w/ c++ stdlib equivalents. ================ Update IDF component of BM25F Calculator in IcingLib ================ Expose QuerySuggestions API. ================ Change the tokenizer used in QuerySuggest. ================ Add SectionWeights API to Icing. ================ Apply SectionWeights to BM25F Scoring. ================ Replaces uses of u_strTo/FromUTF32 w/ u_strTo/FromUTF8. Bug: 152934343 Bug: 202308641 Bug: 203700301 Change-Id: Ic884a84e5ff4c9c04b2cd6dd1fce90765aa4446e
2021-09-08Sync from upstream.My Name
Descriptions: ================ Remove no-longer-used write paths for file-backed-proto-log. ================ Modify segmentation rules to consider any segment that begins with a non-Ascii alphanumeric character as valid ================= Implement CalculateNormalizedMatchLength for IcuNormalizer. ================ Add additional benchmark cases that were useful in developing submatching and CalculateNormalizedMatchLength for IcuNormalizer ================= Switch NormalizationMap from static const std::unordered_map<char16_t, char16_t>& to static const std::unordered_map<char16_t, char16_t> *const. ================== Bug: 147509515 Bug: 149610413 Bug: 195720764 Bug: 196257995 Change-Id: Iabdb34a983b5d47daca808888a46c241767d93bf
2021-08-13Merge androidx-platform-dev/external/icing upstream-master into upstream-masterAlexander Dorokhine
Change-Id: Id39bea14b3a3d378d722dc552e4f3bd4249f3f94
2020-11-18Update Icing from upstream.Alexander Dorokhine
Change-Id: Ic022a44e876a6060a47e0db991e63b2b73807769
2020-11-05Update icing from upstream.Alexander Dorokhine
Change-Id: Ia63a77142ec717c0d9a81ec0a5c1267381858200
2020-10-28Pull upstream changes.Terry Wang
Change-Id: I73ea5f80ccf16a02519f6f7ccfc993e9b0f39f86
2020-10-01Pull upstream changes.Terry Wang
Change-Id: I794757716961569b5c02171cfc82785efb2cf106
2020-09-24Pull upstream changes.Terry Wang
Change-Id: I44831fdadcdb67f2e19570a35cb4c76faf8397f9
2020-06-25Pull upstream changes.Cassie Wang
Change-Id: I8a1e76e3e42188364ac40c0c51efb1e49292c015
2020-06-05Copy over changes made to Google3 codebase in Icing.Tim Barron
Change-Id: Ia36edb0a1b085e249dabfc220a5b72418063604f
2020-05-04Pull upstream changes.Cassie Wang
Change-Id: I28e569082d59404e23e554a9aa7d8751328009e8
2020-01-16Pull upstream changes.Cassie Wang
Upstream synced @290094995 Bug: 146383629 Test: manual, ran 'm -j libicing' with a local Android.bp Change-Id: I63eb12e93f3de5d6607ba7006fd3ea532dc079e4
2020-01-09Pull upstream changes and copy libtextclassifier classes.Cassie Wang
Upstream synced @288789500. Copied text_classifier dependencies (hash, status, logging) into icing/. Bug: 146383629 Test: manual - ran 'm -j libicing' with a working Android.bp locally Change-Id: I187a98af95b362745a09d605ed8334a6ff6971bb
2019-12-20Port over Icing c++ code from upstreamCassie Wang
Change-Id: Ia3981fed7e0e70589efc027d4123f306cdfbe990