61 Commits

Author SHA1 Message Date
Jean Chalard
b28d1cc487 Fix a behavior change in dicttool
The behavior change was introduced by I8b3458ad. Concretely,
empty bigram lists would end up as empty lists instead of null.

Change-Id: Ibcdf7e6aabc6aba3f5db0477335882394e050ce5
2014-10-03 18:04:10 +09:00
Jean Chalard
fb051c3957 Merge changes I3c1f5ac1,I269c9aa8
* changes:
  Switch code point table
  Test for code point table (dicttool test)
2014-10-03 07:55:59 +00:00
Akifumi Yoshimoto
7e5614520a Merge "Include a code point table in the binary dictionary." 2014-10-02 08:55:18 +00:00
Akifumi Yoshimoto
25c884ba65 Test for code point table (dicttool test)
Bug:17097992
Change-Id: I269c9aa86378f32083f8688f4ce91862d47dd181
2014-10-02 12:28:04 +09:00
Akifumi Yoshimoto
9168ab60cf Include a code point table in the binary dictionary.
Bug:17097992
Change-Id: I677a5eb3a704e4386f6573360e44ca335d81d2df
2014-10-02 12:27:49 +09:00
Keisuke Kuroyanagi
c6a6f6a990 Introduce NgramProperty in Java side.
Bug: 14425059
Change-Id: I8b3458ad22730b3dccbe0caea2c5930f5276dc82
2014-10-01 11:21:08 +09:00
Keisuke Kuroyanagi
88fa47a27d Support migration/dump of Beginning-of-Sentence entries.
Bug: 14119293
Change-Id: Ie975138f819794d5c34a7a547be5a6117050e084
2014-06-24 12:37:07 +09:00
Tadashi G. Takaoka
a91561aa58 Use Java 7 diamond operator
Change-Id: If16ef50ae73147594615d0f49d6a22621eaf1aef
2014-05-24 01:05:42 +09:00
Keisuke Kuroyanagi
93cda5bb39 Move code only used for dicttool and tests under tests.
Bug: 13035567
Change-Id: I13c6df013ef2b67c9bf67455d9c32d283bf9ea2e
2014-03-27 15:30:32 +09:00
Keisuke Kuroyanagi
516f86815d Separate WeightedString from FusionDictionary.
Bug: 8187060

Change-Id: I40c1dafca3eb52244c64fdb4c1db30a56385d678
2014-03-06 18:53:06 +09:00
Keisuke Kuroyanagi
e784148ae6 Separate utility methods from BinaryDictionary.
Bug: 8187060
Change-Id: Ice2984e332b7bd3bb17174aefc80b5635b72fc50
2014-03-05 18:19:34 +09:00
Jean Chalard
890b44e537 Correctly read the header of APK-embedded dicts
Bug: 13164518
Change-Id: I8768ad887af8b89ad9f29637f606c3c68629c7ca
2014-02-24 22:54:01 +09:00
Keisuke Kuroyanagi
95d16561e0 Remove unused code.
Bug: 12810574
Change-Id: I9c7fff60ae0e94d52f3bd19c3e88de5a53b917d7
2014-02-15 17:39:24 +09:00
Keisuke Kuroyanagi
0fc93fe445 Implement PatriciaTriePolicy::getNextWordAndNextToken().
Bug: 12810574
Change-Id: Id1d44f90de9455d9cbe7b6e0a161cae91d6d422c
2014-02-15 17:39:20 +09:00
Keisuke Kuroyanagi
85fe06e759 Merge "Remove unused argument from readDictionaryBinary." 2014-02-14 10:37:56 +00:00
Keisuke Kuroyanagi
8e3a1d0f89 Remove unused argument from readDictionaryBinary.
Bug: 12810574
Change-Id: Ice415ebd8d11162facca3fe8927ef8a616b11424
2014-02-14 19:02:15 +09:00
Keisuke Kuroyanagi
8fa7a09f1e Merge "Implement PatriciaTriePolicy::getWordProperty()." 2014-02-14 09:08:09 +00:00
Keisuke Kuroyanagi
c63d183473 Implement PatriciaTriePolicy::getWordProperty().
Bug: 12810574
Change-Id: I7bcccfd3641ebbcf2b8d857d33bb4734c42af5eb
2014-02-14 17:56:45 +09:00
Tadashi G. Takaoka
da973e75dc Make InputLogicTest more robust
Change-Id: I134f14971126cbeed05b472c08747f2b88ad30e6
2014-02-13 19:38:51 +09:00
Keisuke Kuroyanagi
8ffc631826 Make PtNode have ProbabilityInfo instead of raw value.
Bug: 11281877
Bug: 12810574
Change-Id: Id1cda0afc74c4e30633c735729143491b2274a7b
2014-02-10 15:05:08 +09:00
Keisuke Kuroyanagi
ab6a93773b Use native logic to read Ver4 dict.
Bug: 11281877
Bug: 12810574
Change-Id: Ief371d3ef61818e4e031de4659aee3c9584c7379
2014-02-06 21:55:37 +09:00
Keisuke Kuroyanagi
b986f78ba8 Separate header class from FormatSpec.
Bug: 12810574
Change-Id: Iacf1cd05a268bf690ab864b5e32a18a4b0ccc693
2014-02-04 21:36:04 +09:00
Keisuke Kuroyanagi
5cb7509314 Fix BinaryDictDecoderEncoderTests.
Bug: 12809791
Change-Id: I04313df78692b01e153a34c932a37f079a924105
2014-01-31 19:44:17 +09:00
Keisuke Kuroyanagi
26bd46095a Reading dictionary containing timestamps in Java Side.
Just skipping historical information fields.

Bug: 11281877
Change-Id: I43d2adaa576b7da11ed3ca54990265dbb6f53b08
2014-01-29 20:19:24 +09:00
Keisuke Kuroyanagi
c2fd53ee0e Remove ver4 dict updater.
Change-Id: I468994c98d091be621b9fb3fbe6405c67fc6a465
2013-12-17 18:17:51 +09:00
Keisuke Kuroyanagi
42334bb493 Quit checking bigram order in BinaryDictDecoderEncoderTests.
Change-Id: I1b8eb6ab2ea797d2590495b1f57f9ec9560ea817
2013-12-17 17:38:24 +09:00
Keisuke Kuroyanagi
4fdcefe504 Move DictUpdater to the tests directory.
Bug: 11245133
Change-Id: I0907a091ac3ae960eaf3b27da78dbb48a24b2ea1
2013-12-17 14:31:25 +09:00
Jean Chalard
b868375763 Fix failing tests
- Version 3 is not supported
- Now passing the right string to open v4 dicts. Fix the tests for this.

Change-Id: I7829330c3568a715b96396ba4e4e69c6e17775ab
2013-12-16 14:32:19 +09:00
Jean Chalard
7b55cd3e2b Remove flags from Java side.
This simplifies the code quite a bit.
- GERMAN_UMLAUTS are now handled through a key-value attribute.
  The dictionary generator does not need to know about it any more.
- FRENCH_LIGATURES are deprecated as we handle them with shortcuts now.
- CONTAINS_BIGRAMS is deprecated. Bigram processing is always applied
  regardless of this flag.

Bug: 11281748
Change-Id: If567e52e245a9342adc7f3104a0f7d8d782df8c1
2013-12-13 18:15:05 +09:00
Ken Wakasa
2fa3693c26 Reset to 9bd6dac4708ad94fd0257c53e977df62b152e20c
The bulk merge from -bayo to klp-dev should not have been merged to master.

Change-Id: I527a03a76f5247e4939a672f27c314dc11cbb854
2013-12-13 17:13:32 +09:00
Yuichiro Hanada
9514ed5c2a Add the new format of bigram entries.
In new format, each bigram entry has flags (1 byte), a terminal id (3 byte),
a time-stamp (4 byte), a counter (1 byte) and a level (1 byte).

Bug: 10920255
Bug: 10920165
Change-Id: I0f7fc125a6178e6d25a07e8462afc41a7f57e3e1
2013-10-11 14:50:41 +09:00
Yuichiro Hanada
e4e0add9fb Add Ver4DictUpdater.
Change-Id: I986ab26faf535fc4bc98443053f534eced9d048f
2013-10-04 17:33:29 +09:00
Yuichiro Hanada
75d60e821c Refactor BinaryDictIOUtilsTests.
Change-Id: I2208378b33038771b460abb33f9a690872e998e2
2013-10-04 14:19:13 +09:00
Yuichiro Hanada
d6e307a4b7 Add DictUpdater.
Change-Id: Ic586e46e5a9f59de53d53e59886d635345940974
2013-10-03 20:16:34 +09:00
Yuichiro Hanada
3aa8977cb2 Remove some unused variables.
Change-Id: Iaf1556fec194d17cb4318f2bdcc837f8d79449ef
2013-10-02 18:26:03 +09:00
Jean Chalard
fa946d4a0f Fix a test and crash with a better error message when reading
When there are too many bigrams, we stop reading the file,
so the file pointer is in an inconsistent place. This means we
have no idea what's going to happen next. It's better to crash
right away.

Change-Id: Id3b7b78cbe4fda3493b3c9c46758763e1ab5f6a3
2013-10-02 11:48:47 +09:00
Yuichiro Hanada
1625aeafd2 Fix runReadUnigramsAndBigramsTests.
Change-Id: Idd9176c9943dfacac5a06957f1a07187b642b207
2013-09-24 12:31:45 +09:00
Yuichiro Hanada
14087ba52c Add Ver4DictDecoder.
Bug: 9618601
Change-Id: I43c5840505c6a847aaf4893a400392ccd45903c0
2013-09-19 16:11:23 +09:00
Keisuke Kuroyanagi
78b55a31cb Fix handling multi-bytes characters and add a test.
Bug: 6669677

Change-Id: Id2154db47adea2929559a4187a726f9dfa83363e
2013-09-17 15:11:24 +09:00
Yuichiro Hanada
a141d8ef7d Add Ver4DictEncoder.
Bug: 9618601
Change-Id: I161d2845906f07c1251deb8005fdffe49c5b7940
2013-09-13 17:33:51 +09:00
Yuichiro Hanada
0e40cd0c40 Add getDictDecoder.
Bug: 9618601
Change-Id: I173100ac704c03f7d5d0d53477e83cab5d1110d4
2013-09-12 20:14:09 +09:00
Yuichiro Hanada
95bc256f41 Add a flag to readDictioanryBinary in DictDecoder.
Change-Id: I356adb72047ebc43c924fbff1ff45e7460508a31
2013-09-11 18:20:56 +09:00
Yuichiro Hanada
752a33640c [Refactor] Add DictDecoder.readUnigramsAndBigramsBinary.
Change-Id: I259db91d837c67cbcb3b6dc504b21dca23a6a5be
2013-08-26 17:24:38 +09:00
Yuichiro Hanada
bb5b84a826 [Refactor] Add DictDecoder.getTerminalPosition.
Change-Id: I9d04f64a58f5481cbb64cf1c09b5c485dd4176b4
2013-08-26 16:14:59 +09:00
Yuichiro Hanada
576f625ee1 Rename CharGroup to PtNode.
Bug: 10233675
Change-Id: I7b0eb07d195cd386cd0d9e97cd59bf48fcf24107
2013-08-26 15:58:30 +09:00
Yuichiro Hanada
e9a10ff0f0 Add DictDecoder.readDictionaryBinary.
Bug: 10434720
Change-Id: I14690a6e0f922ed1bab3a4b6c9a457ae84d4c1a4
2013-08-23 20:29:25 +09:00
Yuichiro Hanada
373c492a02 Add an unit test for CharEncoding.
Change-Id: Ifb1cc01fa5bc2d6d69671f1acb9b9675a4081d32
2013-08-22 23:05:09 +09:00
Yuichiro Hanada
aa4168ee09 Fix writePlacedNode.
Change-Id: I1d6b086f1d9f0dbd8d74f964e29ae62c533af978
2013-08-22 23:02:08 +09:00
Yuichiro Hanada
c922c8a504 Add DictEncoder.
Change-Id: I41049b9118b58838e5dedf8e5618d939ca70c5ef
2013-08-22 11:53:41 +09:00
Yuichiro Hanada
558e34c7bd Make readPtNode be called with the address from the beginning of the file.
Change-Id: I8939fdfb4f79e55bcd7393633784effb30df3f8f
2013-08-21 20:02:18 +09:00