mirror of
https://gitlab.futo.org/keyboard/latinime.git
synced 2024-09-28 14:54:30 +01:00
d3c51db948
20 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
Adrian Velicu
|
8dd31a28ae |
Update dictionaries (possibly_offensive flag)
Correctly encoding possibly offensive words with their correct frequency and the possibly_offensive flag set. Continuing to encode with zero frequency only distracters or words that should never come up. https://paste.googleplex.com/5167060875214848 Bug: 11031090 Change-Id: Ia394b1827f292ff8d4791cc2f3e6e50b5aff4cbe |
||
Adrian Velicu
|
5664f64dff |
Update dictionaries
>>> java/res/raw/main_ru.dict Header : codePointTable : null <=> оаиенрстлвкмпудыгяйзбьчхюшжцфщёКэСМАБГПВДЛТРНХФ-ОИШЭУъЗЧЕЯЖЦЮЙЩЁЫѓ date : 1412325424 <=> 1412592602 version : 52 <=> 53 Body : No differences Change-Id: I5db813c4e671797c71de8609aa0e4d26404b425e |
||
Adrian Velicu
|
487a6a6949 |
Update dictionaries
>>> dictionaries/de_wordlist.combined.gz Header : date : 1393228134 <=> 1412325412 version : 44 <=> 52 Body : Probability changed: kommen 0 -> 149 Added: Käsebrötchen 50 Added: Lädst 50 Added: Müllbeutel 50 Added: Theresienwiese 50 Added: Verdammtes 50 Added: Wurstbrötchen 50 Added: abgebe 50 Added: angucke 50 Added: async 20 Added: backends 20 Added: brate 50 Added: erschreckendes 50 Added: erwische 50 Added: fahrt 80 Added: fragst 100 Added: gepostet 50 Added: gewundert 80 Added: gucke 50 Added: hattet 50 Added: hinkriege 50 Added: hustet 50 Added: hättet 60 Added: irgendwer 60 Added: koche 50 Added: kriege 70 Added: lehrst 50 Added: motivierenden 50 Added: müsstest 50 Added: müsstet 50 Added: organisiere 50 Added: peilen 50 Added: probiere 50 Added: rede 50 Added: reserviere 50 Added: sag 120 Added: schickes 80 Added: schickst 90 Added: sitze 50 Added: standet 50 Added: stolpere 50 Added: stressig 50 Added: telefoniere 80 Added: wolltest 100 Added: wolltet 100 Added: würdet 100 Added: ziele 50 Added: ähnlich 50 Added: älteren 50 Added: übelriechend 80 Added: überholen 50 Added: überlege 50 Added: überlegen 50 Added: überlegt 50 Added: übermorgen 50 Added: übernachte 50 Added: überquert 50 Added: überstanden 50 Added: übrig 50 Added: übrigens 50 >>> dictionaries/en_GB_wordlist.combined.gz Header : date : 1402373154 <=> 1412325408 version : 47 <=> 52 Body : Deleted: Pinterest 25 Added: Edamame 25 Added: Pinterest 25 Added: amd 0 >>> dictionaries/en_US_wordlist.combined.gz Header : date : 1402373154 <=> 1412325184 version : 47 <=> 52 Body : Deleted: Pinterest 25 Added: Edamame 25 Added: Pinterest 25 Added: amd 0 >>> dictionaries/en_wordlist.combined.gz Header : date : 1402373178 <=> 1412325419 version : 47 <=> 52 Body : Deleted: Pinterest 25 Added: Edamame 25 Added: Pinterest 25 Added: amd 0 >>> dictionaries/es_wordlist.combined.gz Header : date : 1404131686 <=> 1412325412 version : 49 <=> 52 Body : Added: cállese 30 Added: mándame 30 Added: recupérate 35 >>> dictionaries/ro_wordlist.combined.gz Header : description : Româna <=> Română date : 1408019089 <=> 1412325511 version : 50 <=> 52 Body : !!!!!! Truncated. !!!!!!! >>> dictionaries/ru_wordlist.combined.gz Header : date : 1406597821 <=> 1412325424 version : 50 <=> 52 Body : Deleted: Агг 52 Deleted: ЗАГС 77 Deleted: КОНКАКАФ 19 Deleted: Монк 69 Probability changed: НКАО 13 -> 0 Probability changed: НКВД 46 -> 0 Probability changed: НКО 14 -> 0 Probability changed: НКР 22 -> 0 Deleted: НОМОС-БАНК 58 Deleted: ПДД 77 Probability changed: РНК 33 -> 0 Deleted: СМС 78 Probability changed: СНК 35 -> 0 Deleted: ТОО 14 Probability changed: ТЦ 85 -> 5 Probability changed: УНКВД 11 -> 0 Deleted: ФИО 65 Deleted: Эбля 49 Probability changed: асексуальность 59 -> 0 Probability changed: бисексуал 72 -> 0 Probability changed: бисексуалов 85 -> 0 Probability changed: бисексуальной 67 -> 0 Probability changed: бисексуальности 75 -> 0 Deleted: бумажке 94 Deleted: бумажку 104 Deleted: важней 86 Deleted: вероника 58 Deleted: вероники 54 Deleted: вероникой 29 Deleted: веронику 29 Deleted: влезет 94 Deleted: влезть 87 Deleted: врожденная 75 Deleted: врожденного 78 Deleted: врожденное 71 Deleted: врожденной 85 Deleted: врожденную 66 Deleted: врожденные 82 Deleted: врожденный 82 Deleted: врожденным 79 Deleted: врожденными 76 Deleted: врожденных 86 Probability changed: врождённая 68 -> 75 Probability changed: врождённое 69 -> 71 Probability changed: врождённой 80 -> 85 Probability changed: врождённые 78 -> 82 Probability changed: врождённый 77 -> 82 Probability changed: врождённым 74 -> 79 Probability changed: врождённых 80 -> 86 Probability changed: все-таки 113 -> 30 Deleted: вылезли 88 Deleted: г-же 65 Deleted: г-н 88 Deleted: г-на 88 Probability changed: га 135 -> 0 Probability changed: гг 160 -> 0 Probability changed: гетеросексуалов 73 -> 0 Probability changed: гетеросексуального 67 -> 0 Probability changed: гетеросексуальной 71 -> 0 Probability changed: гетеросексуальности 65 -> 0 Probability changed: гетеросексуальность 67 -> 0 Probability changed: гетеросексуальную 65 -> 0 Probability changed: гетеросексуальные 76 -> 0 Probability changed: гетеросексуальных 77 -> 0 Probability changed: гомосексуал 74 -> 0 Probability changed: гомосексуала 67 -> 0 Probability changed: гомосексуалам 75 -> 0 Probability changed: гомосексуалами 70 -> 0 Probability changed: гомосексуализм 91 -> 0 Probability changed: гомосексуализма 91 -> 0 Probability changed: гомосексуализме 74 -> 0 Probability changed: гомосексуализму 68 -> 0 Probability changed: гомосексуалист 80 -> 0 Probability changed: гомосексуалиста 72 -> 0 Probability changed: гомосексуалистам 69 -> 0 Probability changed: гомосексуалистами 69 -> 0 Probability changed: гомосексуалистов 94 -> 0 Probability changed: гомосексуалистом 78 -> 0 Probability changed: гомосексуалисты 77 -> 0 Probability changed: гомосексуалов 93 -> 0 Probability changed: гомосексуалом 65 -> 0 Probability changed: гомосексуалы 82 -> 0 Probability changed: гомосексуальная 70 -> 0 Probability changed: гомосексуального 78 -> 0 Probability changed: гомосексуальное 71 -> 0 Probability changed: гомосексуальной 93 -> 0 Probability changed: гомосексуальности 103 -> 0 Probability changed: гомосексуальность 100 -> 0 Probability changed: гомосексуальностью 73 -> 0 Probability changed: гомосексуальную 75 -> 0 Probability changed: гомосексуальные 92 -> 0 Probability changed: гомосексуальный 75 -> 0 Probability changed: гомосексуальным 74 -> 0 Probability changed: гомосексуальными 70 -> 0 Probability changed: гомосексуальных 91 -> 0 Probability changed: д-р 93 -> 0 Deleted: дада 72 Deleted: даша 55 Deleted: даши 47 Deleted: дашу 29 Probability changed: де 154 -> 30 Probability changed: др 156 -> 0 Deleted: зажги 92 Deleted: зажгу 89 Deleted: зажигай 95 Deleted: зажигаю 88 Probability changed: зоосексуальность 65 -> 0 Probability changed: иРНК 68 -> 0 Probability changed: кДНК 62 -> 0 Probability changed: кв 133 -> 0 Deleted: кио 49 Deleted: лег 91 Deleted: лезу 88 Deleted: лезь 91 Probability changed: ля 103 -> 30 Probability changed: мРНК 102 -> 0 Deleted: машка 29 Probability changed: микроРНК 65 -> 0 Deleted: мону 29 Probability changed: мтДНК 79 -> 0 Probability changed: мяРНК 65 -> 0 Deleted: нажрался 97 Deleted: налил 97 Deleted: налили 86 Probability changed: негетеросексуальной 73 -> 0 Probability changed: негетеросексуальный 73 -> 0 Deleted: орут 98 Deleted: отт 64 Deleted: паша 83 Deleted: паше 66 Deleted: пашей 69 Deleted: пашой 73 Deleted: подоконник 88 Deleted: подскажет 87 Deleted: подскажете 89 Deleted: подскажите 112 Deleted: покажите 95 Deleted: полезли 91 Probability changed: пр 129 -> 0 Probability changed: пре-мРНК 78 -> 0 Deleted: пресекся 73 Probability changed: рРНК 91 -> 0 Deleted: раздражённо 91 Deleted: сажусь 99 Deleted: саше 54 Probability changed: секс 106 -> 0 Probability changed: секс-символ 74 -> 0 Probability changed: секс-символов 65 -> 0 Probability changed: секс-символом 74 -> 0 Probability changed: секс-туризм 62 -> 0 Probability changed: секса 105 -> 0 Probability changed: сексе 93 -> 0 Deleted: секси 88 Probability changed: сексизм 63 -> 0 Probability changed: сексизма 72 -> 0 Probability changed: сексолог 75 -> 0 Probability changed: сексологии 80 -> 0 Probability changed: сексом 102 -> 0 Probability changed: сексу 80 -> 0 Probability changed: сексуальная 95 -> 0 Probability changed: сексуально 88 -> 0 Probability changed: сексуального 107 -> 0 Probability changed: сексуальное 98 -> 0 Probability changed: сексуальной 111 -> 0 Probability changed: сексуальном 84 -> 0 Probability changed: сексуальному 79 -> 0 Probability changed: сексуальности 99 -> 0 Probability changed: сексуальность 90 -> 0 Probability changed: сексуальностью 70 -> 0 Probability changed: сексуальную 95 -> 0 Probability changed: сексуальные 105 -> 0 Probability changed: сексуальный 91 -> 0 Probability changed: сексуальным 95 -> 0 Probability changed: сексуальными 84 -> 0 Probability changed: сексуальных 113 -> 0 Deleted: сете 78 Deleted: слезой 87 Deleted: соображаю 90 Probability changed: тРНК 86 -> 0 Deleted: тав 69 Probability changed: транссексуал 67 -> 0 Probability changed: транссексуалки 64 -> 0 Probability changed: транссексуалов 82 -> 0 Probability changed: транссексуалы 71 -> 0 Probability changed: транссексуальности 77 -> 0 Probability changed: транссексуальность 65 -> 0 Deleted: укажите 83 Probability changed: ул 137 -> 0 Deleted: устар 93 Deleted: эдак 99 Added: Вероника 58 Added: Вероники 54 Added: Вероникой 29 Added: Веронику 29 Added: Даша 55 Added: Даши 47 Added: Дашу 29 Added: Маш 57 Added: Машка 29 Added: Паша 83 Added: Паше 66 Added: Пашей 69 Added: Пашой 73 Added: Саше 54 Added: впросак 0 Added: врождённую 66 Added: втечение 0 Added: втечении 0 Added: лёг 97 Added: машу 80 Added: чтоли 0 Added: чтоль 0 Added: ща 0 Added: щас 0 >>> java/res/raw/main_de.dict Header : date : 1393228134 <=> 1412325412 version : 44 <=> 52 Body : Probability changed: kommen 0 -> 149 Added: Käsebrötchen 50 Added: Lädst 50 Added: Müllbeutel 50 Added: Theresienwiese 50 Added: Verdammtes 50 Added: Wurstbrötchen 50 Added: abgebe 50 Added: angucke 50 Added: async 20 Added: backends 20 Added: brate 50 Added: erschreckendes 50 Added: erwische 50 Added: fahrt 80 Added: fragst 100 Added: gepostet 50 Added: gewundert 80 Added: gucke 50 Added: hattet 50 Added: hinkriege 50 Added: hustet 50 Added: hättet 60 Added: irgendwer 60 Added: koche 50 Added: kriege 70 Added: lehrst 50 Added: motivierenden 50 Added: müsstest 50 Added: müsstet 50 Added: organisiere 50 Added: peilen 50 Added: probiere 50 Added: rede 50 Added: reserviere 50 Added: sag 120 Added: schickes 80 Added: schickst 90 Added: sitze 50 Added: standet 50 Added: stolpere 50 Added: stressig 50 Added: telefoniere 80 Added: wolltest 100 Added: wolltet 100 Added: würdet 100 Added: ziele 50 Added: ähnlich 50 Added: älteren 50 Added: übelriechend 80 Added: überholen 50 Added: überlege 50 Added: überlegen 50 Added: überlegt 50 Added: übermorgen 50 Added: übernachte 50 Added: überquert 50 Added: überstanden 50 Added: übrig 50 Added: übrigens 50 >>> java/res/raw/main_en.dict Header : date : 1402373178 <=> 1412325419 version : 47 <=> 52 Body : Deleted: Pinterest 25 Added: Edamame 25 Added: Pinterest 25 Added: amd 0 >>> java/res/raw/main_es.dict Header : date : 1404131686 <=> 1412325412 version : 49 <=> 52 Body : Added: cállese 30 Added: mándame 30 Added: recupérate 35 >>> java/res/raw/main_ru.dict Header : date : 1406597821 <=> 1412325424 version : 50 <=> 52 Body : Deleted: Агг 52 Deleted: ЗАГС 77 Deleted: КОНКАКАФ 19 Deleted: Монк 69 Probability changed: НКАО 13 -> 0 Probability changed: НКВД 46 -> 0 Probability changed: НКО 14 -> 0 Probability changed: НКР 22 -> 0 Deleted: НОМОС-БАНК 58 Deleted: ПДД 77 Probability changed: РНК 33 -> 0 Deleted: СМС 78 Probability changed: СНК 35 -> 0 Deleted: ТОО 14 Probability changed: ТЦ 85 -> 5 Probability changed: УНКВД 11 -> 0 Deleted: ФИО 65 Deleted: Эбля 49 Probability changed: асексуальность 59 -> 0 Probability changed: бисексуал 72 -> 0 Probability changed: бисексуалов 85 -> 0 Probability changed: бисексуальной 67 -> 0 Probability changed: бисексуальности 75 -> 0 Deleted: бумажке 94 Deleted: бумажку 104 Deleted: важней 86 Deleted: вероника 58 Deleted: вероники 54 Deleted: вероникой 29 Deleted: веронику 29 Deleted: влезет 94 Deleted: влезть 87 Deleted: врожденная 75 Deleted: врожденного 78 Deleted: врожденное 71 Deleted: врожденной 85 Deleted: врожденную 66 Deleted: врожденные 82 Deleted: врожденный 82 Deleted: врожденным 79 Deleted: врожденными 76 Deleted: врожденных 86 Probability changed: врождённая 68 -> 75 Probability changed: врождённое 69 -> 71 Probability changed: врождённой 80 -> 85 Probability changed: врождённые 78 -> 82 Probability changed: врождённый 77 -> 82 Probability changed: врождённым 74 -> 79 Probability changed: врождённых 80 -> 86 Probability changed: все-таки 113 -> 30 Deleted: вылезли 88 Deleted: г-же 65 Deleted: г-н 88 Deleted: г-на 88 Probability changed: га 135 -> 0 Probability changed: гг 160 -> 0 Probability changed: гетеросексуалов 73 -> 0 Probability changed: гетеросексуального 67 -> 0 Probability changed: гетеросексуальной 71 -> 0 Probability changed: гетеросексуальности 65 -> 0 Probability changed: гетеросексуальность 67 -> 0 Probability changed: гетеросексуальную 65 -> 0 Probability changed: гетеросексуальные 76 -> 0 Probability changed: гетеросексуальных 77 -> 0 Probability changed: гомосексуал 74 -> 0 Probability changed: гомосексуала 67 -> 0 Probability changed: гомосексуалам 75 -> 0 Probability changed: гомосексуалами 70 -> 0 Probability changed: гомосексуализм 91 -> 0 Probability changed: гомосексуализма 91 -> 0 Probability changed: гомосексуализме 74 -> 0 Probability changed: гомосексуализму 68 -> 0 Probability changed: гомосексуалист 80 -> 0 Probability changed: гомосексуалиста 72 -> 0 Probability changed: гомосексуалистам 69 -> 0 Probability changed: гомосексуалистами 69 -> 0 Probability changed: гомосексуалистов 94 -> 0 Probability changed: гомосексуалистом 78 -> 0 Probability changed: гомосексуалисты 77 -> 0 Probability changed: гомосексуалов 93 -> 0 Probability changed: гомосексуалом 65 -> 0 Probability changed: гомосексуалы 82 -> 0 Probability changed: гомосексуальная 70 -> 0 Probability changed: гомосексуального 78 -> 0 Probability changed: гомосексуальное 71 -> 0 Probability changed: гомосексуальной 93 -> 0 Probability changed: гомосексуальности 103 -> 0 Probability changed: гомосексуальность 100 -> 0 Probability changed: гомосексуальностью 73 -> 0 Probability changed: гомосексуальную 75 -> 0 Probability changed: гомосексуальные 92 -> 0 Probability changed: гомосексуальный 75 -> 0 Probability changed: гомосексуальным 74 -> 0 Probability changed: гомосексуальными 70 -> 0 Probability changed: гомосексуальных 91 -> 0 Probability changed: д-р 93 -> 0 Deleted: дада 72 Deleted: даша 55 Deleted: даши 47 Deleted: дашу 29 Probability changed: де 154 -> 30 Probability changed: др 156 -> 0 Deleted: зажги 92 Deleted: зажгу 89 Deleted: зажигай 95 Deleted: зажигаю 88 Probability changed: зоосексуальность 65 -> 0 Probability changed: иРНК 68 -> 0 Probability changed: кДНК 62 -> 0 Probability changed: кв 133 -> 0 Deleted: кио 49 Deleted: лег 91 Deleted: лезу 88 Deleted: лезь 91 Probability changed: ля 103 -> 30 Probability changed: мРНК 102 -> 0 Deleted: машка 29 Probability changed: микроРНК 65 -> 0 Deleted: мону 29 Probability changed: мтДНК 79 -> 0 Probability changed: мяРНК 65 -> 0 Deleted: нажрался 97 Deleted: налил 97 Deleted: налили 86 Probability changed: негетеросексуальной 73 -> 0 Probability changed: негетеросексуальный 73 -> 0 Deleted: орут 98 Deleted: отт 64 Deleted: паша 83 Deleted: паше 66 Deleted: пашей 69 Deleted: пашой 73 Deleted: подоконник 88 Deleted: подскажет 87 Deleted: подскажете 89 Deleted: подскажите 112 Deleted: покажите 95 Deleted: полезли 91 Probability changed: пр 129 -> 0 Probability changed: пре-мРНК 78 -> 0 Deleted: пресекся 73 Probability changed: рРНК 91 -> 0 Deleted: раздражённо 91 Deleted: сажусь 99 Deleted: саше 54 Probability changed: секс 106 -> 0 Probability changed: секс-символ 74 -> 0 Probability changed: секс-символов 65 -> 0 Probability changed: секс-символом 74 -> 0 Probability changed: секс-туризм 62 -> 0 Probability changed: секса 105 -> 0 Probability changed: сексе 93 -> 0 Deleted: секси 88 Probability changed: сексизм 63 -> 0 Probability changed: сексизма 72 -> 0 Probability changed: сексолог 75 -> 0 Probability changed: сексологии 80 -> 0 Probability changed: сексом 102 -> 0 Probability changed: сексу 80 -> 0 Probability changed: сексуальная 95 -> 0 Probability changed: сексуально 88 -> 0 Probability changed: сексуального 107 -> 0 Probability changed: сексуальное 98 -> 0 Probability changed: сексуальной 111 -> 0 Probability changed: сексуальном 84 -> 0 Probability changed: сексуальному 79 -> 0 Probability changed: сексуальности 99 -> 0 Probability changed: сексуальность 90 -> 0 Probability changed: сексуальностью 70 -> 0 Probability changed: сексуальную 95 -> 0 Probability changed: сексуальные 105 -> 0 Probability changed: сексуальный 91 -> 0 Probability changed: сексуальным 95 -> 0 Probability changed: сексуальными 84 -> 0 Probability changed: сексуальных 113 -> 0 Deleted: сете 78 Deleted: слезой 87 Deleted: соображаю 90 Probability changed: тРНК 86 -> 0 Deleted: тав 69 Probability changed: транссексуал 67 -> 0 Probability changed: транссексуалки 64 -> 0 Probability changed: транссексуалов 82 -> 0 Probability changed: транссексуалы 71 -> 0 Probability changed: транссексуальности 77 -> 0 Probability changed: транссексуальность 65 -> 0 Deleted: укажите 83 Probability changed: ул 137 -> 0 Deleted: устар 93 Deleted: эдак 99 Added: Вероника 58 Added: Вероники 54 Added: Вероникой 29 Added: Веронику 29 Added: Даша 55 Added: Даши 47 Added: Дашу 29 Added: Маш 57 Added: Машка 29 Added: Паша 83 Added: Паше 66 Added: Пашей 69 Added: Пашой 73 Added: Саше 54 Added: впросак 0 Added: врождённую 66 Added: втечение 0 Added: втечении 0 Added: лёг 97 Added: машу 80 Added: чтоли 0 Added: чтоль 0 Added: ща 0 Added: щас 0 Change-Id: I0c6bf1a1ecc9edf03523bfb080774738aa40d163 |
||
Jean Chalard
|
ae41058659 |
Improve the russian dictionary.
Deleted: 38 words Probability adjusted: 11 words Added: 1299 words [Category diff] +1 15 -1 0 +2 0 -2 0 +3 0 -3 0 +4 0 -4 0 +5 0 -5 3 +6 1 -6 0 +7 0 -7 13 [Weighted category diff] +1 15 -1 0 +2 0 -2 0 +3 0 -3 0 +4 0 -4 0 +5 0 -5 3 +6 1 -6 0 +7 0 -7 13 Change-Id: I1a6513954d60b30738cb849578ce535c5e05eb1a |
||
Jean Chalard
|
004cec01a9 |
Update all dicts to version 44.
Bug: 13164302 Change-Id: I8dc1a839c7dcfaa08a53e26cb6600e9f871447ce |
||
Jean Chalard
|
66c96e8813 |
Update dictionaries
en* : add common app and Google product names en_GB : also add "filters" ru : add some missing words Bug: 11043181 Bug: 12276653 Bug: 12953122 Change-Id: I6b62e681a07b7f0149a10ba4e05954e60d6212d4 |
||
Jean Chalard
|
5937c03f15 |
Update dictionaries
Bug: 10354668 Bug: 10188528 >>> dictionaries/fr_wordlist.combined.gz Header : date : 1374634549 <=> 1376888819 version : 36 <=> 37 Body : Deleted: color 78 Deleted: men 85 Deleted: o 115 Added: nationaux 120 >>> dictionaries/iw_wordlist.combined.gz Added. New dictionary. >>> dictionaries/pt_BR_wordlist.combined.gz Header : date : 1374634563 <=> 1376884524 version : 36 <=> 37 Body : Deleted: la 152 >>> dictionaries/pt_PT_wordlist.combined.gz Header : date : 1357790930 <=> 1376884536 version : 30 <=> 37 Body : Deleted: la 152 >>> dictionaries/ru_wordlist.combined.gz Header : date : 1372393835 <=> 1376897704 version : 35 <=> 37 Body : Freq changed: говно 68 -> 0 >>> java/res/raw/main_fr.dict Header : date : 1374634549 <=> 1376888819 version : 36 <=> 37 Body : Deleted: color 78 Deleted: men 85 Deleted: o 115 Added: nationaux 120 >>> java/res/raw/main_pt_br.dict Header : date : 1374634563 <=> 1376884524 version : 36 <=> 37 Body : Deleted: la 152 >>> java/res/raw/main_ru.dict Header : date : 1372393835 <=> 1376897704 version : 35 <=> 37 Body : Freq changed: говно 68 -> 0 Change-Id: I87a85571c61068ff46a32d291aa43becbb75598a |
||
Jean Chalard
|
ffe7dbbe7a |
Update dictionaries
>>> dictionaries/cs_wordlist.combined.gz Header : date : 1355802831 <=> 1372393817 version : 29 <=> 35 Body : Added: LTE 25 >>> dictionaries/de_wordlist.combined.gz Header : date : 1355802835 <=> 1372393817 version : 29 <=> 35 Body : Added: LTE 25 >>> dictionaries/en_GB_wordlist.combined.gz Header : date : 1366272052 <=> 1372393817 version : 31 <=> 35 Body : Deleted: Sea 126 Added: LTE 25 >>> dictionaries/en_US_wordlist.combined.gz Header : date : 1366272093 <=> 1372393817 version : 31 <=> 35 Body : Added: LTE 25 >>> dictionaries/en_wordlist.combined.gz Header : date : 1366272977 <=> 1372393837 version : 31 <=> 35 Body : Deleted: Sea 126 Added: LTE 25 >>> dictionaries/es_wordlist.combined.gz Header : date : 1355802832 <=> 1372393817 version : 29 <=> 35 Body : Added: LTE 25 >>> dictionaries/fr_wordlist.combined.gz Header : date : 1366272255 <=> 1372393818 version : 31 <=> 35 Body : Deleted: R'n'B 95 Deleted: count 60 Deleted: d'Inti 34 Added: beurk 25 >>> dictionaries/hr_wordlist.combined.gz Header : date : 1355802836 <=> 1372393818 version : 29 <=> 35 Body : Added: LTE 25 >>> dictionaries/it_wordlist.combined.gz Header : date : 1355802836 <=> 1372393818 version : 29 <=> 35 Body : Added: LTE 25 >>> dictionaries/lt_wordlist.combined.gz Header : date : 1355802843 <=> 1372393818 version : 29 <=> 35 Body : Added: LTE 25 >>> dictionaries/lv_wordlist.combined.gz Header : date : 1355802843 <=> 1372393818 version : 29 <=> 35 Body : Added: LTE 25 >>> dictionaries/nb_wordlist.combined.gz Header : date : 1366003450 <=> 1372393818 version : 31 <=> 35 Body : Added: LTE 25 >>> dictionaries/nl_wordlist.combined.gz Header : date : 1355802844 <=> 1372393818 version : 29 <=> 35 Body : Added: LTE 25 >>> dictionaries/ru_wordlist.combined.gz Header : date : 1370244430 <=> 1372393835 version : 34 <=> 35 Body : Freq changed: связывание 93 -> 0 >>> dictionaries/sl_wordlist.combined.gz Header : date : 1355802835 <=> 1372393835 version : 29 <=> 35 Body : Added: LTE 25 >>> dictionaries/sr_wordlist.combined.gz Header : date : 1355802853 <=> 1372393835 version : 29 <=> 35 Body : Added: LTE 25 >>> dictionaries/sv_wordlist.combined.gz Header : date : 1366003804 <=> 1372393836 version : 31 <=> 35 Body : Added: LTE 25 >>> dictionaries/tr_wordlist.combined.gz Header : date : 1355802858 <=> 1372393837 version : 29 <=> 35 Body : Added: LTE 25 >>> java/res/raw/main_de.dict Header : date : 1355802835 <=> 1372393817 version : 29 <=> 35 Body : Added: LTE 25 >>> java/res/raw/main_en.dict Header : date : 1366272977 <=> 1372393837 version : 31 <=> 35 Body : Deleted: Sea 126 Added: LTE 25 >>> java/res/raw/main_es.dict Header : date : 1355802832 <=> 1372393817 version : 29 <=> 35 Body : Added: LTE 25 >>> java/res/raw/main_fr.dict Header : date : 1366272255 <=> 1372393818 version : 31 <=> 35 Body : Deleted: R'n'B 95 Deleted: count 60 Deleted: d'Inti 34 Added: beurk 25 >>> java/res/raw/main_it.dict Header : date : 1355802836 <=> 1372393818 version : 29 <=> 35 Body : Added: LTE 25 >>> java/res/raw/main_ru.dict Header : date : 1370244430 <=> 1372393835 version : 34 <=> 35 Body : Freq changed: связывание 93 -> 0 Bug: 9301610 Bug: 9607966 Change-Id: I1117ed85d97fbb0ee50f11bc31776f1970b56f12 |
||
Jean Chalard
|
e73802f335 |
Update dictionaries
>>> dictionaries/ru_wordlist.combined.gz Header : date : 1366974711 <=> 1370244430 MULTIPLE_WORDS_DEMOTION_RATE : 0 <=> 50 version : 32 <=> 34 Body : Deleted: МДА 2 Freq changed: а 0 -> 60 Freq changed: в 0 -> 60 Deleted: возбужденные 0 Freq changed: гей 92 -> 0 Freq changed: жид 80 -> 0 Freq changed: зареган 0 -> 50 Freq changed: и 0 -> 60 Freq changed: к 0 -> 60 Deleted: клевом 0 Freq changed: куи 29 -> 0 Freq changed: лох 69 -> 0 Freq changed: о 0 -> 60 Freq changed: ребут 0 -> 50 Freq changed: с 0 -> 60 Freq changed: у 0 -> 60 Freq changed: хуй 77 -> 0 Freq changed: хукера 38 -> 0 Freq changed: широко 0 -> 144 Deleted: щеткой 70 Freq changed: щёткой 69 -> 70 Freq changed: я 0 -> 60 Added: жены 134 Added: звони 100 Added: клёвом 50 Added: мда 0 >>> java/res/raw/main_ru.dict Header : date : 1366974711 <=> 1370244430 version : 32 <=> 34 MULTIPLE_WORDS_DEMOTION_RATE : 0 <=> 50 Body : (same changes) Change-Id: Ie10bdd1f33cac43c5be35e99faef7cfdfe877d2b |
||
Jean Chalard
|
d57a7748c1 |
Update dictionaries
>>> dictionaries/ru_wordlist.combined.gz Header : date : 1366957492 <=> 1366974711 Body : Added: ложись 100 Added: под 100 Added: посмотрю 100 Added: угу 100 Added: ух 100 >>> java/res/raw/main_ru.dict Header : date : 1366957492 <=> 1366974711 Body : Added: ложись 100 Added: под 100 Added: посмотрю 100 Added: угу 100 Added: ух 100 Change-Id: Ida39ea2cf25cd291554f3b2f3ce31f57dca24113 |
||
Jean Chalard
|
7ec72b80ed |
Update dictionaries
Full diff too long: truncated Summary diff >>> dictionaries/ru_wordlist.combined.gz Header : date : 1366277083 <=> 1366957492 version : 31 <=> 32 Contents : - Reinstate 2- and 3- letter words that were demoted to avoid bad space insertion (343 entries) - Add missing words as per b/6341908 and b/5674314 (98 entries) This has zero effect on the regression tests Bug: 6341908 Bug: 5674314 Change-Id: Ifce268a7eab5edd264d963489187e975017f8b72 |
||
Jean Chalard
|
9cf468646f |
Update dictionaries
>>> dictionaries/en_GB_wordlist.combined.gz Header : date : 1366021966 <=> 1366272052 Body : Added: yt 0 >>> dictionaries/en_US_wordlist.combined.gz Header : date : 1366021978 <=> 1366272093 Body : Added: yt 0 >>> dictionaries/en_wordlist.combined.gz Header : date : 1366021987 <=> 1366272977 Body : Added: yt 0 >>> dictionaries/fr_wordlist.combined.gz Header : date : 1366003217 <=> 1366272255 Body : Freq changed: cash 80 -> 20 >>> dictionaries/ru_wordlist.combined.gz Header : date : 1366003693 <=> 1366277083 Body : Deleted: толщ 76 >>> java/res/raw/main_en.dict Header : date : 1366021987 <=> 1366272977 Body : Added: yt 0 >>> java/res/raw/main_fr.dict Header : date : 1366003217 <=> 1366272255 Body : Freq changed: cash 80 -> 20 >>> java/res/raw/main_ru.dict Header : date : 1366003693 <=> 1366277083 Body : Deleted: толщ 76 Bug: 8635822 Change-Id: I44dc73bd010b125c994387894847a008276d69f7 |
||
Jean Chalard
|
da175bdcb1 |
Update dictionaries
>>> dictionaries/en_GB_wordlist.combined.gz Header : date : 1355802832 <=> 1366003032 version : 29 <=> 31 Body : Deleted: HTTP 95 Deleted: WWW 72 Added: mm 135 >>> dictionaries/en_US_wordlist.combined.gz Header : date : 1355112451 <=> 1366003070 version : 28 <=> 31 Body : Deleted: HTTP 95 Deleted: WWW 71 Added: mm 135 >>> dictionaries/en_wordlist.combined.gz Header : date : 1355802851 <=> 1366003861 version : 29 <=> 31 Body : Deleted: HTTP 95 Deleted: WWW 71 Added: mm 135 >>> dictionaries/fr_wordlist.combined.gz Header : date : 1357617878 <=> 1366003217 version : 29 <=> 31 Body : Not a word: re false -> true Shortcut added: re le 15 >>> dictionaries/nb_wordlist.combined.gz Header : date : 1355802836 <=> 1366003450 version : 29 <=> 31 Body : Freq changed: iPhone 91 -> 30 Added: app 30 >>> dictionaries/ru_wordlist.combined.gz Header : date : 1358763720 <=> 1366003693 version : 30 <=> 31 Body : Freq changed: за 140 -> 181 Freq changed: не 140 -> 191 Freq changed: про 131 -> 151 Freq changed: эры 125 -> 140 >>> dictionaries/sv_wordlist.combined.gz Header : date : 1355802856 <=> 1366003804 version : 29 <=> 31 Body : Added: vi 180 >>> java/res/raw/main_en.dict Header : date : 1355802851 <=> 1366003861 version : 29 <=> 31 Body : Deleted: HTTP 95 Deleted: WWW 71 Added: mm 135 >>> java/res/raw/main_fr.dict Header : date : 1357617878 <=> 1366003217 version : 29 <=> 31 Body : Not a word: re false -> true Shortcut added: re le 15 >>> java/res/raw/main_ru.dict Header : date : 1358763720 <=> 1366003693 version : 30 <=> 31 Body : Freq changed: за 140 -> 181 Freq changed: не 140 -> 191 Freq changed: про 131 -> 151 Freq changed: эры 125 -> 140 Bug: 8560415 Bug: 7556679 Change-Id: If1c628edcb1cc5efd67e1715acf94f19c0eb4643 |
||
Jean Chalard
|
be94d212e8 |
Update the Russian dictionary
The point is to get as close as possible to having the golden Russian tests pass. >>> dictionaries/ru_wordlist.combined.gz Header : date : 1355818916 <=> 1358763720 version : 29 <=> 30 Body : Deleted: НКТ 14 Freq changed: без 0 -> 140 Freq changed: бонус 94 -> 130 Freq changed: за 0 -> 140 Freq changed: на 0 -> 180 Freq changed: не 0 -> 140 Freq changed: парка 133 -> 110 Freq changed: про 0 -> 131 Freq changed: ручьи 93 -> 80 Freq changed: ура 86 -> 100 Freq changed: юрты 86 -> 60 Added: вечерком 100 Added: задачки 100 Added: сорри 100 Added: узнай 100 Added: учти 100 >>> java/res/raw/main_ru.dict All the same above changes Change-Id: I8685c34d9ab1dcbf8ae8e23d2e26380059684c95 |
||
Jean Chalard
|
cd89c5d6ed |
Update dictionaries
>>> dictionaries/ru_wordlist.combined.gz Header : date : 1355802857 <=> 1355818916 Body : Freq changed: БД 18 -> 0 Freq changed: ГБ 14 -> 0 Freq changed: ЕС 44 -> 0 Freq changed: ЖД 3 -> 0 Freq changed: ЖЖ 8 -> 0 Freq changed: ЖК 3 -> 0 Freq changed: ИИ 21 -> 0 Freq changed: КБ 37 -> 0 Freq changed: МБ 19 -> 0 Freq changed: МО 26 -> 0 Freq changed: ОС 40 -> 0 Freq changed: РФ 65 -> 0 Freq changed: СБ 21 -> 0 Freq changed: СК 23 -> 0 Freq changed: ТВ 37 -> 0 Freq changed: УК 36 -> 0 Freq changed: ЦБ 11 -> 0 Freq changed: ЦК 59 -> 0 Deleted: бэ 0 Freq changed: дБ 92 -> 0 Deleted: йо 0 Freq changed: мм 149 -> 0 Freq changed: рН 104 -> 0 Deleted: ша 0 >>> java/res/raw/main_ru.dict Header : date : 1355802857 <=> 1355818916 Body : Freq changed: БД 18 -> 0 Freq changed: ГБ 14 -> 0 Freq changed: ЕС 44 -> 0 Freq changed: ЖД 3 -> 0 Freq changed: ЖЖ 8 -> 0 Freq changed: ЖК 3 -> 0 Freq changed: ИИ 21 -> 0 Freq changed: КБ 37 -> 0 Freq changed: МБ 19 -> 0 Freq changed: МО 26 -> 0 Freq changed: ОС 40 -> 0 Freq changed: РФ 65 -> 0 Freq changed: СБ 21 -> 0 Freq changed: СК 23 -> 0 Freq changed: ТВ 37 -> 0 Freq changed: УК 36 -> 0 Freq changed: ЦБ 11 -> 0 Freq changed: ЦК 59 -> 0 Deleted: бэ 0 Freq changed: дБ 92 -> 0 Deleted: йо 0 Freq changed: мм 149 -> 0 Freq changed: рН 104 -> 0 Deleted: ша 0 Change-Id: I03f0f4e8d03e0f77f5879e6dd5c424673466afca |
||
Jean Chalard
|
21dbe3701c |
Update dictionaries
cs, da, de, el, es, fi, fr, hr, it, lt, lv, nb, nl, pl, pt_BR, pt_PT, sl, sr, sv, tr : rescale frequencies to match spec. This has no large effect in the practice except the dictionary will become stronger vs spatial model (especially in lower count corpora, like lt, lv, sr) en* : Small changes (rounding going the other way essentially) ru : the above rescaling, and remove the following words: Дре, ОСТа, Планше, легкими, легком, легкому, легкости, легкую, нелегкие, нелегкий, нелегким, нелегкое, нелегкой, нелегкую, полулегком and add нелёгкие, нелёгкое, нелёгкую; other accented forms were already in the dictionary. Change-Id: I40386c2ebd4d2be38874e822bde89db7cb512ae6 |
||
Jean Chalard
|
bd793ed50d |
Update dictionaries
>>> dictionaries/en_GB_wordlist.combined.gz Header : date : 1353500789 <=> 1354870724 Body : Added: Dad 75 Added: Daddy 60 Added: Grandma 60 Added: Grandpa 55 Added: Mama 59 Added: Mom 77 Added: Papa 55 >>> dictionaries/en_US_wordlist.combined.gz Header : date : 1351675958 <=> 1354870736 version : 26 <=> 27 Body : Deleted: Rod's 46 Added: Dad 75 Added: Daddy 60 Added: Grandma 60 Added: Grandpa 55 Added: Mama 59 Added: Mom 77 Added: Papa 55 >>> dictionaries/en_wordlist.combined.gz Header : date : 1353500998 <=> 1354870744 Body : Deleted: Rod's 46 Added: Dad 75 Added: Daddy 60 Added: Grandma 60 Added: Grandpa 55 Added: Mama 59 Added: Mom 77 Added: Papa 55 >>> dictionaries/fr_wordlist.combined.gz Header : date : 1353500832 <=> 1354872988 Body : Deleted: noël 71 Deleted: po 73 Deleted: ti 73 Added: Noël 71 Added: lose 1 Added: y'a 130 >>> dictionaries/ru_wordlist.combined.gz Header : date : 1353567943 <=> 1354870130 Body : Demote all CAPS words by 80 Freq changed: модно 51 -> 20 >>> java/res/raw/main_en.dict Header : date : 1353500998 <=> 1354870744 Body : Deleted: Rod's 46 Added: Dad 75 Added: Daddy 60 Added: Grandma 60 Added: Grandpa 55 Added: Mama 59 Added: Mom 77 Added: Papa 55 >>> java/res/raw/main_fr.dict Header : date : 1353500832 <=> 1354872988 Body : Deleted: noël 71 Deleted: po 73 Deleted: ti 73 Added: Noël 71 Added: lose 1 Added: y'a 130 >>> java/res/raw/main_ru.dict Header : date : 1353567943 <=> 1354870130 Body : Demote all CAPS words by 80 Freq changed: модно 51 -> 20 Change-Id: I6f2d1c359d716535923b22c33d7fa4c3b0a330e4 |
||
Jean Chalard
|
b40a1ce50b |
Update RU dictionary header.
>>> dictionaries/ru_wordlist.combined.gz >>> java/res/raw/main_ru.dict Header : date : 1353500945 <=> 1353567943 MULTIPLE_WORDS_DEMOTION_RATE : null <=> 0 Body : No differences Bug: 7540132 Change-Id: I837831b1e214da64962cf1bb68c840a3d4e6bf76 |
||
Jean Chalard
|
d5f53710c5 |
Update dictionaries and fix mistakes
- Combined de dict : Remove digraph shortcuts that were in by mistake. - Combined en dict : Set freq of "baton" "batons" "mace" "puff" "puffs" and "tasers" to zero. They are offensive in en_GB. - Combined en_GB dict : Change freq of "il" to 0 and flag it "not a word". Still in the dict as a whitelist entry for "I'll"; for some reason it had freq 99. Add "milk:122" and "practice:143" - Combined fr dict : Add missing words : "Nostradamus:40" "défendais:30" "gmail:50" "générale:140" "hm:0" "hmm:0" "y'en:130" "l'apocalypse:31" "m'épuise:30" "recontacter:80" "t'annonce:30" Set freq of non-word shortcuts for digraphs to 1 instead of 0, allowing to gesture them. - Combined ru dict : Remove a lot of two-character non-words. - Binary de dict : Remove the obsolete "options" header, and add the "dictionary" header. - Binary en dict : Flag "hoe" "hoes" "il" "shel" as non-words. Also drop freq of "il" and "shel" to 0 Add the "locale" header that was missing. - Binary es dict : Add the "dictionary" header. - Binary fr dict : Add the same words as above. Non-word shortcuts were already set to 1. - Binary it dict : Add a "dictionary" header. Also change freq of "Šarapova" from 50 to 37; not sure why it was 50. - Binary pt_BR dict : Add a "dictionary" header. - Binary ru dict : Add a "dictionary" header and remove the same words as above. For all dictionaries : bump the version to 27. Change-Id: I94fe7f8f42b31fdad223085c00a94115e14d2276 |
||
Jean Chalard
|
d0cf96493c |
Use all Lexiteria sources and update existing directories.
New dictionaries : - Danish - Greek - Finnish - Lithuanian - Latvian - Dutch - Polish - Russian - Slovene - Serbian - Swedish - Turkish Also, compress those files to reduce the footprint in the repository. Also, update and improve English and French dictionaries, and add the ligatures shortcut into the French dictionary. Finally, move the Russian binary dictionary here now that it can at last be open sourced. Bug: 5587752 Bug: 6775251 Bug: 6995793 Bug: 7149666 Change-Id: Iec9831d4dce425a2b5b0657571e4448436610525 |