Aleksandras Kostarevas
dcce3ea5ae
Skip token mixes if total sum is 0
...
This case seems to actually be quite common during fuzzing
2024-05-16 18:37:43 -05:00
Aleksandras Kostarevas
be5ed15220
Fix some linter warnings
2024-05-16 17:18:08 -05:00
Aleksandras Kostarevas
b59aa89363
Use more C++ style memory management
2024-05-16 14:33:02 -05:00
Aleksandras Kostarevas
e19de589f1
Skip invalid token mixes
2024-05-16 12:18:13 -05:00
Aleksandras Kostarevas
09a6a30d8b
Use jstring2string for more strings
2024-05-16 12:07:16 -05:00
Aleksandras Kostarevas
99d5fda170
Fix incorrect LM results with number row
2024-05-07 11:31:15 -05:00
Aleksandras Kostarevas
0e1a338f0d
Re-enable is_bugged check for now
2024-04-30 14:14:41 -04:00
Aleksandras Kostarevas
57cb64f8bd
Remove some logging
2024-04-30 13:58:46 -04:00
Aleksandras Kostarevas
85de4c86d4
Disable is_bugged check
2024-04-29 22:19:49 -04:00
Aleksandras Kostarevas
0b1ad01f1a
LM rescoring WIP
2024-04-28 21:55:32 -04:00
Aleksandras Kostarevas
46daec4972
Split by n_batch for llama_decode
2024-04-22 14:37:14 -04:00
Aleksandras Kostarevas
8ae3263822
Implement initial swipe typing
2024-04-18 10:29:10 -05:00
Aleksandras Kostarevas
9308bcbfb0
Reduce logging
2024-04-11 00:43:56 -05:00
Aleksandras Kostarevas
cbd75f9799
Fix some race conditions and properly free language model
2024-04-09 23:06:31 -05:00
Aleksandras Kostarevas
434a751d63
Fix modified utf-8 errors when returning strings
2024-03-21 16:49:45 -05:00
Aleksandras Kostarevas
38055fae65
Add other workaround
2024-03-13 16:11:52 -05:00
Aleksandras Kostarevas
350b8e8fcf
Add bad word filtering and blacklisting
2024-03-13 13:31:51 -05:00
Aleksandras Kostarevas
9fed68c03a
Fix segfault when no results / only 1 result
2024-03-07 14:56:21 +02:00
Aleksandras Kostarevas
c57a3d83af
Add personal dictionary glossary for voice input and keyboard
2024-03-05 15:24:30 +02:00
Aleksandras Kostarevas
6453c15a21
Merge branch 'lm-2-finetuning-whisperggml' into 'model-metadata'
...
Add autocorrect threshold to model-metadata branch
See merge request alex/latinime!6
2024-02-03 15:18:27 +00:00
Aleksandras Kostarevas
c7113297fb
Add radio selection for threshold
2024-02-01 21:55:56 +02:00
Aleksandras Kostarevas
a111164bb8
Improve algorithm in a few ways:
...
* If the first letter is capital, only capitalized first tokens will be sampled. If the whole text is capitalized, then only fully capital tokens will be sampled for the whole word
* If a word is an exact match, it gets boosted relative to others
* Probability threshold for autocorrect is now 18.0
* Add "clueless" threshold, if it's less than 1.3 then just show the user's typed word in the middle instead.
2024-01-30 20:30:44 +02:00
Aleksandras Kostarevas
0021b6aa04
Model metadata and manager component
2024-01-24 01:03:16 +02:00
Aleksandras Kostarevas
5e0722c984
Fix issue with apostrophe token being banned
2024-01-22 08:20:55 +02:00
Aleksandras Kostarevas
55d5959f54
Skip non-alphabetic characters during mixing
2024-01-09 18:25:14 +02:00
Aleksandras Kostarevas
ebb70b9c12
Fix build, disable gesture input pending model update
2023-12-19 20:28:58 +02:00
Aleksandras Kostarevas
4e9e86d871
Implement multimodal position encoder
2023-12-19 20:02:20 +02:00
Aleksandras Kostarevas
7075c22179
Add key embedding mixing
2023-12-04 20:09:51 +00:00
Aleksandras Kostarevas
4f15ff4a73
Add experimental swipe typing
2023-11-28 17:01:58 +00:00
Aleksandras Kostarevas
14fcb55565
Save LoRA-merged model after training
2023-11-14 20:40:00 +02:00
Aleksandras Kostarevas
0e0876f06c
Revise training
2023-11-13 16:42:01 +02:00
Aleksandras Kostarevas
ee8a81f12c
Initial fine-tuning
2023-11-07 16:48:48 +02:00
Aleksandras Kostarevas
5778cd15a0
Update ggml and llama.cpp
2023-11-06 13:41:25 +02:00
Aleksandras Kostarevas
7c4531e32d
Fix crashes related to too large context
2023-10-16 18:24:00 +03:00
Aleksandras Kostarevas
92480fd460
Adjust space probability and mustNotAutocorrect
2023-10-13 18:44:38 +03:00
Aleksandras Kostarevas
c34a411989
Fix infinite prediction loop
2023-10-13 18:34:49 +03:00
Aleksandras Kostarevas
b8539ce88a
Initial batched inference using llama_batch
2023-10-10 22:34:04 +03:00
Aleksandras Kostarevas
16fdb3629d
Add LanguageModel class
2023-09-28 19:42:29 +03:00