|
7f2a212c6a
|
catching bad alloc
|
2019-02-28 15:38:43 +01:00 |
|
|
829c70d320
|
long file support
|
2019-02-26 13:46:57 +01:00 |
|
|
d39c0400c9
|
occurrence refactoring
|
2019-01-22 14:07:28 +01:00 |
|
|
73b3d22d97
|
removing throw declarations
|
2019-01-18 13:30:51 +01:00 |
|
|
210929751d
|
const to the getter for total occurences count
|
2019-01-16 13:15:30 +01:00 |
|
|
ec621fb310
|
working full search
|
2019-01-09 18:31:52 +01:00 |
|
|
5a7cbbe9e9
|
full search stub - tests needed
|
2019-01-09 15:30:56 +01:00 |
|
|
53b100b2e4
|
lowercasing bad utf
|
2018-12-13 17:43:01 +01:00 |
|
|
2eda92fe7a
|
interval contains
|
2018-12-12 21:45:07 +01:00 |
|
|
4258caf522
|
correction
|
2018-08-29 11:08:01 +02:00 |
|
|
bd4ff81e32
|
ensuring UTF-8 strings
|
2017-10-15 18:54:15 +02:00 |
|
|
61631c52a3
|
lexicon search
|
2017-10-10 15:39:47 +02:00 |
|
|
5e809efcce
|
corrected tokenizer
|
2017-05-05 12:58:32 +02:00 |
|
|
96a5bc3108
|
original sentence in tokenized sentence
|
2017-04-28 13:48:32 +02:00 |
|
|
4faae4e91a
|
slight change
|
2017-04-27 13:52:03 +02:00 |
|
|
dceb0d9f47
|
date recognition
|
2017-04-27 10:37:29 +02:00 |
|
|
bd73749388
|
new tokenizer
|
2017-04-26 17:02:18 +02:00 |
|
|
a0673df75a
|
cpplint corrections
|
2017-04-22 23:47:48 +02:00 |
|
|
970dda5dc2
|
option of white space tokenization while searching
|
2017-04-22 23:45:51 +02:00 |
|
|
31e4f091ad
|
mutliple results
|
2017-04-21 14:51:58 +02:00 |
|
|
c3826919ba
|
changes in CMakeLists.txt
|
2017-03-03 11:28:54 +01:00 |
|
|
cf7b1592f7
|
updated todo
|
2016-11-01 22:23:30 +01:00 |
|
|
7e005bfca7
|
changed significance factor to 2
|
2016-10-22 18:02:04 +02:00 |
|
|
8bc739ff20
|
added boundary on simple search results
|
2016-01-25 22:42:42 +01:00 |
|
|
b3d7c993aa
|
tokenize only option - no word map
|
2016-01-01 20:45:07 +01:00 |
|
|
bbf3853d2a
|
added lowercasing when tokenizing by space
|
2015-12-29 21:44:46 +01:00 |
|
|
0a8d2fdd39
|
tokenize by whitespace option
|
2015-12-27 20:54:40 +01:00 |
|
|
873d7c300c
|
added parameterless constructor for concordia
|
2015-10-19 15:38:10 +02:00 |
|
|
1adabf4833
|
add index path as required argument to concordia constructor
|
2015-10-16 22:14:11 +02:00 |
|
|
f585ff9e01
|
corpus figures creator
|
2015-10-06 13:34:03 +02:00 |
|
|
96c74c47ac
|
corpus analyzer
|
2015-10-04 16:24:58 +02:00 |
|
|
2601dc83bf
|
test corpus for corpus analyzer
|
2015-10-03 16:19:10 +02:00 |
|
|
4e17e28f7f
|
working corpus analyzer
|
2015-10-03 16:18:49 +02:00 |
|
|
fa3138df29
|
count occurences feature
|
2015-10-01 13:36:54 +02:00 |
|
|
fd32ff7e12
|
todo
|
2015-09-07 08:15:46 +02:00 |
|
|
cdeb57ccfa
|
todo
|
2015-08-26 20:14:43 +02:00 |
|
|
bd62420cd5
|
updated tutorial
|
2015-08-24 14:30:20 +02:00 |
|
|
0a3fd8a04e
|
added an extremely important improvement to the concordia search algorithm - gapped overlays cut-off
|
2015-08-24 13:10:06 +02:00 |
|
|
209e374226
|
repaired concordia test
|
2015-08-19 20:53:40 +02:00 |
|
|
68fecaddf8
|
adding all tokenized examples
|
2015-08-19 20:49:26 +02:00 |
|
|
a765443a01
|
simple search returns matched pattern fragments
|
2015-08-07 12:54:57 +02:00 |
|
|
28704c2f43
|
separated tokenization and adding to index
|
2015-08-01 17:03:39 +02:00 |
|
|
5a57406875
|
finished original word positions
|
2015-06-27 12:40:24 +02:00 |
|
|
a8c5fa0c75
|
original word positions
|
2015-06-27 10:09:49 +02:00 |
|
|
dba70b4e24
|
done word positions
|
2015-06-26 22:50:53 +02:00 |
|
|
724bf0d080
|
new responsibilities of tokenized sentence
|
2015-06-26 15:38:24 +02:00 |
|
|
9b1735516c
|
working sentence tokenizer
|
2015-06-25 20:49:22 +02:00 |
|
|
8432dd321f
|
tokenizer in progress
|
2015-06-25 10:12:51 +02:00 |
|
|
0baf3e4ef2
|
character intervals in progress
|
2015-06-22 13:52:56 +02:00 |
|
|
4c0f2fd08d
|
modified todo
|
2015-06-12 12:25:02 +02:00 |
|