David Huynh
2fe8f98e4e
Added repeat and repeatCount options for text transform operation. This lets us fix those & repeated encoding problems easily.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@179 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-04 00:00:46 +00:00
David Huynh
b4d2cef526
Added an option for what to do when a text transform errors out. Made a custom expression preview dialog for the text transform command in order to suppor that option.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@178 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-03 22:12:48 +00:00
David Huynh
c1498448e4
Implemented global and per-project expression histories and hooked them up to the expression preview dialog.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@176 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-03 21:21:38 +00:00
David Huynh
b75f1faea8
Changed tabs to spaces. No functionality change.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@174 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-03 04:19:58 +00:00
Stefano Mazzocchi
2691ee50d7
adding OS-specific data paths
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@173 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-03 02:53:07 +00:00
David Huynh
ad7671508f
Added "cancel processes" command, not hooked up yet.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@171 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-03 00:30:39 +00:00
David Huynh
59c5314e42
Fixed bug in list facet: list facets on columns with numeric data weren't working before.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@169 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-02 23:07:33 +00:00
David Huynh
3ecfb4e4d9
Implemented facet-based edit operation for real.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@167 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-02 20:33:11 +00:00
David Huynh
512cd16381
Implemented recon by keys, guids, and ids.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@165 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-02 18:19:20 +00:00
David Huynh
99ae6109d8
Started work on key-based recon.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@164 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-02 03:31:58 +00:00
David Huynh
e57aae888b
Hooked up the recon service at data.labs.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@163 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-02 00:33:32 +00:00
David Huynh
f16727c20c
Refactored recon code on the server side to prepare for supporting other modes of recon.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@162 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-01 23:33:23 +00:00
Stefano Mazzocchi
621655372f
- save encoding and confidence in the project metadata
...
- use the saved encoding for decoding
- don't error when fingerprinting null
git-svn-id: http://google-refine.googlecode.com/svn/trunk@160 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-01 04:56:16 +00:00
David Huynh
bc9bc54d30
Implemented a meta parser that looks for a language prefix and picks the right parser.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@159 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-01 01:50:56 +00:00
David Huynh
acfa19a683
Moved GEL stuff (gridworks expression language) into gel package.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@158 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-01 01:30:31 +00:00
David Huynh
7c38fbb945
Created an ast package for gridworks expression language abstract syntax tree nodes. Moved parsing exception class out to its own file.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@156 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-01 01:14:21 +00:00
David Huynh
983be19e14
Made EvalError serializable because errors can be cell values and need to be saved.
...
Turned is* functions into controls, since they have to be able to test errors, and only controls can do that, not functions.
Polished display of errors in cells and in expression preview dialog.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@155 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-01 00:21:13 +00:00
Stefano Mazzocchi
0c6590fe2c
- added an encoding guesser
...
- fixed a bunch of encoding issues
- added a function to reinterpret call content in another encoding
- added a 'phonetic' function to the expression language that supports metaphone and soundex
- updated the COS library to the latest released version
- added the IBM ICU4j library (that contains the encoding guesser)
- added examples with same content but different encodings
git-svn-id: http://google-refine.googlecode.com/svn/trunk@154 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-28 21:51:33 +00:00
Stefano Mazzocchi
d9e67ac806
- diff now can act before two dates (still to be fully tested)
...
- added string fingerprinting function (useful for clustering)
- fixed unicode() function which wasn't returning correct values
- added a toString method to EvalError to know what error that was
- fixed a NPE in TextTransformationOperation
git-svn-id: http://google-refine.googlecode.com/svn/trunk@153 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-28 00:55:09 +00:00
Stefano Mazzocchi
f1923758e7
- add a bunch of new functions
...
- very lax date parser
- lots of new advanced string functions
- new version of commons-lang
git-svn-id: http://google-refine.googlecode.com/svn/trunk@152 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-27 08:56:04 +00:00
David Huynh
25fd5794cd
Added choices blank and error to list facets.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@151 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-27 06:59:55 +00:00
David Huynh
49e7241d1d
Re-organized functions into a few sub-packages.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@150 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-27 05:59:09 +00:00
David Huynh
c914aa6c16
Introduced EvalError objects as possible values returned by expressions.
...
Extracted function and control name mappings to ControlFunctionRegistry.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@148 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-27 05:48:33 +00:00
David Huynh
f5ff9044cf
Track and display recon stats in column headers.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@146 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-26 23:33:16 +00:00
David Huynh
30dce3b3d5
Made range facet more robust against bad expressions.
...
Centralized code that updates components of the UI. Show "Working..." indicator if anything takes more than 500ms.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@142 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-26 21:56:41 +00:00
David Huynh
1e4b9f4e80
Fixed bug in text search facet where if the query is null or empty string it'd filter to nothing.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@141 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-26 18:44:36 +00:00
David Huynh
bb83dcda1c
Added support for specifying number of initial rows to skip when creating a new project.
...
Fixed the height of the histogram images in range facets to eliminate jitters.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@135 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-24 18:52:54 +00:00
David Huynh
254853b51d
Added reverse and sort functions.
...
Support a limit on how many rows to load into a new project.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@134 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-23 23:22:02 +00:00
David Huynh
4bdb2320b7
Styled help tab of expression preview dialog. Added variables section.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@131 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-23 19:06:58 +00:00
David Huynh
dce42400d4
Fixed bug introduced while trying to delay constructing the candidates arrays in Recon objects.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@130 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-23 17:47:34 +00:00
David Huynh
ec1604e815
Added support for starring rows.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@129 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-23 07:45:12 +00:00
David Huynh
8992531d02
Documented functions and controls in expression language.
...
Better error checking in operator calls.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@127 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-23 01:40:06 +00:00
David Huynh
607fca04cb
Added a few more math functions.
...
Fixed expression preview dialog to use tabs.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@126 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-23 00:33:39 +00:00
David Huynh
94fbd97bc4
Added a few more expression functions.
...
Bind row index when filtering rows, so we can create facets based on row indices.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@125 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-22 23:51:44 +00:00
David Huynh
0f505c72c5
Delay constructing the candidates array in recon objects to save memory.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@124 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-22 23:27:16 +00:00
David Huynh
c45e0edc10
Lower recon batch size back to 10.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@123 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-22 23:06:34 +00:00
David Huynh
4ed7b45e41
Don't use schema restriction for protograph link suggest because it's not a "soft" restriction (so if the user wants a property that doesn't belong to the type, there is no way to get it).
...
More expression functions and controls.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@122 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-22 23:04:46 +00:00
David Huynh
5e9be8c258
Support reusing newly created topics for cells with the same content.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@121 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-22 22:15:48 +00:00
David Huynh
e4b01cb36c
Make similar cell judgments an abstract operation.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@120 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-22 20:25:45 +00:00
David Huynh
c98a8ad552
Pulled the operations package up one level.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@119 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-22 18:42:25 +00:00
David Huynh
c50de52883
Improved the "extract operations" dialog to let user select which operations to extract. Also show history entries that cannot be abstracted.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@118 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-22 06:31:09 +00:00
David Huynh
1227c9dff4
Centralized mapping between operation names and their reconstructors.
...
Implemented comparison operators.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@117 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-22 04:23:25 +00:00
David Huynh
934c0f81c3
Forgot to check in image files.
...
Added commands for judging similar cells.
Started to fix/unify terminologies for recon operations.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@116 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-22 01:28:13 +00:00
David Huynh
b3167a1a9f
Added option for automatically approving best recon candidates that match the expected type and score at least some minimum score.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@115 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-21 08:11:07 +00:00
David Huynh
b4935f576c
Tweaked recon type guessing heuristic: remove "role" and "annotation" types, and rank types based on result orders rather than relevance scores.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@114 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-21 07:32:16 +00:00
David Huynh
b730dfd8f9
Added commands for searching for specific topics to match cells with.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@113 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-20 00:47:08 +00:00
David Huynh
ea2c904704
Use the schema index to suggest properties in the schema alignment dialog.
...
Fixed minor bug in triple loader transposer that wrote a bad triple for each literal cell value.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@112 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-19 22:56:29 +00:00
David Huynh
846e540ff6
Keep track of type names of reconciled columns so we can display them later in the schema alignment dialog.
...
Automatically create properties linking to all columns when starting with an empty protograph.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@110 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-19 18:32:48 +00:00
David Huynh
6c7557eeff
Minor bug fixes.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@108 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-19 17:47:04 +00:00
David Huynh
bc412b99ea
Fixed bug in triple loader transposer: properties didn't get asserted before.
...
Made triple loader transposer index its output variables.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@106 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-19 07:05:09 +00:00