David Huynh
496823e564
Added start() and end() methods to RowVisitor and RecordVisitor so visitors can do things before and after all visitations.
...
Added sorting package. It's not hooked up, yet.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@834 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-20 22:10:34 +00:00
David Huynh
28ca652dea
More row/record model refactoring work. Everything should still be working almost as before, except contextual rows are not shown in row-based mode.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@823 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-20 00:13:19 +00:00
David Huynh
1e737e3238
Factored row dependency code from Row class and Project class out as Record and RecordModel classes.
...
Simplified RdfTripleImporter.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@820 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-19 04:22:45 +00:00
David Huynh
6450921c02
Fixed issue 4: Match All bug with ZIP code.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@767 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-14 17:03:33 +00:00
Stefano Mazzocchi
ea459aed07
Applied a bunch of patches from Tom Morris (Issue 25, 26 and 27)
...
- make java6 dependency explicit in eclipse project files
- avoid using NotImplementException especially the sun.* one
- avoid using internal sun signal handling and rely on standard java.* APIs
(I tested this one and it seems to be working fine)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@756 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-13 21:02:19 +00:00
David Huynh
8412aa72dd
Fixed Issue 17: Conflated triples - all rows are producing triple with "s" :" $Name_0".
...
Also exposed "id" field for recon objects.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@720 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-12 07:45:22 +00:00
Stefano Mazzocchi
c32899aea6
clearing PMD warnings
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@600 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-05 01:42:08 +00:00
Stefano Mazzocchi
e6d36710ff
findbug cleanups
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@599 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-05 01:35:51 +00:00
Stefano Mazzocchi
92ecc0c0f5
detab + dedos for java files (no functional changes)
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@594 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-04 23:24:48 +00:00
David Huynh
bab1e8905b
Jacked up jetty form upload size limit.
...
Added a few more array bound checks.
Reduced number of recon candidate and recon objects created by extend data operations.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@577 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-30 19:41:53 +00:00
David Huynh
3f40195ea1
Implemented but disabled the denormalize operation.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@571 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-29 22:07:07 +00:00
David Huynh
17c9b65889
Made (blank) and (error) choices in list facets editable, too.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@569 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-29 20:18:06 +00:00
David Huynh
cf01dcd965
In column addition and text transform operations, for expressions that evaluate to cells or wrapped cells, use the whole cells as the result cells. This effectively copies their recon objects as well.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@568 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-29 19:44:39 +00:00
David Huynh
89e1d8b5ac
Got history entries' IDs into Recon objects so we can track from a Recon object to all others created by the same operation.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@562 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-28 17:18:44 +00:00
David Huynh
15c188ad7a
Added more metadata into recon objects.
...
Tried to minimize number of unique recon objects created when calling Recon.dup().
git-svn-id: http://google-refine.googlecode.com/svn/trunk@560 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-27 23:17:18 +00:00
David Huynh
53d7bd3287
Another star to flag copy and paste bug.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@549 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-26 01:18:57 +00:00
David Huynh
fed3c87fa6
Added row flagging support. Fixed bug in row star change: starring or unstarring one row wasn't undo-able previously.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@547 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-26 01:08:56 +00:00
David Huynh
f9a829758e
Pool recons and recon candidates. This yields smaller project files, change files, and AJAX responses for get-rows. It should make re-loading existing projects faster.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@521 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-23 19:39:12 +00:00
David Huynh
5ba67b7b26
Implemented column split command. It seems to be working in "by lengths" mode.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@510 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-20 23:08:14 +00:00
David Huynh
1d938bc4d0
Better MQL batching during extending data operations.
...
Tried to use JSON streaming in changes as well.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@479 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-15 00:09:14 +00:00
David Huynh
24a7ea91b6
Fixed bugs
...
- MassEditOperation was barfing when engineConfig was missing
- When parsing JSON in streaming mode, get long instead of int and double instead of float so that we won't get overflow exception.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@476 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-14 20:34:29 +00:00
David Huynh
8b95248c75
Fixed bug where after reconciling by ID, GUID, or key would generate a buggy numeric range facet, since all the scores were artificially the same.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@454 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-12 22:19:44 +00:00
David Huynh
759824e1b4
Bug fix: editing one facet choice while some other choices are selected resulted in no change.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@429 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-08 22:16:12 +00:00
Stefano Mazzocchi
7526c4e582
cleanups (no functional changes)
...
this makes pmd and javac on linux happier
git-svn-id: http://google-refine.googlecode.com/svn/trunk@427 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-08 20:46:02 +00:00
David Huynh
9d9329ca96
Implemented row remove command.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@391 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-06 07:47:44 +00:00
David Huynh
1fd85c62bf
Implemented column rename command.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@390 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-06 07:15:34 +00:00
David Huynh
f402db10af
Implemented inter-project joins.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@387 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-06 05:35:48 +00:00
Stefano Mazzocchi
2efbf0031f
- removed the 'thirdparty' directory (now the 'gridworks' script will download and install needed tools if they are not present in the system already)
...
- added 'findbugs' command that uses the findbugs static analyzer to look for problems in the code
- fixed a bunch of issues that findbugs found (a few methods would go a little faster, and a few NPE will be avoided... nothing major but good to have)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@382 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-05 07:15:16 +00:00
David Huynh
084a6114d7
Track freebase types of columns added with data from Freebase, so that we can later add more data based on those columns. Fixed minor bug in serialization of data extension records.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@303 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-16 06:18:00 +00:00
David Huynh
c6e7986206
Extend data operation is working.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@301 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-16 00:24:20 +00:00
David Huynh
e35c4c3b94
Minor bug.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@294 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-14 05:15:23 +00:00
David Huynh
af3cb76056
Added support for including dependent rows in row visiting. Facets still don't count them, though.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@282 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-12 01:06:23 +00:00
David Huynh
80e6111a92
Added options for omitting error and blank choices in list facets, and use them in the various recon facets.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@227 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-07 22:54:02 +00:00
David Huynh
694f09fb0a
Major refactoring: everything is now saved to disk using our own formats, mostly json-based, some inside zip files.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@226 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-07 22:37:26 +00:00
David Huynh
e0d72c81e9
Renamed "facet-based edit" operation and command to "mass edit", because it's not just facet-based.
...
Added option "apply to other cells with same original content" to single cell edit popup, so it can be used like a find&replace operation.
Renamed "do-text-transform" operation and command to just "text-transform".
git-svn-id: http://google-refine.googlecode.com/svn/trunk@223 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-07 00:25:00 +00:00
David Huynh
db824bffeb
Fixed bug in saving recon changes.
...
Fixed bug in discard recon judgment operation.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@218 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-06 08:03:29 +00:00
David Huynh
78b1eb7e73
Major refactoring:
...
- Made all Change classes save to and load from .zip files.
- Changed Column.headerLabel to Column.name.
- Save project's raw data to "raw-data" file for now. We'll make it save to a zip file next.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@217 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-06 07:43:45 +00:00
David Huynh
1d6db8fa6e
Made recon process cause the client page to create facets when the recon process is done.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@203 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-05 01:13:59 +00:00
David Huynh
9d8b746121
Switched Cell.value from Object to Serializable.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@201 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-04 19:59:31 +00:00
David Huynh
72d06fe65c
Added support for canceling running and pending processes.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@183 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-04 01:14:48 +00:00
David Huynh
eaef7b2394
Also let user decide what to do on expression evaluation error when creating a new column.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@182 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-04 00:32:54 +00:00
David Huynh
5a0a8bea4f
Added custom dialog box for create column operation (with a field for the new column name).
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@180 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-04 00:12:39 +00:00
David Huynh
2fe8f98e4e
Added repeat and repeatCount options for text transform operation. This lets us fix those & repeated encoding problems easily.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@179 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-04 00:00:46 +00:00
David Huynh
b4d2cef526
Added an option for what to do when a text transform errors out. Made a custom expression preview dialog for the text transform command in order to suppor that option.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@178 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-03 22:12:48 +00:00
David Huynh
b75f1faea8
Changed tabs to spaces. No functionality change.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@174 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-03 04:19:58 +00:00
Stefano Mazzocchi
2691ee50d7
adding OS-specific data paths
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@173 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-03 02:53:07 +00:00
David Huynh
3ecfb4e4d9
Implemented facet-based edit operation for real.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@167 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-02 20:33:11 +00:00
David Huynh
f16727c20c
Refactored recon code on the server side to prepare for supporting other modes of recon.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@162 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-01 23:33:23 +00:00
David Huynh
bc9bc54d30
Implemented a meta parser that looks for a language prefix and picks the right parser.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@159 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-01 01:50:56 +00:00
David Huynh
acfa19a683
Moved GEL stuff (gridworks expression language) into gel package.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@158 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-01 01:30:31 +00:00