Commit Graph

61 Commits

Author SHA1 Message Date
David Huynh
496823e564 Added start() and end() methods to RowVisitor and RecordVisitor so visitors can do things before and after all visitations.
Added sorting package. It's not hooked up, yet.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@834 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-20 22:10:34 +00:00
David Huynh
28ca652dea More row/record model refactoring work. Everything should still be working almost as before, except contextual rows are not shown in row-based mode.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@823 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-20 00:13:19 +00:00
David Huynh
1e737e3238 Factored row dependency code from Row class and Project class out as Record and RecordModel classes.
Simplified RdfTripleImporter.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@820 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-19 04:22:45 +00:00
David Huynh
6450921c02 Fixed issue 4: Match All bug with ZIP code.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@767 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-14 17:03:33 +00:00
Stefano Mazzocchi
ea459aed07 Applied a bunch of patches from Tom Morris (Issue 25, 26 and 27)
- make java6 dependency explicit in eclipse project files
- avoid using NotImplementException especially the sun.* one
- avoid using internal sun signal handling and rely on standard java.* APIs
 (I tested this one and it seems to be working fine)


git-svn-id: http://google-refine.googlecode.com/svn/trunk@756 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-13 21:02:19 +00:00
David Huynh
8412aa72dd Fixed Issue 17: Conflated triples - all rows are producing triple with "s" :" $Name_0".
Also exposed "id" field for recon objects.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@720 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-12 07:45:22 +00:00
Stefano Mazzocchi
c32899aea6 clearing PMD warnings
git-svn-id: http://google-refine.googlecode.com/svn/trunk@600 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-05 01:42:08 +00:00
Stefano Mazzocchi
e6d36710ff findbug cleanups
git-svn-id: http://google-refine.googlecode.com/svn/trunk@599 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-05 01:35:51 +00:00
Stefano Mazzocchi
92ecc0c0f5 detab + dedos for java files (no functional changes)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@594 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-04 23:24:48 +00:00
David Huynh
bab1e8905b Jacked up jetty form upload size limit.
Added a few more array bound checks.
Reduced number of recon candidate and recon objects created by extend data operations.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@577 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-30 19:41:53 +00:00
David Huynh
3f40195ea1 Implemented but disabled the denormalize operation.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@571 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-29 22:07:07 +00:00
David Huynh
17c9b65889 Made (blank) and (error) choices in list facets editable, too.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@569 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-29 20:18:06 +00:00
David Huynh
cf01dcd965 In column addition and text transform operations, for expressions that evaluate to cells or wrapped cells, use the whole cells as the result cells. This effectively copies their recon objects as well.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@568 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-29 19:44:39 +00:00
David Huynh
89e1d8b5ac Got history entries' IDs into Recon objects so we can track from a Recon object to all others created by the same operation.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@562 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-28 17:18:44 +00:00
David Huynh
15c188ad7a Added more metadata into recon objects.
Tried to minimize number of unique recon objects created when calling Recon.dup().

git-svn-id: http://google-refine.googlecode.com/svn/trunk@560 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-27 23:17:18 +00:00
David Huynh
53d7bd3287 Another star to flag copy and paste bug.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@549 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-26 01:18:57 +00:00
David Huynh
fed3c87fa6 Added row flagging support. Fixed bug in row star change: starring or unstarring one row wasn't undo-able previously.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@547 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-26 01:08:56 +00:00
David Huynh
f9a829758e Pool recons and recon candidates. This yields smaller project files, change files, and AJAX responses for get-rows. It should make re-loading existing projects faster.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@521 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-23 19:39:12 +00:00
David Huynh
5ba67b7b26 Implemented column split command. It seems to be working in "by lengths" mode.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@510 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-20 23:08:14 +00:00
David Huynh
1d938bc4d0 Better MQL batching during extending data operations.
Tried to use JSON streaming in changes as well.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@479 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-15 00:09:14 +00:00
David Huynh
24a7ea91b6 Fixed bugs
- MassEditOperation was barfing when engineConfig was missing
- When parsing JSON in streaming mode, get long instead of int and double instead of float so that we won't get overflow exception.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@476 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-14 20:34:29 +00:00
David Huynh
8b95248c75 Fixed bug where after reconciling by ID, GUID, or key would generate a buggy numeric range facet, since all the scores were artificially the same.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@454 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-12 22:19:44 +00:00
David Huynh
759824e1b4 Bug fix: editing one facet choice while some other choices are selected resulted in no change.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@429 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-08 22:16:12 +00:00
Stefano Mazzocchi
7526c4e582 cleanups (no functional changes)
this makes pmd and javac on linux happier


git-svn-id: http://google-refine.googlecode.com/svn/trunk@427 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-08 20:46:02 +00:00
David Huynh
9d9329ca96 Implemented row remove command.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@391 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-06 07:47:44 +00:00
David Huynh
1fd85c62bf Implemented column rename command.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@390 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-06 07:15:34 +00:00
David Huynh
f402db10af Implemented inter-project joins.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@387 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-06 05:35:48 +00:00
Stefano Mazzocchi
2efbf0031f - removed the 'thirdparty' directory (now the 'gridworks' script will download and install needed tools if they are not present in the system already)
- added 'findbugs' command that uses the findbugs static analyzer to look for problems in the code
- fixed a bunch of issues that findbugs found (a few methods would go a little faster, and a few NPE will be avoided... nothing major but good to have)


git-svn-id: http://google-refine.googlecode.com/svn/trunk@382 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-05 07:15:16 +00:00
David Huynh
084a6114d7 Track freebase types of columns added with data from Freebase, so that we can later add more data based on those columns. Fixed minor bug in serialization of data extension records.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@303 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-16 06:18:00 +00:00
David Huynh
c6e7986206 Extend data operation is working.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@301 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-16 00:24:20 +00:00
David Huynh
e35c4c3b94 Minor bug.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@294 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-14 05:15:23 +00:00
David Huynh
af3cb76056 Added support for including dependent rows in row visiting. Facets still don't count them, though.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@282 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-12 01:06:23 +00:00
David Huynh
80e6111a92 Added options for omitting error and blank choices in list facets, and use them in the various recon facets.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@227 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-07 22:54:02 +00:00
David Huynh
694f09fb0a Major refactoring: everything is now saved to disk using our own formats, mostly json-based, some inside zip files.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@226 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-07 22:37:26 +00:00
David Huynh
e0d72c81e9 Renamed "facet-based edit" operation and command to "mass edit", because it's not just facet-based.
Added option "apply to other cells with same original content" to single cell edit popup, so it can be used like a find&replace operation.
Renamed "do-text-transform" operation and command to just "text-transform".

git-svn-id: http://google-refine.googlecode.com/svn/trunk@223 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-07 00:25:00 +00:00
David Huynh
db824bffeb Fixed bug in saving recon changes.
Fixed bug in discard recon judgment operation.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@218 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-06 08:03:29 +00:00
David Huynh
78b1eb7e73 Major refactoring:
- Made all Change classes save to and load from .zip files.
- Changed Column.headerLabel to Column.name.
- Save project's raw data to "raw-data" file for now. We'll make it save to a zip file next.


git-svn-id: http://google-refine.googlecode.com/svn/trunk@217 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-06 07:43:45 +00:00
David Huynh
1d6db8fa6e Made recon process cause the client page to create facets when the recon process is done.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@203 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-05 01:13:59 +00:00
David Huynh
9d8b746121 Switched Cell.value from Object to Serializable.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@201 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-04 19:59:31 +00:00
David Huynh
72d06fe65c Added support for canceling running and pending processes.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@183 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-04 01:14:48 +00:00
David Huynh
eaef7b2394 Also let user decide what to do on expression evaluation error when creating a new column.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@182 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-04 00:32:54 +00:00
David Huynh
5a0a8bea4f Added custom dialog box for create column operation (with a field for the new column name).
git-svn-id: http://google-refine.googlecode.com/svn/trunk@180 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-04 00:12:39 +00:00
David Huynh
2fe8f98e4e Added repeat and repeatCount options for text transform operation. This lets us fix those & repeated encoding problems easily.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@179 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-04 00:00:46 +00:00
David Huynh
b4d2cef526 Added an option for what to do when a text transform errors out. Made a custom expression preview dialog for the text transform command in order to suppor that option.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@178 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-03 22:12:48 +00:00
David Huynh
b75f1faea8 Changed tabs to spaces. No functionality change.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@174 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-03 04:19:58 +00:00
Stefano Mazzocchi
2691ee50d7 adding OS-specific data paths
git-svn-id: http://google-refine.googlecode.com/svn/trunk@173 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-03 02:53:07 +00:00
David Huynh
3ecfb4e4d9 Implemented facet-based edit operation for real.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@167 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-02 20:33:11 +00:00
David Huynh
f16727c20c Refactored recon code on the server side to prepare for supporting other modes of recon.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@162 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-01 23:33:23 +00:00
David Huynh
bc9bc54d30 Implemented a meta parser that looks for a language prefix and picks the right parser.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@159 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-01 01:50:56 +00:00
David Huynh
acfa19a683 Moved GEL stuff (gridworks expression language) into gel package.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@158 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-01 01:30:31 +00:00