Commit Graph

82 Commits

Author SHA1 Message Date
David Huynh
8950e87e02 When re-loading existing projects from disk, cache recon objects by their IDs to lower memory consumption.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@437 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-09 19:32:50 +00:00
Stefano Mazzocchi
d3d40d608a bunch of PMD-induced fixes
(now the PMD report is clean)


git-svn-id: http://google-refine.googlecode.com/svn/trunk@430 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-09 00:14:11 +00:00
Stefano Mazzocchi
7526c4e582 cleanups (no functional changes)
this makes pmd and javac on linux happier


git-svn-id: http://google-refine.googlecode.com/svn/trunk@427 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-08 20:46:02 +00:00
David Huynh
9d9329ca96 Implemented row remove command.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@391 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-06 07:47:44 +00:00
David Huynh
1fd85c62bf Implemented column rename command.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@390 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-06 07:15:34 +00:00
David Huynh
f402db10af Implemented inter-project joins.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@387 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-06 05:35:48 +00:00
Stefano Mazzocchi
2efbf0031f - removed the 'thirdparty' directory (now the 'gridworks' script will download and install needed tools if they are not present in the system already)
- added 'findbugs' command that uses the findbugs static analyzer to look for problems in the code
- fixed a bunch of issues that findbugs found (a few methods would go a little faster, and a few NPE will be avoided... nothing major but good to have)


git-svn-id: http://google-refine.googlecode.com/svn/trunk@382 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-05 07:15:16 +00:00
Stefano Mazzocchi
4eda7ae2c0 avoid an array out of bounds exception in case there are no columns in the dataset
(I know, it should not happen but when it does let's not barf)


git-svn-id: http://google-refine.googlecode.com/svn/trunk@375 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-02 20:21:41 +00:00
Stefano Mazzocchi
521acda025 - pass the svn revision as format version (for more detailed verification)
- add an 'autoreload' setting that makes Gridworks autoreload its self if a class gets changed
(this is useful to make development cycles faster when working on the java code with autocompiling IDE like Eclipse or IDEA)


git-svn-id: http://google-refine.googlecode.com/svn/trunk@372 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-02 00:52:38 +00:00
David Huynh
1d0e6abaf8 Got some work done on the plane:
- better detection of record XML elements in XML importer
- XML importer creates column groups and data table view renders them


git-svn-id: http://google-refine.googlecode.com/svn/trunk@356 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-27 05:23:09 +00:00
David Huynh
4df1c4107a Fixed a bug introduced recently: recon candidates were not serializing their topic types for the data view, so in the data view we can't send back a candidate's types when the user wants to match the candidate to some cells. I need to figure out a better way to optimize this.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@350 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-24 03:58:52 +00:00
David Huynh
85d1671d6e Fixed minor bug: recon wasn't saving out its candidates if its judgment is Matched. So when a project is saved and reloaded, it loses all of the recon candidates except for the matches.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@344 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-23 18:02:08 +00:00
David Huynh
2846d66261 Detect max cell index on load, just in case the max cell index we've stored previously was out of whack.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@341 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-23 03:19:17 +00:00
David Huynh
f8d30e9e8e Don't send back recon candidate types for rendering cells.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@340 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-23 01:17:45 +00:00
David Huynh
a43b2a72c1 Made various GEL functions and the forEach control work with java.util.List and java.util.Collection in addition to just Object[].
Added field columnNames to row object.
Added 1-bounded numeric log facet.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@328 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-19 23:04:17 +00:00
David Huynh
cd062cf028 Minor bug: recon candidate's "id" field should return id, not name.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@312 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-17 19:54:27 +00:00
David Huynh
084a6114d7 Track freebase types of columns added with data from Freebase, so that we can later add more data based on those columns. Fixed minor bug in serialization of data extension records.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@303 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-16 06:18:00 +00:00
David Huynh
c6e7986206 Extend data operation is working.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@301 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-16 00:24:20 +00:00
David Huynh
025eccce4b Implemented "record" field for each row.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@283 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-12 06:33:03 +00:00
David Huynh
af3cb76056 Added support for including dependent rows in row visiting. Facets still don't count them, though.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@282 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-12 01:06:23 +00:00
David Huynh
e760750b57 Fixed minor bug that prevented column details from getting passed on to recon service.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@280 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-11 21:55:32 +00:00
David Huynh
b1fca11342 Made recon use cells from context rows.
Fixed bug in menu left-right positioning.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@271 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-10 08:32:20 +00:00
David Huynh
6bf5418f9d Cell changes should also flush column precomputes.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@267 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-10 07:42:57 +00:00
David Huynh
e008332399 - make recon changes flush column precomputes
- fixed bug where recon features are not saved to file properly
- support selecting non-numeric, blank, and error choices in numeric range facets

git-svn-id: http://google-refine.googlecode.com/svn/trunk@265 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-10 06:02:36 +00:00
David Huynh
5d3a57eeeb Implemented project import and export commands (from/to .tar files).
git-svn-id: http://google-refine.googlecode.com/svn/trunk@234 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-08 02:34:25 +00:00
David Huynh
a1ec0ea8df When saving projects, save only modified ones.
Save projects and workspace periodically.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@232 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-08 00:37:06 +00:00
David Huynh
3388c3e09f Still some old Serializable stuff to remove.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@228 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-07 23:02:57 +00:00
David Huynh
694f09fb0a Major refactoring: everything is now saved to disk using our own formats, mostly json-based, some inside zip files.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@226 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-07 22:37:26 +00:00
David Huynh
e06d8fe130 Better checking for null value in Cell.load.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@224 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-07 00:35:44 +00:00
David Huynh
db824bffeb Fixed bug in saving recon changes.
Fixed bug in discard recon judgment operation.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@218 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-06 08:03:29 +00:00
David Huynh
78b1eb7e73 Major refactoring:
- Made all Change classes save to and load from .zip files.
- Changed Column.headerLabel to Column.name.
- Save project's raw data to "raw-data" file for now. We'll make it save to a zip file next.


git-svn-id: http://google-refine.googlecode.com/svn/trunk@217 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-06 07:43:45 +00:00
David Huynh
b3ac945c33 Implemented single-cell editing.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@210 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-05 08:11:48 +00:00
David Huynh
40cdf5092b Better display of Calendar objects in data table view and in expression preview dialog.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@208 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-05 02:25:27 +00:00
David Huynh
9d8b746121 Switched Cell.value from Object to Serializable.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@201 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-04 19:59:31 +00:00
David Huynh
c1498448e4 Implemented global and per-project expression histories and hooked them up to the expression preview dialog.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@176 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-03 21:21:38 +00:00
David Huynh
b75f1faea8 Changed tabs to spaces. No functionality change.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@174 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-03 04:19:58 +00:00
Stefano Mazzocchi
2691ee50d7 adding OS-specific data paths
git-svn-id: http://google-refine.googlecode.com/svn/trunk@173 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-03 02:53:07 +00:00
David Huynh
512cd16381 Implemented recon by keys, guids, and ids.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@165 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-02 18:19:20 +00:00
David Huynh
99ae6109d8 Started work on key-based recon.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@164 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-02 03:31:58 +00:00
David Huynh
e57aae888b Hooked up the recon service at data.labs.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@163 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-02 00:33:32 +00:00
David Huynh
f16727c20c Refactored recon code on the server side to prepare for supporting other modes of recon.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@162 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-01 23:33:23 +00:00
David Huynh
983be19e14 Made EvalError serializable because errors can be cell values and need to be saved.
Turned is* functions into controls, since they have to be able to test errors, and only controls can do that, not functions.
Polished display of errors in cells and in expression preview dialog.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@155 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-01 00:21:13 +00:00
David Huynh
c914aa6c16 Introduced EvalError objects as possible values returned by expressions.
Extracted function and control name mappings to ControlFunctionRegistry.


git-svn-id: http://google-refine.googlecode.com/svn/trunk@148 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-27 05:48:33 +00:00
David Huynh
f5ff9044cf Track and display recon stats in column headers.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@146 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-26 23:33:16 +00:00
David Huynh
dce42400d4 Fixed bug introduced while trying to delay constructing the candidates arrays in Recon objects.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@130 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-23 17:47:34 +00:00
David Huynh
ec1604e815 Added support for starring rows.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@129 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-23 07:45:12 +00:00
David Huynh
94fbd97bc4 Added a few more expression functions.
Bind row index when filtering rows, so we can create facets based on row indices.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@125 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-22 23:51:44 +00:00
David Huynh
0f505c72c5 Delay constructing the candidates array in recon objects to save memory.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@124 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-22 23:27:16 +00:00
David Huynh
5e9be8c258 Support reusing newly created topics for cells with the same content.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@121 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-22 22:15:48 +00:00
David Huynh
e4b01cb36c Make similar cell judgments an abstract operation.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@120 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-22 20:25:45 +00:00