Commit Graph

96 Commits

Author SHA1 Message Date
David Huynh
254853b51d Added reverse and sort functions.
Support a limit on how many rows to load into a new project.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@134 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-23 23:22:02 +00:00
David Huynh
4bdb2320b7 Styled help tab of expression preview dialog. Added variables section.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@131 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-23 19:06:58 +00:00
David Huynh
dce42400d4 Fixed bug introduced while trying to delay constructing the candidates arrays in Recon objects.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@130 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-23 17:47:34 +00:00
David Huynh
ec1604e815 Added support for starring rows.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@129 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-23 07:45:12 +00:00
David Huynh
8992531d02 Documented functions and controls in expression language.
Better error checking in operator calls.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@127 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-23 01:40:06 +00:00
David Huynh
607fca04cb Added a few more math functions.
Fixed expression preview dialog to use tabs.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@126 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-23 00:33:39 +00:00
David Huynh
94fbd97bc4 Added a few more expression functions.
Bind row index when filtering rows, so we can create facets based on row indices.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@125 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-22 23:51:44 +00:00
David Huynh
0f505c72c5 Delay constructing the candidates array in recon objects to save memory.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@124 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-22 23:27:16 +00:00
David Huynh
c45e0edc10 Lower recon batch size back to 10.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@123 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-22 23:06:34 +00:00
David Huynh
4ed7b45e41 Don't use schema restriction for protograph link suggest because it's not a "soft" restriction (so if the user wants a property that doesn't belong to the type, there is no way to get it).
More expression functions and controls.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@122 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-22 23:04:46 +00:00
David Huynh
5e9be8c258 Support reusing newly created topics for cells with the same content.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@121 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-22 22:15:48 +00:00
David Huynh
e4b01cb36c Make similar cell judgments an abstract operation.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@120 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-22 20:25:45 +00:00
David Huynh
c98a8ad552 Pulled the operations package up one level.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@119 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-22 18:42:25 +00:00
David Huynh
c50de52883 Improved the "extract operations" dialog to let user select which operations to extract. Also show history entries that cannot be abstracted.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@118 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-22 06:31:09 +00:00
David Huynh
1227c9dff4 Centralized mapping between operation names and their reconstructors.
Implemented comparison operators.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@117 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-22 04:23:25 +00:00
David Huynh
934c0f81c3 Forgot to check in image files.
Added commands for judging similar cells.
Started to fix/unify terminologies for recon operations.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@116 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-22 01:28:13 +00:00
David Huynh
b3167a1a9f Added option for automatically approving best recon candidates that match the expected type and score at least some minimum score.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@115 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-21 08:11:07 +00:00
David Huynh
b4935f576c Tweaked recon type guessing heuristic: remove "role" and "annotation" types, and rank types based on result orders rather than relevance scores.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@114 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-21 07:32:16 +00:00
David Huynh
b730dfd8f9 Added commands for searching for specific topics to match cells with.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@113 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-20 00:47:08 +00:00
David Huynh
ea2c904704 Use the schema index to suggest properties in the schema alignment dialog.
Fixed minor bug in triple loader transposer that wrote a bad triple for each literal cell value.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@112 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-19 22:56:29 +00:00
David Huynh
846e540ff6 Keep track of type names of reconciled columns so we can display them later in the schema alignment dialog.
Automatically create properties linking to all columns when starting with an empty protograph.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@110 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-19 18:32:48 +00:00
David Huynh
6c7557eeff Minor bug fixes.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@108 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-19 17:47:04 +00:00
David Huynh
bc412b99ea Fixed bug in triple loader transposer: properties didn't get asserted before.
Made triple loader transposer index its output variables.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@106 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-19 07:05:09 +00:00
David Huynh
5264c829ae A bit more careful error handling during recon.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@105 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-19 01:17:35 +00:00
David Huynh
28a86dfe0f Automatically guess types to reconcile a column, using Stefano's trick in his "cupid" acre app.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@104 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-19 00:28:34 +00:00
David Huynh
4b2e48614b Actual work in operations must be delayed until their changes are applied.
Column addition change must track the new cell index that it allocates when it is first applied.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@103 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-18 23:27:40 +00:00
David Huynh
8831703a2c Implemented "apply operations" feature.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@102 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-18 05:00:56 +00:00
David Huynh
604dd53ebd Engine configs were not deserialized properly when abstract operatons are retrieved.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@101 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-17 01:40:41 +00:00
David Huynh
8c41af9c12 Allow operations to be extracted in abstract forms.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@99 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-17 01:30:09 +00:00
David Huynh
32157ce76b Changed operations to record column names instead of cell indices.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@98 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-17 00:26:38 +00:00
David Huynh
e6a98f23bd Implemented triple loader preview.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@97 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-16 23:32:12 +00:00
David Huynh
aa530395d2 Use tabs in the schema alignment dialog to get more space.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@96 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-16 20:45:04 +00:00
David Huynh
5de0c36f86 Protograph preview now works.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@95 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-16 20:15:19 +00:00
David Huynh
8189ba74fd Schema alignment dialog now saves protograph and re-renders it properly.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@92 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-12 22:11:49 +00:00
David Huynh
425140261f We're starting to be able to save protographs.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@91 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-12 20:29:56 +00:00
David Huynh
634d666949 More work on the schema alignment node dialog.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@90 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-12 02:23:26 +00:00
David Huynh
f5942773ec Still more work on the protograph, toward being able to build and save a protograph, but it's not working yet.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@89 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-11 06:44:48 +00:00
David Huynh
d227db0cc6 Eliminate hash maps from recon objects--they are expensive.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@88 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-10 19:25:21 +00:00
David Huynh
5cd147ea3c Compute record indices and render them instead of row indices.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@87 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-10 18:54:53 +00:00
David Huynh
242e23c085 The schema alignment dialog is starting to work. The protograph gets rendered and is interactive. No saving yet.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@83 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-10 01:15:25 +00:00
David Huynh
97e2e0eddc Implemented "judge one cell" command for making recon judgment per cell.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@80 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-09 20:04:43 +00:00
David Huynh
f8a1daba62 Handle formula cells in Excel files.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@77 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-09 01:13:11 +00:00
David Huynh
b7cf18b86a Save a change right after it gets applied rather than when it gets created. This is because when a change gets applied, it might grab onto the old data in order to able to revert later, and we need to save that old data together with the change.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@74 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-08 23:57:37 +00:00
David Huynh
cd376c7532 Added support for Excel 2007 XML file format.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@73 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-08 23:44:33 +00:00
Stefano Mazzocchi
1f5b27653e POI deprecated the use of short, good thing
git-svn-id: http://google-refine.googlecode.com/svn/trunk@69 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-08 19:54:09 +00:00
Stefano Mazzocchi
1343162a75 major rewrite of the foundation:
- de-maveniziation (uses the same code that Acre uses to drive jetty directly)
 - removed all dependencies on external javascript code (jquery and suggest) by making a local copy (this makes gridworks totally self-serving, meaning that you can use it even if you don't have any internet connectivity)
 - fixed a NPE when the servlet is shutdown before any project is loaded
 - found a way to spawn a browser directly from the java code (untested in windows)
 - added two ant tasks to generate windows and macosx stand-alone binaries (unused just yet)

To run, just type "./gridworks run" at the command line


git-svn-id: http://google-refine.googlecode.com/svn/trunk@65 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-07 23:15:50 +00:00
David Huynh
8f186a5f10 Added a help panel to the expression preview dialog. It gets populated by function and control names for now; more info will come later.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@64 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-07 09:02:22 +00:00
David Huynh
d3f97fea93 While importing data, use null for cells with empty text.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@63 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-07 07:16:39 +00:00
Stefano Mazzocchi
a61f35079a make eclipse happier by removing @Override annotations when really it's an interface method implementation
(no functional changes)


git-svn-id: http://google-refine.googlecode.com/svn/trunk@62 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-07 06:47:52 +00:00
David Huynh
a025b272bd String.isEmpty() is no longer there (?!).
git-svn-id: http://google-refine.googlecode.com/svn/trunk@61 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-07 06:16:46 +00:00