Commit Graph

1106 Commits

Author SHA1 Message Date
Tom Morris
7dcd0c073d Revert bad commit r1600
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1601 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-18 05:19:05 +00:00
Tom Morris
79c00bab36 Incomplete - task 157: Integrate Google Spreadsheet import/export plugin
http://code.google.com/p/google-refine/issues/detail?id=157

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1600 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-18 04:59:39 +00:00
David Huynh
e7184ec9ab Deleted old empty protograph dirs. Use a default assign version even if running from trunk; this is so that we have at least some clue about an imported project file.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1598 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-18 04:18:09 +00:00
David Huynh
c8dcc10ab8 Be sure to use UTF-8 when saving data.txt, pool.txt, and change files.
Fix issue 163: Refine doesn't retain the characters for flat or sharp.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1588 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-17 23:43:02 +00:00
David Huynh
a62638e88d For each recon group, try at least 3 times if the service keeps failing. Log errors more for debugging purposes.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1578 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-16 00:19:31 +00:00
Stefano Mazzocchi
f50880905e fixed warnings
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1577 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-16 00:01:42 +00:00
Tom Morris
47dd5f8da6 Make sure the stream/writer is flushed in case the exporter forgets to do it
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1569 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-15 17:10:37 +00:00
Tom Morris
bbebb4d2dc Add @Overrides so we get warned about API changes
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1565 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-15 13:26:25 +00:00
David Huynh
7e9df21b70 Exporters need to implement either WriterExporter or StreamExporter.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1558 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-15 06:18:20 +00:00
David Huynh
73042712ed Made csv/tsv importer not trim whitespace even if "guess cells' types" is checked (for cells that are strings).
Updated csv tests to expect un-trimmed cells.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1557 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-15 05:30:15 +00:00
David Huynh
9e35ea3775 Better error message for numeric range facet if there's no numeric value.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1551 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-15 01:00:51 +00:00
Tom Morris
083abd4329 Refactor exporter interface along same lines as importer
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1547 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-14 21:33:50 +00:00
David Huynh
4ccdbc8716 Fixed bug in which a newly created and unedited project would never get saved because it had the same modified time and last save time.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1530 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-14 01:43:26 +00:00
David Huynh
dc49047092 We have previously changed the standard-reconcile acre app to return mids, but we still need to make sure its metadata says that its identifier space is mid, not id. And we need Refine to test for the mid identifier space as well.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1479 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-13 18:33:27 +00:00
David Huynh
a16df8f2d6 For unrecoverable projects, rename them with a suffix so the next time we won't try to recover them again.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1472 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-13 07:05:34 +00:00
David Huynh
91ffe71d17 Lowering recon batch size from 7 to 3 to avoid timeout problem. This is a temporary fix only for
Issue 156: Reconcile is not picking up alias hints or even type hints correctly

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1470 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-13 05:03:49 +00:00
David Huynh
208152b55c Added .vt template for reporting errors with stacktraces.
Fixed Issue 155: Blank browser shown when non-GZIP format is detected during import

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1469 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-13 04:51:01 +00:00
David Huynh
7cd5a47fbf We haven't been using non-split row parser, so we need to fix the trimming problem in the tsv/csv importer instead.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1467 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-12 23:24:16 +00:00
David Huynh
2d276fa1e6 Non split row parser shouldn't trim lines because whitespaces are significant
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1465 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-12 22:45:30 +00:00
David Huynh
69c338c728 Text filter was throwing an exception if the column went away (which happened when the column got split).
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1464 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-11 23:15:13 +00:00
David Huynh
336a773069 Only try to create the workspace dir if it doesn't exist.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1463 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-11 23:04:06 +00:00
Tom Morris
c42c78dc0a Log errors if things don't go as expected
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1462 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-11 22:28:22 +00:00
Iain Sproat
142591a090 Added a mention of the new JsonImporter to CHANGES.txt
Corrected the logger name in JsonImporter.java

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1455 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-08 07:58:59 +00:00
David Huynh
ad0d227ab3 Remove remaining Freebase related functionalities.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1453 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-08 02:04:47 +00:00
David Huynh
6ddd945a80 The Freebase functionalities have been extracted out in the last commit. We're removing them from the core module now. This is not a complete checkin. SVN is having some trouble with some directories.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1452 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-08 01:54:00 +00:00
Tom Morris
5040b06d9f Make exceptions more specific for load errors. Still no error returned to user though (just hangs)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1450 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-07 14:20:28 +00:00
Tom Morris
ea28784e8b Don't save null project if load failed
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1449 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-07 14:19:42 +00:00
Stefano Mazzocchi
215165ed97 spell out tweezer parameters
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1444 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-07 08:20:46 +00:00
David Huynh
9ea477c80d Allowed a single operation class to be registered under several names, so that we can rename operations (to better names) while maintaining backward compatibility.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1443 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-07 05:42:01 +00:00
David Huynh
1de5e7c00e Renamed package gel to grel.
Replaced gel with grel in other places in the code base while maintaining backward compatibility.
Changed layout in expression preview dialog to accommodate long GREL name.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1442 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-07 05:19:35 +00:00
David Huynh
90d1111ebc Added "project" argument to OverlayModel methods, as suggested by Fadi Maali.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1439 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-06 20:47:11 +00:00
David Huynh
3ba8e63249 Register Json importer.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1426 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-04 18:53:41 +00:00
Iain Sproat
d977f42f51 Changed behaviour of the XmlImporter to make it more permissive, and allow arrays within mixed elements to be used as candidates for importing to Refine.
This change has also allowed the JsonImporter to pass all its unit tests without any further modification.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1425 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-04 18:33:59 +00:00
Iain Sproat
ec9898ba92 Some tidying up of the XmlImporter which reduces the number of generic TreeParser tokens to a minimum - and should allow elements such as comments and CDATA to be ignored/skipped.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1422 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-04 15:02:09 +00:00
Iain Sproat
d3f223c196 The JsonImporter now passes all current unit tests.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1421 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-04 10:02:50 +00:00
Stefano Mazzocchi
2b9b38368f use the new FreeQ 'refine' queue instead of the old 'gridworks' one
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1410 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-30 00:26:02 +00:00
Stefano Mazzocchi
b62e63306a - make the correct version + revision available also to the java side (thru web.xml)
- add @Override metadata to the commands that were missing it
- make the version information appear even when using trunk (Fixes Issue 136)


git-svn-id: http://google-refine.googlecode.com/svn/trunk@1406 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-29 01:50:57 +00:00
David Huynh
935355cb50 Comments in XML file caused the record detection code to fail. So we added ignorable element type that we can skip over.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1392 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 19:16:43 +00:00
Iain Sproat
bd3ded0828 Correcting JsonImporter to use the correct parser.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1388 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 14:19:19 +00:00
Iain Sproat
855df20481 XmlImportUtilities no longer relies on XMLStreamConstants, and is now independent of any specific type of tree data (Xml or otherwise).
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1378 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 10:46:33 +00:00
Iain Sproat
b21961be89 Another small step towards making XmlImportUtilities generic for all tree structured data, and less XML centric. Some calls to XMLStreamConstant in XmlImportUtilities are now working with a generic TreeParserToken, with methods to converter between TreeParserToken and XMLStreamConstant/JsonToken in the respective parsers.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1377 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 10:04:56 +00:00
David Huynh
740caedf46 Updated to version 2.0
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1376 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 06:03:07 +00:00
David Huynh
e587614c22 Fixed Issue 126: Large integers formatted in scientific notation in formulas
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1373 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 04:21:44 +00:00
Tom Morris
bc6f05f41b Issue 140 - Fix Open Workspace command for non-Mac platforms (requires Java 6)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1372 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 03:54:54 +00:00
David Huynh
194fb5e706 Fixed Issue 122: Exporting to Excel on attached project raises server exception
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1370 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 03:44:30 +00:00
David Huynh
f2ce1b7161 Fixed Issue 121: Importing attached file strips backslashes
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1369 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 03:35:42 +00:00
Stefano Mazzocchi
c976091624 new hooks to the Freebase Refinery
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1368 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 02:19:50 +00:00
David Huynh
823fe989a4 Fixed Issue 110: Import of single column text file with Postal Codes shows only 1 row with lots of � chars (?).
(by enforcing a confidence threshold on the encoding guessing)

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1367 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 00:26:53 +00:00
Stefano Mazzocchi
14d046bb7a silence velocity's logs
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1366 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 00:01:53 +00:00
Iain Sproat
c3c23a87b0 The renaming of TreeImporter to TreeImportUtilities didn't seem to get committed last time. Trying again.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1362 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-27 22:57:26 +00:00
Iain Sproat
d285999da8 New JsonImporter, JsonParser and JsonImporterTests (copy of XmlImporterTests with syntax of the example data altered for Json).
Renaming of TreeImporter to TreeImportUtilities (as per the current convention with the XmlImporter and XmlImportUtilities).

NB the new JsonParser class does not work, and 5 of the new unit tests for JsonImporter currently fail.  To be fixed in due course.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1361 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-27 22:53:17 +00:00
Stefano Mazzocchi
86f810a324 hardening the timeline facet
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1353 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-27 21:59:17 +00:00
Iain Sproat
e5ddfa6fdc All methods in XmlImportUtilities now use the TreeParser interface.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1323 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-27 17:59:53 +00:00
Iain Sproat
d71c563831 XmlImportUtilities.detectPathFromTag and XmlImportUtilities.detectRecordElement methods now use a generic TreeParser interface. A lightweight wrapper XmlParser wraps XMLStreamReader to provide parsing for xml data.
This is another small step towards a generic importer for tree structured data.  My plan is to refactor more of XmlImportUtilities' methods to use the TreeParser interface so that XmlStreamReader is no longer called directly from XmlImportUtilities.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1322 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-27 17:40:51 +00:00
Iain Sproat
1bda46d40f Methods which are generic to any tree structured data and don't rely on an XmlParser have been moved to a new TreeImporter class. This is a small step towards supporting importers for other tree structured data.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1321 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-27 16:09:44 +00:00
Stefano Mazzocchi
6273332cef the sandbox->freebase loading conduit is now named "refinery"
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1313 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-25 01:57:56 +00:00
David Huynh
a112ffa9ab Caught a stray rename miss. Added more generic support for renaming old Java classes so that extensions could remain backward-compatible, too
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1297 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-22 23:59:57 +00:00
David Huynh
1367ce301e More renaming, except for: client-side code, build scripts, anything to do with data loading and QA, workspace path. Refine can still run, and undo/redo on existing projects is working.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1290 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-22 18:36:33 +00:00
David Huynh
e6bc603a11 Renamed Java classes whose names contain 'Gridworks'. Refine is still able to start. But don't check out the code just yet.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1289 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-22 17:46:39 +00:00
David Huynh
edb23eb263 Changed Java packages com.google.gridworks.* to com.google.refine.* and modified other code just enough to start grefine up without error. Much remains to be done. Do not check out the code just yet.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1288 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-22 17:04:10 +00:00
David Huynh
362a277c58 Added main menu command to open system file explorer at the workspace directory.
Made project manager more careful at disposing projects, in case any of them is null.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1272 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-17 06:52:10 +00:00
David Huynh
2609c4049d Fixed issue 114: "Refactor project manager api to allow importers to create project metadata" by incorporating tfmorris' patch.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1271 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-17 01:00:23 +00:00
David Huynh
8d1f2d44b9 Patched the json lib to allow up to 100 levels of nesting.
Fixed ImportProjectCommand to redirect from the error page back to /index rather than /index.html.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1270 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-17 00:21:54 +00:00
Stefano Mazzocchi
eee4514643 fixing Issue-125
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1269 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-16 23:05:53 +00:00
Stefano Mazzocchi
df0a30e22d this is really a debug log
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1262 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-10 21:52:02 +00:00
David Huynh
9acd3dbe05 Fixed issue 127 - Add column from Freebase raises exception. Made sure DataExtensionChange saves properly.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1261 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-10 04:53:37 +00:00
Stefano Mazzocchi
e973fd3e89 d'oh, wrong object counter (thanks again to knut.forkalsrud for spotting my mistakes)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1250 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-31 23:25:16 +00:00
Stefano Mazzocchi
e5c6dda178 Fixed Issue-116
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1243 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-31 19:33:05 +00:00
Stefano Mazzocchi
7df259008b more whitespace
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1242 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-31 19:32:59 +00:00
Stefano Mazzocchi
cf66d00854 only whitespace (no functional changes)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1240 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-31 19:32:48 +00:00
Stefano Mazzocchi
860d6c4ee2 a little more solid (it's possible to have both Dates and Calendars in there)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1239 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-31 19:12:11 +00:00
Stefano Mazzocchi
3648883e0c ISSUE-99 thanks to knut.forkalsrud for providing the patch!
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1238 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-31 18:56:35 +00:00
Stefano Mazzocchi
5d788c9260 added timeline facet (like the numeric binning facet but working on dates instead of numbers and with date-specific binning logic)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1234 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-31 17:57:54 +00:00
David Huynh
bd7453adba Made sure to strip off charset from content-type when importing from URLs before looking up for the right importer.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1229 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-25 17:35:16 +00:00
David Huynh
367796488e Fixed xml importer: subgroups should now line up properly by rows.
Added command to reorder columns using drag and drop.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1227 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-25 06:17:08 +00:00
David Huynh
276fae8938 Save templating exporter's template.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1221 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-24 06:36:49 +00:00
David Huynh
baa4e0db8c Added command to browse to the data load page on the Gridworks QA dashboard.
Save the data load job name and fill it in the next time the Load into Freebase dialog is opened.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1220 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-24 06:06:07 +00:00
David Huynh
e4af19f8a6 Namespaced operations' names by their modules' names.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1215 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-24 04:02:36 +00:00
David Huynh
1f69fba43c Added command Add Column by Fetching URLs.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1203 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-22 23:55:07 +00:00
David Huynh
9041ebf7b9 Bumped version to 1.5.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1195 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-22 06:42:21 +00:00
David Huynh
c94abd0427 Commands are now registered in association with their modules, so to avoid name collision.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1193 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-22 06:16:13 +00:00
David Huynh
95e2e30c8a Added events to OverlayModel interface, so overlay models can react to saving events and to disposing event from the project.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1191 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-22 05:06:36 +00:00
David Huynh
4ea765b689 Factored out registries of importers and exporters.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1183 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-22 01:46:32 +00:00
David Huynh
99b8c4dc7a Setting rabj=true when uploading to freeq.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1170 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-19 21:13:00 +00:00
Stefano Mazzocchi
fcc54e2ab3 removing what turned out to be dead code
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1162 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-16 16:09:52 +00:00
Stefano Mazzocchi
bb7d3c388c ISSUE-115 datePart('month') should return January as 1 not 0
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1161 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-16 16:09:39 +00:00
David Huynh
a90a9c724e Forgot to register blank down operation in operation registry previously.
Added uniques GEL function for eliminating duplicates in an array.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1158 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-16 03:00:43 +00:00
David Huynh
fa816007a7 Fixed copy-and-paste string mistake in BlankDownOperation.
Fixed minor bug in Row.isValueBlank that returns true for non-string values.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1157 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-16 02:16:41 +00:00
David Huynh
e61655506a Added new command to import QA results, so any reconciliation action that yields conflicting or uncertain opinions among reviewers can be examined inside Gridworks.
Added new customized facets for checking QA results. 

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1156 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-13 16:26:33 +00:00
David Huynh
8f071ede31 Added command Transpose Cells in Rows into Columns (Issue 82).
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1147 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-08 06:30:30 +00:00
David Huynh
d1a66e2e63 Added JSON support in GEL.
Added GEL functions: escape, parseJson, hasField.
Fixed bug in preference store: expression history was still not loaded properly.
Integers are now rendered without decimals in the expression preview dialogs.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1145 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-07 22:57:48 +00:00
David Huynh
e70f16025b Fixed bug introduced recently by changing the preference key of the expression history from "expressions" to "scripting.expressions".
Added code in FileProjectManager for trying to recover projects in the workspace dir but are not recorded in the workspace json file.


git-svn-id: http://google-refine.googlecode.com/svn/trunk@1144 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-07 20:25:31 +00:00
David Huynh
0500d7aa10 Added commands Move Column to Beginning, Move Column to End, Move Column Left, Move Column Right.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1142 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-07 01:24:48 +00:00
David Huynh
f0eae04c0c Forgot to add 2 files in the last commit
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1141 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-06 23:50:22 +00:00
David Huynh
a8ee9b9e08 Added Fill Down and Blank Down commands.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1140 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-06 20:33:28 +00:00
David Huynh
3bda9d035d Added support for creating a project by pointing to a data file URL.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1139 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-06 06:15:05 +00:00
David Huynh
f411dc9104 - Issue 112: Refactor Importer API (patch from tfmorris)
- Added support for storing custom metadata in ProjectMetadata.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1138 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-06 05:04:25 +00:00
David Huynh
00c6865d95 - Select All and Unselect All buttons in History Extract dialog
- Schema skeleton: support for multiple cells per cell-as nodes, and for conditional links


git-svn-id: http://google-refine.googlecode.com/svn/trunk@1137 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-05 20:27:39 +00:00
David Huynh
5cb3f924f6 Added support in protograph for specifying several column names per cell-as nodes.
Started to add support for conditional links in protograph. The UI is not hooked up with.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1136 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-05 08:29:34 +00:00
David Huynh
b8ad56c6db Made sure in the schema skeleton dialog, in the dialog box for a node, in the "cell-as-topic" section, the type is always recorded.
In the triple loader transposed node factory, use the column's recon config to generate new topics' type.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1135 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-05 00:53:08 +00:00
David Huynh
dcc3ac8534 Renamed packages com.metaweb.* to com.google.*.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1130 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-03 23:01:18 +00:00
Stefano Mazzocchi
8c56b437fa more fixes
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1129 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-03 20:19:48 +00:00
David Huynh
762a9f13eb Text facet's choice count limit is now configurable through preference page. Preference page needs polishing.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1127 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-02 01:49:10 +00:00
David Huynh
965ef20790 Made sure commands that create new columns check for duplicate column names.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1126 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-01 04:44:21 +00:00
David Huynh
4ad31ffcde Excel importer now supports "header lines" parameter.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1125 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-01 04:22:45 +00:00
David Huynh
7bb6674e5b Fixed recently introduced bug: expressions were not logged because preference stores were not initialized properly.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1124 7d457c2a-affb-35e4-300a-418c747d4874
2010-07-31 07:01:44 +00:00
David Huynh
f069780bfa Added support for bundling .js files to shave off some loading time.
For GetRowsCommand, tried to use jsonp but that didn't seem to improve performance much.
Gzip http responses of various text-based mime types.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1122 7d457c2a-affb-35e4-300a-418c747d4874
2010-07-31 06:51:11 +00:00
David Huynh
d71d84194f Register new operation Transpose Cells in Columns into Rows.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1112 7d457c2a-affb-35e4-300a-418c747d4874
2010-07-30 02:28:33 +00:00
David Huynh
ee14955605 Added new command Transpose Cells in Columns into Rows.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1111 7d457c2a-affb-35e4-300a-418c747d4874
2010-07-30 02:25:58 +00:00
David Huynh
a192674118 - added smartSplit GEL function that can handle quoted values
- added max width to operation extract dialog
- made GEL get and slice functions handle HasFieldsList
- fixed versioned standard-reconcile URLs (they need userid.user.dev)

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1110 7d457c2a-affb-35e4-300a-418c747d4874
2010-07-26 22:50:00 +00:00
David Huynh
2ff0184c65 - switched to accessing versioned standard-reconcile app
- standardized preference keys to using dot separated format
- added support to override freeq url from workspace preferences
- added GEL controls: forEachIndex, forRange, filter
- enforced max-width on preview table columns in expression preview dialog
- added preservedAllTokens option to split GEL function

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1109 7d457c2a-affb-35e4-300a-418c747d4874
2010-07-26 21:12:40 +00:00
David Huynh
4522b98f32 Store and use job ID to retrieve MDO ID and send that in subsequent loads.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1100 7d457c2a-affb-35e4-300a-418c747d4874
2010-07-16 17:32:06 +00:00
David Huynh
4373e7276f Pass target Freebase type IDs in recon objects to freeq.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1099 7d457c2a-affb-35e4-300a-418c747d4874
2010-07-15 21:45:17 +00:00
David Huynh
b854f99ef5 Removed extra closing brace.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1096 7d457c2a-affb-35e4-300a-418c747d4874
2010-07-11 05:54:06 +00:00
David Huynh
43dadf40da Added ignore:true to any triple that shouldn't be loaded.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1095 7d457c2a-affb-35e4-300a-418c747d4874
2010-07-10 05:07:24 +00:00
David Huynh
513283d4d1 Support creation of cache directories, so the rdf importer can store its lucene indexes.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1090 7d457c2a-affb-35e4-300a-418c747d4874
2010-07-08 23:22:29 +00:00
David Huynh
f5fc44e24e Refactoring to expose extension points that the rdf-exporter extension will plug into.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1074 7d457c2a-affb-35e4-300a-418c747d4874
2010-07-06 00:14:07 +00:00
David Huynh
ab82562016 Tripleloader protograph transposer now generates more context information for QA.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1073 7d457c2a-affb-35e4-300a-418c747d4874
2010-07-03 01:39:14 +00:00
David Huynh
217fb7b25c Fixed Issue 66: Records not excluded with inverted text facet.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1064 7d457c2a-affb-35e4-300a-418c747d4874
2010-07-01 20:26:54 +00:00
Stefano Mazzocchi
a682d6b36f fixing eclipse warnings
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1063 7d457c2a-affb-35e4-300a-418c747d4874
2010-07-01 08:03:42 +00:00
Stefano Mazzocchi
9fbff0640b make sure that splitting values maintains empty cells if the separator is repeated
(this is useful in case the cells contains a rigid structure across multiple columns)

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1062 7d457c2a-affb-35e4-300a-418c747d4874
2010-07-01 07:47:57 +00:00
Stefano Mazzocchi
2302d017d8 remove eclipse warnings
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1061 7d457c2a-affb-35e4-300a-418c747d4874
2010-07-01 07:47:52 +00:00
David Huynh
18b720b913 Fixed CSV and TSV export bug.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1059 7d457c2a-affb-35e4-300a-418c747d4874
2010-07-01 02:32:03 +00:00
David Huynh
2e3984d54a When transposing data to triple loader output, pass row indices and cell indices deep down so later we can generate more context information for recon.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1051 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-29 17:30:16 +00:00
David Huynh
0e4781cb58 Forgot a console.log() call.
Allow reconciling against no particular type.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1043 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-27 04:20:35 +00:00
David Huynh
76c8cd77eb "search for match" links in data table cells now use recon service's entity suggest options.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1041 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-27 03:31:56 +00:00
David Huynh
ecfb893e98 More work on the recon UI. Standard services can now be added.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1038 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-26 01:10:23 +00:00
David Huynh
1342ceacea Careful not to load all projects in an autosave cycle.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1037 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-25 22:43:30 +00:00
David Huynh
058e86b4c8 First pass in trying to generalize standard reconciliation service UI. A lot of pieces are still Freebase-centric.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1032 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-24 20:13:51 +00:00
Iain Sproat
f0ed50e468 issue 69 fixed. ControlFunctionRegistry now correctly registers Chomp expression as "chomp" key.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1024 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-23 17:53:29 +00:00
David Huynh
a9f77d0f51 Minor bug.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1020 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-22 21:29:15 +00:00
Iain Sproat
0d7b3b0e9c ProjectManager is now partially unit tested.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1015 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-21 19:57:31 +00:00
Iain Sproat
dcf6919900 Functionality which didn't need to be moved to FileProjectManager as it wasn't file system specific has been moved back to ProjectManager. importProject function is now named loadProjectMetadata to avoid confusion.
Some additional source code documentation added to ProjectManager, and methods rearranged in more readable fashion.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1011 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-20 16:32:49 +00:00
Iain Sproat
7ced0cb31e New feature for importing text files (CSV and TSV). By selecting the checkbox in index.html it allows the effects of quotation marks around data values to be ignored.
Unit test added for this.

This has required a further branch to opencsv - patch sent to opencsv project and can be tracked at  https://sourceforge.net/tracker/?func=detail&aid=3018599&group_id=148905&atid=773543

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1010 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-20 14:47:45 +00:00
Iain Sproat
0af7e5fcf5 More functionality which didn't need to be moved to FileProjectManager, as it wasn't file system specific, has been moved back to ProjectManager.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@992 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-18 20:47:15 +00:00
Iain Sproat
c72b4571a5 Functionality which didn't need to be moved to FileProjectManager as it wasn't file system specific has been moved back to ProjectManager.
Some additional source code documentation added.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@991 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-18 19:16:30 +00:00
Iain Sproat
846cf1d57e Fixed bug in CsvExporter, all unit tests for CsvExporter and TsvExporter now working.
History now has the beginnings of a unit test.

Additional source documentation on public methods in ProjectManager and History.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@989 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-17 15:37:28 +00:00
David Huynh
e7d0fc5ed6 Implemented a generic preference store for both the whole workspace and each project.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@988 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-17 02:15:26 +00:00
Iain Sproat
18e319bb76 Moved call to FileHistoryEntryManager from ProjectManager to FileProjectManager.
Added interface HistoryEntryManager, which seems to have been forgotten from last commit.
FileHistoryEntry is now named FileHistoryEntryManager.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@983 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-16 14:17:17 +00:00
Iain Sproat
f92fc2d056 Internal refactor for IO - HistoryEntry is now a concrete class, so can be instantiated (reverting Operations classes back to r972 which were changed as a result of HistoryEntry being abstract).
HistoryEntry now deals with backend (filesystem etc.) through classes which implement HistoryEntryManager.  This HistoryEntryManager is held by ProjectManager, which allows for FileProjectManager to create FileHistoryEntryManager as appropriate.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@982 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-16 12:35:37 +00:00
Iain Sproat
280daad2f6 Refactored ImportProjectCommand and ExportProjectCommand. These are no longer dependent on the File System, and all file system related work is done in FileProjectManager.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@981 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-16 07:44:46 +00:00
Iain Sproat
f47cb75525 Fixed ImportProjectCommand so it no longer contains references to project.html, a file previously removed from the project.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@980 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-16 07:06:52 +00:00
Iain Sproat
17f1dc2e6f The file system coupled method getProjectDirectory is now removed from ProjectManager.
Methods of HistoryEntry which directly work with the File System have been moved to FileHistoryEntry in the io directory, and HistoryEntry made abstract.

As the abstract HistoryEntry cannot be instantiated directly, the ProjectManager is now responsible for creating new HistoryEntry.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@973 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-15 22:11:35 +00:00
Iain Sproat
b07075bed5 FileProjectManager and portions of Project and ProjectMetadata classes which deal with io are moved to an io directory.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@972 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-15 20:55:38 +00:00
Iain Sproat
c94957b6a0 CreateProjectCommand no longer contains references to project.html, a file previously removed from the project.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@971 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-15 20:28:18 +00:00
Iain Sproat
dc7060d390 portion of ProjectManager which interacts with File System has been moved to FileProjectManager, which extends ProjectManager. ProjectManager is now abstract.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@970 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-15 19:34:40 +00:00
Iain Sproat
a671551289 Two more XmlImport tests now work. Some documentation stubs were added to XmlImporterUtilities.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@967 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-15 16:22:53 +00:00
David Huynh
f7fe44dccc Converted project.html to project.vt and added a client side resource manager, where extensions can register scripts and styles to be included in .vt files
git-svn-id: http://google-refine.googlecode.com/svn/trunk@965 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-15 00:35:23 +00:00
David Huynh
b0389d8c6a Jython integration has been moved out to an extension.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@964 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-14 22:11:34 +00:00
Stefano Mazzocchi
af48cb799e moving Griworks to use the Butterfly webapp framework (this will allow us to make gw more extensible without excessive complexity... as a bonus we gain server side javascript support which might end up being useful)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@940 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-05 00:50:18 +00:00
Stefano Mazzocchi
0648e8725e adding regexp group capturing GEL function
git-svn-id: http://google-refine.googlecode.com/svn/trunk@932 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-01 08:54:17 +00:00
Stefano Mazzocchi
5e0acf28d0 forgot to add the ngram class itself
git-svn-id: http://google-refine.googlecode.com/svn/trunk@931 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-01 08:04:40 +00:00
Stefano Mazzocchi
b3173211e3 adding an ngram function to GEL
git-svn-id: http://google-refine.googlecode.com/svn/trunk@930 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-01 08:02:28 +00:00
Stefano Mazzocchi
3b7f132430 fixing jython initialization logic
git-svn-id: http://google-refine.googlecode.com/svn/trunk@924 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-31 09:00:42 +00:00
Stefano Mazzocchi
e3fc7ab603 bringing the refactor branch up to speed with trunk
(everything works like in trunk for now, although some tests still fail)


git-svn-id: http://google-refine.googlecode.com/svn/branches/split-refactor@915 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-30 18:18:59 +00:00
Stefano Mazzocchi
aa4de48f95 some renaming, moving tests into main
git-svn-id: http://google-refine.googlecode.com/svn/branches/split-refactor@906 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-30 16:55:53 +00:00