Iain Sproat
ec9898ba92
Some tidying up of the XmlImporter which reduces the number of generic TreeParser tokens to a minimum - and should allow elements such as comments and CDATA to be ignored/skipped.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1422 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-04 15:02:09 +00:00
Iain Sproat
d3f223c196
The JsonImporter now passes all current unit tests.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1421 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-04 10:02:50 +00:00
Stefano Mazzocchi
2b9b38368f
use the new FreeQ 'refine' queue instead of the old 'gridworks' one
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1410 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-30 00:26:02 +00:00
Stefano Mazzocchi
b62e63306a
- make the correct version + revision available also to the java side (thru web.xml)
...
- add @Override metadata to the commands that were missing it
- make the version information appear even when using trunk (Fixes Issue 136)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1406 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-29 01:50:57 +00:00
David Huynh
935355cb50
Comments in XML file caused the record detection code to fail. So we added ignorable element type that we can skip over.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1392 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 19:16:43 +00:00
Iain Sproat
bd3ded0828
Correcting JsonImporter to use the correct parser.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1388 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 14:19:19 +00:00
Iain Sproat
855df20481
XmlImportUtilities no longer relies on XMLStreamConstants, and is now independent of any specific type of tree data (Xml or otherwise).
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1378 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 10:46:33 +00:00
Iain Sproat
b21961be89
Another small step towards making XmlImportUtilities generic for all tree structured data, and less XML centric. Some calls to XMLStreamConstant in XmlImportUtilities are now working with a generic TreeParserToken, with methods to converter between TreeParserToken and XMLStreamConstant/JsonToken in the respective parsers.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1377 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 10:04:56 +00:00
David Huynh
740caedf46
Updated to version 2.0
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1376 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 06:03:07 +00:00
David Huynh
e587614c22
Fixed Issue 126: Large integers formatted in scientific notation in formulas
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1373 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 04:21:44 +00:00
Tom Morris
bc6f05f41b
Issue 140 - Fix Open Workspace command for non-Mac platforms (requires Java 6)
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1372 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 03:54:54 +00:00
David Huynh
194fb5e706
Fixed Issue 122: Exporting to Excel on attached project raises server exception
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1370 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 03:44:30 +00:00
David Huynh
f2ce1b7161
Fixed Issue 121: Importing attached file strips backslashes
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1369 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 03:35:42 +00:00
Stefano Mazzocchi
c976091624
new hooks to the Freebase Refinery
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1368 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 02:19:50 +00:00
David Huynh
823fe989a4
Fixed Issue 110: Import of single column text file with Postal Codes shows only 1 row with lots of � chars (?).
...
(by enforcing a confidence threshold on the encoding guessing)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1367 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 00:26:53 +00:00
Iain Sproat
c3c23a87b0
The renaming of TreeImporter to TreeImportUtilities didn't seem to get committed last time. Trying again.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1362 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-27 22:57:26 +00:00
Iain Sproat
d285999da8
New JsonImporter, JsonParser and JsonImporterTests (copy of XmlImporterTests with syntax of the example data altered for Json).
...
Renaming of TreeImporter to TreeImportUtilities (as per the current convention with the XmlImporter and XmlImportUtilities).
NB the new JsonParser class does not work, and 5 of the new unit tests for JsonImporter currently fail. To be fixed in due course.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1361 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-27 22:53:17 +00:00
Stefano Mazzocchi
86f810a324
hardening the timeline facet
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1353 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-27 21:59:17 +00:00
Iain Sproat
e5ddfa6fdc
All methods in XmlImportUtilities now use the TreeParser interface.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1323 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-27 17:59:53 +00:00
Iain Sproat
d71c563831
XmlImportUtilities.detectPathFromTag and XmlImportUtilities.detectRecordElement methods now use a generic TreeParser interface. A lightweight wrapper XmlParser wraps XMLStreamReader to provide parsing for xml data.
...
This is another small step towards a generic importer for tree structured data. My plan is to refactor more of XmlImportUtilities' methods to use the TreeParser interface so that XmlStreamReader is no longer called directly from XmlImportUtilities.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1322 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-27 17:40:51 +00:00
Iain Sproat
1bda46d40f
Methods which are generic to any tree structured data and don't rely on an XmlParser have been moved to a new TreeImporter class. This is a small step towards supporting importers for other tree structured data.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1321 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-27 16:09:44 +00:00
Stefano Mazzocchi
6273332cef
the sandbox->freebase loading conduit is now named "refinery"
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1313 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-25 01:57:56 +00:00
David Huynh
a112ffa9ab
Caught a stray rename miss. Added more generic support for renaming old Java classes so that extensions could remain backward-compatible, too
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1297 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-22 23:59:57 +00:00
David Huynh
1367ce301e
More renaming, except for: client-side code, build scripts, anything to do with data loading and QA, workspace path. Refine can still run, and undo/redo on existing projects is working.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1290 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-22 18:36:33 +00:00
David Huynh
e6bc603a11
Renamed Java classes whose names contain 'Gridworks'. Refine is still able to start. But don't check out the code just yet.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1289 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-22 17:46:39 +00:00
David Huynh
edb23eb263
Changed Java packages com.google.gridworks.* to com.google.refine.* and modified other code just enough to start grefine up without error. Much remains to be done. Do not check out the code just yet.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1288 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-22 17:04:10 +00:00
David Huynh
362a277c58
Added main menu command to open system file explorer at the workspace directory.
...
Made project manager more careful at disposing projects, in case any of them is null.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1272 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-17 06:52:10 +00:00
David Huynh
2609c4049d
Fixed issue 114: "Refactor project manager api to allow importers to create project metadata" by incorporating tfmorris' patch.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1271 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-17 01:00:23 +00:00
David Huynh
8d1f2d44b9
Patched the json lib to allow up to 100 levels of nesting.
...
Fixed ImportProjectCommand to redirect from the error page back to /index rather than /index.html.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1270 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-17 00:21:54 +00:00
Stefano Mazzocchi
eee4514643
fixing Issue-125
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1269 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-16 23:05:53 +00:00
Stefano Mazzocchi
df0a30e22d
this is really a debug log
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1262 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-10 21:52:02 +00:00
David Huynh
9acd3dbe05
Fixed issue 127 - Add column from Freebase raises exception. Made sure DataExtensionChange saves properly.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1261 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-10 04:53:37 +00:00
Stefano Mazzocchi
e973fd3e89
d'oh, wrong object counter (thanks again to knut.forkalsrud for spotting my mistakes)
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1250 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-31 23:25:16 +00:00
Stefano Mazzocchi
e5c6dda178
Fixed Issue-116
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1243 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-31 19:33:05 +00:00
Stefano Mazzocchi
7df259008b
more whitespace
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1242 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-31 19:32:59 +00:00
Stefano Mazzocchi
cf66d00854
only whitespace (no functional changes)
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1240 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-31 19:32:48 +00:00
Stefano Mazzocchi
860d6c4ee2
a little more solid (it's possible to have both Dates and Calendars in there)
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1239 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-31 19:12:11 +00:00
Stefano Mazzocchi
3648883e0c
ISSUE-99 thanks to knut.forkalsrud for providing the patch!
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1238 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-31 18:56:35 +00:00
Stefano Mazzocchi
5d788c9260
added timeline facet (like the numeric binning facet but working on dates instead of numbers and with date-specific binning logic)
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1234 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-31 17:57:54 +00:00
David Huynh
bd7453adba
Made sure to strip off charset from content-type when importing from URLs before looking up for the right importer.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1229 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-25 17:35:16 +00:00
David Huynh
367796488e
Fixed xml importer: subgroups should now line up properly by rows.
...
Added command to reorder columns using drag and drop.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1227 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-25 06:17:08 +00:00
David Huynh
276fae8938
Save templating exporter's template.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1221 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-24 06:36:49 +00:00
David Huynh
baa4e0db8c
Added command to browse to the data load page on the Gridworks QA dashboard.
...
Save the data load job name and fill it in the next time the Load into Freebase dialog is opened.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1220 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-24 06:06:07 +00:00
David Huynh
e4af19f8a6
Namespaced operations' names by their modules' names.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1215 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-24 04:02:36 +00:00
David Huynh
1f69fba43c
Added command Add Column by Fetching URLs.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1203 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-22 23:55:07 +00:00
David Huynh
9041ebf7b9
Bumped version to 1.5.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1195 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-22 06:42:21 +00:00
David Huynh
c94abd0427
Commands are now registered in association with their modules, so to avoid name collision.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1193 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-22 06:16:13 +00:00
David Huynh
95e2e30c8a
Added events to OverlayModel interface, so overlay models can react to saving events and to disposing event from the project.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1191 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-22 05:06:36 +00:00
David Huynh
4ea765b689
Factored out registries of importers and exporters.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1183 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-22 01:46:32 +00:00
David Huynh
99b8c4dc7a
Setting rabj=true when uploading to freeq.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1170 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-19 21:13:00 +00:00
Stefano Mazzocchi
fcc54e2ab3
removing what turned out to be dead code
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1162 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-16 16:09:52 +00:00
Stefano Mazzocchi
bb7d3c388c
ISSUE-115 datePart('month') should return January as 1 not 0
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1161 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-16 16:09:39 +00:00
David Huynh
a90a9c724e
Forgot to register blank down operation in operation registry previously.
...
Added uniques GEL function for eliminating duplicates in an array.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1158 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-16 03:00:43 +00:00
David Huynh
fa816007a7
Fixed copy-and-paste string mistake in BlankDownOperation.
...
Fixed minor bug in Row.isValueBlank that returns true for non-string values.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1157 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-16 02:16:41 +00:00
David Huynh
e61655506a
Added new command to import QA results, so any reconciliation action that yields conflicting or uncertain opinions among reviewers can be examined inside Gridworks.
...
Added new customized facets for checking QA results.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1156 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-13 16:26:33 +00:00
David Huynh
8f071ede31
Added command Transpose Cells in Rows into Columns (Issue 82).
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1147 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-08 06:30:30 +00:00
David Huynh
d1a66e2e63
Added JSON support in GEL.
...
Added GEL functions: escape, parseJson, hasField.
Fixed bug in preference store: expression history was still not loaded properly.
Integers are now rendered without decimals in the expression preview dialogs.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1145 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-07 22:57:48 +00:00
David Huynh
e70f16025b
Fixed bug introduced recently by changing the preference key of the expression history from "expressions" to "scripting.expressions".
...
Added code in FileProjectManager for trying to recover projects in the workspace dir but are not recorded in the workspace json file.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1144 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-07 20:25:31 +00:00
David Huynh
0500d7aa10
Added commands Move Column to Beginning, Move Column to End, Move Column Left, Move Column Right.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1142 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-07 01:24:48 +00:00
David Huynh
f0eae04c0c
Forgot to add 2 files in the last commit
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1141 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-06 23:50:22 +00:00
David Huynh
a8ee9b9e08
Added Fill Down and Blank Down commands.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1140 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-06 20:33:28 +00:00
David Huynh
3bda9d035d
Added support for creating a project by pointing to a data file URL.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1139 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-06 06:15:05 +00:00
David Huynh
f411dc9104
- Issue 112: Refactor Importer API (patch from tfmorris)
...
- Added support for storing custom metadata in ProjectMetadata.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1138 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-06 05:04:25 +00:00
David Huynh
00c6865d95
- Select All and Unselect All buttons in History Extract dialog
...
- Schema skeleton: support for multiple cells per cell-as nodes, and for conditional links
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1137 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-05 20:27:39 +00:00
David Huynh
5cb3f924f6
Added support in protograph for specifying several column names per cell-as nodes.
...
Started to add support for conditional links in protograph. The UI is not hooked up with.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1136 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-05 08:29:34 +00:00
David Huynh
b8ad56c6db
Made sure in the schema skeleton dialog, in the dialog box for a node, in the "cell-as-topic" section, the type is always recorded.
...
In the triple loader transposed node factory, use the column's recon config to generate new topics' type.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1135 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-05 00:53:08 +00:00
David Huynh
dcc3ac8534
Renamed packages com.metaweb.* to com.google.*.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1130 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-03 23:01:18 +00:00