David Huynh
7cd5a47fbf
We haven't been using non-split row parser, so we need to fix the trimming problem in the tsv/csv importer instead.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1467 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-12 23:24:16 +00:00
David Huynh
2d276fa1e6
Non split row parser shouldn't trim lines because whitespaces are significant
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1465 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-12 22:45:30 +00:00
Iain Sproat
142591a090
Added a mention of the new JsonImporter to CHANGES.txt
...
Corrected the logger name in JsonImporter.java
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1455 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-08 07:58:59 +00:00
David Huynh
3ba8e63249
Register Json importer.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1426 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-04 18:53:41 +00:00
Iain Sproat
d977f42f51
Changed behaviour of the XmlImporter to make it more permissive, and allow arrays within mixed elements to be used as candidates for importing to Refine.
...
This change has also allowed the JsonImporter to pass all its unit tests without any further modification.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1425 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-04 18:33:59 +00:00
Iain Sproat
ec9898ba92
Some tidying up of the XmlImporter which reduces the number of generic TreeParser tokens to a minimum - and should allow elements such as comments and CDATA to be ignored/skipped.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1422 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-04 15:02:09 +00:00
Iain Sproat
d3f223c196
The JsonImporter now passes all current unit tests.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1421 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-04 10:02:50 +00:00
David Huynh
935355cb50
Comments in XML file caused the record detection code to fail. So we added ignorable element type that we can skip over.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1392 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 19:16:43 +00:00
Iain Sproat
bd3ded0828
Correcting JsonImporter to use the correct parser.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1388 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 14:19:19 +00:00
Iain Sproat
855df20481
XmlImportUtilities no longer relies on XMLStreamConstants, and is now independent of any specific type of tree data (Xml or otherwise).
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1378 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 10:46:33 +00:00
Iain Sproat
b21961be89
Another small step towards making XmlImportUtilities generic for all tree structured data, and less XML centric. Some calls to XMLStreamConstant in XmlImportUtilities are now working with a generic TreeParserToken, with methods to converter between TreeParserToken and XMLStreamConstant/JsonToken in the respective parsers.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1377 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 10:04:56 +00:00
David Huynh
f2ce1b7161
Fixed Issue 121: Importing attached file strips backslashes
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1369 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 03:35:42 +00:00
Iain Sproat
c3c23a87b0
The renaming of TreeImporter to TreeImportUtilities didn't seem to get committed last time. Trying again.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1362 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-27 22:57:26 +00:00
Iain Sproat
d285999da8
New JsonImporter, JsonParser and JsonImporterTests (copy of XmlImporterTests with syntax of the example data altered for Json).
...
Renaming of TreeImporter to TreeImportUtilities (as per the current convention with the XmlImporter and XmlImportUtilities).
NB the new JsonParser class does not work, and 5 of the new unit tests for JsonImporter currently fail. To be fixed in due course.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1361 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-27 22:53:17 +00:00
Iain Sproat
e5ddfa6fdc
All methods in XmlImportUtilities now use the TreeParser interface.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1323 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-27 17:59:53 +00:00
Iain Sproat
d71c563831
XmlImportUtilities.detectPathFromTag and XmlImportUtilities.detectRecordElement methods now use a generic TreeParser interface. A lightweight wrapper XmlParser wraps XMLStreamReader to provide parsing for xml data.
...
This is another small step towards a generic importer for tree structured data. My plan is to refactor more of XmlImportUtilities' methods to use the TreeParser interface so that XmlStreamReader is no longer called directly from XmlImportUtilities.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1322 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-27 17:40:51 +00:00
Iain Sproat
1bda46d40f
Methods which are generic to any tree structured data and don't rely on an XmlParser have been moved to a new TreeImporter class. This is a small step towards supporting importers for other tree structured data.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1321 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-27 16:09:44 +00:00
David Huynh
1367ce301e
More renaming, except for: client-side code, build scripts, anything to do with data loading and QA, workspace path. Refine can still run, and undo/redo on existing projects is working.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1290 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-22 18:36:33 +00:00
David Huynh
edb23eb263
Changed Java packages com.google.gridworks.* to com.google.refine.* and modified other code just enough to start grefine up without error. Much remains to be done. Do not check out the code just yet.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1288 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-22 17:04:10 +00:00