Commit Graph

258 Commits

Author SHA1 Message Date
David Huynh
cecfa244e0 Changed to UTF-8 encoding
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2039 7d457c2a-affb-35e4-300a-418c747d4874
2011-04-06 21:09:21 +00:00
Stefano Mazzocchi
610de0d33a adding Metaphone3 algorithm
Many thanks to Lawrence Philips for donating the code to us under the BSD license.


git-svn-id: http://google-refine.googlecode.com/svn/trunk@2029 7d457c2a-affb-35e4-300a-418c747d4874
2011-03-01 00:17:48 +00:00
Stefano Mazzocchi
87e7f9a7a4 remove unused variable
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2028 7d457c2a-affb-35e4-300a-418c747d4874
2011-02-25 22:51:58 +00:00
Tom Morris
c5312a2e6a Issue 338 - patch from Thad Guidry to provide function which calls JSoup ownText() method
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2025 7d457c2a-affb-35e4-300a-418c747d4874
2011-02-23 19:40:35 +00:00
Tom Morris
5b9362e956 Issue 334 - Make sure URLs are encoded before using them.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2007 7d457c2a-affb-35e4-300a-418c747d4874
2011-02-15 23:15:09 +00:00
Tom Morris
06e2487189 Issue 276 - patch from pxb1... to fix character encoding issue with CreateProject command slightly modified to preserve request encoding if it has one
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2000 7d457c2a-affb-35e4-300a-418c747d4874
2011-02-04 03:15:12 +00:00
David Huynh
d7b482be06 Attempt at fixing issue 185. Will need someone else to verify.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1989 7d457c2a-affb-35e4-300a-418c747d4874
2011-01-20 22:49:36 +00:00
David Huynh
44652a3ee2 Make copy of Calendar object before modifying it. Also handle Date type.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1982 7d457c2a-affb-35e4-300a-418c747d4874
2011-01-10 23:06:28 +00:00
David Huynh
90794d5039 Started working on new import UI. Not much to see yet, but if you append ?new=1 to the index page URL then you see the new form. It can only upload a file at the moment.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1971 7d457c2a-affb-35e4-300a-418c747d4874
2011-01-02 23:09:08 +00:00
David Huynh
6fb2b05739 Fixed issue 294: "Exporting date type column to TSV/CSV shows java debugging information instead of value" with help from Gabriel Sjoberg.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1967 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-28 15:54:24 +00:00
David Huynh
53442c5ef2 Handle the case where an excel cell has a formula but the cached result of that formula is an error.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1962 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-25 21:41:21 +00:00
David Huynh
687e9064df A shorter fix for toString() to handle Date than the last commit.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1961 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-25 21:36:51 +00:00
David Huynh
0ff40eabbd toString() should handle Date, too, rather than just Calendar.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1960 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-25 21:33:59 +00:00
Tom Morris
209f157656 RESOLVED - task 202: Sort text with accents
http://code.google.com/p/google-refine/issues/detail?id=202

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1951 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-12 18:16:29 +00:00
Iain Sproat
f55f11cd0d Adding classes to now make it possible to parse Html in GREL. Uses small subset of methods from the JSoup library, licensed under the MIT license.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1948 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-06 23:15:24 +00:00
Tom Morris
9aaa1c9919 Replace tabs with spaces. No functional change.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1947 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-05 20:50:03 +00:00
Tom Morris
a560cb56df Replace tabs with spaces. No functional changes.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1942 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-29 06:27:06 +00:00
Tom Morris
3a8f9306bd Add some toString() methods to help with debugging
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1941 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-29 06:24:50 +00:00
Tom Morris
af20157532 Fix indentation so indent levels match logical block levels. No code changes.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1940 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-28 17:46:57 +00:00
Tom Morris
748b5699b9 Issue 61 - Turn on text coalescing and XML entity reference replacement
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1939 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-27 22:07:15 +00:00
Tom Morris
e19148c375 Make sure we at least log an error if the import fails
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1938 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-27 22:05:45 +00:00
Tom Morris
824f445530 Unused import
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1937 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-27 17:54:16 +00:00
Tom Morris
b9fa100d31 Don't try to save a null encoding
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1936 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-27 17:54:01 +00:00
Tom Morris
850c43d6f3 Issue 107 - set encoding on response
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1935 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-27 02:46:10 +00:00
Tom Morris
3d6458a0e5 Replace tabs with spaces
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1934 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-27 01:38:32 +00:00
Tom Morris
bc8637f638 Issue 257 - Don't return a String where a Date is required (using generics in Criterion API would prevent this kind of problem, but that's incompatible with the use of the Eval_Error class)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1933 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-27 01:18:36 +00:00
Tom Morris
c7b0f4d024 Issue 184 - use default locale date formatting if no format string is specified
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1932 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-26 23:47:09 +00:00
Tom Morris
080ec5332e Issue 237 - Make sure project's character encoding is always set. Lower minimum confidence threshold for guesser.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1931 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-26 22:23:31 +00:00
David Huynh
1e2af79851 Let's handle .tar files as well rather than requiring .tar.gz.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1919 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-21 03:00:43 +00:00
David Huynh
c496f1e941 Helped toward fixing issue 228: ButterflyServlet already tracks the ServletConfig, so there's no need for RefineServlet to do that, too.
Importing archive files has another big problem at the moment: namely, even if the many files in a single archive file share several columns, they still cause columns with the same names to be over and over again as each file gets imported. This is because individual importer was written with the assumption that it imports into an empty project with no column.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1918 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-21 02:58:15 +00:00
Iain Sproat
09fa36198c Additions to GREL:
* Factorial function allowing variable steps
* GreatestCommonDenominator function
* LeastCommonMultiple function
* Multinomial function
* Quotient function

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1910 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-20 18:04:11 +00:00
Iain Sproat
43d0de2d8a Fixed registered name of GREL combination function
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1909 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-20 12:15:31 +00:00
Iain Sproat
f1643565b8 Additions to GREL:
* modulo operator, %
* cos, sin and tan functions
* acos, asin, atan and atan2 functions
* cosh, sinh and tanh functions
* fact and combin functions
* degrees and radians functions
* odd and even functions

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1908 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-20 12:11:37 +00:00
Iain Sproat
1ec7cb9f7b PI constant added to GREL
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1904 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-18 23:53:07 +00:00
Tom Morris
675714d03d Add toString() methods to help with debugging
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1894 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-18 08:19:05 +00:00
Iain Sproat
dd333d5b43 Abs function now available in GREL
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1890 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-17 09:38:51 +00:00
Iain Sproat
74e9288229 Additional error dialog for Issue 188
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1858 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-11 14:25:46 +00:00
Iain Sproat
2f564589f5 Adding a Fixed Width data importer (Issue 85) and associated tests.
Although this importer is 'wired up', it requires a property "fixed-column-widths" which is not (yet) implemented in the UI.  But the ImporterRegister.guessImporter method will probably select the CsvTsvImporter before the FixedWidthImporter anyway.  I suggest an improvement to the project creation UI and/or the guessImporter method will be required.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1857 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-11 13:15:41 +00:00
David Huynh
703d2dbd19 IsTest should catch errors and wrap them.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1833 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-08 21:19:25 +00:00
David Huynh
5d915be096 Numeric comparisons == and != should be special-cased, too.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1780 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-05 19:17:57 +00:00
David Huynh
fe08a43e0c FunctionCall and ControlCall should catch exceptions and wrap them as EvalError's.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1777 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-05 04:26:10 +00:00
David Huynh
faaca5beea Fixed the GREL round function.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1749 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-01 20:28:42 +00:00
David Huynh
1f12bfb409 Fixed bug in HasFieldsListImpl where list members weren't tested for being null.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1735 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-01 18:13:53 +00:00
David Huynh
1eebe2e4a3 Fixed transpose-rows-into-columns command, which previously duplicated columns that precede the column being transposed.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1734 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-01 17:59:58 +00:00
David Huynh
764558c48a In numeric bin index, count infinity values as errors
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1700 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-26 22:49:59 +00:00
David Huynh
8d422e2e54 Fixed Calendar vs. Date bug in time range facet
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1699 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-26 21:06:16 +00:00
David Huynh
8ccf9d1bf8 The judgment facet created after a recon operation is done should also show (blank) and (unreconciled) choices
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1696 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-26 04:57:02 +00:00
David Huynh
e601ad8d40 bug: autoMatch flag wasn't actually used before
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1627 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-25 01:41:07 +00:00
David Huynh
2d9e7c87f6 Increased recon batch size to 10 again. Various style tweaks. Polished up freebase extension's dialogs to be a bit more helpful
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1625 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-25 01:07:25 +00:00
David Huynh
345c1c62ac Added new recon commands:
- clear recon data for all matching rows
- clear recon data for one cell
- clear recon data for similar cells
- copy recon judgments across columns

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1618 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-22 07:25:27 +00:00
David Huynh
5a17acfd70 Prepended license text to java source
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1613 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-20 20:45:52 +00:00
David Huynh
9b8206da29 Fixed new bug for query-based reconciliation introduced by factoring out the freebase extension
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1611 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-20 04:52:35 +00:00
Tom Morris
7dcd0c073d Revert bad commit r1600
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1601 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-18 05:19:05 +00:00
Tom Morris
79c00bab36 Incomplete - task 157: Integrate Google Spreadsheet import/export plugin
http://code.google.com/p/google-refine/issues/detail?id=157

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1600 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-18 04:59:39 +00:00
David Huynh
e7184ec9ab Deleted old empty protograph dirs. Use a default assign version even if running from trunk; this is so that we have at least some clue about an imported project file.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1598 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-18 04:18:09 +00:00
David Huynh
c8dcc10ab8 Be sure to use UTF-8 when saving data.txt, pool.txt, and change files.
Fix issue 163: Refine doesn't retain the characters for flat or sharp.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1588 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-17 23:43:02 +00:00
David Huynh
a62638e88d For each recon group, try at least 3 times if the service keeps failing. Log errors more for debugging purposes.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1578 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-16 00:19:31 +00:00
Stefano Mazzocchi
f50880905e fixed warnings
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1577 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-16 00:01:42 +00:00
Tom Morris
47dd5f8da6 Make sure the stream/writer is flushed in case the exporter forgets to do it
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1569 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-15 17:10:37 +00:00
Tom Morris
bbebb4d2dc Add @Overrides so we get warned about API changes
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1565 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-15 13:26:25 +00:00
David Huynh
7e9df21b70 Exporters need to implement either WriterExporter or StreamExporter.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1558 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-15 06:18:20 +00:00
David Huynh
73042712ed Made csv/tsv importer not trim whitespace even if "guess cells' types" is checked (for cells that are strings).
Updated csv tests to expect un-trimmed cells.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1557 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-15 05:30:15 +00:00
David Huynh
9e35ea3775 Better error message for numeric range facet if there's no numeric value.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1551 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-15 01:00:51 +00:00
Tom Morris
083abd4329 Refactor exporter interface along same lines as importer
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1547 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-14 21:33:50 +00:00
David Huynh
4ccdbc8716 Fixed bug in which a newly created and unedited project would never get saved because it had the same modified time and last save time.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1530 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-14 01:43:26 +00:00
David Huynh
dc49047092 We have previously changed the standard-reconcile acre app to return mids, but we still need to make sure its metadata says that its identifier space is mid, not id. And we need Refine to test for the mid identifier space as well.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1479 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-13 18:33:27 +00:00
David Huynh
a16df8f2d6 For unrecoverable projects, rename them with a suffix so the next time we won't try to recover them again.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1472 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-13 07:05:34 +00:00
David Huynh
91ffe71d17 Lowering recon batch size from 7 to 3 to avoid timeout problem. This is a temporary fix only for
Issue 156: Reconcile is not picking up alias hints or even type hints correctly

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1470 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-13 05:03:49 +00:00
David Huynh
208152b55c Added .vt template for reporting errors with stacktraces.
Fixed Issue 155: Blank browser shown when non-GZIP format is detected during import

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1469 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-13 04:51:01 +00:00
David Huynh
7cd5a47fbf We haven't been using non-split row parser, so we need to fix the trimming problem in the tsv/csv importer instead.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1467 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-12 23:24:16 +00:00
David Huynh
2d276fa1e6 Non split row parser shouldn't trim lines because whitespaces are significant
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1465 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-12 22:45:30 +00:00
David Huynh
69c338c728 Text filter was throwing an exception if the column went away (which happened when the column got split).
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1464 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-11 23:15:13 +00:00
David Huynh
336a773069 Only try to create the workspace dir if it doesn't exist.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1463 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-11 23:04:06 +00:00
Tom Morris
c42c78dc0a Log errors if things don't go as expected
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1462 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-11 22:28:22 +00:00
Iain Sproat
142591a090 Added a mention of the new JsonImporter to CHANGES.txt
Corrected the logger name in JsonImporter.java

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1455 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-08 07:58:59 +00:00
David Huynh
ad0d227ab3 Remove remaining Freebase related functionalities.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1453 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-08 02:04:47 +00:00
David Huynh
6ddd945a80 The Freebase functionalities have been extracted out in the last commit. We're removing them from the core module now. This is not a complete checkin. SVN is having some trouble with some directories.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1452 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-08 01:54:00 +00:00
Tom Morris
5040b06d9f Make exceptions more specific for load errors. Still no error returned to user though (just hangs)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1450 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-07 14:20:28 +00:00
Tom Morris
ea28784e8b Don't save null project if load failed
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1449 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-07 14:19:42 +00:00
Stefano Mazzocchi
215165ed97 spell out tweezer parameters
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1444 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-07 08:20:46 +00:00
David Huynh
9ea477c80d Allowed a single operation class to be registered under several names, so that we can rename operations (to better names) while maintaining backward compatibility.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1443 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-07 05:42:01 +00:00
David Huynh
1de5e7c00e Renamed package gel to grel.
Replaced gel with grel in other places in the code base while maintaining backward compatibility.
Changed layout in expression preview dialog to accommodate long GREL name.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1442 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-07 05:19:35 +00:00
David Huynh
90d1111ebc Added "project" argument to OverlayModel methods, as suggested by Fadi Maali.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1439 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-06 20:47:11 +00:00
David Huynh
3ba8e63249 Register Json importer.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1426 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-04 18:53:41 +00:00
Iain Sproat
d977f42f51 Changed behaviour of the XmlImporter to make it more permissive, and allow arrays within mixed elements to be used as candidates for importing to Refine.
This change has also allowed the JsonImporter to pass all its unit tests without any further modification.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1425 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-04 18:33:59 +00:00
Iain Sproat
ec9898ba92 Some tidying up of the XmlImporter which reduces the number of generic TreeParser tokens to a minimum - and should allow elements such as comments and CDATA to be ignored/skipped.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1422 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-04 15:02:09 +00:00
Iain Sproat
d3f223c196 The JsonImporter now passes all current unit tests.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1421 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-04 10:02:50 +00:00
Stefano Mazzocchi
2b9b38368f use the new FreeQ 'refine' queue instead of the old 'gridworks' one
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1410 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-30 00:26:02 +00:00
Stefano Mazzocchi
b62e63306a - make the correct version + revision available also to the java side (thru web.xml)
- add @Override metadata to the commands that were missing it
- make the version information appear even when using trunk (Fixes Issue 136)


git-svn-id: http://google-refine.googlecode.com/svn/trunk@1406 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-29 01:50:57 +00:00
David Huynh
935355cb50 Comments in XML file caused the record detection code to fail. So we added ignorable element type that we can skip over.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1392 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 19:16:43 +00:00
Iain Sproat
bd3ded0828 Correcting JsonImporter to use the correct parser.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1388 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 14:19:19 +00:00
Iain Sproat
855df20481 XmlImportUtilities no longer relies on XMLStreamConstants, and is now independent of any specific type of tree data (Xml or otherwise).
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1378 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 10:46:33 +00:00
Iain Sproat
b21961be89 Another small step towards making XmlImportUtilities generic for all tree structured data, and less XML centric. Some calls to XMLStreamConstant in XmlImportUtilities are now working with a generic TreeParserToken, with methods to converter between TreeParserToken and XMLStreamConstant/JsonToken in the respective parsers.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1377 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 10:04:56 +00:00
David Huynh
740caedf46 Updated to version 2.0
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1376 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 06:03:07 +00:00
David Huynh
e587614c22 Fixed Issue 126: Large integers formatted in scientific notation in formulas
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1373 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 04:21:44 +00:00
Tom Morris
bc6f05f41b Issue 140 - Fix Open Workspace command for non-Mac platforms (requires Java 6)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1372 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 03:54:54 +00:00
David Huynh
194fb5e706 Fixed Issue 122: Exporting to Excel on attached project raises server exception
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1370 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 03:44:30 +00:00
David Huynh
f2ce1b7161 Fixed Issue 121: Importing attached file strips backslashes
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1369 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 03:35:42 +00:00
Stefano Mazzocchi
c976091624 new hooks to the Freebase Refinery
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1368 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 02:19:50 +00:00
David Huynh
823fe989a4 Fixed Issue 110: Import of single column text file with Postal Codes shows only 1 row with lots of � chars (?).
(by enforcing a confidence threshold on the encoding guessing)

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1367 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 00:26:53 +00:00