Stefano Mazzocchi
610de0d33a
adding Metaphone3 algorithm
...
Many thanks to Lawrence Philips for donating the code to us under the BSD license.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2029 7d457c2a-affb-35e4-300a-418c747d4874
2011-03-01 00:17:48 +00:00
Stefano Mazzocchi
87e7f9a7a4
remove unused variable
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2028 7d457c2a-affb-35e4-300a-418c747d4874
2011-02-25 22:51:58 +00:00
Tom Morris
c5312a2e6a
Issue 338 - patch from Thad Guidry to provide function which calls JSoup ownText() method
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2025 7d457c2a-affb-35e4-300a-418c747d4874
2011-02-23 19:40:35 +00:00
Tom Morris
5b9362e956
Issue 334 - Make sure URLs are encoded before using them.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2007 7d457c2a-affb-35e4-300a-418c747d4874
2011-02-15 23:15:09 +00:00
Tom Morris
06e2487189
Issue 276 - patch from pxb1... to fix character encoding issue with CreateProject command slightly modified to preserve request encoding if it has one
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2000 7d457c2a-affb-35e4-300a-418c747d4874
2011-02-04 03:15:12 +00:00
David Huynh
d7b482be06
Attempt at fixing issue 185. Will need someone else to verify.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1989 7d457c2a-affb-35e4-300a-418c747d4874
2011-01-20 22:49:36 +00:00
David Huynh
44652a3ee2
Make copy of Calendar object before modifying it. Also handle Date type.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1982 7d457c2a-affb-35e4-300a-418c747d4874
2011-01-10 23:06:28 +00:00
David Huynh
90794d5039
Started working on new import UI. Not much to see yet, but if you append ?new=1 to the index page URL then you see the new form. It can only upload a file at the moment.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1971 7d457c2a-affb-35e4-300a-418c747d4874
2011-01-02 23:09:08 +00:00
David Huynh
6fb2b05739
Fixed issue 294: "Exporting date type column to TSV/CSV shows java debugging information instead of value" with help from Gabriel Sjoberg.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1967 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-28 15:54:24 +00:00
David Huynh
53442c5ef2
Handle the case where an excel cell has a formula but the cached result of that formula is an error.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1962 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-25 21:41:21 +00:00
David Huynh
687e9064df
A shorter fix for toString() to handle Date than the last commit.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1961 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-25 21:36:51 +00:00
David Huynh
0ff40eabbd
toString() should handle Date, too, rather than just Calendar.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1960 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-25 21:33:59 +00:00
Tom Morris
209f157656
RESOLVED - task 202: Sort text with accents
...
http://code.google.com/p/google-refine/issues/detail?id=202
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1951 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-12 18:16:29 +00:00
Iain Sproat
f55f11cd0d
Adding classes to now make it possible to parse Html in GREL. Uses small subset of methods from the JSoup library, licensed under the MIT license.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1948 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-06 23:15:24 +00:00
Tom Morris
9aaa1c9919
Replace tabs with spaces. No functional change.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1947 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-05 20:50:03 +00:00
Tom Morris
a560cb56df
Replace tabs with spaces. No functional changes.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1942 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-29 06:27:06 +00:00
Tom Morris
3a8f9306bd
Add some toString() methods to help with debugging
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1941 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-29 06:24:50 +00:00
Tom Morris
af20157532
Fix indentation so indent levels match logical block levels. No code changes.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1940 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-28 17:46:57 +00:00
Tom Morris
748b5699b9
Issue 61 - Turn on text coalescing and XML entity reference replacement
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1939 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-27 22:07:15 +00:00
Tom Morris
e19148c375
Make sure we at least log an error if the import fails
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1938 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-27 22:05:45 +00:00
Tom Morris
824f445530
Unused import
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1937 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-27 17:54:16 +00:00
Tom Morris
b9fa100d31
Don't try to save a null encoding
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1936 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-27 17:54:01 +00:00
Tom Morris
850c43d6f3
Issue 107 - set encoding on response
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1935 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-27 02:46:10 +00:00
Tom Morris
3d6458a0e5
Replace tabs with spaces
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1934 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-27 01:38:32 +00:00
Tom Morris
bc8637f638
Issue 257 - Don't return a String where a Date is required (using generics in Criterion API would prevent this kind of problem, but that's incompatible with the use of the Eval_Error class)
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1933 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-27 01:18:36 +00:00
Tom Morris
c7b0f4d024
Issue 184 - use default locale date formatting if no format string is specified
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1932 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-26 23:47:09 +00:00
Tom Morris
080ec5332e
Issue 237 - Make sure project's character encoding is always set. Lower minimum confidence threshold for guesser.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1931 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-26 22:23:31 +00:00
David Huynh
1e2af79851
Let's handle .tar files as well rather than requiring .tar.gz.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1919 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-21 03:00:43 +00:00
David Huynh
c496f1e941
Helped toward fixing issue 228: ButterflyServlet already tracks the ServletConfig, so there's no need for RefineServlet to do that, too.
...
Importing archive files has another big problem at the moment: namely, even if the many files in a single archive file share several columns, they still cause columns with the same names to be over and over again as each file gets imported. This is because individual importer was written with the assumption that it imports into an empty project with no column.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1918 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-21 02:58:15 +00:00
Iain Sproat
09fa36198c
Additions to GREL:
...
* Factorial function allowing variable steps
* GreatestCommonDenominator function
* LeastCommonMultiple function
* Multinomial function
* Quotient function
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1910 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-20 18:04:11 +00:00
Iain Sproat
43d0de2d8a
Fixed registered name of GREL combination function
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1909 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-20 12:15:31 +00:00
Iain Sproat
f1643565b8
Additions to GREL:
...
* modulo operator, %
* cos, sin and tan functions
* acos, asin, atan and atan2 functions
* cosh, sinh and tanh functions
* fact and combin functions
* degrees and radians functions
* odd and even functions
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1908 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-20 12:11:37 +00:00
Iain Sproat
1ec7cb9f7b
PI constant added to GREL
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1904 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-18 23:53:07 +00:00
Tom Morris
675714d03d
Add toString() methods to help with debugging
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1894 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-18 08:19:05 +00:00
Iain Sproat
dd333d5b43
Abs function now available in GREL
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1890 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-17 09:38:51 +00:00
Iain Sproat
74e9288229
Additional error dialog for Issue 188
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1858 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-11 14:25:46 +00:00
Iain Sproat
2f564589f5
Adding a Fixed Width data importer (Issue 85) and associated tests.
...
Although this importer is 'wired up', it requires a property "fixed-column-widths" which is not (yet) implemented in the UI. But the ImporterRegister.guessImporter method will probably select the CsvTsvImporter before the FixedWidthImporter anyway. I suggest an improvement to the project creation UI and/or the guessImporter method will be required.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1857 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-11 13:15:41 +00:00
David Huynh
703d2dbd19
IsTest should catch errors and wrap them.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1833 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-08 21:19:25 +00:00
David Huynh
5d915be096
Numeric comparisons == and != should be special-cased, too.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1780 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-05 19:17:57 +00:00
David Huynh
fe08a43e0c
FunctionCall and ControlCall should catch exceptions and wrap them as EvalError's.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1777 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-05 04:26:10 +00:00
David Huynh
faaca5beea
Fixed the GREL round function.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1749 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-01 20:28:42 +00:00
David Huynh
1f12bfb409
Fixed bug in HasFieldsListImpl where list members weren't tested for being null.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1735 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-01 18:13:53 +00:00
David Huynh
1eebe2e4a3
Fixed transpose-rows-into-columns command, which previously duplicated columns that precede the column being transposed.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1734 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-01 17:59:58 +00:00
David Huynh
764558c48a
In numeric bin index, count infinity values as errors
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1700 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-26 22:49:59 +00:00
David Huynh
8d422e2e54
Fixed Calendar vs. Date bug in time range facet
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1699 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-26 21:06:16 +00:00
David Huynh
8ccf9d1bf8
The judgment facet created after a recon operation is done should also show (blank) and (unreconciled) choices
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1696 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-26 04:57:02 +00:00
David Huynh
e601ad8d40
bug: autoMatch flag wasn't actually used before
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1627 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-25 01:41:07 +00:00
David Huynh
2d9e7c87f6
Increased recon batch size to 10 again. Various style tweaks. Polished up freebase extension's dialogs to be a bit more helpful
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1625 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-25 01:07:25 +00:00
David Huynh
345c1c62ac
Added new recon commands:
...
- clear recon data for all matching rows
- clear recon data for one cell
- clear recon data for similar cells
- copy recon judgments across columns
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1618 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-22 07:25:27 +00:00
David Huynh
5a17acfd70
Prepended license text to java source
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1613 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-20 20:45:52 +00:00