Commit Graph

2289 Commits

Author SHA1 Message Date
Tom Morris
06d5b108fa Set project encoding to UTF-8 to match source files
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2040 7d457c2a-affb-35e4-300a-418c747d4874
2011-04-08 14:39:55 +00:00
David Huynh
cecfa244e0 Changed to UTF-8 encoding
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2039 7d457c2a-affb-35e4-300a-418c747d4874
2011-04-06 21:09:21 +00:00
Stefano Mazzocchi
610de0d33a adding Metaphone3 algorithm
Many thanks to Lawrence Philips for donating the code to us under the BSD license.


git-svn-id: http://google-refine.googlecode.com/svn/trunk@2029 7d457c2a-affb-35e4-300a-418c747d4874
2011-03-01 00:17:48 +00:00
Stefano Mazzocchi
87e7f9a7a4 remove unused variable
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2028 7d457c2a-affb-35e4-300a-418c747d4874
2011-02-25 22:51:58 +00:00
Stefano Mazzocchi
c65627524f fixed bug in mql_key quoting with - and _ at the beginning and end of the string
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2027 7d457c2a-affb-35e4-300a-418c747d4874
2011-02-25 22:51:37 +00:00
Tom Morris
3a9ea77b5c Use actual key parsing methods to make sure we can get a key before claiming we'll be able to open a URL.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2026 7d457c2a-affb-35e4-300a-418c747d4874
2011-02-23 20:28:56 +00:00
Tom Morris
c5312a2e6a Issue 338 - patch from Thad Guidry to provide function which calls JSoup ownText() method
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2025 7d457c2a-affb-35e4-300a-418c747d4874
2011-02-23 19:40:35 +00:00
David Huynh
a4572b66c8 Fixed yet another problem caused by r1989.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2022 7d457c2a-affb-35e4-300a-418c747d4874
2011-02-23 01:19:06 +00:00
David Huynh
669b708d60 Fixed r185: same reconciliation candidate for two cells seems to be overridden
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2010 7d457c2a-affb-35e4-300a-418c747d4874
2011-02-18 05:38:41 +00:00
Tom Morris
5b9362e956 Issue 334 - Make sure URLs are encoded before using them.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2007 7d457c2a-affb-35e4-300a-418c747d4874
2011-02-15 23:15:09 +00:00
Tom Morris
e72d590a31 Issue 334 - tighten up URL pattern matching for Google Spreadsheets & Fusion Tables
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2006 7d457c2a-affb-35e4-300a-418c747d4874
2011-02-14 22:23:48 +00:00
Tom Morris
9384d22d85 Add GData extension to the source path for debugging
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2005 7d457c2a-affb-35e4-300a-418c747d4874
2011-02-14 22:23:05 +00:00
Tom Morris
dcc6ac9bea Issue 325 - use system defined HTTP proxies by default
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2003 7d457c2a-affb-35e4-300a-418c747d4874
2011-02-04 20:18:48 +00:00
Tom Morris
bbc2b3d363 Test provided by Gabriel Sjoberg. Thank you!
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2002 7d457c2a-affb-35e4-300a-418c747d4874
2011-02-04 20:09:01 +00:00
Tom Morris
3e08aca4ec Issue 304 - Apply patch to fix test. Thanks to GabrielS
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2001 7d457c2a-affb-35e4-300a-418c747d4874
2011-02-04 04:55:05 +00:00
Tom Morris
06e2487189 Issue 276 - patch from pxb1... to fix character encoding issue with CreateProject command slightly modified to preserve request encoding if it has one
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2000 7d457c2a-affb-35e4-300a-418c747d4874
2011-02-04 03:15:12 +00:00
Tom Morris
5519f61335 Issue 311 - give input fields unique names
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1999 7d457c2a-affb-35e4-300a-418c747d4874
2011-02-04 02:31:14 +00:00
Tom Morris
de25ddfe41 Issue 328 - extend solution for key-based recon to guid & id recon
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1998 7d457c2a-affb-35e4-300a-418c747d4874
2011-02-04 02:08:38 +00:00
Tom Morris
cccfbf9ad8 Issue 328 - don't retry unsuccessful MQL key based reconciliation
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1997 7d457c2a-affb-35e4-300a-418c747d4874
2011-02-04 00:56:31 +00:00
Tom Morris
1df3348b52 Include Freebase extension in launch config
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1995 7d457c2a-affb-35e4-300a-418c747d4874
2011-02-03 16:34:11 +00:00
Tom Morris
094a479d50 Compile using Java 1.6 since we have a dependency on it
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1994 7d457c2a-affb-35e4-300a-418c747d4874
2011-02-03 16:01:28 +00:00
Tom Morris
6a01c345ad Quote paths so they work when there are spaces in them
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1993 7d457c2a-affb-35e4-300a-418c747d4874
2011-02-03 02:57:10 +00:00
David Huynh
d7b482be06 Attempt at fixing issue 185. Will need someone else to verify.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1989 7d457c2a-affb-35e4-300a-418c747d4874
2011-01-20 22:49:36 +00:00
David Huynh
44652a3ee2 Make copy of Calendar object before modifying it. Also handle Date type.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1982 7d457c2a-affb-35e4-300a-418c747d4874
2011-01-10 23:06:28 +00:00
Tom Morris
4d84733b8e Fixed - task 197: Handle date wraparound for year boundary
http://code.google.com/p/google-refine/issues/detail?id=197

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1979 7d457c2a-affb-35e4-300a-418c747d4874
2011-01-09 06:09:30 +00:00
David Huynh
90794d5039 Started working on new import UI. Not much to see yet, but if you append ?new=1 to the index page URL then you see the new form. It can only upload a file at the moment.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1971 7d457c2a-affb-35e4-300a-418c747d4874
2011-01-02 23:09:08 +00:00
David Huynh
a81dcc50cc Don't assert type /type/object as the result of any /type/object/* property.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1969 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-28 22:32:21 +00:00
David Huynh
ccc6587cdd Fixed minor bug introduced by recent check-in for asserting types in triple loader payload.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1968 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-28 19:48:08 +00:00
David Huynh
6fb2b05739 Fixed issue 294: "Exporting date type column to TSV/CSV shows java debugging information instead of value" with help from Gabriel Sjoberg.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1967 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-28 15:54:24 +00:00
David Huynh
ca8f64ddc4 When generating triple loader payload, assert included, schema, and expected types for existing as well as new topics.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1963 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-27 22:57:04 +00:00
David Huynh
53442c5ef2 Handle the case where an excel cell has a formula but the cached result of that formula is an error.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1962 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-25 21:41:21 +00:00
David Huynh
687e9064df A shorter fix for toString() to handle Date than the last commit.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1961 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-25 21:36:51 +00:00
David Huynh
0ff40eabbd toString() should handle Date, too, rather than just Calendar.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1960 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-25 21:33:59 +00:00
Stefano Mazzocchi
f85e9198d4 ignoring classes that are generated by jython
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1959 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-25 01:49:42 +00:00
Stefano Mazzocchi
2ab131b87a ISSUE-262 make running checks more solid
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1958 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-24 22:21:40 +00:00
Stefano Mazzocchi
e6415bab4f ISSUE-258 avoid ignoring the JAVA_HOME environment if java is not in the path
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1957 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-24 21:17:21 +00:00
Stefano Mazzocchi
9c98842132 ISSUE-295 use "mktemp" instead of creating files locally which makes it possible to install Refine under a different user than the one executing it on *nix systems
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1956 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-24 20:05:57 +00:00
Tom Morris
209f157656 RESOLVED - task 202: Sort text with accents
http://code.google.com/p/google-refine/issues/detail?id=202

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1951 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-12 18:16:29 +00:00
Tom Morris
dda74792bc Add jsoup library so project will compile
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1950 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-08 05:35:07 +00:00
Iain Sproat
f55f11cd0d Adding classes to now make it possible to parse Html in GREL. Uses small subset of methods from the JSoup library, licensed under the MIT license.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1948 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-06 23:15:24 +00:00
Tom Morris
9aaa1c9919 Replace tabs with spaces. No functional change.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1947 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-05 20:50:03 +00:00
Tom Morris
a560cb56df Replace tabs with spaces. No functional changes.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1942 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-29 06:27:06 +00:00
Tom Morris
3a8f9306bd Add some toString() methods to help with debugging
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1941 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-29 06:24:50 +00:00
Tom Morris
af20157532 Fix indentation so indent levels match logical block levels. No code changes.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1940 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-28 17:46:57 +00:00
Tom Morris
748b5699b9 Issue 61 - Turn on text coalescing and XML entity reference replacement
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1939 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-27 22:07:15 +00:00
Tom Morris
e19148c375 Make sure we at least log an error if the import fails
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1938 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-27 22:05:45 +00:00
Tom Morris
824f445530 Unused import
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1937 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-27 17:54:16 +00:00
Tom Morris
b9fa100d31 Don't try to save a null encoding
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1936 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-27 17:54:01 +00:00
Tom Morris
850c43d6f3 Issue 107 - set encoding on response
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1935 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-27 02:46:10 +00:00
Tom Morris
3d6458a0e5 Replace tabs with spaces
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1934 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-27 01:38:32 +00:00