Tom Morris
2c52a00f55
Fixed - issue 544,600,618: Clean up handling of compressed files & archives with multi-segment paths
...
http://code.google.com/p/google-refine/issues/detail?id=600
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2569 7d457c2a-affb-35e4-300a-418c747d4874
2012-09-22 18:08:56 +00:00
Tom Morris
748e205ae8
FIXED - task 616: Support bzip2 decompression on import
...
http://code.google.com/p/google-refine/issues/detail?id=616
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2568 7d457c2a-affb-35e4-300a-418c747d4874
2012-09-22 16:00:42 +00:00
Tom Morris
27e3c0c8dc
FIXED - task 614: Use same instance of OAuthProvider in OAuth dance. Patch supplied by sdeo@google.com
...
http://code.google.com/p/google-refine/issues/detail?id=614
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2566 7d457c2a-affb-35e4-300a-418c747d4874
2012-09-19 23:16:29 +00:00
Tom Morris
b3f5fada95
FIXED - task 578 & 596: Clean up JSON importer
...
http://code.google.com/p/google-refine/issues/detail?id=578
http://code.google.com/p/google-refine/issues/detail?id=596
Extend tree parser framework to allow any Serializable instead of just Strings. Use this in JSON importer to: Import keywords null, true, false; Import empty strings and don't trim whitespace from strings on import; Import numbers directly instead of importing them as text and then parsing them ourselves. Add tests to verify all this stuff
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2543 7d457c2a-affb-35e4-300a-418c747d4874
2012-09-08 01:20:25 +00:00
Tom Morris
93d6e176d6
Task 478: Default "guess datatypes" to False so importers which don't specify it (e.g. gData & Excel) aren't effected
...
http://code.google.com/p/google-refine/issues/detail?id=478
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2541 7d457c2a-affb-35e4-300a-418c747d4874
2012-09-07 21:17:34 +00:00
Tom Morris
83dce305cb
FIXED - task 432: cross() failing - flush join cache table when column changes
...
http://code.google.com/p/google-refine/issues/detail?id=432
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2539 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-30 16:31:22 +00:00
Tom Morris
9b54a8f29e
FIXED - task 559: Deadlock between autosave thread and history code
...
http://code.google.com/p/google-refine/issues/detail?id=559
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2538 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-30 16:22:28 +00:00
Stefano Mazzocchi
ba89daec1c
make oauth against freebase work again in chrome
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2537 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-23 20:58:08 +00:00
Tom Morris
12a61b6ec6
task 603: range check column move commands
...
http://code.google.com/p/google-refine/issues/detail?id=603
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2534 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-18 22:01:23 +00:00
Tom Morris
202018fac4
Add Javadoc. No code changes.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2533 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-18 22:00:41 +00:00
Tom Morris
4bb6c43982
task 604: add Guava to main project so that we're not dependent on an extension
...
http://code.google.com/p/google-refine/issues/detail?id=604
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2531 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-15 13:33:17 +00:00
Tom Morris
1e043dcc94
FIXED - task 604: The common transform “Trim leading and trailing whitespace” doesn’t trim non-breaking spaces
...
http://code.google.com/p/google-refine/issues/detail?id=604
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2529 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-14 23:00:13 +00:00
Tom Morris
f29f77e8f8
STARTED - task 604: The common transform “Trim leading and trailing whitespace” doesn’t trim non-breaking spaces
...
http://code.google.com/p/google-refine/issues/detail?id=604
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2528 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-14 20:09:49 +00:00
Tom Morris
4bf212c03d
FIXED - task 154: Can't import RDF/XML Data
...
http://code.google.com/p/google-refine/issues/detail?id=154
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2526 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-05 16:31:41 +00:00
Tom Morris
5881addac8
Throw an exception if unsupported verb is used
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2525 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-05 15:36:23 +00:00
Tom Morris
b2ae74d23f
FIXED - task 586: Only one parse date format is attempted from list in toDate(format1,format2)
...
http://code.google.com/p/google-refine/issues/detail?id=586
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2520 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-03 18:01:01 +00:00
Tom Morris
4319314675
FIXED - task 594: Date diff function doesn't work for two Calendar objects
...
http://code.google.com/p/google-refine/issues/detail?id=594
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2519 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-02 21:41:19 +00:00
Tom Morris
efa58630cf
Add constructor that takes a Throwable to eliminate redundant code from callers.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2518 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-02 21:38:00 +00:00
Stefano Mazzocchi
2cb31b8b29
fixing oauth problems with redirection for the Freebase API
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2516 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-01 21:46:53 +00:00
David Huynh
4cfb921082
Added getStringKey() method for when it is difficult to generate integer keys that don't collide
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2515 7d457c2a-affb-35e4-300a-418c747d4874
2012-07-19 00:25:41 +00:00
Stefano Mazzocchi
6e41f4ad91
make the latest eclipse happy (it triggers a warning)
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2513 7d457c2a-affb-35e4-300a-418c747d4874
2012-07-12 01:55:11 +00:00
Stefano Mazzocchi
bccea8cebe
we could be leaking file descriptors here
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2506 7d457c2a-affb-35e4-300a-418c747d4874
2012-06-30 07:05:08 +00:00
Stefano Mazzocchi
f84dcff900
moving oauth authorize and deauthrorize into the core module because they are reusable across extensions
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2505 7d457c2a-affb-35e4-300a-418c747d4874
2012-06-29 19:39:42 +00:00
Tom Morris
8872c1b0a1
Keep track of when we have unsaved preference changes
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2502 7d457c2a-affb-35e4-300a-418c747d4874
2012-06-02 21:06:46 +00:00
Tom Morris
a0812c5751
Be slightly more tolerant of weird spreadsheet data
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2501 7d457c2a-affb-35e4-300a-418c747d4874
2012-06-02 21:00:30 +00:00
Tom Morris
c47b1e0ab7
Mark project as modified when metadata is changed
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2491 7d457c2a-affb-35e4-300a-418c747d4874
2012-04-14 14:10:11 +00:00
Tom Morris
8d22ede1f8
Issue 554 - rank formats *before* serializing them.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2482 7d457c2a-affb-35e4-300a-418c747d4874
2012-03-16 20:21:57 +00:00
Tom Morris
b3f8ce83c1
Issue 553 - Make sure we have a usable filename when importing from a URL
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2481 7d457c2a-affb-35e4-300a-418c747d4874
2012-03-16 20:16:18 +00:00
Tom Morris
51c586bc2c
Issue 543 - Handle HTTP responses with Content-Encoding of gzip
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2480 7d457c2a-affb-35e4-300a-418c747d4874
2012-03-16 20:12:10 +00:00
Tom Morris
a8cb23ca51
Issue 544 - preserve directory path after decompressing file
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2479 7d457c2a-affb-35e4-300a-418c747d4874
2012-03-16 20:06:54 +00:00
Tom Morris
e97e7523b2
Issue 548 - Convert non-strings to strings before escaping
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2463 7d457c2a-affb-35e4-300a-418c747d4874
2012-03-14 03:06:11 +00:00
Tom Morris
18b780bebe
Issue 517 - Fix combin() function to a) increase upper limit and b) keep it from continually recomputing the same values in recursion
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2459 7d457c2a-affb-35e4-300a-418c747d4874
2012-03-08 22:53:21 +00:00
Tom Morris
28ff2295fd
Issue 490 - Handle separator guessing for CSVs with quoted fields containing commas
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2458 7d457c2a-affb-35e4-300a-418c747d4874
2012-03-08 15:53:55 +00:00
Tom Morris
9a680e8307
Switch to class name for logging, per convention
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2457 7d457c2a-affb-35e4-300a-418c747d4874
2012-03-08 14:53:27 +00:00
Tom Morris
ddd3680128
Add a TODO for recon failure retries on HTTP 500s - no functional changes
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2455 7d457c2a-affb-35e4-300a-418c747d4874
2012-03-08 14:45:53 +00:00
Tom Morris
5a962b1768
Issue 534 - Attempt to recover recon links which have become corrupted
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2454 7d457c2a-affb-35e4-300a-418c747d4874
2012-03-08 00:37:29 +00:00
Tom Morris
dbdbd906b7
Issue 547 - Decompress kmz files
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2453 7d457c2a-affb-35e4-300a-418c747d4874
2012-03-08 00:29:25 +00:00
Tom Morris
4a99abf25d
Isse 542 - allow integers to be converted to dates
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2450 7d457c2a-affb-35e4-300a-418c747d4874
2012-03-03 21:36:36 +00:00
Tom Morris
5d080e5b3e
Wrap if statement in a block to avoid future problems.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2447 7d457c2a-affb-35e4-300a-418c747d4874
2012-03-01 18:10:59 +00:00
Tom Morris
c583ad4367
Issue 537 - Try to convert to Long first before converting to Double. Matches behavior on import.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2446 7d457c2a-affb-35e4-300a-418c747d4874
2012-02-26 17:27:00 +00:00
Tom Morris
190e817fb8
Protect against NullPointerException
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2444 7d457c2a-affb-35e4-300a-418c747d4874
2012-02-22 20:06:03 +00:00
David Huynh
e21ae32722
Make sure project ID is completely numeric. Slightly better error reporting on project page when project ID is not valid.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2441 7d457c2a-affb-35e4-300a-418c747d4874
2012-01-29 21:16:13 +00:00
Tom Morris
6414ae7f87
Remove redundant test
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2436 7d457c2a-affb-35e4-300a-418c747d4874
2012-01-27 20:38:55 +00:00
Tom Morris
40183aa0ba
Issue 513 - get rid of exception at end of import in JSON parser
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2435 7d457c2a-affb-35e4-300a-418c747d4874
2012-01-27 17:05:45 +00:00
Tom Morris
fdac0c30cf
Issue 524 - shorten __anonymous__ names for JSON importer
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2432 7d457c2a-affb-35e4-300a-418c747d4874
2012-01-26 22:38:25 +00:00
Tom Morris
df45d06b2b
Issue 523 - On URL fetch error, return HTTP error code, message, and contents of error stream (HTML page) if available
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2429 7d457c2a-affb-35e4-300a-418c747d4874
2012-01-26 18:47:30 +00:00
David Huynh
794629eee6
ChangeSequence did not save/load properly at all.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2427 7d457c2a-affb-35e4-300a-418c747d4874
2012-01-25 02:04:52 +00:00
David Huynh
893b767c01
ChangeSequence did not revert properly at all.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2426 7d457c2a-affb-35e4-300a-418c747d4874
2012-01-25 00:46:52 +00:00
Tom Morris
fa2e6fe608
Issue 517 - add some interim error checking and reporting
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2420 7d457c2a-affb-35e4-300a-418c747d4874
2012-01-12 06:20:28 +00:00
Tom Morris
8ec10a6ea6
Fix error message to match code
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2419 7d457c2a-affb-35e4-300a-418c747d4874
2012-01-12 05:51:16 +00:00
Tom Morris
b409ef5670
Issue 491 - fix off-by-one error in column counts
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2405 7d457c2a-affb-35e4-300a-418c747d4874
2011-12-09 23:50:40 +00:00
Tom Morris
b3bcb3361b
Issue 483 - make custom metadata available to the client
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2404 7d457c2a-affb-35e4-300a-418c747d4874
2011-12-09 23:05:42 +00:00
David Huynh
ae771a7ccb
Fixed Issue 502 in google-refine: Fetch URLs does not return the exact HTTP payload, like Create Project from URLs does.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2398 7d457c2a-affb-35e4-300a-418c747d4874
2011-12-02 20:44:13 +00:00
David Huynh
a7e2704655
Attempt at fixing Issue 500: Sequential creation of related columns using apply-operation command
...
by letting long-running processes report errors.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2394 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-30 23:54:40 +00:00
David Huynh
d419f4bbc7
For reinterpret function, swapped encoder and decoder arguments if decoder is specified, as discussed here:
...
http://groups.google.com/group/google-refine/msg/629dbf11b073e129
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2392 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-29 19:55:08 +00:00
Tom Morris
3b4bdbecdf
Issue 378 - JSONize NaNs as their string equivalent to keep JSONwriter from throwing an exception
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2391 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-29 07:57:36 +00:00
David Huynh
76802d328d
Default the encoding of clipboard data to UTF-8.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2390 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-29 05:21:26 +00:00
David Huynh
cdca6fff8f
Checked in Shardul Deo's patch from
...
http://groups.google.com/group/google-refine-dev/browse_thread/thread/5222a68396c56405
to support HTTP PUT and DELETE.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2387 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-25 03:48:03 +00:00
Tom Morris
f1b567bc31
Issue 487 - Add support for ISO 8601 date parsing
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2383 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-18 22:05:45 +00:00
Tom Morris
80c13e4b59
Issue 486 - make sure project character encoding doesn't get set to ""
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2381 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-18 20:52:49 +00:00
Tom Morris
d5dd04965a
Allow user to optionally override source encoding in reinterpret function so they can fix up bad projects. Interpret empty string as system default encoding.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2380 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-18 20:50:55 +00:00
Tom Morris
23ac625818
Issue 430 - Fix timeline facet to handle Calendar type as well as Date
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2379 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-17 23:52:32 +00:00
David Huynh
dbeaefb00b
Minor bug fix to previous check-in: made sure blank cells in the 2 newly generated columns don't get filled in.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2368 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-07 19:53:26 +00:00
David Huynh
d01745284b
Added option to "transpose columns into rows" operation for filling in other columns.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2367 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-07 19:39:23 +00:00
David Huynh
5aec75696d
Fixed Issue 477 in google-refine: Implement or remove the line separator option.
...
Also, fixed displaying bug in the fixed-width parser UI: previously, tab characters forced columns to be wider.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2364 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-06 20:13:05 +00:00
David Huynh
a35b9f53f7
Made operation "Transpose columns into rows" support the option of transposing into 2 new columns rather than just one.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2362 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-06 02:50:33 +00:00
Tom Morris
85a37d23f9
Issue 474 - implement record limit for XML and JSON importers
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2359 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-05 16:38:19 +00:00
David Huynh
b36b229ba4
Fixed Issue 465: Data text file with extension .dta within a .ZIP is not automatically extracted
...
.dta isn't recognized so there's no best format detected. But now we default to text/line-based and always select all files if no file gets selected by default.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2358 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-04 22:33:38 +00:00
David Huynh
41a90ad71f
Fixed Issue 459: Undefined error with some CSV files (incorrectly detected as EXCEL)
...
by favoring file name-based format over mime type-based format (because the user's computer might have .csv registered as an Excel format).
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2357 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-04 21:52:12 +00:00
David Huynh
2f6b635f66
Added initial implementation of Key/value Columnize operation and command.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2356 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-04 21:00:32 +00:00
Tom Morris
a7c81880a8
Issue 475 - Support escaped custom separators
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2355 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-04 19:04:16 +00:00
Tom Morris
cacbedd352
Fix index out of bounds exception when separator is the empty string
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2354 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-04 17:31:51 +00:00
Stefano Mazzocchi
856ef6a65a
commented out unused variables
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2352 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-01 21:47:24 +00:00
Tom Morris
71492c706c
Just some TODOs
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2349 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-28 17:51:20 +00:00
Tom Morris
ad8705e299
Javadoc only
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2348 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-28 17:29:35 +00:00
Tom Morris
a870e782f5
Make sure out counts our current before attempting to use them for sorting
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2347 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-28 17:28:27 +00:00
Tom Morris
5dad4d6a0b
Handle legacy projects which have an empty slot 0 for the column model (old off-by-one bug)
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2346 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-23 19:29:44 +00:00
Tom Morris
ab950689dd
Add debugging info - mostly toString() methods for types missing them
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2343 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-21 16:46:55 +00:00
Tom Morris
b2781bda3f
Javadoc only
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2342 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-21 16:30:37 +00:00
Tom Morris
9a9f4c1354
Issue 467 - provide JVM heap usage as part of the progress monitor during project creation.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2341 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-21 16:28:40 +00:00
David Huynh
f4b2ee3715
"Transpose columns into rows" operation now supports specifying the ending column to be the last column regardless of its name.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2337 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-19 13:42:50 +00:00
David Huynh
223074bb25
Xml importer should stop trying to skip over initial non-xml content after some number of characters.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2336 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-18 15:25:31 +00:00
Tom Morris
9710521ef8
Correct column counting so maxCellIndex represents current count rather than next column
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2335 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-14 21:00:50 +00:00
Tom Morris
5d6ab76b7c
Issue 313 - fix cell format so dates export as dates rather than numbers.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2334 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-14 20:23:59 +00:00
Tom Morris
2d5125af1e
Issue 462 - don't trim whitespace from string-valued cell contents on import
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2330 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-12 23:45:52 +00:00
Tom Morris
5c95c9c1f9
New exporter - Open Document Format (ODF) spreadsheets (.ods)
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2326 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-11 21:02:23 +00:00
Tom Morris
3bd84088da
Rename OO/ODS importer with more generic name
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2325 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-11 21:01:45 +00:00
Tom Morris
ee0fb9033e
Javadoc
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2324 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-11 20:57:40 +00:00
Tom Morris
ca17e1ef0a
New importer for Open Document Format (ODF) spreadsheet files (.ods)
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2323 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-11 20:27:40 +00:00
Tom Morris
2726f61a61
Add toString methods to help with debugging
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2321 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-11 20:19:53 +00:00
Tom Morris
5c856179cb
Add TODO for suspicious code
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2320 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-11 20:14:57 +00:00
Tom Morris
16421303cb
Add Javadoc
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2318 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-11 20:12:24 +00:00
David Huynh
55c3fdebab
Bumped up version to 2.5.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2314 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-10 21:58:42 +00:00
David Huynh
1a14d82393
For XML files, ignore not just leading whitespace but anything except <.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2313 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-10 20:51:00 +00:00
Tom Morris
fffd24d64b
Parse parameters from multipart/form-data POSTs rather than just dropping them (needed for Windmill tests, among other things)
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2302 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-07 23:15:55 +00:00
Stefano Mazzocchi
1f67866258
fixing a bunch of inconsistencies and potential bugs as indicated by findbugs, pmd and eclipse
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2301 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-07 21:23:23 +00:00
Tom Morris
31073d7712
Refactor importer interfaces to narrow exceptions thrown and handled
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2296 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-07 19:06:53 +00:00
Tom Morris
50927b33dc
Javadoc
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2295 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-07 18:56:23 +00:00
Tom Morris
4a230abb44
Narrow exception handling
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2294 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-07 18:55:46 +00:00
Tom Morris
29cbc5af20
Remove some obsolete TODOs. No functional changes.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2290 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-07 17:29:30 +00:00
David Huynh
18f32ed7e8
Fixed up Rdf Triples importer, added a parser UI for it, and got its tests to pass.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2283 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-06 21:28:20 +00:00
David Huynh
1c5dc32b88
Fixed tsv/csv tests.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2276 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-06 06:22:30 +00:00
Tom Morris
ac4a0ca747
Store blank cells as nulls if that's what the user request
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2272 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-05 23:41:52 +00:00
Tom Morris
0ce0a0a8d3
Add toString support for null cells to help with debugging
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2271 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-05 23:33:17 +00:00
David Huynh
e7e9dbc74d
Minor fixes to pass some exporter tests.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2269 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-03 16:38:07 +00:00
David Huynh
7935dfd60e
Stricter detection of json and xml formats on import, by checking for initial nonspace character.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2266 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-30 01:47:42 +00:00
David Huynh
d047acf1d1
Fixed Issue 452: Importing using Clipboard function does not guess structure correctly for XML or JSON
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2263 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-29 14:02:12 +00:00
David Huynh
5762efebf6
Fixed Issue 397: New UI Importer Branch - individual JSON record nodes do not preview well.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2258 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-28 03:38:23 +00:00
Tom Morris
1b197d93d8
Issue 447 - allow users to specify delimiters for toTitlecase function
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2253 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-20 05:07:46 +00:00
David Huynh
e1184003df
Color-code date values in data table.
...
Fixed Issue 426: filter with custom facet adds zero lines choice
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2251 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-20 01:36:47 +00:00
Tom Morris
59d6020979
Add basic test coverage for ToTitleCase and (commented out) support for 2nd parameter
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2250 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-19 15:47:33 +00:00
David Huynh
82cc76f076
Fixed bug where a blank row used to corrupt the whole project because it could not be re-loaded from file.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2248 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-19 10:36:38 +00:00
David Huynh
9111157172
Fixed Issue 447: Extend toTitlecase() function with support for char[] delimiters in Apache WordUtils.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2247 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-19 09:48:37 +00:00
David Huynh
db3bbb5c86
Fixed xml parsing error due to whitespaces in front of <?xml>.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2246 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-19 09:06:36 +00:00
David Huynh
66cf0b6596
Fixed Issue 449: Uncaught exception from Excel importer.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2245 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-19 08:49:35 +00:00
David Huynh
5c446d28d0
Support uploading directly to a new Google spreadsheet.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2243 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-16 18:04:55 +00:00
David Huynh
02c58e2c56
Periodically clean up stale importing jobs to free up disk space.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2240 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-15 23:52:05 +00:00
David Huynh
0693205430
Added support for importing from fusion tables.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2239 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-15 21:40:40 +00:00
Tom Morris
ebede9b424
Issue 441 - return EvalError if we can't parse a date
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2237 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-13 20:58:43 +00:00
Tom Morris
131ff81c0d
Don't reschedule a canceled timer
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2236 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-13 20:38:34 +00:00
David Huynh
57c11d0238
Fixed issue 442: Two column transforms to date on the same column turns the cells blank
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2230 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-01 22:11:45 +00:00
David Huynh
a88ccd2c32
Reduced amount of logging.
...
Suppressed logging for the GetProcessesCommand, which gets ping'ed often while there is a long running operation being executed (e.g., reconciling, fetching URLs).
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2228 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-01 18:26:45 +00:00
David Huynh
a8815956cd
Implemented back-end of customizable tabular exporting support.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2225 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-30 19:19:46 +00:00
Tom Morris
e174bb163a
Issue 440 - Don't purge from memory those projects with pending operations
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2222 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-28 22:00:02 +00:00
David Huynh
420e74c6f4
Made CreateProjectCommand scriptable again, so it can be called from client libraries.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2216 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-23 18:49:47 +00:00
David Huynh
4113a10b5b
Catch/log exceptions in the importers a bit more carefully.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2215 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-22 21:47:15 +00:00
David Huynh
f023b922e1
Implemented encoding selectors in a few importing parser UIs.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2214 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-22 17:55:06 +00:00
Tom Morris
bde63ff417
Last set of indentation cleanups - no functional changes
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2211 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-18 17:46:36 +00:00
Tom Morris
9d7b8a5279
Don't die if we get passed no candidates
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2210 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-18 17:39:18 +00:00
David Huynh
afb7953eac
Fixed problem for importing from an archive file containing fixed width column files: we used to create totally new columns for each contained file, yielding too many columns.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2203 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-14 02:53:19 +00:00
David Huynh
33d99186ea
Made fixed width column guessing slightly better.
...
Made sure fixed width parser UI take into account the File column.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2202 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-14 02:05:18 +00:00
David Huynh
41e4e1cd70
Some more JS indentation fixes.
...
Fixed issue 31: "Maximum number of facet values should be configurable." Now when we're showing "too many choices" we also display exactly how many choices there are and show a link to change the limit.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2201 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-14 01:05:43 +00:00
David Huynh
e955ed05ae
Made sure busy indicator shows up for GData importing when needed.
...
Fixed radio button issue with GData worksheet selection.
Fixed resizing issue with open project action area.
Fixed NullPointerException in RecordModel.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2198 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-12 19:15:58 +00:00
David Huynh
823729776d
Google spreadsheets can now be imported directly from within Refine.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2192 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-11 00:35:01 +00:00
David Huynh
c5078d1887
Fixed issue 428: Excel import sometimes drops last row of data.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2189 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-06 19:37:23 +00:00
Tom Morris
da7347e7b1
Make sure all conditionals and loops are in blocks (too bug-prone otherwise)
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2183 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-02 22:21:47 +00:00
Tom Morris
c16a2378f9
Ask people not to reformat since this is imported code.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2182 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-02 22:18:50 +00:00
Tom Morris
539fea6eb3
Simplify some for loops using new Java 5 syntax
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2181 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-02 21:17:41 +00:00
Tom Morris
97a0f2a33e
Organize imports. com.google.refine last in a section of its own. Everything alphabetical in its section.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2180 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-02 21:10:22 +00:00
Tom Morris
5497fa4685
Remove unnecessary casts
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2173 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-02 20:33:57 +00:00
Tom Morris
7fd6e22af4
Convert tabs to spaces. No functional changes.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2172 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-02 20:26:32 +00:00
Tom Morris
123614539d
Add missing @Override annotations
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2171 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-02 19:30:23 +00:00
David Huynh
78edff6f7f
Merged new importer UI work from branch over.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2170 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-02 03:34:47 +00:00
Tom Morris
b82448037a
Add @Override annotations. No functional changes.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2124 7d457c2a-affb-35e4-300a-418c747d4874
2011-06-24 04:01:47 +00:00
Tom Morris
eb38ab75a4
FIXED - task 415: Evaluation precedence wrong for arithmetic expressions
...
http://code.google.com/p/google-refine/issues/detail?id=415
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2123 7d457c2a-affb-35e4-300a-418c747d4874
2011-06-23 23:42:12 +00:00
Tom Morris
2af22f9485
Issue 404 - Fix indeterminate behavior in character encoding guesser. Thanks to Paul Makepeace.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2120 7d457c2a-affb-35e4-300a-418c747d4874
2011-06-14 04:29:44 +00:00
Tom Morris
8da1291650
Issue 399 - Add Cologne Phonetic Keyer and allow it to be used for clustering
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2102 7d457c2a-affb-35e4-300a-418c747d4874
2011-06-09 19:42:05 +00:00
Tom Morris
51c898d602
Issue 351 - truncate exports to Excel at 256 columns (limitation of Excel format)
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2094 7d457c2a-affb-35e4-300a-418c747d4874
2011-06-07 23:55:00 +00:00
Tom Morris
6a14049652
Issue 401 - use default exception handling for ExportRows command instead of JSON response
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2093 7d457c2a-affb-35e4-300a-418c747d4874
2011-06-07 23:52:23 +00:00
Tom Morris
2cd3ae03d0
@Override annotations. No functional changes.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2092 7d457c2a-affb-35e4-300a-418c747d4874
2011-06-07 23:51:16 +00:00
Tom Morris
a52c25272e
Issue 342 - help text update
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2090 7d457c2a-affb-35e4-300a-418c747d4874
2011-06-06 22:38:50 +00:00
Tom Morris
eebc225abc
Add missing @Override annotations (issue 316, 317, 319, 320 among others)
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2089 7d457c2a-affb-35e4-300a-418c747d4874
2011-06-06 22:35:01 +00:00
Tom Morris
73acd497e9
Fix for issue 358 from Tomaz Solc. Don't return a NaN when comparing two 0-length word lists.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2088 7d457c2a-affb-35e4-300a-418c747d4874
2011-06-06 21:30:46 +00:00
David Huynh
11cf415ee8
Exposed more fields for each record.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2081 7d457c2a-affb-35e4-300a-418c747d4874
2011-06-06 20:19:20 +00:00
Tom Morris
4dc3ef8caa
Bump version to 2.1
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2080 7d457c2a-affb-35e4-300a-418c747d4874
2011-06-06 20:16:19 +00:00
David Huynh
b75a5efe71
Applied patch for Issue 222: save favorite transforms.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2079 7d457c2a-affb-35e4-300a-418c747d4874
2011-06-06 18:49:36 +00:00
David Huynh
f7c33fba45
Fixed issue 196: failure and error dialog attempting to remove columns
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2077 7d457c2a-affb-35e4-300a-418c747d4874
2011-06-05 04:31:51 +00:00
David Huynh
cecfa244e0
Changed to UTF-8 encoding
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2039 7d457c2a-affb-35e4-300a-418c747d4874
2011-04-06 21:09:21 +00:00
Stefano Mazzocchi
610de0d33a
adding Metaphone3 algorithm
...
Many thanks to Lawrence Philips for donating the code to us under the BSD license.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2029 7d457c2a-affb-35e4-300a-418c747d4874
2011-03-01 00:17:48 +00:00
Stefano Mazzocchi
87e7f9a7a4
remove unused variable
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2028 7d457c2a-affb-35e4-300a-418c747d4874
2011-02-25 22:51:58 +00:00
Tom Morris
c5312a2e6a
Issue 338 - patch from Thad Guidry to provide function which calls JSoup ownText() method
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2025 7d457c2a-affb-35e4-300a-418c747d4874
2011-02-23 19:40:35 +00:00
Tom Morris
5b9362e956
Issue 334 - Make sure URLs are encoded before using them.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2007 7d457c2a-affb-35e4-300a-418c747d4874
2011-02-15 23:15:09 +00:00
Tom Morris
06e2487189
Issue 276 - patch from pxb1... to fix character encoding issue with CreateProject command slightly modified to preserve request encoding if it has one
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2000 7d457c2a-affb-35e4-300a-418c747d4874
2011-02-04 03:15:12 +00:00
David Huynh
d7b482be06
Attempt at fixing issue 185. Will need someone else to verify.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1989 7d457c2a-affb-35e4-300a-418c747d4874
2011-01-20 22:49:36 +00:00
David Huynh
44652a3ee2
Make copy of Calendar object before modifying it. Also handle Date type.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1982 7d457c2a-affb-35e4-300a-418c747d4874
2011-01-10 23:06:28 +00:00
David Huynh
90794d5039
Started working on new import UI. Not much to see yet, but if you append ?new=1 to the index page URL then you see the new form. It can only upload a file at the moment.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1971 7d457c2a-affb-35e4-300a-418c747d4874
2011-01-02 23:09:08 +00:00
David Huynh
6fb2b05739
Fixed issue 294: "Exporting date type column to TSV/CSV shows java debugging information instead of value" with help from Gabriel Sjoberg.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1967 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-28 15:54:24 +00:00
David Huynh
53442c5ef2
Handle the case where an excel cell has a formula but the cached result of that formula is an error.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1962 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-25 21:41:21 +00:00
David Huynh
687e9064df
A shorter fix for toString() to handle Date than the last commit.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1961 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-25 21:36:51 +00:00
David Huynh
0ff40eabbd
toString() should handle Date, too, rather than just Calendar.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1960 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-25 21:33:59 +00:00
Tom Morris
209f157656
RESOLVED - task 202: Sort text with accents
...
http://code.google.com/p/google-refine/issues/detail?id=202
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1951 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-12 18:16:29 +00:00
Iain Sproat
f55f11cd0d
Adding classes to now make it possible to parse Html in GREL. Uses small subset of methods from the JSoup library, licensed under the MIT license.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1948 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-06 23:15:24 +00:00
Tom Morris
9aaa1c9919
Replace tabs with spaces. No functional change.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1947 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-05 20:50:03 +00:00
Tom Morris
a560cb56df
Replace tabs with spaces. No functional changes.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1942 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-29 06:27:06 +00:00
Tom Morris
3a8f9306bd
Add some toString() methods to help with debugging
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1941 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-29 06:24:50 +00:00
Tom Morris
af20157532
Fix indentation so indent levels match logical block levels. No code changes.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1940 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-28 17:46:57 +00:00
Tom Morris
748b5699b9
Issue 61 - Turn on text coalescing and XML entity reference replacement
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1939 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-27 22:07:15 +00:00
Tom Morris
e19148c375
Make sure we at least log an error if the import fails
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1938 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-27 22:05:45 +00:00
Tom Morris
824f445530
Unused import
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1937 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-27 17:54:16 +00:00
Tom Morris
b9fa100d31
Don't try to save a null encoding
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1936 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-27 17:54:01 +00:00
Tom Morris
850c43d6f3
Issue 107 - set encoding on response
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1935 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-27 02:46:10 +00:00
Tom Morris
3d6458a0e5
Replace tabs with spaces
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1934 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-27 01:38:32 +00:00
Tom Morris
bc8637f638
Issue 257 - Don't return a String where a Date is required (using generics in Criterion API would prevent this kind of problem, but that's incompatible with the use of the Eval_Error class)
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1933 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-27 01:18:36 +00:00
Tom Morris
c7b0f4d024
Issue 184 - use default locale date formatting if no format string is specified
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1932 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-26 23:47:09 +00:00
Tom Morris
080ec5332e
Issue 237 - Make sure project's character encoding is always set. Lower minimum confidence threshold for guesser.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1931 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-26 22:23:31 +00:00
David Huynh
1e2af79851
Let's handle .tar files as well rather than requiring .tar.gz.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1919 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-21 03:00:43 +00:00
David Huynh
c496f1e941
Helped toward fixing issue 228: ButterflyServlet already tracks the ServletConfig, so there's no need for RefineServlet to do that, too.
...
Importing archive files has another big problem at the moment: namely, even if the many files in a single archive file share several columns, they still cause columns with the same names to be over and over again as each file gets imported. This is because individual importer was written with the assumption that it imports into an empty project with no column.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1918 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-21 02:58:15 +00:00
Iain Sproat
09fa36198c
Additions to GREL:
...
* Factorial function allowing variable steps
* GreatestCommonDenominator function
* LeastCommonMultiple function
* Multinomial function
* Quotient function
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1910 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-20 18:04:11 +00:00
Iain Sproat
43d0de2d8a
Fixed registered name of GREL combination function
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1909 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-20 12:15:31 +00:00
Iain Sproat
f1643565b8
Additions to GREL:
...
* modulo operator, %
* cos, sin and tan functions
* acos, asin, atan and atan2 functions
* cosh, sinh and tanh functions
* fact and combin functions
* degrees and radians functions
* odd and even functions
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1908 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-20 12:11:37 +00:00
Iain Sproat
1ec7cb9f7b
PI constant added to GREL
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1904 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-18 23:53:07 +00:00
Tom Morris
675714d03d
Add toString() methods to help with debugging
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1894 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-18 08:19:05 +00:00
Iain Sproat
dd333d5b43
Abs function now available in GREL
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1890 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-17 09:38:51 +00:00
Iain Sproat
74e9288229
Additional error dialog for Issue 188
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1858 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-11 14:25:46 +00:00
Iain Sproat
2f564589f5
Adding a Fixed Width data importer (Issue 85) and associated tests.
...
Although this importer is 'wired up', it requires a property "fixed-column-widths" which is not (yet) implemented in the UI. But the ImporterRegister.guessImporter method will probably select the CsvTsvImporter before the FixedWidthImporter anyway. I suggest an improvement to the project creation UI and/or the guessImporter method will be required.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1857 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-11 13:15:41 +00:00
David Huynh
703d2dbd19
IsTest should catch errors and wrap them.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1833 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-08 21:19:25 +00:00
David Huynh
5d915be096
Numeric comparisons == and != should be special-cased, too.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1780 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-05 19:17:57 +00:00
David Huynh
fe08a43e0c
FunctionCall and ControlCall should catch exceptions and wrap them as EvalError's.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1777 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-05 04:26:10 +00:00
David Huynh
faaca5beea
Fixed the GREL round function.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1749 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-01 20:28:42 +00:00
David Huynh
1f12bfb409
Fixed bug in HasFieldsListImpl where list members weren't tested for being null.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1735 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-01 18:13:53 +00:00
David Huynh
1eebe2e4a3
Fixed transpose-rows-into-columns command, which previously duplicated columns that precede the column being transposed.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1734 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-01 17:59:58 +00:00
David Huynh
764558c48a
In numeric bin index, count infinity values as errors
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1700 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-26 22:49:59 +00:00
David Huynh
8d422e2e54
Fixed Calendar vs. Date bug in time range facet
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1699 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-26 21:06:16 +00:00
David Huynh
8ccf9d1bf8
The judgment facet created after a recon operation is done should also show (blank) and (unreconciled) choices
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1696 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-26 04:57:02 +00:00
David Huynh
e601ad8d40
bug: autoMatch flag wasn't actually used before
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1627 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-25 01:41:07 +00:00
David Huynh
2d9e7c87f6
Increased recon batch size to 10 again. Various style tweaks. Polished up freebase extension's dialogs to be a bit more helpful
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1625 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-25 01:07:25 +00:00
David Huynh
345c1c62ac
Added new recon commands:
...
- clear recon data for all matching rows
- clear recon data for one cell
- clear recon data for similar cells
- copy recon judgments across columns
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1618 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-22 07:25:27 +00:00
David Huynh
5a17acfd70
Prepended license text to java source
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1613 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-20 20:45:52 +00:00
David Huynh
9b8206da29
Fixed new bug for query-based reconciliation introduced by factoring out the freebase extension
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1611 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-20 04:52:35 +00:00
Tom Morris
7dcd0c073d
Revert bad commit r1600
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1601 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-18 05:19:05 +00:00
Tom Morris
79c00bab36
Incomplete - task 157: Integrate Google Spreadsheet import/export plugin
...
http://code.google.com/p/google-refine/issues/detail?id=157
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1600 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-18 04:59:39 +00:00
David Huynh
e7184ec9ab
Deleted old empty protograph dirs. Use a default assign version even if running from trunk; this is so that we have at least some clue about an imported project file.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1598 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-18 04:18:09 +00:00
David Huynh
c8dcc10ab8
Be sure to use UTF-8 when saving data.txt, pool.txt, and change files.
...
Fix issue 163: Refine doesn't retain the characters for flat or sharp.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1588 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-17 23:43:02 +00:00
David Huynh
a62638e88d
For each recon group, try at least 3 times if the service keeps failing. Log errors more for debugging purposes.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1578 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-16 00:19:31 +00:00
Stefano Mazzocchi
f50880905e
fixed warnings
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1577 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-16 00:01:42 +00:00
Tom Morris
47dd5f8da6
Make sure the stream/writer is flushed in case the exporter forgets to do it
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1569 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-15 17:10:37 +00:00
Tom Morris
bbebb4d2dc
Add @Overrides so we get warned about API changes
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1565 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-15 13:26:25 +00:00
David Huynh
7e9df21b70
Exporters need to implement either WriterExporter or StreamExporter.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1558 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-15 06:18:20 +00:00
David Huynh
73042712ed
Made csv/tsv importer not trim whitespace even if "guess cells' types" is checked (for cells that are strings).
...
Updated csv tests to expect un-trimmed cells.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1557 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-15 05:30:15 +00:00
David Huynh
9e35ea3775
Better error message for numeric range facet if there's no numeric value.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1551 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-15 01:00:51 +00:00
Tom Morris
083abd4329
Refactor exporter interface along same lines as importer
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1547 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-14 21:33:50 +00:00
David Huynh
4ccdbc8716
Fixed bug in which a newly created and unedited project would never get saved because it had the same modified time and last save time.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1530 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-14 01:43:26 +00:00
David Huynh
dc49047092
We have previously changed the standard-reconcile acre app to return mids, but we still need to make sure its metadata says that its identifier space is mid, not id. And we need Refine to test for the mid identifier space as well.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1479 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-13 18:33:27 +00:00
David Huynh
a16df8f2d6
For unrecoverable projects, rename them with a suffix so the next time we won't try to recover them again.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1472 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-13 07:05:34 +00:00
David Huynh
91ffe71d17
Lowering recon batch size from 7 to 3 to avoid timeout problem. This is a temporary fix only for
...
Issue 156: Reconcile is not picking up alias hints or even type hints correctly
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1470 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-13 05:03:49 +00:00
David Huynh
208152b55c
Added .vt template for reporting errors with stacktraces.
...
Fixed Issue 155: Blank browser shown when non-GZIP format is detected during import
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1469 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-13 04:51:01 +00:00
David Huynh
7cd5a47fbf
We haven't been using non-split row parser, so we need to fix the trimming problem in the tsv/csv importer instead.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1467 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-12 23:24:16 +00:00
David Huynh
2d276fa1e6
Non split row parser shouldn't trim lines because whitespaces are significant
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1465 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-12 22:45:30 +00:00
David Huynh
69c338c728
Text filter was throwing an exception if the column went away (which happened when the column got split).
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1464 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-11 23:15:13 +00:00
David Huynh
336a773069
Only try to create the workspace dir if it doesn't exist.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1463 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-11 23:04:06 +00:00
Tom Morris
c42c78dc0a
Log errors if things don't go as expected
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1462 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-11 22:28:22 +00:00
Iain Sproat
142591a090
Added a mention of the new JsonImporter to CHANGES.txt
...
Corrected the logger name in JsonImporter.java
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1455 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-08 07:58:59 +00:00
David Huynh
ad0d227ab3
Remove remaining Freebase related functionalities.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1453 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-08 02:04:47 +00:00
David Huynh
6ddd945a80
The Freebase functionalities have been extracted out in the last commit. We're removing them from the core module now. This is not a complete checkin. SVN is having some trouble with some directories.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1452 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-08 01:54:00 +00:00
Tom Morris
5040b06d9f
Make exceptions more specific for load errors. Still no error returned to user though (just hangs)
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1450 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-07 14:20:28 +00:00
Tom Morris
ea28784e8b
Don't save null project if load failed
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1449 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-07 14:19:42 +00:00
Stefano Mazzocchi
215165ed97
spell out tweezer parameters
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1444 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-07 08:20:46 +00:00
David Huynh
9ea477c80d
Allowed a single operation class to be registered under several names, so that we can rename operations (to better names) while maintaining backward compatibility.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1443 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-07 05:42:01 +00:00
David Huynh
1de5e7c00e
Renamed package gel to grel.
...
Replaced gel with grel in other places in the code base while maintaining backward compatibility.
Changed layout in expression preview dialog to accommodate long GREL name.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1442 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-07 05:19:35 +00:00
David Huynh
90d1111ebc
Added "project" argument to OverlayModel methods, as suggested by Fadi Maali.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1439 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-06 20:47:11 +00:00
David Huynh
3ba8e63249
Register Json importer.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1426 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-04 18:53:41 +00:00
Iain Sproat
d977f42f51
Changed behaviour of the XmlImporter to make it more permissive, and allow arrays within mixed elements to be used as candidates for importing to Refine.
...
This change has also allowed the JsonImporter to pass all its unit tests without any further modification.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1425 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-04 18:33:59 +00:00
Iain Sproat
ec9898ba92
Some tidying up of the XmlImporter which reduces the number of generic TreeParser tokens to a minimum - and should allow elements such as comments and CDATA to be ignored/skipped.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1422 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-04 15:02:09 +00:00
Iain Sproat
d3f223c196
The JsonImporter now passes all current unit tests.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1421 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-04 10:02:50 +00:00
Stefano Mazzocchi
2b9b38368f
use the new FreeQ 'refine' queue instead of the old 'gridworks' one
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1410 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-30 00:26:02 +00:00
Stefano Mazzocchi
b62e63306a
- make the correct version + revision available also to the java side (thru web.xml)
...
- add @Override metadata to the commands that were missing it
- make the version information appear even when using trunk (Fixes Issue 136)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1406 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-29 01:50:57 +00:00
David Huynh
935355cb50
Comments in XML file caused the record detection code to fail. So we added ignorable element type that we can skip over.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1392 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 19:16:43 +00:00
Iain Sproat
bd3ded0828
Correcting JsonImporter to use the correct parser.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1388 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 14:19:19 +00:00
Iain Sproat
855df20481
XmlImportUtilities no longer relies on XMLStreamConstants, and is now independent of any specific type of tree data (Xml or otherwise).
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1378 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 10:46:33 +00:00
Iain Sproat
b21961be89
Another small step towards making XmlImportUtilities generic for all tree structured data, and less XML centric. Some calls to XMLStreamConstant in XmlImportUtilities are now working with a generic TreeParserToken, with methods to converter between TreeParserToken and XMLStreamConstant/JsonToken in the respective parsers.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1377 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 10:04:56 +00:00