Commit Graph

1096 Commits

Author SHA1 Message Date
Antonin Delpeuch
dbb071da30 Merge branch 'default-to-english' of https://github.com/RBGKew/OpenRefine into RBGKew-default-to-english 2017-08-09 14:07:22 +01:00
Jacky
275dac976e fix #137 2017-08-07 21:53:35 -04:00
Antonin Delpeuch
66eac0fae9 Ensure null values are not cached in URL fetching operation. Closes #1219. 2017-08-01 13:05:29 +01:00
jackyq2015
53baa5a833 put the correct params description 2017-07-28 20:37:20 -04:00
jackyq2015
4950d29074 add backward compatility for cross function 2017-07-23 19:19:58 -04:00
Thad Guidry
7f92251ed1 Merge pull request #1210 from wetneb/extend
Add data extension capabilities to the reconciliation API
2017-07-17 18:01:37 -05:00
Antonin Delpeuch
84c06821ee Data extension tests 2017-07-16 11:47:12 +01:00
Antonin Delpeuch
05873f283d Integration of constraints with service-defined forms 2017-07-14 22:17:40 +01:00
Antonin Delpeuch
3eadefe613 Do not add reconciliation statistics on columns without types 2017-07-14 12:53:54 +01:00
Antonin Delpeuch
6501c235e8 Pass the identifier and schema spaces along to create better ReconCandidates 2017-07-14 12:30:39 +01:00
Antonin Delpeuch
cc991cab21 Add nicer spinning gif while preview is loading.
Fix bug of multiple ColumnInfo being generated.
2017-07-14 11:30:17 +01:00
Antonin Delpeuch
d99128c330 Retrieve types from the extend service 2017-07-06 21:15:37 +02:00
Antonin Delpeuch
ad3a174abd Starting to migrate data extension to standard reconciliation services 2017-07-04 23:14:19 +02:00
jackyq2015
1ee339cbbd cross function test suite. #1204 2017-06-28 08:12:36 -04:00
jackyq2015
f03be76475 Extend cross() function to take either a cell or a value #1204 2017-06-25 21:04:00 -04:00
Felix Lohmeier
2557cc5419 bugfix for new option autosave period 2017-06-24 22:42:49 +02:00
Felix Lohmeier
e54199a6f1 added options for initial java heap space and autosave period 2017-06-22 12:27:55 +02:00
Adi Eyal
09c00c6a19 Fixes #1181 2017-05-05 23:38:37 +02:00
Bob Harper
909df1b6a7 xor can also accept 2+ params, rewrite tests to be consistent 2017-04-27 11:20:48 +01:00
Bob Harper
ef4e039998 allow more than 2 AND and OR conditions 2017-04-26 20:51:58 +01:00
wangwenxiang
660df900d4 Fix bug: load wrong new value for RowStarChange 2017-03-15 12:54:01 +08:00
wangwenxiang
0314f49f36 Fix bug: load wrong new value for RowFlagChange 2017-03-15 10:39:33 +08:00
Jacky
912600f0bd Merge pull request #1178 from wetneb/url_caching
Add caching in URL fetching
2017-03-09 17:28:38 -05:00
Antonin Delpeuch
22124ac57e Add checkbox to disable caching 2017-03-09 00:21:34 +00:00
Antonin Delpeuch
32c232c2d6 Move to Guava's cache for ColumnAdditionByFetchingURLsOperation 2017-03-08 09:32:34 +00:00
Antonin Delpeuch
a9c4b0af16 Cache String, not URL, in ColumnAdditionByFetchingURLsOperation 2017-03-08 07:45:11 +00:00
Antonin Delpeuch
782a2f5b48 Add caching in URL fetching 2017-03-07 20:24:50 +00:00
Jacky
5aede573dc bump version to 2.7 2017-02-10 15:55:58 -05:00
Qi Cui
773151380e fix #1138. column transpose 2016-08-24 13:56:35 -04:00
Tom Morris
aa65bc5c18 Throw exception on error instead of logging to console 2016-05-17 15:10:09 -04:00
Tom Morris
6df822e5a6 Set ContentType to application/json 2016-05-17 15:10:09 -04:00
Tom Morris
5d45566455 Protect against NPE when content type is missing 2016-05-17 15:10:09 -04:00
Scott Wiedemann
16b0453b74 Update ToDate.java
Updating SimpleDateFormat api doc url for ToDate function.
2015-11-13 12:27:16 -07:00
Steffen Stundzig
7f5e58ef51 #1086 add support for quote character 2015-10-30 14:32:46 +01:00
Tom Morris
be7f880cbe Revert addition of synchronized methods 2015-10-16 19:33:15 -04:00
Tom Morris
e3858da843 Escape cell data for HTML - fixes #1049 2015-10-16 15:41:03 -04:00
Martin Magdinier
8b4a1d577a Merge pull request #1079 from RefinePro/issue-796
fixed issue #796 Columnize by key/value columns creates empty lines
2015-10-08 14:01:07 -04:00
jackyq2015
7a2a0eb52f fixed issue #796 Columnize by key/value columns creates empty lines 2015-09-29 20:12:05 -04:00
Tom Morris
48681e8877 Move assert where it belongs 2015-09-25 20:01:27 -04:00
Tom Morris
be936a86eb Clean up PR #1055 2015-09-25 19:01:16 -04:00
Tom Morris
de66afa512 Revert " Use new algorithm for levenshtein clustering" 2015-09-25 16:44:25 -04:00
Thad Guidry
175f4a5319 Merge pull request #1047 from lemmingapex/master
Fixed #1046 Combine xls and xlsx formats by inspecting file header information in ExcelImporter
2015-09-21 20:33:05 -05:00
Thad Guidry
94e219042e Merge pull request #1007 from lispc/master
Use new algorithm for levenshtein clustering
2015-09-21 20:23:45 -05:00
Thad Guidry
85ffce60d2 Merge pull request #1070 from RefinePro/issue-995
fix issue #995
2015-09-21 20:12:51 -05:00
jackyq2015
d671d7784b fix issue #995 2015-09-21 21:03:25 -04:00
magdmartin
ab56b73db9 Merge pull request #993 from RefinePro/OpenRefine-trunk
prevent the multiple sorting
2015-09-20 09:32:17 -04:00
magdmartin
b635f4e067 Merge pull request #1055 from RefinePro/issue-512
fix issue #512 to save the file location as a table column
2015-09-20 09:31:16 -04:00
magdmartin
ab6e2951e9 Merge pull request #1051 from RefinePro/issue-1015
Issue 1015. add the meta utf-8
2015-09-20 09:28:10 -04:00
jackyq2015
4e6f584cde fix issue #512 to save the file location as a table column 2015-08-27 15:13:20 -04:00
jackyq2015
dc7535c63e 1. take out of issue #1021 fix which was mistakenly put in
2. fix the expected value for JUNIT
2015-08-06 21:31:37 -04:00
Scott Wiedemann
5eab8893cc Fixed #1046 Combine xls and xlsx formats by inspecting file header information in ExcelImporter. 2015-07-30 16:19:26 -06:00
jackyq2015
819e1ba5c6 patch for issue #708. fix few hanging UIs when importing file 2015-07-18 10:27:35 -04:00
lispc
43e441a4d0 Use new algorithm for levenshtein clustering 2015-06-01 20:35:21 +08:00
Jacky
ca862970a4 prevent the multiple sorting 2015-05-01 15:04:51 -04:00
magdmartin
383f8c5e50 Changed GREL to *General Refine Expression Language* as agreed in 2013 when drafting *Using OpenRefine* 2015-04-21 10:35:52 -04:00
Matthew Blissett
5cdc6d7b5a Fallback to English language to avoid need to maintain 'default' translation files. 2015-02-10 12:33:08 +00:00
QI CUI
495dcd7bd5 use the LinkedHashMap instead of HashMap to make sure the retrive order 2015-01-30 15:03:20 -05:00
Tom Morris
83da996a36 Change to Java 5 loop syntax 2014-12-23 00:04:24 -05:00
Tom Morris
ddfaecb3e6 Merge pull request #914 from opendatatrentino/rev-masschange
Fix wrong revert order in MassChange
2014-12-22 23:50:31 -05:00
David Leoni
4d2b90ad60 added MassChangeTests 2014-12-22 12:23:49 +01:00
Tom Morris
ea723413cb Use StringUtils.toString() convenience method 2014-12-21 11:39:34 -05:00
Tom Morris
4eb6eb6eda Merge pull request #915 from opendatatrentino/fixNullCellToString
Fixes Cell.toString failing on null value
2014-12-21 11:13:34 -05:00
Matthew Blissett
f3e2b9622a Add charset=UTF-8 to HTTP Content-Type for reconciliation queries.
Fixes problem where non-ASCII characters would be URL encoded as UTF-8, but interpreted according to the whims of the server.
2014-11-28 10:45:22 +00:00
David Leoni
c3884c57f5 Fixes Cell.toString failing on null value 2014-11-27 18:45:01 +01:00
David Leoni
d29bf230b5 Fixes wrong revert order in MassChange 2014-11-27 18:12:54 +01:00
Thad Guidry
cdda1edcf0 Fixed issue with null cells after Fetch URL
Some websites do not set the charset= properly and use enclosing quotes.  Tested and Verified.
2014-08-13 21:39:30 -05:00
Tom Morris
536493c5d3 Fix AbstractMethodError 500 - fixes #589 2014-08-05 14:55:45 -04:00
Tom Morris
2fa9cf11c8 Merge pull request #859 from Arcadelia/Job-lastTouched-fix
Initialized ImportingJob.lastTouched
2014-07-03 10:36:48 -04:00
Tom Morris
655e0b0dc1 Wrap conditional statement in block 2014-07-03 10:35:24 -04:00
Tom Morris
b21cb56149 Merge pull request #852 from Arcadelia/Duplicate-job-id-fix
Import job duplicate id fix
2014-07-03 10:34:29 -04:00
Tom Morris
4333b1b2e7 Merge pull request #881 from zsxwing/simple-date-format-bug
Put ISO8601_FORMAT into ThreadLocal to fix the concurrency issue
2014-07-03 10:15:03 -04:00
Tom Morris
d106d61b25 Improve error messages - fixes #878 2014-05-30 01:47:22 -04:00
Tom Morris
5799c3d92b Synchronize access to processes list - fixes #862 2014-05-30 01:47:21 -04:00
zsxwing
4ee8e079c9 Put ISO8601_FORMAT into ThreadLocal to fix the concurrency issue 2014-05-30 11:45:28 +08:00
Tom Morris
a4d03968a5 Merge pull request #867 from abhillman/exceloutput255bugfix
Report error to user when attempting to export >255 columns, rather than generic 500 ISE
2014-04-20 23:43:19 -04:00
Aryeh Hillman
2bf35e5f0d Fix when exporting to excel files
When exporting to excel, there cannot be more than 255 columns.
If there are more columns than that, we write "ERROR: TOO MANY
COLUMNS" to the 255th column. Formerly, OpenRefine reported
a 500 Server error.
2014-04-12 16:41:54 -07:00
Frank Wennerdahl
8c02a13429 Initialized ImportingJob.lastTouched
Prevents the CleaningTimerTask from disposing newly created
ImportingJobs which have not yet been touched.
2014-02-19 16:02:45 +01:00
Frank Wennerdahl
a0d4eb0058 Job id duplicate fix
Changed how job id's are created to avoid the same id to be assigned to
two concurrent jobs.
2014-02-05 12:21:50 +01:00
Frank Wennerdahl
6dedae37a1 Fixed too frequent job cleanups
The ImportingManager cleans up jobs that has not been touched in 60ms.
According to comment this should be 60 minutes but was changed in
4529310237.
2014-02-05 11:07:41 +01:00
Tom Morris
bc801546cc Remove references to obsolete splitIntoColumns option 2013-09-18 18:44:58 -04:00
Tom Morris
4f2ebed676 Make localization language list dynamic - fixes #807
- refactor LoadLanguageCommand so language loading can be reused
- add GetLanguagesCommand for the server
- change GUI to fetch language list and update selection list with it
2013-09-18 13:16:24 -04:00
Tom Morris
1261734f15 Partial solution for #816 plus improved conversion test coverage 2013-09-18 11:14:48 -04:00
Tom Morris
d84f897ae0 Improve help message to specify an integer is returned 2013-09-18 11:12:34 -04:00
Tom Morris
f344e3da1c Return "null" for toString(null) - fixes #783
- also fixed grammar in error message
2013-09-18 10:20:17 -04:00
Tom Morris
daed3bd90c Move MARC->XML conversion to earlier in process - issue #794
- functional now, but probably not good enough to release yet
2013-09-17 19:19:50 -04:00
Tom Morris
6bd6a5934b Start wiring up MARC importer - issue #794 2013-09-17 17:17:23 -04:00
Tom Morris
cce480ff38 Fix implementation for #466 to handle default empty string 2013-09-04 18:59:13 -04:00
Tom Morris
889245fdf4 Make the number of reconciliation results configurable - closes #466 2013-09-04 18:07:12 -04:00
Thad Guidry
f2c4e3ab48 Added ability to extract MILLISECOND to datePart (milliseconds,ms,S) 2013-08-30 09:09:54 -05:00
Tom Morris
c68c1bb2b1 Upgrade to Clojure 1.5.1 & switch to clojure-slim JAR - #792 2013-08-26 19:40:37 -04:00
Tom Morris
62b8c476f1 Use Java's built-in Number formatter instead of ICU4J which is
massive - #792
2013-08-26 15:47:12 -04:00
Tom Morris
4529310237 Switch from TimerTask to ScheduledExecutorService for more robustness 2013-08-18 11:31:03 -04:00
Tom Morris
e93bfa798e Use iterator when removing to avoid ConcurrentModificationException -
fixes #652
2013-08-17 13:45:22 -04:00
Tom Morris
3315136681 Allow reinitializatoin of ProjectManager singleton - fixes #787 2013-08-17 12:47:57 -04:00
Tom Morris
25f02dd9b9 Fix Java 6 incompatibility 2013-08-15 15:57:24 -04:00
Tom Morris
fa072df85c Add locale support to toDate() - fixes #729 2013-08-15 15:19:01 -04:00
Tom Morris
ab42df6ea3 Merge pull request #658 from Arcadelia/CSV_Multi-char-separator_support
Support for multi-char-separators in CSV
2013-08-14 07:29:45 -07:00
Tom Morris
37d8abc114 Minor improvement to recon error handling 2013-08-10 18:03:06 -04:00
Tom Morris
1d8784e059 Make workspace saving and loading more robust - fixes #528
- don't overwrite old files if we get an error writing new ones
- don't write unchanged data
- keep backup files around until next write rather than deleting
immediately
- attempt to recreate missing metadata as best as possible
2013-08-09 19:53:53 -04:00
Tom Morris
579d71b7eb Switch back to NUL character for quote now that OpenCSV handles it -
fixes #653
2013-08-07 17:07:17 -04:00
Tom Morris
7b5b549113 More project saving changes for #528
- reduce project retention in memory from 1 hr to 15 min.
- free all unmodified projects if we get an error on save (we could be
running low on memory)
- make sure exceptions propagate up to where they can be usefully
handled
2013-08-05 14:13:56 -04:00
Tom Morris
190a031a8a Comments only. No code changes. 2013-08-05 14:11:06 -04:00
Tom Morris
3500f20e47 Save all modified projects before importing new one - hopefully helps
#528
2013-08-05 14:10:26 -04:00
Tom Morris
57f5e9873d Add Javadoc. No code changes. 2013-08-05 13:08:30 -04:00
Tom Morris
c3cab0524a Narrow exceptions thrown and let them propagate up so we know
workspace file isn't valid - first step for #528
2013-08-05 13:08:02 -04:00
Tom Morris
a7273625d7 Add support for Basic Authentication over HTTPS - addresses #217 2013-08-02 19:15:24 -04:00
Tom Morris
4f7da9d18e Switch to Apache HTTP client for downloads - fixes #748 2013-08-02 18:13:41 -04:00
Tom Morris
d7531bbbd8 Handle quoted fields with embedded new lines. Sort separators by score
rather than just standard deviation
2013-08-02 17:59:09 -04:00
Tom Morris
f4ff227340 Clean up localization - fixes #760, modifies pull request #755
- make all file loading relative to module base
- move core language files into appropriate place
- eliminate all SetLanguage commands and use SetPreference instead
- eliminate all LoadLanguage commands except for core's
- fix duplicate keys in JSON language files
- remove BOM from JSON language files

OPEN - task 760: Translations not being loaded from built kit 
http://github.com/OpenRefine/OpenRefine/issues/issue/760
2013-07-31 00:31:31 -04:00
Tom Morris
9450d483ce Fix up line endings 2013-07-29 15:49:20 -04:00
Tom Morris
3003c1a709 Make importers more robust to preview errors when someone selects the
wrong importer/parser
2013-07-27 13:35:12 -04:00
Tom Morris
57ca70132c Turn all import conversions off by default - fixes #478 2013-07-27 13:32:26 -04:00
Tom Morris
5123dad6a8 More conservative approach for locking of jobs table 2013-07-26 18:51:08 -04:00
Tom Morris
0dc14af1aa Fix bug in refactoring of ImportingJob from commit
1e5f89e84c
2013-07-26 18:50:03 -04:00
Tom Morris
46a1e198d8 Recompute max cell index when rebuiling maps in ColumnModel - fixes #406 2013-07-26 18:48:20 -04:00
Tom Morris
7edc550618 Give a reasonable error message on Excel 95 import failure - fixes #564 2013-07-26 16:24:56 -04:00
Tom Morris
dc4d04c132 Allow arrays containing null in Filter & ForEach - fixes #741 2013-07-26 15:20:44 -04:00
Tom Morris
1e5f89e84c Centralize handling of import job config object & synchronize to allow
multiple accessors
2013-07-25 15:41:08 -04:00
Tom Morris
dc206e1889 Switch to ConcurrentHashMap for jobs table to allow multiple accessors 2013-07-25 15:36:54 -04:00
Tom Morris
0ff2d7ed9f Simplify implementation from pull request #728 2013-07-25 13:45:44 -04:00
Tom Morris
6dd4b8ea23 Add tests for boolean functions and tighten up error handling 2013-07-25 13:45:04 -04:00
Tom Morris
2c2c0d3d68 Merge pull request #728 from jmcastagnetto/master
Implements Xor operation
2013-07-25 10:00:11 -07:00
Blakko
6e90bc41f6 Merge remote-tracking branch 'origin/master' into internationalization
Conflicts:
	extensions/freebase/module/scripts/dialogs/schema-alignment/schema-alignment-dialog.html
	main/webapp/modules/core/index.vt
	main/webapp/modules/core/project.vt
	main/webapp/modules/core/scripts/project/browsing-engine.js
	main/webapp/modules/core/scripts/project/history-panel.html
2013-07-25 11:07:59 +02:00
Blakko
e6e6c8c002 Added a "Language Settings" menu at index
Now the language manually set has priority over the browser lang
Update translations
2013-07-12 11:12:33 +02:00
Tom Morris
92e4427c39 Adding a TODO 2013-07-10 15:13:22 -04:00
Tom Morris
32773122c4 Fix CollationKey creation - fixes #753 2013-07-10 15:12:49 -04:00
Blakko
552b0bf94b Internationalization of the index part (create/open/update) of refine 2013-07-02 13:40:50 +02:00
Tom Morris
5b6bc888f7 Fix template escape processing. Fixes #752. 2013-06-30 12:21:26 -04:00
Tom Morris
a3b4b45e4e Support non-string types in facetCount() - fixes #591 2013-06-23 12:04:48 -04:00
Tom Morris
51c1bc4a2f Refactor default toString with date support into separate utility 2013-06-23 12:02:13 -04:00
Tom Morris
c961bb64de Flush all column caches on row removals/changes. Fixes issue 567. 2013-06-22 18:44:26 -04:00
Tom Morris
fd58bd3327 Move documentation to Javadoc where it's visible 2013-06-22 16:27:18 -04:00
Tom Morris
6e88d068ee Throw a narrower exception 2013-06-22 16:26:45 -04:00
Jesus M. Castagnetto
0795bd8422 resolved .gitignore conflict 2013-06-19 12:10:32 -05:00
Jesus M. Castagnetto
b09bb4463e fix error in index caught by thadguidry 2013-06-19 11:21:26 -05:00
Tom Morris
b91fc8a2b1 Use CollationKeys when sorting text. Fixes issue 738 2013-06-17 15:51:29 -04:00
Tom Morris
067fcacec7 Clean up to pass tests:
- don't include TAB in control characters which get stripped so we can
use it for splitting
- remove trailing space from normalize strings
2013-05-31 17:06:03 -04:00
Tom Morris
000c0a38a8 Compute delay from request issue, not response return. Fixes #721 2013-05-26 10:13:16 -04:00
Tom Morris
4a5d3d4662 Convert dates to ISO 8601 for reconciliation. Fixes #688. 2013-05-26 10:08:55 -04:00
Tom Morris
7615db97cf Add Javadoc clean up variable naming. No functional change. 2013-05-26 10:07:37 -04:00
Tom Morris
36dd95c263 Add TODO for record mode operation 2013-05-26 07:54:33 -04:00
Tom Morris
567da6aa9f Normalize line endings
Add .gitattributes & do one-time normalization of line endings
2013-03-23 18:46:20 -04:00
Tom Morris
6a91b5d75b Use InputStream instead of Reader for JSON import - fixes #698 2013-03-23 18:36:05 -04:00
Tom Morris
6b3592982e Remove O(n^2) issue in tree importers - fixes #699
- Add sparse/based list implementation for ImportRecord
2013-03-23 12:02:51 -04:00
Tom Morris
f78dfadcf3 Clean up tree import utilities for #699
- lazy allocate objects
- conditionalize logging to prevent calls to StringBuilder & toString()

These are secondary issues, but still worth cleaning up.
2013-03-23 11:56:58 -04:00
Tom Morris
0a2ba1b1ae Switch from LinkedList to ArrayList
Just a simple list.  No need for extra overhead..
2013-03-23 08:16:23 -04:00
Tom Morris
bfa7c34d17 Merge pull request #659 - closes #659 2013-03-18 21:24:01 -04:00
Tom Morris
31cffa1181 Merge remote-tracking branch 'upstream/master' 2013-03-18 21:16:55 -04:00
Tom Morris
8a61cf731b Merge pull request #664 from Arcadelia/Preserve_Quotes
Quotes should not be removed from values
2013-03-18 18:12:51 -07:00
Tom Morris
fe943fe3ea Flag English specific stopwords for cleanupp 2013-03-18 20:20:46 -04:00
Tom Morris
7b9f6836e1 Update key & id recon to new Freebase APIs - part of #696 2013-03-12 16:50:23 -04:00
Tom Morris
7578d3375f Add logger and logging
- fix exception printing that goes nowhere
- make logger available for subclasses to use
2013-03-11 13:14:20 -04:00
Tom Morris
a2a8f4af2e Patch applied - closed #315 2013-03-06 21:45:54 -05:00
Tom Morris
d8d82bf8b7 Clean up a couple more format guessing issues left over from #685 2013-03-06 20:39:39 -05:00
Tom Morris
369bfffb2f Don't guess field widths unless we have at least 3 lines
- Investigation of #685 showed that single line files were being guessed
as fixed field width
2013-03-04 17:47:06 -05:00
Tom Morris
6b676f7513 Handle MIME media types which have charset param - fixes #685 2013-03-04 17:45:34 -05:00
Tom Morris
10bd7e3b75 Make upper bound of time facet inclusive - fixes issue #648 2013-03-03 16:06:20 -05:00
Tom Morris
eba03fc69e Protect joins map with mutex - fixes issue #652 2013-03-03 09:36:43 -05:00
Tom Morris
7b3379afc7 fix range check in getFields - fixes issue 687 2013-02-26 16:35:21 -05:00
Tom Morris
389e762251 Merge remote-tracking branch 'upstream/master' 2013-02-26 00:01:06 -05:00
Tom Morris
95e13eac50 Improve recon error handling 2013-02-26 00:00:03 -05:00
Tom Morris
50888c6f2e Merge pull request #666 from Arcadelia/Temp-file_removal
Fixed removal of upload temp files
2013-02-11 15:11:24 -08:00
Tom Morris
1033ce973e TODO about memory usage 2013-02-03 15:56:54 -05:00
Jesus M. Castagnetto
71f3196048 added comment on implementation 2013-02-01 23:45:43 -05:00
Jesus M. Castagnetto
36d2c4ac44 Added full text of BSD 2-clause 2013-02-01 23:44:35 -05:00
Jesus M. Castagnetto
df450b20f7 Registering new XOR command 2013-02-01 22:42:01 -05:00
Jesus Castagnetto
fec35a8bc6 Update main/src/com/google/refine/expr/functions/booleans/Xor.java 2013-02-01 21:07:42 -05:00
Jesus Castagnetto
ebec459cfd indentation change 2013-02-01 21:00:36 -05:00
Jesus Castagnetto
473e2f367f Implementing Xor operation 2013-02-01 17:59:16 -08:00
Tom Morris
c0347225b8 Switch escape character from NUL to DEL in hopes that it's rarer. 2013-02-01 17:12:07 -05:00
Frank Wennerdahl
2c59a0059f Fixed removal of upload temp files
Fixed an issue with an unclosed stream preventing upload temp files from
being removed after use. Also removed the use of FileCleaningTracker and
instead added manual removal of all tempfiles. By doing this the reaper
threads in FileCleaningTracker are avoided and files are removed
directly after use.
2013-01-24 09:59:09 +01:00
Frank Wennerdahl
64cf62e081 Fixed history and header update in IE
Due to Internet Explorer caching GET requests the Undo/Redo list and
column headers were not updated, leaving essential parts of the user
interface crippled even if Google Frame is installed. Adding
Cache-Control headers to the responses fixes this.
2013-01-24 09:39:12 +01:00
Frank Wennerdahl
1f7ab046c7 Quotes should not be removed from values
Leading quotation marks should not be removed from values. If they have
been left by the importing parser they should be considered part of the
value.
2013-01-24 09:04:17 +01:00
Frank Wennerdahl
ebdc40ad71 Added CSV quote options
Added two additional CSV options, one for parsing and one for export.

Specifying strict quotes when parsing will ignore all data not quoted.
Specifying quote all when exporting will enclose all values in quotes.

No front-end changes made, just added the support for the options in the
requests.
2013-01-21 08:21:16 +01:00
Frank Wennerdahl
f837643f1e Support for multi-char-separators in CSV
This change requires that the following patch is applied to OpenCSV:

http://sourceforge.net/tracker/index.php?func=detail&aid=3599477&group_id=148905&atid=773543
2013-01-18 16:28:27 +01:00
Tom Morris
33aa1132d7 Clarify wording/naming of blank rows export option - fixes issue #651
- clarify that it refers to all non-null cells
- rename variables without compatibility constraints to match actual
function
2013-01-14 16:36:09 -05:00
Tom Morris
0bd2104a16 Issue 630: Change branding from Google Refine to OpenRefine
** The first native Github commit (ie not one converted from SVN **
Change Google Refine to OpenRefine or just Refine.  
Change icon filenames and add some placeholder icons
2012-10-18 19:40:31 -04:00
Tom Morris
068e0916a2 FIXED - task 587: Correct initialization of the temporary directory - patch from the Wikier project
http://code.google.com/p/google-refine/issues/detail?id=587
https://bitbucket.org/wikier/google-refine/changeset/f3dbdb16a320#chg-main/src/com/google/refine/RefineServlet.java

git-svn-id: http://google-refine.googlecode.com/svn/trunk@2583 7d457c2a-affb-35e4-300a-418c747d4874
2012-10-13 15:58:44 +00:00
Tom Morris
4d48741ce0 FIXED - task 574: create safe sheet names for Excel export - patch from jd@tekii.com.ar
http://code.google.com/p/google-refine/issues/detail?id=574

git-svn-id: http://google-refine.googlecode.com/svn/trunk@2582 7d457c2a-affb-35e4-300a-418c747d4874
2012-10-12 23:05:17 +00:00
Tom Morris
ca2e959957 FIXED - task 529: Add support for key/value transpose with only two columns as well as repeating key fields in a single record.
http://code.google.com/p/google-refine/issues/detail?id=529

git-svn-id: http://google-refine.googlecode.com/svn/trunk@2574 7d457c2a-affb-35e4-300a-418c747d4874
2012-10-05 23:31:25 +00:00
Tom Morris
ffe674729c Just a little Javadoc. No functional changes.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2573 7d457c2a-affb-35e4-300a-418c747d4874
2012-10-05 21:10:32 +00:00
Tom Morris
2c52a00f55 Fixed - issue 544,600,618: Clean up handling of compressed files & archives with multi-segment paths
http://code.google.com/p/google-refine/issues/detail?id=600


git-svn-id: http://google-refine.googlecode.com/svn/trunk@2569 7d457c2a-affb-35e4-300a-418c747d4874
2012-09-22 18:08:56 +00:00
Tom Morris
748e205ae8 FIXED - task 616: Support bzip2 decompression on import
http://code.google.com/p/google-refine/issues/detail?id=616

git-svn-id: http://google-refine.googlecode.com/svn/trunk@2568 7d457c2a-affb-35e4-300a-418c747d4874
2012-09-22 16:00:42 +00:00
Tom Morris
27e3c0c8dc FIXED - task 614: Use same instance of OAuthProvider in OAuth dance. Patch supplied by sdeo@google.com
http://code.google.com/p/google-refine/issues/detail?id=614

git-svn-id: http://google-refine.googlecode.com/svn/trunk@2566 7d457c2a-affb-35e4-300a-418c747d4874
2012-09-19 23:16:29 +00:00
Tom Morris
b3f5fada95 FIXED - task 578 & 596: Clean up JSON importer
http://code.google.com/p/google-refine/issues/detail?id=578
http://code.google.com/p/google-refine/issues/detail?id=596

Extend tree parser framework to allow any Serializable instead of just Strings. Use this in JSON importer to: Import keywords null, true, false; Import empty strings and don't trim whitespace from strings on import;  Import numbers directly instead of importing them as text and then parsing them ourselves. Add tests to verify all this stuff

git-svn-id: http://google-refine.googlecode.com/svn/trunk@2543 7d457c2a-affb-35e4-300a-418c747d4874
2012-09-08 01:20:25 +00:00
Tom Morris
93d6e176d6 Task 478: Default "guess datatypes" to False so importers which don't specify it (e.g. gData & Excel) aren't effected
http://code.google.com/p/google-refine/issues/detail?id=478

git-svn-id: http://google-refine.googlecode.com/svn/trunk@2541 7d457c2a-affb-35e4-300a-418c747d4874
2012-09-07 21:17:34 +00:00
Tom Morris
83dce305cb FIXED - task 432: cross() failing - flush join cache table when column changes
http://code.google.com/p/google-refine/issues/detail?id=432

git-svn-id: http://google-refine.googlecode.com/svn/trunk@2539 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-30 16:31:22 +00:00
Tom Morris
9b54a8f29e FIXED - task 559: Deadlock between autosave thread and history code
http://code.google.com/p/google-refine/issues/detail?id=559

git-svn-id: http://google-refine.googlecode.com/svn/trunk@2538 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-30 16:22:28 +00:00
Stefano Mazzocchi
ba89daec1c make oauth against freebase work again in chrome
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2537 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-23 20:58:08 +00:00
Tom Morris
12a61b6ec6 task 603: range check column move commands
http://code.google.com/p/google-refine/issues/detail?id=603

git-svn-id: http://google-refine.googlecode.com/svn/trunk@2534 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-18 22:01:23 +00:00
Tom Morris
202018fac4 Add Javadoc. No code changes.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2533 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-18 22:00:41 +00:00
Tom Morris
4bb6c43982 task 604: add Guava to main project so that we're not dependent on an extension
http://code.google.com/p/google-refine/issues/detail?id=604

git-svn-id: http://google-refine.googlecode.com/svn/trunk@2531 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-15 13:33:17 +00:00
Tom Morris
1e043dcc94 FIXED - task 604: The common transform “Trim leading and trailing whitespace” doesn’t trim non-breaking spaces
http://code.google.com/p/google-refine/issues/detail?id=604

git-svn-id: http://google-refine.googlecode.com/svn/trunk@2529 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-14 23:00:13 +00:00
Tom Morris
f29f77e8f8 STARTED - task 604: The common transform “Trim leading and trailing whitespace” doesn’t trim non-breaking spaces
http://code.google.com/p/google-refine/issues/detail?id=604

git-svn-id: http://google-refine.googlecode.com/svn/trunk@2528 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-14 20:09:49 +00:00
Tom Morris
4bf212c03d FIXED - task 154: Can't import RDF/XML Data
http://code.google.com/p/google-refine/issues/detail?id=154

git-svn-id: http://google-refine.googlecode.com/svn/trunk@2526 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-05 16:31:41 +00:00
Tom Morris
5881addac8 Throw an exception if unsupported verb is used
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2525 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-05 15:36:23 +00:00
Tom Morris
b2ae74d23f FIXED - task 586: Only one parse date format is attempted from list in toDate(format1,format2)
http://code.google.com/p/google-refine/issues/detail?id=586

git-svn-id: http://google-refine.googlecode.com/svn/trunk@2520 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-03 18:01:01 +00:00
Tom Morris
4319314675 FIXED - task 594: Date diff function doesn't work for two Calendar objects
http://code.google.com/p/google-refine/issues/detail?id=594

git-svn-id: http://google-refine.googlecode.com/svn/trunk@2519 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-02 21:41:19 +00:00
Tom Morris
efa58630cf Add constructor that takes a Throwable to eliminate redundant code from callers.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2518 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-02 21:38:00 +00:00
Stefano Mazzocchi
2cb31b8b29 fixing oauth problems with redirection for the Freebase API
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2516 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-01 21:46:53 +00:00
David Huynh
4cfb921082 Added getStringKey() method for when it is difficult to generate integer keys that don't collide
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2515 7d457c2a-affb-35e4-300a-418c747d4874
2012-07-19 00:25:41 +00:00
Stefano Mazzocchi
6e41f4ad91 make the latest eclipse happy (it triggers a warning)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2513 7d457c2a-affb-35e4-300a-418c747d4874
2012-07-12 01:55:11 +00:00
Stefano Mazzocchi
bccea8cebe we could be leaking file descriptors here
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2506 7d457c2a-affb-35e4-300a-418c747d4874
2012-06-30 07:05:08 +00:00
Stefano Mazzocchi
f84dcff900 moving oauth authorize and deauthrorize into the core module because they are reusable across extensions
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2505 7d457c2a-affb-35e4-300a-418c747d4874
2012-06-29 19:39:42 +00:00
Tom Morris
8872c1b0a1 Keep track of when we have unsaved preference changes
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2502 7d457c2a-affb-35e4-300a-418c747d4874
2012-06-02 21:06:46 +00:00
Tom Morris
a0812c5751 Be slightly more tolerant of weird spreadsheet data
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2501 7d457c2a-affb-35e4-300a-418c747d4874
2012-06-02 21:00:30 +00:00
Tom Morris
c47b1e0ab7 Mark project as modified when metadata is changed
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2491 7d457c2a-affb-35e4-300a-418c747d4874
2012-04-14 14:10:11 +00:00
Tom Morris
8d22ede1f8 Issue 554 - rank formats *before* serializing them.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2482 7d457c2a-affb-35e4-300a-418c747d4874
2012-03-16 20:21:57 +00:00
Tom Morris
b3f8ce83c1 Issue 553 - Make sure we have a usable filename when importing from a URL
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2481 7d457c2a-affb-35e4-300a-418c747d4874
2012-03-16 20:16:18 +00:00
Tom Morris
51c586bc2c Issue 543 - Handle HTTP responses with Content-Encoding of gzip
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2480 7d457c2a-affb-35e4-300a-418c747d4874
2012-03-16 20:12:10 +00:00
Tom Morris
a8cb23ca51 Issue 544 - preserve directory path after decompressing file
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2479 7d457c2a-affb-35e4-300a-418c747d4874
2012-03-16 20:06:54 +00:00
Tom Morris
e97e7523b2 Issue 548 - Convert non-strings to strings before escaping
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2463 7d457c2a-affb-35e4-300a-418c747d4874
2012-03-14 03:06:11 +00:00
Tom Morris
18b780bebe Issue 517 - Fix combin() function to a) increase upper limit and b) keep it from continually recomputing the same values in recursion
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2459 7d457c2a-affb-35e4-300a-418c747d4874
2012-03-08 22:53:21 +00:00
Tom Morris
28ff2295fd Issue 490 - Handle separator guessing for CSVs with quoted fields containing commas
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2458 7d457c2a-affb-35e4-300a-418c747d4874
2012-03-08 15:53:55 +00:00
Tom Morris
9a680e8307 Switch to class name for logging, per convention
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2457 7d457c2a-affb-35e4-300a-418c747d4874
2012-03-08 14:53:27 +00:00
Tom Morris
ddd3680128 Add a TODO for recon failure retries on HTTP 500s - no functional changes
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2455 7d457c2a-affb-35e4-300a-418c747d4874
2012-03-08 14:45:53 +00:00
Tom Morris
5a962b1768 Issue 534 - Attempt to recover recon links which have become corrupted
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2454 7d457c2a-affb-35e4-300a-418c747d4874
2012-03-08 00:37:29 +00:00
Tom Morris
dbdbd906b7 Issue 547 - Decompress kmz files
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2453 7d457c2a-affb-35e4-300a-418c747d4874
2012-03-08 00:29:25 +00:00
Tom Morris
4a99abf25d Isse 542 - allow integers to be converted to dates
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2450 7d457c2a-affb-35e4-300a-418c747d4874
2012-03-03 21:36:36 +00:00
Tom Morris
5d080e5b3e Wrap if statement in a block to avoid future problems.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2447 7d457c2a-affb-35e4-300a-418c747d4874
2012-03-01 18:10:59 +00:00
Tom Morris
c583ad4367 Issue 537 - Try to convert to Long first before converting to Double. Matches behavior on import.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2446 7d457c2a-affb-35e4-300a-418c747d4874
2012-02-26 17:27:00 +00:00
Tom Morris
190e817fb8 Protect against NullPointerException
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2444 7d457c2a-affb-35e4-300a-418c747d4874
2012-02-22 20:06:03 +00:00
David Huynh
e21ae32722 Make sure project ID is completely numeric. Slightly better error reporting on project page when project ID is not valid.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2441 7d457c2a-affb-35e4-300a-418c747d4874
2012-01-29 21:16:13 +00:00
Tom Morris
6414ae7f87 Remove redundant test
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2436 7d457c2a-affb-35e4-300a-418c747d4874
2012-01-27 20:38:55 +00:00
Tom Morris
40183aa0ba Issue 513 - get rid of exception at end of import in JSON parser
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2435 7d457c2a-affb-35e4-300a-418c747d4874
2012-01-27 17:05:45 +00:00
Tom Morris
fdac0c30cf Issue 524 - shorten __anonymous__ names for JSON importer
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2432 7d457c2a-affb-35e4-300a-418c747d4874
2012-01-26 22:38:25 +00:00
Tom Morris
df45d06b2b Issue 523 - On URL fetch error, return HTTP error code, message, and contents of error stream (HTML page) if available
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2429 7d457c2a-affb-35e4-300a-418c747d4874
2012-01-26 18:47:30 +00:00
David Huynh
794629eee6 ChangeSequence did not save/load properly at all.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2427 7d457c2a-affb-35e4-300a-418c747d4874
2012-01-25 02:04:52 +00:00
David Huynh
893b767c01 ChangeSequence did not revert properly at all.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2426 7d457c2a-affb-35e4-300a-418c747d4874
2012-01-25 00:46:52 +00:00
Tom Morris
fa2e6fe608 Issue 517 - add some interim error checking and reporting
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2420 7d457c2a-affb-35e4-300a-418c747d4874
2012-01-12 06:20:28 +00:00
Tom Morris
8ec10a6ea6 Fix error message to match code
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2419 7d457c2a-affb-35e4-300a-418c747d4874
2012-01-12 05:51:16 +00:00
Tom Morris
b409ef5670 Issue 491 - fix off-by-one error in column counts
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2405 7d457c2a-affb-35e4-300a-418c747d4874
2011-12-09 23:50:40 +00:00
Tom Morris
b3bcb3361b Issue 483 - make custom metadata available to the client
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2404 7d457c2a-affb-35e4-300a-418c747d4874
2011-12-09 23:05:42 +00:00
David Huynh
ae771a7ccb Fixed Issue 502 in google-refine: Fetch URLs does not return the exact HTTP payload, like Create Project from URLs does.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2398 7d457c2a-affb-35e4-300a-418c747d4874
2011-12-02 20:44:13 +00:00
David Huynh
a7e2704655 Attempt at fixing Issue 500: Sequential creation of related columns using apply-operation command
by letting long-running processes report errors.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@2394 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-30 23:54:40 +00:00
David Huynh
d419f4bbc7 For reinterpret function, swapped encoder and decoder arguments if decoder is specified, as discussed here:
http://groups.google.com/group/google-refine/msg/629dbf11b073e129

git-svn-id: http://google-refine.googlecode.com/svn/trunk@2392 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-29 19:55:08 +00:00
Tom Morris
3b4bdbecdf Issue 378 - JSONize NaNs as their string equivalent to keep JSONwriter from throwing an exception
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2391 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-29 07:57:36 +00:00
David Huynh
76802d328d Default the encoding of clipboard data to UTF-8.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2390 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-29 05:21:26 +00:00
David Huynh
cdca6fff8f Checked in Shardul Deo's patch from
http://groups.google.com/group/google-refine-dev/browse_thread/thread/5222a68396c56405
to support HTTP PUT and DELETE.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@2387 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-25 03:48:03 +00:00
Tom Morris
f1b567bc31 Issue 487 - Add support for ISO 8601 date parsing
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2383 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-18 22:05:45 +00:00
Tom Morris
80c13e4b59 Issue 486 - make sure project character encoding doesn't get set to ""
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2381 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-18 20:52:49 +00:00
Tom Morris
d5dd04965a Allow user to optionally override source encoding in reinterpret function so they can fix up bad projects. Interpret empty string as system default encoding.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2380 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-18 20:50:55 +00:00
Tom Morris
23ac625818 Issue 430 - Fix timeline facet to handle Calendar type as well as Date
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2379 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-17 23:52:32 +00:00
David Huynh
dbeaefb00b Minor bug fix to previous check-in: made sure blank cells in the 2 newly generated columns don't get filled in.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2368 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-07 19:53:26 +00:00
David Huynh
d01745284b Added option to "transpose columns into rows" operation for filling in other columns.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2367 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-07 19:39:23 +00:00
David Huynh
5aec75696d Fixed Issue 477 in google-refine: Implement or remove the line separator option.
Also, fixed displaying bug in the fixed-width parser UI: previously, tab characters forced columns to be wider.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@2364 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-06 20:13:05 +00:00
David Huynh
a35b9f53f7 Made operation "Transpose columns into rows" support the option of transposing into 2 new columns rather than just one.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2362 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-06 02:50:33 +00:00
Tom Morris
85a37d23f9 Issue 474 - implement record limit for XML and JSON importers
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2359 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-05 16:38:19 +00:00
David Huynh
b36b229ba4 Fixed Issue 465: Data text file with extension .dta within a .ZIP is not automatically extracted
.dta isn't recognized so there's no best format detected. But now we default to text/line-based and always select all files if no file gets selected by default.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@2358 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-04 22:33:38 +00:00
David Huynh
41a90ad71f Fixed Issue 459: Undefined error with some CSV files (incorrectly detected as EXCEL)
by favoring file name-based format over mime type-based format (because the user's computer might have .csv registered as an Excel format).

git-svn-id: http://google-refine.googlecode.com/svn/trunk@2357 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-04 21:52:12 +00:00
David Huynh
2f6b635f66 Added initial implementation of Key/value Columnize operation and command.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2356 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-04 21:00:32 +00:00
Tom Morris
a7c81880a8 Issue 475 - Support escaped custom separators
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2355 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-04 19:04:16 +00:00
Tom Morris
cacbedd352 Fix index out of bounds exception when separator is the empty string
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2354 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-04 17:31:51 +00:00
Stefano Mazzocchi
856ef6a65a commented out unused variables
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2352 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-01 21:47:24 +00:00
Tom Morris
71492c706c Just some TODOs
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2349 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-28 17:51:20 +00:00
Tom Morris
ad8705e299 Javadoc only
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2348 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-28 17:29:35 +00:00
Tom Morris
a870e782f5 Make sure out counts our current before attempting to use them for sorting
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2347 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-28 17:28:27 +00:00
Tom Morris
5dad4d6a0b Handle legacy projects which have an empty slot 0 for the column model (old off-by-one bug)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2346 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-23 19:29:44 +00:00
Tom Morris
ab950689dd Add debugging info - mostly toString() methods for types missing them
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2343 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-21 16:46:55 +00:00
Tom Morris
b2781bda3f Javadoc only
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2342 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-21 16:30:37 +00:00
Tom Morris
9a9f4c1354 Issue 467 - provide JVM heap usage as part of the progress monitor during project creation.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2341 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-21 16:28:40 +00:00
David Huynh
f4b2ee3715 "Transpose columns into rows" operation now supports specifying the ending column to be the last column regardless of its name.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2337 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-19 13:42:50 +00:00
David Huynh
223074bb25 Xml importer should stop trying to skip over initial non-xml content after some number of characters.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2336 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-18 15:25:31 +00:00
Tom Morris
9710521ef8 Correct column counting so maxCellIndex represents current count rather than next column
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2335 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-14 21:00:50 +00:00
Tom Morris
5d6ab76b7c Issue 313 - fix cell format so dates export as dates rather than numbers.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2334 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-14 20:23:59 +00:00
Tom Morris
2d5125af1e Issue 462 - don't trim whitespace from string-valued cell contents on import
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2330 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-12 23:45:52 +00:00
Tom Morris
5c95c9c1f9 New exporter - Open Document Format (ODF) spreadsheets (.ods)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2326 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-11 21:02:23 +00:00
Tom Morris
3bd84088da Rename OO/ODS importer with more generic name
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2325 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-11 21:01:45 +00:00
Tom Morris
ee0fb9033e Javadoc
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2324 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-11 20:57:40 +00:00
Tom Morris
ca17e1ef0a New importer for Open Document Format (ODF) spreadsheet files (.ods)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2323 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-11 20:27:40 +00:00
Tom Morris
2726f61a61 Add toString methods to help with debugging
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2321 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-11 20:19:53 +00:00
Tom Morris
5c856179cb Add TODO for suspicious code
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2320 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-11 20:14:57 +00:00
Tom Morris
16421303cb Add Javadoc
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2318 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-11 20:12:24 +00:00
David Huynh
55c3fdebab Bumped up version to 2.5.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2314 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-10 21:58:42 +00:00
David Huynh
1a14d82393 For XML files, ignore not just leading whitespace but anything except <.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2313 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-10 20:51:00 +00:00
Tom Morris
fffd24d64b Parse parameters from multipart/form-data POSTs rather than just dropping them (needed for Windmill tests, among other things)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2302 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-07 23:15:55 +00:00
Stefano Mazzocchi
1f67866258 fixing a bunch of inconsistencies and potential bugs as indicated by findbugs, pmd and eclipse
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2301 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-07 21:23:23 +00:00
Tom Morris
31073d7712 Refactor importer interfaces to narrow exceptions thrown and handled
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2296 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-07 19:06:53 +00:00
Tom Morris
50927b33dc Javadoc
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2295 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-07 18:56:23 +00:00
Tom Morris
4a230abb44 Narrow exception handling
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2294 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-07 18:55:46 +00:00
Tom Morris
29cbc5af20 Remove some obsolete TODOs. No functional changes.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2290 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-07 17:29:30 +00:00
David Huynh
18f32ed7e8 Fixed up Rdf Triples importer, added a parser UI for it, and got its tests to pass.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2283 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-06 21:28:20 +00:00
David Huynh
1c5dc32b88 Fixed tsv/csv tests.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2276 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-06 06:22:30 +00:00
Tom Morris
ac4a0ca747 Store blank cells as nulls if that's what the user request
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2272 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-05 23:41:52 +00:00
Tom Morris
0ce0a0a8d3 Add toString support for null cells to help with debugging
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2271 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-05 23:33:17 +00:00
David Huynh
e7e9dbc74d Minor fixes to pass some exporter tests.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2269 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-03 16:38:07 +00:00
David Huynh
7935dfd60e Stricter detection of json and xml formats on import, by checking for initial nonspace character.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2266 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-30 01:47:42 +00:00
David Huynh
d047acf1d1 Fixed Issue 452: Importing using Clipboard function does not guess structure correctly for XML or JSON
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2263 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-29 14:02:12 +00:00
David Huynh
5762efebf6 Fixed Issue 397: New UI Importer Branch - individual JSON record nodes do not preview well.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2258 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-28 03:38:23 +00:00
Tom Morris
1b197d93d8 Issue 447 - allow users to specify delimiters for toTitlecase function
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2253 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-20 05:07:46 +00:00
David Huynh
e1184003df Color-code date values in data table.
Fixed Issue 426: filter with custom facet adds zero lines choice

git-svn-id: http://google-refine.googlecode.com/svn/trunk@2251 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-20 01:36:47 +00:00
Tom Morris
59d6020979 Add basic test coverage for ToTitleCase and (commented out) support for 2nd parameter
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2250 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-19 15:47:33 +00:00
David Huynh
82cc76f076 Fixed bug where a blank row used to corrupt the whole project because it could not be re-loaded from file.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2248 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-19 10:36:38 +00:00
David Huynh
9111157172 Fixed Issue 447: Extend toTitlecase() function with support for char[] delimiters in Apache WordUtils.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2247 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-19 09:48:37 +00:00
David Huynh
db3bbb5c86 Fixed xml parsing error due to whitespaces in front of <?xml>.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2246 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-19 09:06:36 +00:00
David Huynh
66cf0b6596 Fixed Issue 449: Uncaught exception from Excel importer.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2245 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-19 08:49:35 +00:00
David Huynh
5c446d28d0 Support uploading directly to a new Google spreadsheet.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2243 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-16 18:04:55 +00:00
David Huynh
02c58e2c56 Periodically clean up stale importing jobs to free up disk space.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2240 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-15 23:52:05 +00:00
David Huynh
0693205430 Added support for importing from fusion tables.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2239 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-15 21:40:40 +00:00
Tom Morris
ebede9b424 Issue 441 - return EvalError if we can't parse a date
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2237 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-13 20:58:43 +00:00
Tom Morris
131ff81c0d Don't reschedule a canceled timer
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2236 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-13 20:38:34 +00:00
David Huynh
57c11d0238 Fixed issue 442: Two column transforms to date on the same column turns the cells blank
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2230 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-01 22:11:45 +00:00
David Huynh
a88ccd2c32 Reduced amount of logging.
Suppressed logging for the GetProcessesCommand, which gets ping'ed often while there is a long running operation being executed (e.g., reconciling, fetching URLs).

git-svn-id: http://google-refine.googlecode.com/svn/trunk@2228 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-01 18:26:45 +00:00
David Huynh
a8815956cd Implemented back-end of customizable tabular exporting support.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2225 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-30 19:19:46 +00:00
Tom Morris
e174bb163a Issue 440 - Don't purge from memory those projects with pending operations
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2222 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-28 22:00:02 +00:00
David Huynh
420e74c6f4 Made CreateProjectCommand scriptable again, so it can be called from client libraries.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2216 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-23 18:49:47 +00:00
David Huynh
4113a10b5b Catch/log exceptions in the importers a bit more carefully.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2215 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-22 21:47:15 +00:00
David Huynh
f023b922e1 Implemented encoding selectors in a few importing parser UIs.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2214 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-22 17:55:06 +00:00
Tom Morris
bde63ff417 Last set of indentation cleanups - no functional changes
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2211 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-18 17:46:36 +00:00
Tom Morris
9d7b8a5279 Don't die if we get passed no candidates
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2210 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-18 17:39:18 +00:00
David Huynh
afb7953eac Fixed problem for importing from an archive file containing fixed width column files: we used to create totally new columns for each contained file, yielding too many columns.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2203 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-14 02:53:19 +00:00
David Huynh
33d99186ea Made fixed width column guessing slightly better.
Made sure fixed width parser UI take into account the File column.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@2202 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-14 02:05:18 +00:00
David Huynh
41e4e1cd70 Some more JS indentation fixes.
Fixed issue 31: "Maximum number of facet values should be configurable." Now when we're showing "too many choices" we also display exactly how many choices there are and show a link to change the limit.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@2201 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-14 01:05:43 +00:00
David Huynh
e955ed05ae Made sure busy indicator shows up for GData importing when needed.
Fixed radio button issue with GData worksheet selection.
Fixed resizing issue with open project action area.
Fixed NullPointerException in RecordModel.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@2198 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-12 19:15:58 +00:00
David Huynh
823729776d Google spreadsheets can now be imported directly from within Refine.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2192 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-11 00:35:01 +00:00
David Huynh
c5078d1887 Fixed issue 428: Excel import sometimes drops last row of data.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2189 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-06 19:37:23 +00:00
Tom Morris
da7347e7b1 Make sure all conditionals and loops are in blocks (too bug-prone otherwise)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2183 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-02 22:21:47 +00:00
Tom Morris
c16a2378f9 Ask people not to reformat since this is imported code.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2182 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-02 22:18:50 +00:00
Tom Morris
539fea6eb3 Simplify some for loops using new Java 5 syntax
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2181 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-02 21:17:41 +00:00
Tom Morris
97a0f2a33e Organize imports. com.google.refine last in a section of its own. Everything alphabetical in its section.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2180 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-02 21:10:22 +00:00
Tom Morris
5497fa4685 Remove unnecessary casts
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2173 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-02 20:33:57 +00:00
Tom Morris
7fd6e22af4 Convert tabs to spaces. No functional changes.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2172 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-02 20:26:32 +00:00
Tom Morris
123614539d Add missing @Override annotations
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2171 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-02 19:30:23 +00:00
David Huynh
78edff6f7f Merged new importer UI work from branch over.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2170 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-02 03:34:47 +00:00
Tom Morris
b82448037a Add @Override annotations. No functional changes.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2124 7d457c2a-affb-35e4-300a-418c747d4874
2011-06-24 04:01:47 +00:00
Tom Morris
eb38ab75a4 FIXED - task 415: Evaluation precedence wrong for arithmetic expressions
http://code.google.com/p/google-refine/issues/detail?id=415

git-svn-id: http://google-refine.googlecode.com/svn/trunk@2123 7d457c2a-affb-35e4-300a-418c747d4874
2011-06-23 23:42:12 +00:00
Tom Morris
2af22f9485 Issue 404 - Fix indeterminate behavior in character encoding guesser. Thanks to Paul Makepeace.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2120 7d457c2a-affb-35e4-300a-418c747d4874
2011-06-14 04:29:44 +00:00
Tom Morris
8da1291650 Issue 399 - Add Cologne Phonetic Keyer and allow it to be used for clustering
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2102 7d457c2a-affb-35e4-300a-418c747d4874
2011-06-09 19:42:05 +00:00
Tom Morris
51c898d602 Issue 351 - truncate exports to Excel at 256 columns (limitation of Excel format)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2094 7d457c2a-affb-35e4-300a-418c747d4874
2011-06-07 23:55:00 +00:00
Tom Morris
6a14049652 Issue 401 - use default exception handling for ExportRows command instead of JSON response
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2093 7d457c2a-affb-35e4-300a-418c747d4874
2011-06-07 23:52:23 +00:00
Tom Morris
2cd3ae03d0 @Override annotations. No functional changes.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2092 7d457c2a-affb-35e4-300a-418c747d4874
2011-06-07 23:51:16 +00:00
Tom Morris
a52c25272e Issue 342 - help text update
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2090 7d457c2a-affb-35e4-300a-418c747d4874
2011-06-06 22:38:50 +00:00
Tom Morris
eebc225abc Add missing @Override annotations (issue 316, 317, 319, 320 among others)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2089 7d457c2a-affb-35e4-300a-418c747d4874
2011-06-06 22:35:01 +00:00
Tom Morris
73acd497e9 Fix for issue 358 from Tomaz Solc. Don't return a NaN when comparing two 0-length word lists.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2088 7d457c2a-affb-35e4-300a-418c747d4874
2011-06-06 21:30:46 +00:00
David Huynh
11cf415ee8 Exposed more fields for each record.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2081 7d457c2a-affb-35e4-300a-418c747d4874
2011-06-06 20:19:20 +00:00
Tom Morris
4dc3ef8caa Bump version to 2.1
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2080 7d457c2a-affb-35e4-300a-418c747d4874
2011-06-06 20:16:19 +00:00
David Huynh
b75a5efe71 Applied patch for Issue 222: save favorite transforms.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2079 7d457c2a-affb-35e4-300a-418c747d4874
2011-06-06 18:49:36 +00:00
David Huynh
f7c33fba45 Fixed issue 196: failure and error dialog attempting to remove columns
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2077 7d457c2a-affb-35e4-300a-418c747d4874
2011-06-05 04:31:51 +00:00
David Huynh
cecfa244e0 Changed to UTF-8 encoding
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2039 7d457c2a-affb-35e4-300a-418c747d4874
2011-04-06 21:09:21 +00:00
Stefano Mazzocchi
610de0d33a adding Metaphone3 algorithm
Many thanks to Lawrence Philips for donating the code to us under the BSD license.


git-svn-id: http://google-refine.googlecode.com/svn/trunk@2029 7d457c2a-affb-35e4-300a-418c747d4874
2011-03-01 00:17:48 +00:00
Stefano Mazzocchi
87e7f9a7a4 remove unused variable
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2028 7d457c2a-affb-35e4-300a-418c747d4874
2011-02-25 22:51:58 +00:00
Tom Morris
c5312a2e6a Issue 338 - patch from Thad Guidry to provide function which calls JSoup ownText() method
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2025 7d457c2a-affb-35e4-300a-418c747d4874
2011-02-23 19:40:35 +00:00
Tom Morris
5b9362e956 Issue 334 - Make sure URLs are encoded before using them.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2007 7d457c2a-affb-35e4-300a-418c747d4874
2011-02-15 23:15:09 +00:00
Tom Morris
06e2487189 Issue 276 - patch from pxb1... to fix character encoding issue with CreateProject command slightly modified to preserve request encoding if it has one
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2000 7d457c2a-affb-35e4-300a-418c747d4874
2011-02-04 03:15:12 +00:00
David Huynh
d7b482be06 Attempt at fixing issue 185. Will need someone else to verify.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1989 7d457c2a-affb-35e4-300a-418c747d4874
2011-01-20 22:49:36 +00:00
David Huynh
44652a3ee2 Make copy of Calendar object before modifying it. Also handle Date type.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1982 7d457c2a-affb-35e4-300a-418c747d4874
2011-01-10 23:06:28 +00:00
David Huynh
90794d5039 Started working on new import UI. Not much to see yet, but if you append ?new=1 to the index page URL then you see the new form. It can only upload a file at the moment.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1971 7d457c2a-affb-35e4-300a-418c747d4874
2011-01-02 23:09:08 +00:00
David Huynh
6fb2b05739 Fixed issue 294: "Exporting date type column to TSV/CSV shows java debugging information instead of value" with help from Gabriel Sjoberg.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1967 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-28 15:54:24 +00:00
David Huynh
53442c5ef2 Handle the case where an excel cell has a formula but the cached result of that formula is an error.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1962 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-25 21:41:21 +00:00
David Huynh
687e9064df A shorter fix for toString() to handle Date than the last commit.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1961 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-25 21:36:51 +00:00
David Huynh
0ff40eabbd toString() should handle Date, too, rather than just Calendar.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1960 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-25 21:33:59 +00:00
Tom Morris
209f157656 RESOLVED - task 202: Sort text with accents
http://code.google.com/p/google-refine/issues/detail?id=202

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1951 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-12 18:16:29 +00:00
Iain Sproat
f55f11cd0d Adding classes to now make it possible to parse Html in GREL. Uses small subset of methods from the JSoup library, licensed under the MIT license.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1948 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-06 23:15:24 +00:00
Tom Morris
9aaa1c9919 Replace tabs with spaces. No functional change.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1947 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-05 20:50:03 +00:00
Tom Morris
a560cb56df Replace tabs with spaces. No functional changes.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1942 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-29 06:27:06 +00:00
Tom Morris
3a8f9306bd Add some toString() methods to help with debugging
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1941 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-29 06:24:50 +00:00
Tom Morris
af20157532 Fix indentation so indent levels match logical block levels. No code changes.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1940 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-28 17:46:57 +00:00
Tom Morris
748b5699b9 Issue 61 - Turn on text coalescing and XML entity reference replacement
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1939 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-27 22:07:15 +00:00
Tom Morris
e19148c375 Make sure we at least log an error if the import fails
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1938 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-27 22:05:45 +00:00
Tom Morris
824f445530 Unused import
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1937 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-27 17:54:16 +00:00
Tom Morris
b9fa100d31 Don't try to save a null encoding
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1936 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-27 17:54:01 +00:00
Tom Morris
850c43d6f3 Issue 107 - set encoding on response
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1935 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-27 02:46:10 +00:00
Tom Morris
3d6458a0e5 Replace tabs with spaces
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1934 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-27 01:38:32 +00:00
Tom Morris
bc8637f638 Issue 257 - Don't return a String where a Date is required (using generics in Criterion API would prevent this kind of problem, but that's incompatible with the use of the Eval_Error class)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1933 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-27 01:18:36 +00:00
Tom Morris
c7b0f4d024 Issue 184 - use default locale date formatting if no format string is specified
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1932 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-26 23:47:09 +00:00
Tom Morris
080ec5332e Issue 237 - Make sure project's character encoding is always set. Lower minimum confidence threshold for guesser.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1931 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-26 22:23:31 +00:00
David Huynh
1e2af79851 Let's handle .tar files as well rather than requiring .tar.gz.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1919 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-21 03:00:43 +00:00
David Huynh
c496f1e941 Helped toward fixing issue 228: ButterflyServlet already tracks the ServletConfig, so there's no need for RefineServlet to do that, too.
Importing archive files has another big problem at the moment: namely, even if the many files in a single archive file share several columns, they still cause columns with the same names to be over and over again as each file gets imported. This is because individual importer was written with the assumption that it imports into an empty project with no column.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1918 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-21 02:58:15 +00:00
Iain Sproat
09fa36198c Additions to GREL:
* Factorial function allowing variable steps
* GreatestCommonDenominator function
* LeastCommonMultiple function
* Multinomial function
* Quotient function

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1910 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-20 18:04:11 +00:00
Iain Sproat
43d0de2d8a Fixed registered name of GREL combination function
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1909 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-20 12:15:31 +00:00
Iain Sproat
f1643565b8 Additions to GREL:
* modulo operator, %
* cos, sin and tan functions
* acos, asin, atan and atan2 functions
* cosh, sinh and tanh functions
* fact and combin functions
* degrees and radians functions
* odd and even functions

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1908 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-20 12:11:37 +00:00
Iain Sproat
1ec7cb9f7b PI constant added to GREL
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1904 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-18 23:53:07 +00:00
Tom Morris
675714d03d Add toString() methods to help with debugging
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1894 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-18 08:19:05 +00:00
Iain Sproat
dd333d5b43 Abs function now available in GREL
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1890 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-17 09:38:51 +00:00
Iain Sproat
74e9288229 Additional error dialog for Issue 188
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1858 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-11 14:25:46 +00:00
Iain Sproat
2f564589f5 Adding a Fixed Width data importer (Issue 85) and associated tests.
Although this importer is 'wired up', it requires a property "fixed-column-widths" which is not (yet) implemented in the UI.  But the ImporterRegister.guessImporter method will probably select the CsvTsvImporter before the FixedWidthImporter anyway.  I suggest an improvement to the project creation UI and/or the guessImporter method will be required.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1857 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-11 13:15:41 +00:00
David Huynh
703d2dbd19 IsTest should catch errors and wrap them.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1833 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-08 21:19:25 +00:00
David Huynh
5d915be096 Numeric comparisons == and != should be special-cased, too.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1780 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-05 19:17:57 +00:00
David Huynh
fe08a43e0c FunctionCall and ControlCall should catch exceptions and wrap them as EvalError's.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1777 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-05 04:26:10 +00:00
David Huynh
faaca5beea Fixed the GREL round function.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1749 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-01 20:28:42 +00:00
David Huynh
1f12bfb409 Fixed bug in HasFieldsListImpl where list members weren't tested for being null.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1735 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-01 18:13:53 +00:00
David Huynh
1eebe2e4a3 Fixed transpose-rows-into-columns command, which previously duplicated columns that precede the column being transposed.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1734 7d457c2a-affb-35e4-300a-418c747d4874
2010-11-01 17:59:58 +00:00
David Huynh
764558c48a In numeric bin index, count infinity values as errors
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1700 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-26 22:49:59 +00:00
David Huynh
8d422e2e54 Fixed Calendar vs. Date bug in time range facet
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1699 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-26 21:06:16 +00:00
David Huynh
8ccf9d1bf8 The judgment facet created after a recon operation is done should also show (blank) and (unreconciled) choices
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1696 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-26 04:57:02 +00:00
David Huynh
e601ad8d40 bug: autoMatch flag wasn't actually used before
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1627 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-25 01:41:07 +00:00
David Huynh
2d9e7c87f6 Increased recon batch size to 10 again. Various style tweaks. Polished up freebase extension's dialogs to be a bit more helpful
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1625 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-25 01:07:25 +00:00
David Huynh
345c1c62ac Added new recon commands:
- clear recon data for all matching rows
- clear recon data for one cell
- clear recon data for similar cells
- copy recon judgments across columns

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1618 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-22 07:25:27 +00:00
David Huynh
5a17acfd70 Prepended license text to java source
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1613 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-20 20:45:52 +00:00
David Huynh
9b8206da29 Fixed new bug for query-based reconciliation introduced by factoring out the freebase extension
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1611 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-20 04:52:35 +00:00
Tom Morris
7dcd0c073d Revert bad commit r1600
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1601 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-18 05:19:05 +00:00
Tom Morris
79c00bab36 Incomplete - task 157: Integrate Google Spreadsheet import/export plugin
http://code.google.com/p/google-refine/issues/detail?id=157

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1600 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-18 04:59:39 +00:00
David Huynh
e7184ec9ab Deleted old empty protograph dirs. Use a default assign version even if running from trunk; this is so that we have at least some clue about an imported project file.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1598 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-18 04:18:09 +00:00
David Huynh
c8dcc10ab8 Be sure to use UTF-8 when saving data.txt, pool.txt, and change files.
Fix issue 163: Refine doesn't retain the characters for flat or sharp.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1588 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-17 23:43:02 +00:00
David Huynh
a62638e88d For each recon group, try at least 3 times if the service keeps failing. Log errors more for debugging purposes.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1578 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-16 00:19:31 +00:00
Stefano Mazzocchi
f50880905e fixed warnings
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1577 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-16 00:01:42 +00:00
Tom Morris
47dd5f8da6 Make sure the stream/writer is flushed in case the exporter forgets to do it
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1569 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-15 17:10:37 +00:00
Tom Morris
bbebb4d2dc Add @Overrides so we get warned about API changes
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1565 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-15 13:26:25 +00:00
David Huynh
7e9df21b70 Exporters need to implement either WriterExporter or StreamExporter.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1558 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-15 06:18:20 +00:00
David Huynh
73042712ed Made csv/tsv importer not trim whitespace even if "guess cells' types" is checked (for cells that are strings).
Updated csv tests to expect un-trimmed cells.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1557 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-15 05:30:15 +00:00
David Huynh
9e35ea3775 Better error message for numeric range facet if there's no numeric value.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1551 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-15 01:00:51 +00:00
Tom Morris
083abd4329 Refactor exporter interface along same lines as importer
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1547 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-14 21:33:50 +00:00
David Huynh
4ccdbc8716 Fixed bug in which a newly created and unedited project would never get saved because it had the same modified time and last save time.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1530 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-14 01:43:26 +00:00
David Huynh
dc49047092 We have previously changed the standard-reconcile acre app to return mids, but we still need to make sure its metadata says that its identifier space is mid, not id. And we need Refine to test for the mid identifier space as well.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1479 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-13 18:33:27 +00:00
David Huynh
a16df8f2d6 For unrecoverable projects, rename them with a suffix so the next time we won't try to recover them again.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1472 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-13 07:05:34 +00:00
David Huynh
91ffe71d17 Lowering recon batch size from 7 to 3 to avoid timeout problem. This is a temporary fix only for
Issue 156: Reconcile is not picking up alias hints or even type hints correctly

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1470 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-13 05:03:49 +00:00
David Huynh
208152b55c Added .vt template for reporting errors with stacktraces.
Fixed Issue 155: Blank browser shown when non-GZIP format is detected during import

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1469 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-13 04:51:01 +00:00
David Huynh
7cd5a47fbf We haven't been using non-split row parser, so we need to fix the trimming problem in the tsv/csv importer instead.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1467 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-12 23:24:16 +00:00
David Huynh
2d276fa1e6 Non split row parser shouldn't trim lines because whitespaces are significant
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1465 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-12 22:45:30 +00:00
David Huynh
69c338c728 Text filter was throwing an exception if the column went away (which happened when the column got split).
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1464 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-11 23:15:13 +00:00
David Huynh
336a773069 Only try to create the workspace dir if it doesn't exist.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1463 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-11 23:04:06 +00:00
Tom Morris
c42c78dc0a Log errors if things don't go as expected
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1462 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-11 22:28:22 +00:00
Iain Sproat
142591a090 Added a mention of the new JsonImporter to CHANGES.txt
Corrected the logger name in JsonImporter.java

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1455 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-08 07:58:59 +00:00
David Huynh
ad0d227ab3 Remove remaining Freebase related functionalities.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1453 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-08 02:04:47 +00:00
David Huynh
6ddd945a80 The Freebase functionalities have been extracted out in the last commit. We're removing them from the core module now. This is not a complete checkin. SVN is having some trouble with some directories.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1452 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-08 01:54:00 +00:00
Tom Morris
5040b06d9f Make exceptions more specific for load errors. Still no error returned to user though (just hangs)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1450 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-07 14:20:28 +00:00
Tom Morris
ea28784e8b Don't save null project if load failed
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1449 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-07 14:19:42 +00:00
Stefano Mazzocchi
215165ed97 spell out tweezer parameters
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1444 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-07 08:20:46 +00:00
David Huynh
9ea477c80d Allowed a single operation class to be registered under several names, so that we can rename operations (to better names) while maintaining backward compatibility.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1443 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-07 05:42:01 +00:00
David Huynh
1de5e7c00e Renamed package gel to grel.
Replaced gel with grel in other places in the code base while maintaining backward compatibility.
Changed layout in expression preview dialog to accommodate long GREL name.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1442 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-07 05:19:35 +00:00
David Huynh
90d1111ebc Added "project" argument to OverlayModel methods, as suggested by Fadi Maali.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1439 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-06 20:47:11 +00:00
David Huynh
3ba8e63249 Register Json importer.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1426 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-04 18:53:41 +00:00
Iain Sproat
d977f42f51 Changed behaviour of the XmlImporter to make it more permissive, and allow arrays within mixed elements to be used as candidates for importing to Refine.
This change has also allowed the JsonImporter to pass all its unit tests without any further modification.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1425 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-04 18:33:59 +00:00
Iain Sproat
ec9898ba92 Some tidying up of the XmlImporter which reduces the number of generic TreeParser tokens to a minimum - and should allow elements such as comments and CDATA to be ignored/skipped.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1422 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-04 15:02:09 +00:00
Iain Sproat
d3f223c196 The JsonImporter now passes all current unit tests.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1421 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-04 10:02:50 +00:00
Stefano Mazzocchi
2b9b38368f use the new FreeQ 'refine' queue instead of the old 'gridworks' one
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1410 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-30 00:26:02 +00:00
Stefano Mazzocchi
b62e63306a - make the correct version + revision available also to the java side (thru web.xml)
- add @Override metadata to the commands that were missing it
- make the version information appear even when using trunk (Fixes Issue 136)


git-svn-id: http://google-refine.googlecode.com/svn/trunk@1406 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-29 01:50:57 +00:00
David Huynh
935355cb50 Comments in XML file caused the record detection code to fail. So we added ignorable element type that we can skip over.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1392 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 19:16:43 +00:00
Iain Sproat
bd3ded0828 Correcting JsonImporter to use the correct parser.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1388 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 14:19:19 +00:00
Iain Sproat
855df20481 XmlImportUtilities no longer relies on XMLStreamConstants, and is now independent of any specific type of tree data (Xml or otherwise).
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1378 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 10:46:33 +00:00
Iain Sproat
b21961be89 Another small step towards making XmlImportUtilities generic for all tree structured data, and less XML centric. Some calls to XMLStreamConstant in XmlImportUtilities are now working with a generic TreeParserToken, with methods to converter between TreeParserToken and XMLStreamConstant/JsonToken in the respective parsers.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1377 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 10:04:56 +00:00
David Huynh
740caedf46 Updated to version 2.0
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1376 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 06:03:07 +00:00
David Huynh
e587614c22 Fixed Issue 126: Large integers formatted in scientific notation in formulas
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1373 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 04:21:44 +00:00
Tom Morris
bc6f05f41b Issue 140 - Fix Open Workspace command for non-Mac platforms (requires Java 6)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1372 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 03:54:54 +00:00
David Huynh
194fb5e706 Fixed Issue 122: Exporting to Excel on attached project raises server exception
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1370 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 03:44:30 +00:00
David Huynh
f2ce1b7161 Fixed Issue 121: Importing attached file strips backslashes
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1369 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 03:35:42 +00:00
Stefano Mazzocchi
c976091624 new hooks to the Freebase Refinery
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1368 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 02:19:50 +00:00
David Huynh
823fe989a4 Fixed Issue 110: Import of single column text file with Postal Codes shows only 1 row with lots of � chars (?).
(by enforcing a confidence threshold on the encoding guessing)

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1367 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 00:26:53 +00:00
Stefano Mazzocchi
14d046bb7a silence velocity's logs
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1366 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-28 00:01:53 +00:00
Iain Sproat
c3c23a87b0 The renaming of TreeImporter to TreeImportUtilities didn't seem to get committed last time. Trying again.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1362 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-27 22:57:26 +00:00
Iain Sproat
d285999da8 New JsonImporter, JsonParser and JsonImporterTests (copy of XmlImporterTests with syntax of the example data altered for Json).
Renaming of TreeImporter to TreeImportUtilities (as per the current convention with the XmlImporter and XmlImportUtilities).

NB the new JsonParser class does not work, and 5 of the new unit tests for JsonImporter currently fail.  To be fixed in due course.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1361 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-27 22:53:17 +00:00
Stefano Mazzocchi
86f810a324 hardening the timeline facet
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1353 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-27 21:59:17 +00:00
Iain Sproat
e5ddfa6fdc All methods in XmlImportUtilities now use the TreeParser interface.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1323 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-27 17:59:53 +00:00
Iain Sproat
d71c563831 XmlImportUtilities.detectPathFromTag and XmlImportUtilities.detectRecordElement methods now use a generic TreeParser interface. A lightweight wrapper XmlParser wraps XMLStreamReader to provide parsing for xml data.
This is another small step towards a generic importer for tree structured data.  My plan is to refactor more of XmlImportUtilities' methods to use the TreeParser interface so that XmlStreamReader is no longer called directly from XmlImportUtilities.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1322 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-27 17:40:51 +00:00
Iain Sproat
1bda46d40f Methods which are generic to any tree structured data and don't rely on an XmlParser have been moved to a new TreeImporter class. This is a small step towards supporting importers for other tree structured data.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1321 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-27 16:09:44 +00:00
Stefano Mazzocchi
6273332cef the sandbox->freebase loading conduit is now named "refinery"
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1313 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-25 01:57:56 +00:00
David Huynh
a112ffa9ab Caught a stray rename miss. Added more generic support for renaming old Java classes so that extensions could remain backward-compatible, too
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1297 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-22 23:59:57 +00:00
David Huynh
1367ce301e More renaming, except for: client-side code, build scripts, anything to do with data loading and QA, workspace path. Refine can still run, and undo/redo on existing projects is working.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1290 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-22 18:36:33 +00:00
David Huynh
e6bc603a11 Renamed Java classes whose names contain 'Gridworks'. Refine is still able to start. But don't check out the code just yet.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1289 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-22 17:46:39 +00:00
David Huynh
edb23eb263 Changed Java packages com.google.gridworks.* to com.google.refine.* and modified other code just enough to start grefine up without error. Much remains to be done. Do not check out the code just yet.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1288 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-22 17:04:10 +00:00
David Huynh
362a277c58 Added main menu command to open system file explorer at the workspace directory.
Made project manager more careful at disposing projects, in case any of them is null.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1272 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-17 06:52:10 +00:00
David Huynh
2609c4049d Fixed issue 114: "Refactor project manager api to allow importers to create project metadata" by incorporating tfmorris' patch.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1271 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-17 01:00:23 +00:00
David Huynh
8d1f2d44b9 Patched the json lib to allow up to 100 levels of nesting.
Fixed ImportProjectCommand to redirect from the error page back to /index rather than /index.html.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1270 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-17 00:21:54 +00:00
Stefano Mazzocchi
eee4514643 fixing Issue-125
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1269 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-16 23:05:53 +00:00
Stefano Mazzocchi
df0a30e22d this is really a debug log
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1262 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-10 21:52:02 +00:00
David Huynh
9acd3dbe05 Fixed issue 127 - Add column from Freebase raises exception. Made sure DataExtensionChange saves properly.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1261 7d457c2a-affb-35e4-300a-418c747d4874
2010-09-10 04:53:37 +00:00
Stefano Mazzocchi
e973fd3e89 d'oh, wrong object counter (thanks again to knut.forkalsrud for spotting my mistakes)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1250 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-31 23:25:16 +00:00
Stefano Mazzocchi
e5c6dda178 Fixed Issue-116
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1243 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-31 19:33:05 +00:00
Stefano Mazzocchi
7df259008b more whitespace
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1242 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-31 19:32:59 +00:00
Stefano Mazzocchi
cf66d00854 only whitespace (no functional changes)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1240 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-31 19:32:48 +00:00
Stefano Mazzocchi
860d6c4ee2 a little more solid (it's possible to have both Dates and Calendars in there)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1239 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-31 19:12:11 +00:00
Stefano Mazzocchi
3648883e0c ISSUE-99 thanks to knut.forkalsrud for providing the patch!
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1238 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-31 18:56:35 +00:00
Stefano Mazzocchi
5d788c9260 added timeline facet (like the numeric binning facet but working on dates instead of numbers and with date-specific binning logic)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1234 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-31 17:57:54 +00:00
David Huynh
bd7453adba Made sure to strip off charset from content-type when importing from URLs before looking up for the right importer.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1229 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-25 17:35:16 +00:00
David Huynh
367796488e Fixed xml importer: subgroups should now line up properly by rows.
Added command to reorder columns using drag and drop.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1227 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-25 06:17:08 +00:00
David Huynh
276fae8938 Save templating exporter's template.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1221 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-24 06:36:49 +00:00
David Huynh
baa4e0db8c Added command to browse to the data load page on the Gridworks QA dashboard.
Save the data load job name and fill it in the next time the Load into Freebase dialog is opened.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1220 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-24 06:06:07 +00:00
David Huynh
e4af19f8a6 Namespaced operations' names by their modules' names.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1215 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-24 04:02:36 +00:00
David Huynh
1f69fba43c Added command Add Column by Fetching URLs.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1203 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-22 23:55:07 +00:00
David Huynh
9041ebf7b9 Bumped version to 1.5.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1195 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-22 06:42:21 +00:00
David Huynh
c94abd0427 Commands are now registered in association with their modules, so to avoid name collision.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1193 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-22 06:16:13 +00:00
David Huynh
95e2e30c8a Added events to OverlayModel interface, so overlay models can react to saving events and to disposing event from the project.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1191 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-22 05:06:36 +00:00
David Huynh
4ea765b689 Factored out registries of importers and exporters.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1183 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-22 01:46:32 +00:00
David Huynh
99b8c4dc7a Setting rabj=true when uploading to freeq.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1170 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-19 21:13:00 +00:00
Stefano Mazzocchi
fcc54e2ab3 removing what turned out to be dead code
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1162 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-16 16:09:52 +00:00
Stefano Mazzocchi
bb7d3c388c ISSUE-115 datePart('month') should return January as 1 not 0
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1161 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-16 16:09:39 +00:00
David Huynh
a90a9c724e Forgot to register blank down operation in operation registry previously.
Added uniques GEL function for eliminating duplicates in an array.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1158 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-16 03:00:43 +00:00
David Huynh
fa816007a7 Fixed copy-and-paste string mistake in BlankDownOperation.
Fixed minor bug in Row.isValueBlank that returns true for non-string values.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1157 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-16 02:16:41 +00:00
David Huynh
e61655506a Added new command to import QA results, so any reconciliation action that yields conflicting or uncertain opinions among reviewers can be examined inside Gridworks.
Added new customized facets for checking QA results. 

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1156 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-13 16:26:33 +00:00
David Huynh
8f071ede31 Added command Transpose Cells in Rows into Columns (Issue 82).
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1147 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-08 06:30:30 +00:00
David Huynh
d1a66e2e63 Added JSON support in GEL.
Added GEL functions: escape, parseJson, hasField.
Fixed bug in preference store: expression history was still not loaded properly.
Integers are now rendered without decimals in the expression preview dialogs.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1145 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-07 22:57:48 +00:00
David Huynh
e70f16025b Fixed bug introduced recently by changing the preference key of the expression history from "expressions" to "scripting.expressions".
Added code in FileProjectManager for trying to recover projects in the workspace dir but are not recorded in the workspace json file.


git-svn-id: http://google-refine.googlecode.com/svn/trunk@1144 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-07 20:25:31 +00:00
David Huynh
0500d7aa10 Added commands Move Column to Beginning, Move Column to End, Move Column Left, Move Column Right.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1142 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-07 01:24:48 +00:00
David Huynh
f0eae04c0c Forgot to add 2 files in the last commit
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1141 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-06 23:50:22 +00:00
David Huynh
a8ee9b9e08 Added Fill Down and Blank Down commands.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1140 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-06 20:33:28 +00:00
David Huynh
3bda9d035d Added support for creating a project by pointing to a data file URL.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1139 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-06 06:15:05 +00:00
David Huynh
f411dc9104 - Issue 112: Refactor Importer API (patch from tfmorris)
- Added support for storing custom metadata in ProjectMetadata.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1138 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-06 05:04:25 +00:00
David Huynh
00c6865d95 - Select All and Unselect All buttons in History Extract dialog
- Schema skeleton: support for multiple cells per cell-as nodes, and for conditional links


git-svn-id: http://google-refine.googlecode.com/svn/trunk@1137 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-05 20:27:39 +00:00
David Huynh
5cb3f924f6 Added support in protograph for specifying several column names per cell-as nodes.
Started to add support for conditional links in protograph. The UI is not hooked up with.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1136 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-05 08:29:34 +00:00
David Huynh
b8ad56c6db Made sure in the schema skeleton dialog, in the dialog box for a node, in the "cell-as-topic" section, the type is always recorded.
In the triple loader transposed node factory, use the column's recon config to generate new topics' type.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1135 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-05 00:53:08 +00:00
David Huynh
dcc3ac8534 Renamed packages com.metaweb.* to com.google.*.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1130 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-03 23:01:18 +00:00
Stefano Mazzocchi
8c56b437fa more fixes
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1129 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-03 20:19:48 +00:00
David Huynh
762a9f13eb Text facet's choice count limit is now configurable through preference page. Preference page needs polishing.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1127 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-02 01:49:10 +00:00
David Huynh
965ef20790 Made sure commands that create new columns check for duplicate column names.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1126 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-01 04:44:21 +00:00
David Huynh
4ad31ffcde Excel importer now supports "header lines" parameter.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1125 7d457c2a-affb-35e4-300a-418c747d4874
2010-08-01 04:22:45 +00:00
David Huynh
7bb6674e5b Fixed recently introduced bug: expressions were not logged because preference stores were not initialized properly.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1124 7d457c2a-affb-35e4-300a-418c747d4874
2010-07-31 07:01:44 +00:00
David Huynh
f069780bfa Added support for bundling .js files to shave off some loading time.
For GetRowsCommand, tried to use jsonp but that didn't seem to improve performance much.
Gzip http responses of various text-based mime types.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1122 7d457c2a-affb-35e4-300a-418c747d4874
2010-07-31 06:51:11 +00:00
David Huynh
d71d84194f Register new operation Transpose Cells in Columns into Rows.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1112 7d457c2a-affb-35e4-300a-418c747d4874
2010-07-30 02:28:33 +00:00
David Huynh
ee14955605 Added new command Transpose Cells in Columns into Rows.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1111 7d457c2a-affb-35e4-300a-418c747d4874
2010-07-30 02:25:58 +00:00
David Huynh
a192674118 - added smartSplit GEL function that can handle quoted values
- added max width to operation extract dialog
- made GEL get and slice functions handle HasFieldsList
- fixed versioned standard-reconcile URLs (they need userid.user.dev)

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1110 7d457c2a-affb-35e4-300a-418c747d4874
2010-07-26 22:50:00 +00:00
David Huynh
2ff0184c65 - switched to accessing versioned standard-reconcile app
- standardized preference keys to using dot separated format
- added support to override freeq url from workspace preferences
- added GEL controls: forEachIndex, forRange, filter
- enforced max-width on preview table columns in expression preview dialog
- added preservedAllTokens option to split GEL function

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1109 7d457c2a-affb-35e4-300a-418c747d4874
2010-07-26 21:12:40 +00:00
David Huynh
4522b98f32 Store and use job ID to retrieve MDO ID and send that in subsequent loads.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1100 7d457c2a-affb-35e4-300a-418c747d4874
2010-07-16 17:32:06 +00:00
David Huynh
4373e7276f Pass target Freebase type IDs in recon objects to freeq.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1099 7d457c2a-affb-35e4-300a-418c747d4874
2010-07-15 21:45:17 +00:00
David Huynh
b854f99ef5 Removed extra closing brace.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1096 7d457c2a-affb-35e4-300a-418c747d4874
2010-07-11 05:54:06 +00:00
David Huynh
43dadf40da Added ignore:true to any triple that shouldn't be loaded.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1095 7d457c2a-affb-35e4-300a-418c747d4874
2010-07-10 05:07:24 +00:00
David Huynh
513283d4d1 Support creation of cache directories, so the rdf importer can store its lucene indexes.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1090 7d457c2a-affb-35e4-300a-418c747d4874
2010-07-08 23:22:29 +00:00
David Huynh
f5fc44e24e Refactoring to expose extension points that the rdf-exporter extension will plug into.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1074 7d457c2a-affb-35e4-300a-418c747d4874
2010-07-06 00:14:07 +00:00
David Huynh
ab82562016 Tripleloader protograph transposer now generates more context information for QA.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1073 7d457c2a-affb-35e4-300a-418c747d4874
2010-07-03 01:39:14 +00:00
David Huynh
217fb7b25c Fixed Issue 66: Records not excluded with inverted text facet.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1064 7d457c2a-affb-35e4-300a-418c747d4874
2010-07-01 20:26:54 +00:00
Stefano Mazzocchi
a682d6b36f fixing eclipse warnings
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1063 7d457c2a-affb-35e4-300a-418c747d4874
2010-07-01 08:03:42 +00:00
Stefano Mazzocchi
9fbff0640b make sure that splitting values maintains empty cells if the separator is repeated
(this is useful in case the cells contains a rigid structure across multiple columns)

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1062 7d457c2a-affb-35e4-300a-418c747d4874
2010-07-01 07:47:57 +00:00
Stefano Mazzocchi
2302d017d8 remove eclipse warnings
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1061 7d457c2a-affb-35e4-300a-418c747d4874
2010-07-01 07:47:52 +00:00
David Huynh
18b720b913 Fixed CSV and TSV export bug.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1059 7d457c2a-affb-35e4-300a-418c747d4874
2010-07-01 02:32:03 +00:00
David Huynh
2e3984d54a When transposing data to triple loader output, pass row indices and cell indices deep down so later we can generate more context information for recon.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1051 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-29 17:30:16 +00:00
David Huynh
0e4781cb58 Forgot a console.log() call.
Allow reconciling against no particular type.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1043 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-27 04:20:35 +00:00
David Huynh
76c8cd77eb "search for match" links in data table cells now use recon service's entity suggest options.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1041 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-27 03:31:56 +00:00
David Huynh
ecfb893e98 More work on the recon UI. Standard services can now be added.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1038 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-26 01:10:23 +00:00
David Huynh
1342ceacea Careful not to load all projects in an autosave cycle.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1037 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-25 22:43:30 +00:00
David Huynh
058e86b4c8 First pass in trying to generalize standard reconciliation service UI. A lot of pieces are still Freebase-centric.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1032 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-24 20:13:51 +00:00
Iain Sproat
f0ed50e468 issue 69 fixed. ControlFunctionRegistry now correctly registers Chomp expression as "chomp" key.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1024 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-23 17:53:29 +00:00
David Huynh
a9f77d0f51 Minor bug.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1020 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-22 21:29:15 +00:00
Iain Sproat
0d7b3b0e9c ProjectManager is now partially unit tested.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1015 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-21 19:57:31 +00:00
Iain Sproat
dcf6919900 Functionality which didn't need to be moved to FileProjectManager as it wasn't file system specific has been moved back to ProjectManager. importProject function is now named loadProjectMetadata to avoid confusion.
Some additional source code documentation added to ProjectManager, and methods rearranged in more readable fashion.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1011 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-20 16:32:49 +00:00
Iain Sproat
7ced0cb31e New feature for importing text files (CSV and TSV). By selecting the checkbox in index.html it allows the effects of quotation marks around data values to be ignored.
Unit test added for this.

This has required a further branch to opencsv - patch sent to opencsv project and can be tracked at  https://sourceforge.net/tracker/?func=detail&aid=3018599&group_id=148905&atid=773543

git-svn-id: http://google-refine.googlecode.com/svn/trunk@1010 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-20 14:47:45 +00:00
Iain Sproat
0af7e5fcf5 More functionality which didn't need to be moved to FileProjectManager, as it wasn't file system specific, has been moved back to ProjectManager.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@992 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-18 20:47:15 +00:00
Iain Sproat
c72b4571a5 Functionality which didn't need to be moved to FileProjectManager as it wasn't file system specific has been moved back to ProjectManager.
Some additional source code documentation added.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@991 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-18 19:16:30 +00:00
Iain Sproat
846cf1d57e Fixed bug in CsvExporter, all unit tests for CsvExporter and TsvExporter now working.
History now has the beginnings of a unit test.

Additional source documentation on public methods in ProjectManager and History.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@989 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-17 15:37:28 +00:00
David Huynh
e7d0fc5ed6 Implemented a generic preference store for both the whole workspace and each project.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@988 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-17 02:15:26 +00:00
Iain Sproat
18e319bb76 Moved call to FileHistoryEntryManager from ProjectManager to FileProjectManager.
Added interface HistoryEntryManager, which seems to have been forgotten from last commit.
FileHistoryEntry is now named FileHistoryEntryManager.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@983 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-16 14:17:17 +00:00
Iain Sproat
f92fc2d056 Internal refactor for IO - HistoryEntry is now a concrete class, so can be instantiated (reverting Operations classes back to r972 which were changed as a result of HistoryEntry being abstract).
HistoryEntry now deals with backend (filesystem etc.) through classes which implement HistoryEntryManager.  This HistoryEntryManager is held by ProjectManager, which allows for FileProjectManager to create FileHistoryEntryManager as appropriate.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@982 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-16 12:35:37 +00:00
Iain Sproat
280daad2f6 Refactored ImportProjectCommand and ExportProjectCommand. These are no longer dependent on the File System, and all file system related work is done in FileProjectManager.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@981 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-16 07:44:46 +00:00
Iain Sproat
f47cb75525 Fixed ImportProjectCommand so it no longer contains references to project.html, a file previously removed from the project.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@980 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-16 07:06:52 +00:00
Iain Sproat
17f1dc2e6f The file system coupled method getProjectDirectory is now removed from ProjectManager.
Methods of HistoryEntry which directly work with the File System have been moved to FileHistoryEntry in the io directory, and HistoryEntry made abstract.

As the abstract HistoryEntry cannot be instantiated directly, the ProjectManager is now responsible for creating new HistoryEntry.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@973 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-15 22:11:35 +00:00
Iain Sproat
b07075bed5 FileProjectManager and portions of Project and ProjectMetadata classes which deal with io are moved to an io directory.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@972 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-15 20:55:38 +00:00
Iain Sproat
c94957b6a0 CreateProjectCommand no longer contains references to project.html, a file previously removed from the project.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@971 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-15 20:28:18 +00:00
Iain Sproat
dc7060d390 portion of ProjectManager which interacts with File System has been moved to FileProjectManager, which extends ProjectManager. ProjectManager is now abstract.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@970 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-15 19:34:40 +00:00
Iain Sproat
a671551289 Two more XmlImport tests now work. Some documentation stubs were added to XmlImporterUtilities.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@967 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-15 16:22:53 +00:00
David Huynh
f7fe44dccc Converted project.html to project.vt and added a client side resource manager, where extensions can register scripts and styles to be included in .vt files
git-svn-id: http://google-refine.googlecode.com/svn/trunk@965 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-15 00:35:23 +00:00
David Huynh
b0389d8c6a Jython integration has been moved out to an extension.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@964 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-14 22:11:34 +00:00
Stefano Mazzocchi
af48cb799e moving Griworks to use the Butterfly webapp framework (this will allow us to make gw more extensible without excessive complexity... as a bonus we gain server side javascript support which might end up being useful)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@940 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-05 00:50:18 +00:00
Stefano Mazzocchi
0648e8725e adding regexp group capturing GEL function
git-svn-id: http://google-refine.googlecode.com/svn/trunk@932 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-01 08:54:17 +00:00
Stefano Mazzocchi
5e0acf28d0 forgot to add the ngram class itself
git-svn-id: http://google-refine.googlecode.com/svn/trunk@931 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-01 08:04:40 +00:00
Stefano Mazzocchi
b3173211e3 adding an ngram function to GEL
git-svn-id: http://google-refine.googlecode.com/svn/trunk@930 7d457c2a-affb-35e4-300a-418c747d4874
2010-06-01 08:02:28 +00:00
Stefano Mazzocchi
3b7f132430 fixing jython initialization logic
git-svn-id: http://google-refine.googlecode.com/svn/trunk@924 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-31 09:00:42 +00:00
Stefano Mazzocchi
e3fc7ab603 bringing the refactor branch up to speed with trunk
(everything works like in trunk for now, although some tests still fail)


git-svn-id: http://google-refine.googlecode.com/svn/branches/split-refactor@915 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-30 18:18:59 +00:00
Stefano Mazzocchi
aa4de48f95 some renaming, moving tests into main
git-svn-id: http://google-refine.googlecode.com/svn/branches/split-refactor@906 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-30 16:55:53 +00:00