Owen Stephens
cccf1e55c9
Update split multi-valued cells to support split by regex and split by lengths
2017-10-22 23:54:18 +01:00
Jacky
63c1714d0a
add fields for metadata
2017-10-22 00:37:59 -04:00
Jacky
f1ab6b8cd6
Merge branch 'master' of https://github.com/OpenRefine/OpenRefine
2017-10-21 23:49:58 -04:00
Jacky
818e139b43
add the import options to metadata
2017-10-21 23:41:11 -04:00
Antonin Delpeuch
21f4d62474
Merge pull request #1275 from OpenRefine/wikitext-url-fix
...
Forbid pipe characters in URL references to ease parsing.
2017-10-20 16:41:00 +02:00
Antonin Delpeuch
e2a22a6994
Forbid pipe characters in URL references to ease parsing.
...
This is a temporary fix before we do full Wikitext parsing inside references
(this needs a change upstream). See https://github.com/sweble/sweble-wikitext/issues/67 .
2017-10-20 15:32:58 +01:00
Antonin Delpeuch
54acf10edf
Change "topic" to "item" in the UI
2017-10-18 12:39:40 +01:00
Antonin Delpeuch
473b1b135d
Merge pull request #1264 from OpenRefine/issue1262
...
Update Jackson to 2.9.1
2017-10-09 20:09:49 +02:00
Antonin Delpeuch
c9cc4fb262
Update Jackson to 2.9.1
...
Closes #1262
2017-10-09 17:38:09 +01:00
Owen Stephens
bb6b8378d3
Ensure _max is never less than _min
2017-10-09 17:13:43 +01:00
Antonin Delpeuch
1da3c00cb1
Perform ASCII normalization earlier in FingerprintKeyer.
...
This closes #1256 .
2017-09-27 09:23:40 +01:00
Antonin Delpeuch
cfc0b95cd1
Fix string comparison in Wikitext exporter
2017-09-23 23:13:18 +01:00
Antonin Delpeuch
a1b2c9b683
Add support for references in Wikitable importer.
...
Closes #1243 .
2017-09-23 22:54:43 +01:00
Antonin Delpeuch
49564e8905
Fix bug when an extra column starts in the middle of the table
2017-09-19 13:54:27 +01:00
Antonin Delpeuch
00f8e4fc6b
Merge pull request #1237 .
...
Conflicts:
.classpath
main/webapp/modules/core/langs/translation-en.json
main/webapp/modules/core/scripts/dialogs/extend-data-preview-dialog.js
Closes #363 and #56 .
2017-08-28 16:25:50 +01:00
Antonin Delpeuch
c66e609b1d
Cleanup wikitext PR for Codacy
2017-08-26 21:50:02 +01:00
Antonin Delpeuch
0a00fd9318
Add option to include raw templates as cells
2017-08-25 14:28:30 +01:00
Antonin Delpeuch
554b75fa7b
Fix parsing of newlines in cells
2017-08-17 19:18:50 +01:00
Antonin Delpeuch
7989aacc58
Cleanup for Codacy
2017-08-17 12:40:56 +01:00
Antonin Delpeuch
637e69db9d
Better error reporting and testing for Wikitext import
2017-08-16 10:30:51 +01:00
Antonin Delpeuch
3dcda5a42c
Add reconciliation config in wikitext import.
2017-08-16 00:05:40 +01:00
Antonin Delpeuch
86dc240335
Support reconciliation via sitelinks.
...
Wikilinks are automatically reconciled at import time.
Related to #56 .
2017-08-15 20:17:34 +01:00
Antonin Delpeuch
aa4517ba58
Add support for colspan and rowspan in Wikitext
2017-08-15 11:28:43 +01:00
Antonin Delpeuch
73f7fdc036
Update TextFormatGuesser to support wikitext
2017-08-14 15:58:27 +01:00
Antonin Delpeuch
e168c900e8
Add support for table headers
2017-08-13 20:14:48 +01:00
Jacky
c3e04010b1
Merge branch 'master' into master
2017-08-13 14:09:56 -04:00
Antonin Delpeuch
b8a781d366
Add support for links (unreconciled for now)
2017-08-13 12:57:46 +01:00
Antonin Delpeuch
e6406f56ef
Initial version of the wikitext importer
2017-08-13 11:26:59 +01:00
Antonin Delpeuch
dbb071da30
Merge branch 'default-to-english' of https://github.com/RBGKew/OpenRefine into RBGKew-default-to-english
2017-08-09 14:07:22 +01:00
Jacky
275dac976e
fix #137
2017-08-07 21:53:35 -04:00
Antonin Delpeuch
66eac0fae9
Ensure null values are not cached in URL fetching operation. Closes #1219 .
2017-08-01 13:05:29 +01:00
jackyq2015
53baa5a833
put the correct params description
2017-07-28 20:37:20 -04:00
jackyq2015
4950d29074
add backward compatility for cross function
2017-07-23 19:19:58 -04:00
Thad Guidry
7f92251ed1
Merge pull request #1210 from wetneb/extend
...
Add data extension capabilities to the reconciliation API
2017-07-17 18:01:37 -05:00
Antonin Delpeuch
84c06821ee
Data extension tests
2017-07-16 11:47:12 +01:00
Antonin Delpeuch
05873f283d
Integration of constraints with service-defined forms
2017-07-14 22:17:40 +01:00
Antonin Delpeuch
3eadefe613
Do not add reconciliation statistics on columns without types
2017-07-14 12:53:54 +01:00
Antonin Delpeuch
6501c235e8
Pass the identifier and schema spaces along to create better ReconCandidates
2017-07-14 12:30:39 +01:00
Antonin Delpeuch
cc991cab21
Add nicer spinning gif while preview is loading.
...
Fix bug of multiple ColumnInfo being generated.
2017-07-14 11:30:17 +01:00
Antonin Delpeuch
d99128c330
Retrieve types from the extend service
2017-07-06 21:15:37 +02:00
Antonin Delpeuch
ad3a174abd
Starting to migrate data extension to standard reconciliation services
2017-07-04 23:14:19 +02:00
jackyq2015
1ee339cbbd
cross function test suite. #1204
2017-06-28 08:12:36 -04:00
jackyq2015
f03be76475
Extend cross() function to take either a cell or a value #1204
2017-06-25 21:04:00 -04:00
Felix Lohmeier
2557cc5419
bugfix for new option autosave period
2017-06-24 22:42:49 +02:00
Felix Lohmeier
e54199a6f1
added options for initial java heap space and autosave period
2017-06-22 12:27:55 +02:00
Adi Eyal
09c00c6a19
Fixes #1181
2017-05-05 23:38:37 +02:00
Bob Harper
909df1b6a7
xor can also accept 2+ params, rewrite tests to be consistent
2017-04-27 11:20:48 +01:00
Bob Harper
ef4e039998
allow more than 2 AND and OR conditions
2017-04-26 20:51:58 +01:00
wangwenxiang
660df900d4
Fix bug: load wrong new value for RowStarChange
2017-03-15 12:54:01 +08:00
wangwenxiang
0314f49f36
Fix bug: load wrong new value for RowFlagChange
2017-03-15 10:39:33 +08:00
Jacky
912600f0bd
Merge pull request #1178 from wetneb/url_caching
...
Add caching in URL fetching
2017-03-09 17:28:38 -05:00
Antonin Delpeuch
22124ac57e
Add checkbox to disable caching
2017-03-09 00:21:34 +00:00
Antonin Delpeuch
32c232c2d6
Move to Guava's cache for ColumnAdditionByFetchingURLsOperation
2017-03-08 09:32:34 +00:00
Antonin Delpeuch
a9c4b0af16
Cache String, not URL, in ColumnAdditionByFetchingURLsOperation
2017-03-08 07:45:11 +00:00
Antonin Delpeuch
782a2f5b48
Add caching in URL fetching
2017-03-07 20:24:50 +00:00
Jacky
5aede573dc
bump version to 2.7
2017-02-10 15:55:58 -05:00
Qi Cui
773151380e
fix #1138 . column transpose
2016-08-24 13:56:35 -04:00
Tom Morris
aa65bc5c18
Throw exception on error instead of logging to console
2016-05-17 15:10:09 -04:00
Tom Morris
6df822e5a6
Set ContentType to application/json
2016-05-17 15:10:09 -04:00
Tom Morris
5d45566455
Protect against NPE when content type is missing
2016-05-17 15:10:09 -04:00
Scott Wiedemann
16b0453b74
Update ToDate.java
...
Updating SimpleDateFormat api doc url for ToDate function.
2015-11-13 12:27:16 -07:00
Steffen Stundzig
7f5e58ef51
#1086 add support for quote character
2015-10-30 14:32:46 +01:00
Tom Morris
be7f880cbe
Revert addition of synchronized methods
2015-10-16 19:33:15 -04:00
Tom Morris
e3858da843
Escape cell data for HTML - fixes #1049
2015-10-16 15:41:03 -04:00
Martin Magdinier
8b4a1d577a
Merge pull request #1079 from RefinePro/issue-796
...
fixed issue #796 Columnize by key/value columns creates empty lines
2015-10-08 14:01:07 -04:00
jackyq2015
7a2a0eb52f
fixed issue #796 Columnize by key/value columns creates empty lines
2015-09-29 20:12:05 -04:00
Tom Morris
48681e8877
Move assert where it belongs
2015-09-25 20:01:27 -04:00
Tom Morris
be936a86eb
Clean up PR #1055
2015-09-25 19:01:16 -04:00
Tom Morris
de66afa512
Revert " Use new algorithm for levenshtein clustering"
2015-09-25 16:44:25 -04:00
Thad Guidry
175f4a5319
Merge pull request #1047 from lemmingapex/master
...
Fixed #1046 Combine xls and xlsx formats by inspecting file header information in ExcelImporter
2015-09-21 20:33:05 -05:00
Thad Guidry
94e219042e
Merge pull request #1007 from lispc/master
...
Use new algorithm for levenshtein clustering
2015-09-21 20:23:45 -05:00
Thad Guidry
85ffce60d2
Merge pull request #1070 from RefinePro/issue-995
...
fix issue #995
2015-09-21 20:12:51 -05:00
jackyq2015
d671d7784b
fix issue #995
2015-09-21 21:03:25 -04:00
magdmartin
ab56b73db9
Merge pull request #993 from RefinePro/OpenRefine-trunk
...
prevent the multiple sorting
2015-09-20 09:32:17 -04:00
magdmartin
b635f4e067
Merge pull request #1055 from RefinePro/issue-512
...
fix issue #512 to save the file location as a table column
2015-09-20 09:31:16 -04:00
magdmartin
ab6e2951e9
Merge pull request #1051 from RefinePro/issue-1015
...
Issue 1015. add the meta utf-8
2015-09-20 09:28:10 -04:00
jackyq2015
4e6f584cde
fix issue #512 to save the file location as a table column
2015-08-27 15:13:20 -04:00
jackyq2015
dc7535c63e
1. take out of issue #1021 fix which was mistakenly put in
...
2. fix the expected value for JUNIT
2015-08-06 21:31:37 -04:00
Scott Wiedemann
5eab8893cc
Fixed #1046 Combine xls and xlsx formats by inspecting file header information in ExcelImporter.
2015-07-30 16:19:26 -06:00
jackyq2015
819e1ba5c6
patch for issue #708 . fix few hanging UIs when importing file
2015-07-18 10:27:35 -04:00
lispc
43e441a4d0
Use new algorithm for levenshtein clustering
2015-06-01 20:35:21 +08:00
Jacky
ca862970a4
prevent the multiple sorting
2015-05-01 15:04:51 -04:00
magdmartin
383f8c5e50
Changed GREL to *General Refine Expression Language* as agreed in 2013 when drafting *Using OpenRefine*
2015-04-21 10:35:52 -04:00
Matthew Blissett
5cdc6d7b5a
Fallback to English language to avoid need to maintain 'default' translation files.
2015-02-10 12:33:08 +00:00
QI CUI
495dcd7bd5
use the LinkedHashMap instead of HashMap to make sure the retrive order
2015-01-30 15:03:20 -05:00
Tom Morris
83da996a36
Change to Java 5 loop syntax
2014-12-23 00:04:24 -05:00
Tom Morris
ddfaecb3e6
Merge pull request #914 from opendatatrentino/rev-masschange
...
Fix wrong revert order in MassChange
2014-12-22 23:50:31 -05:00
David Leoni
4d2b90ad60
added MassChangeTests
2014-12-22 12:23:49 +01:00
Tom Morris
ea723413cb
Use StringUtils.toString() convenience method
2014-12-21 11:39:34 -05:00
Tom Morris
4eb6eb6eda
Merge pull request #915 from opendatatrentino/fixNullCellToString
...
Fixes Cell.toString failing on null value
2014-12-21 11:13:34 -05:00
Matthew Blissett
f3e2b9622a
Add charset=UTF-8 to HTTP Content-Type for reconciliation queries.
...
Fixes problem where non-ASCII characters would be URL encoded as UTF-8, but interpreted according to the whims of the server.
2014-11-28 10:45:22 +00:00
David Leoni
c3884c57f5
Fixes Cell.toString failing on null value
2014-11-27 18:45:01 +01:00
David Leoni
d29bf230b5
Fixes wrong revert order in MassChange
2014-11-27 18:12:54 +01:00
Thad Guidry
cdda1edcf0
Fixed issue with null cells after Fetch URL
...
Some websites do not set the charset= properly and use enclosing quotes. Tested and Verified.
2014-08-13 21:39:30 -05:00
Tom Morris
536493c5d3
Fix AbstractMethodError 500 - fixes #589
2014-08-05 14:55:45 -04:00
Tom Morris
2fa9cf11c8
Merge pull request #859 from Arcadelia/Job-lastTouched-fix
...
Initialized ImportingJob.lastTouched
2014-07-03 10:36:48 -04:00
Tom Morris
655e0b0dc1
Wrap conditional statement in block
2014-07-03 10:35:24 -04:00
Tom Morris
b21cb56149
Merge pull request #852 from Arcadelia/Duplicate-job-id-fix
...
Import job duplicate id fix
2014-07-03 10:34:29 -04:00
Tom Morris
4333b1b2e7
Merge pull request #881 from zsxwing/simple-date-format-bug
...
Put ISO8601_FORMAT into ThreadLocal to fix the concurrency issue
2014-07-03 10:15:03 -04:00
Tom Morris
d106d61b25
Improve error messages - fixes #878
2014-05-30 01:47:22 -04:00
Tom Morris
5799c3d92b
Synchronize access to processes list - fixes #862
2014-05-30 01:47:21 -04:00
zsxwing
4ee8e079c9
Put ISO8601_FORMAT into ThreadLocal to fix the concurrency issue
2014-05-30 11:45:28 +08:00
Tom Morris
a4d03968a5
Merge pull request #867 from abhillman/exceloutput255bugfix
...
Report error to user when attempting to export >255 columns, rather than generic 500 ISE
2014-04-20 23:43:19 -04:00
Aryeh Hillman
2bf35e5f0d
Fix when exporting to excel files
...
When exporting to excel, there cannot be more than 255 columns.
If there are more columns than that, we write "ERROR: TOO MANY
COLUMNS" to the 255th column. Formerly, OpenRefine reported
a 500 Server error.
2014-04-12 16:41:54 -07:00
Frank Wennerdahl
8c02a13429
Initialized ImportingJob.lastTouched
...
Prevents the CleaningTimerTask from disposing newly created
ImportingJobs which have not yet been touched.
2014-02-19 16:02:45 +01:00
Frank Wennerdahl
a0d4eb0058
Job id duplicate fix
...
Changed how job id's are created to avoid the same id to be assigned to
two concurrent jobs.
2014-02-05 12:21:50 +01:00
Frank Wennerdahl
6dedae37a1
Fixed too frequent job cleanups
...
The ImportingManager cleans up jobs that has not been touched in 60ms.
According to comment this should be 60 minutes but was changed in
4529310237
.
2014-02-05 11:07:41 +01:00
Tom Morris
bc801546cc
Remove references to obsolete splitIntoColumns option
2013-09-18 18:44:58 -04:00
Tom Morris
4f2ebed676
Make localization language list dynamic - fixes #807
...
- refactor LoadLanguageCommand so language loading can be reused
- add GetLanguagesCommand for the server
- change GUI to fetch language list and update selection list with it
2013-09-18 13:16:24 -04:00
Tom Morris
1261734f15
Partial solution for #816 plus improved conversion test coverage
2013-09-18 11:14:48 -04:00
Tom Morris
d84f897ae0
Improve help message to specify an integer is returned
2013-09-18 11:12:34 -04:00
Tom Morris
f344e3da1c
Return "null" for toString(null) - fixes #783
...
- also fixed grammar in error message
2013-09-18 10:20:17 -04:00
Tom Morris
daed3bd90c
Move MARC->XML conversion to earlier in process - issue #794
...
- functional now, but probably not good enough to release yet
2013-09-17 19:19:50 -04:00
Tom Morris
6bd6a5934b
Start wiring up MARC importer - issue #794
2013-09-17 17:17:23 -04:00
Tom Morris
cce480ff38
Fix implementation for #466 to handle default empty string
2013-09-04 18:59:13 -04:00
Tom Morris
889245fdf4
Make the number of reconciliation results configurable - closes #466
2013-09-04 18:07:12 -04:00
Thad Guidry
f2c4e3ab48
Added ability to extract MILLISECOND to datePart (milliseconds,ms,S)
2013-08-30 09:09:54 -05:00
Tom Morris
c68c1bb2b1
Upgrade to Clojure 1.5.1 & switch to clojure-slim JAR - #792
2013-08-26 19:40:37 -04:00
Tom Morris
62b8c476f1
Use Java's built-in Number formatter instead of ICU4J which is
...
massive - #792
2013-08-26 15:47:12 -04:00
Tom Morris
4529310237
Switch from TimerTask to ScheduledExecutorService for more robustness
2013-08-18 11:31:03 -04:00
Tom Morris
e93bfa798e
Use iterator when removing to avoid ConcurrentModificationException -
...
fixes #652
2013-08-17 13:45:22 -04:00
Tom Morris
3315136681
Allow reinitializatoin of ProjectManager singleton - fixes #787
2013-08-17 12:47:57 -04:00
Tom Morris
25f02dd9b9
Fix Java 6 incompatibility
2013-08-15 15:57:24 -04:00
Tom Morris
fa072df85c
Add locale support to toDate() - fixes #729
2013-08-15 15:19:01 -04:00
Tom Morris
ab42df6ea3
Merge pull request #658 from Arcadelia/CSV_Multi-char-separator_support
...
Support for multi-char-separators in CSV
2013-08-14 07:29:45 -07:00
Tom Morris
37d8abc114
Minor improvement to recon error handling
2013-08-10 18:03:06 -04:00
Tom Morris
1d8784e059
Make workspace saving and loading more robust - fixes #528
...
- don't overwrite old files if we get an error writing new ones
- don't write unchanged data
- keep backup files around until next write rather than deleting
immediately
- attempt to recreate missing metadata as best as possible
2013-08-09 19:53:53 -04:00
Tom Morris
579d71b7eb
Switch back to NUL character for quote now that OpenCSV handles it -
...
fixes #653
2013-08-07 17:07:17 -04:00
Tom Morris
7b5b549113
More project saving changes for #528
...
- reduce project retention in memory from 1 hr to 15 min.
- free all unmodified projects if we get an error on save (we could be
running low on memory)
- make sure exceptions propagate up to where they can be usefully
handled
2013-08-05 14:13:56 -04:00
Tom Morris
190a031a8a
Comments only. No code changes.
2013-08-05 14:11:06 -04:00
Tom Morris
3500f20e47
Save all modified projects before importing new one - hopefully helps
...
#528
2013-08-05 14:10:26 -04:00
Tom Morris
57f5e9873d
Add Javadoc. No code changes.
2013-08-05 13:08:30 -04:00
Tom Morris
c3cab0524a
Narrow exceptions thrown and let them propagate up so we know
...
workspace file isn't valid - first step for #528
2013-08-05 13:08:02 -04:00
Tom Morris
a7273625d7
Add support for Basic Authentication over HTTPS - addresses #217
2013-08-02 19:15:24 -04:00
Tom Morris
4f7da9d18e
Switch to Apache HTTP client for downloads - fixes #748
2013-08-02 18:13:41 -04:00
Tom Morris
d7531bbbd8
Handle quoted fields with embedded new lines. Sort separators by score
...
rather than just standard deviation
2013-08-02 17:59:09 -04:00
Tom Morris
f4ff227340
Clean up localization - fixes #760 , modifies pull request #755
...
- make all file loading relative to module base
- move core language files into appropriate place
- eliminate all SetLanguage commands and use SetPreference instead
- eliminate all LoadLanguage commands except for core's
- fix duplicate keys in JSON language files
- remove BOM from JSON language files
OPEN - task 760: Translations not being loaded from built kit
http://github.com/OpenRefine/OpenRefine/issues/issue/760
2013-07-31 00:31:31 -04:00
Tom Morris
9450d483ce
Fix up line endings
2013-07-29 15:49:20 -04:00
Tom Morris
3003c1a709
Make importers more robust to preview errors when someone selects the
...
wrong importer/parser
2013-07-27 13:35:12 -04:00
Tom Morris
57ca70132c
Turn all import conversions off by default - fixes #478
2013-07-27 13:32:26 -04:00
Tom Morris
5123dad6a8
More conservative approach for locking of jobs table
2013-07-26 18:51:08 -04:00
Tom Morris
0dc14af1aa
Fix bug in refactoring of ImportingJob from commit
...
1e5f89e84c
2013-07-26 18:50:03 -04:00
Tom Morris
46a1e198d8
Recompute max cell index when rebuiling maps in ColumnModel - fixes #406
2013-07-26 18:48:20 -04:00
Tom Morris
7edc550618
Give a reasonable error message on Excel 95 import failure - fixes #564
2013-07-26 16:24:56 -04:00
Tom Morris
dc4d04c132
Allow arrays containing null in Filter & ForEach - fixes #741
2013-07-26 15:20:44 -04:00
Tom Morris
1e5f89e84c
Centralize handling of import job config object & synchronize to allow
...
multiple accessors
2013-07-25 15:41:08 -04:00
Tom Morris
dc206e1889
Switch to ConcurrentHashMap for jobs table to allow multiple accessors
2013-07-25 15:36:54 -04:00
Tom Morris
0ff2d7ed9f
Simplify implementation from pull request #728
2013-07-25 13:45:44 -04:00
Tom Morris
6dd4b8ea23
Add tests for boolean functions and tighten up error handling
2013-07-25 13:45:04 -04:00
Tom Morris
2c2c0d3d68
Merge pull request #728 from jmcastagnetto/master
...
Implements Xor operation
2013-07-25 10:00:11 -07:00
Blakko
6e90bc41f6
Merge remote-tracking branch 'origin/master' into internationalization
...
Conflicts:
extensions/freebase/module/scripts/dialogs/schema-alignment/schema-alignment-dialog.html
main/webapp/modules/core/index.vt
main/webapp/modules/core/project.vt
main/webapp/modules/core/scripts/project/browsing-engine.js
main/webapp/modules/core/scripts/project/history-panel.html
2013-07-25 11:07:59 +02:00
Blakko
e6e6c8c002
Added a "Language Settings" menu at index
...
Now the language manually set has priority over the browser lang
Update translations
2013-07-12 11:12:33 +02:00
Tom Morris
92e4427c39
Adding a TODO
2013-07-10 15:13:22 -04:00
Tom Morris
32773122c4
Fix CollationKey creation - fixes #753
2013-07-10 15:12:49 -04:00
Blakko
552b0bf94b
Internationalization of the index part (create/open/update) of refine
2013-07-02 13:40:50 +02:00
Tom Morris
5b6bc888f7
Fix template escape processing. Fixes #752 .
2013-06-30 12:21:26 -04:00
Tom Morris
a3b4b45e4e
Support non-string types in facetCount() - fixes #591
2013-06-23 12:04:48 -04:00
Tom Morris
51c1bc4a2f
Refactor default toString with date support into separate utility
2013-06-23 12:02:13 -04:00
Tom Morris
c961bb64de
Flush all column caches on row removals/changes. Fixes issue 567.
2013-06-22 18:44:26 -04:00
Tom Morris
fd58bd3327
Move documentation to Javadoc where it's visible
2013-06-22 16:27:18 -04:00
Tom Morris
6e88d068ee
Throw a narrower exception
2013-06-22 16:26:45 -04:00
Jesus M. Castagnetto
0795bd8422
resolved .gitignore conflict
2013-06-19 12:10:32 -05:00
Jesus M. Castagnetto
b09bb4463e
fix error in index caught by thadguidry
2013-06-19 11:21:26 -05:00
Tom Morris
b91fc8a2b1
Use CollationKeys when sorting text. Fixes issue 738
2013-06-17 15:51:29 -04:00
Tom Morris
067fcacec7
Clean up to pass tests:
...
- don't include TAB in control characters which get stripped so we can
use it for splitting
- remove trailing space from normalize strings
2013-05-31 17:06:03 -04:00
Tom Morris
000c0a38a8
Compute delay from request issue, not response return. Fixes #721
2013-05-26 10:13:16 -04:00
Tom Morris
4a5d3d4662
Convert dates to ISO 8601 for reconciliation. Fixes #688 .
2013-05-26 10:08:55 -04:00
Tom Morris
7615db97cf
Add Javadoc clean up variable naming. No functional change.
2013-05-26 10:07:37 -04:00
Tom Morris
36dd95c263
Add TODO for record mode operation
2013-05-26 07:54:33 -04:00
Tom Morris
567da6aa9f
Normalize line endings
...
Add .gitattributes & do one-time normalization of line endings
2013-03-23 18:46:20 -04:00
Tom Morris
6a91b5d75b
Use InputStream instead of Reader for JSON import - fixes #698
2013-03-23 18:36:05 -04:00
Tom Morris
6b3592982e
Remove O(n^2) issue in tree importers - fixes #699
...
- Add sparse/based list implementation for ImportRecord
2013-03-23 12:02:51 -04:00
Tom Morris
f78dfadcf3
Clean up tree import utilities for #699
...
- lazy allocate objects
- conditionalize logging to prevent calls to StringBuilder & toString()
These are secondary issues, but still worth cleaning up.
2013-03-23 11:56:58 -04:00
Tom Morris
0a2ba1b1ae
Switch from LinkedList to ArrayList
...
Just a simple list. No need for extra overhead..
2013-03-23 08:16:23 -04:00
Tom Morris
bfa7c34d17
Merge pull request #659 - closes #659
2013-03-18 21:24:01 -04:00
Tom Morris
31cffa1181
Merge remote-tracking branch 'upstream/master'
2013-03-18 21:16:55 -04:00
Tom Morris
8a61cf731b
Merge pull request #664 from Arcadelia/Preserve_Quotes
...
Quotes should not be removed from values
2013-03-18 18:12:51 -07:00
Tom Morris
fe943fe3ea
Flag English specific stopwords for cleanupp
2013-03-18 20:20:46 -04:00
Tom Morris
7b9f6836e1
Update key & id recon to new Freebase APIs - part of #696
2013-03-12 16:50:23 -04:00
Tom Morris
7578d3375f
Add logger and logging
...
- fix exception printing that goes nowhere
- make logger available for subclasses to use
2013-03-11 13:14:20 -04:00
Tom Morris
a2a8f4af2e
Patch applied - closed #315
2013-03-06 21:45:54 -05:00
Tom Morris
d8d82bf8b7
Clean up a couple more format guessing issues left over from #685
2013-03-06 20:39:39 -05:00
Tom Morris
369bfffb2f
Don't guess field widths unless we have at least 3 lines
...
- Investigation of #685 showed that single line files were being guessed
as fixed field width
2013-03-04 17:47:06 -05:00
Tom Morris
6b676f7513
Handle MIME media types which have charset param - fixes #685
2013-03-04 17:45:34 -05:00
Tom Morris
10bd7e3b75
Make upper bound of time facet inclusive - fixes issue #648
2013-03-03 16:06:20 -05:00
Tom Morris
eba03fc69e
Protect joins map with mutex - fixes issue #652
2013-03-03 09:36:43 -05:00
Tom Morris
7b3379afc7
fix range check in getFields - fixes issue 687
2013-02-26 16:35:21 -05:00
Tom Morris
389e762251
Merge remote-tracking branch 'upstream/master'
2013-02-26 00:01:06 -05:00
Tom Morris
95e13eac50
Improve recon error handling
2013-02-26 00:00:03 -05:00
Tom Morris
50888c6f2e
Merge pull request #666 from Arcadelia/Temp-file_removal
...
Fixed removal of upload temp files
2013-02-11 15:11:24 -08:00
Tom Morris
1033ce973e
TODO about memory usage
2013-02-03 15:56:54 -05:00
Jesus M. Castagnetto
71f3196048
added comment on implementation
2013-02-01 23:45:43 -05:00
Jesus M. Castagnetto
36d2c4ac44
Added full text of BSD 2-clause
2013-02-01 23:44:35 -05:00
Jesus M. Castagnetto
df450b20f7
Registering new XOR command
2013-02-01 22:42:01 -05:00
Jesus Castagnetto
fec35a8bc6
Update main/src/com/google/refine/expr/functions/booleans/Xor.java
2013-02-01 21:07:42 -05:00
Jesus Castagnetto
ebec459cfd
indentation change
2013-02-01 21:00:36 -05:00
Jesus Castagnetto
473e2f367f
Implementing Xor operation
2013-02-01 17:59:16 -08:00
Tom Morris
c0347225b8
Switch escape character from NUL to DEL in hopes that it's rarer.
2013-02-01 17:12:07 -05:00
Frank Wennerdahl
2c59a0059f
Fixed removal of upload temp files
...
Fixed an issue with an unclosed stream preventing upload temp files from
being removed after use. Also removed the use of FileCleaningTracker and
instead added manual removal of all tempfiles. By doing this the reaper
threads in FileCleaningTracker are avoided and files are removed
directly after use.
2013-01-24 09:59:09 +01:00
Frank Wennerdahl
64cf62e081
Fixed history and header update in IE
...
Due to Internet Explorer caching GET requests the Undo/Redo list and
column headers were not updated, leaving essential parts of the user
interface crippled even if Google Frame is installed. Adding
Cache-Control headers to the responses fixes this.
2013-01-24 09:39:12 +01:00
Frank Wennerdahl
1f7ab046c7
Quotes should not be removed from values
...
Leading quotation marks should not be removed from values. If they have
been left by the importing parser they should be considered part of the
value.
2013-01-24 09:04:17 +01:00
Frank Wennerdahl
ebdc40ad71
Added CSV quote options
...
Added two additional CSV options, one for parsing and one for export.
Specifying strict quotes when parsing will ignore all data not quoted.
Specifying quote all when exporting will enclose all values in quotes.
No front-end changes made, just added the support for the options in the
requests.
2013-01-21 08:21:16 +01:00
Frank Wennerdahl
f837643f1e
Support for multi-char-separators in CSV
...
This change requires that the following patch is applied to OpenCSV:
http://sourceforge.net/tracker/index.php?func=detail&aid=3599477&group_id=148905&atid=773543
2013-01-18 16:28:27 +01:00
Tom Morris
33aa1132d7
Clarify wording/naming of blank rows export option - fixes issue #651
...
- clarify that it refers to all non-null cells
- rename variables without compatibility constraints to match actual
function
2013-01-14 16:36:09 -05:00
Tom Morris
0bd2104a16
Issue 630: Change branding from Google Refine to OpenRefine
...
** The first native Github commit (ie not one converted from SVN **
Change Google Refine to OpenRefine or just Refine.
Change icon filenames and add some placeholder icons
2012-10-18 19:40:31 -04:00
Tom Morris
068e0916a2
FIXED - task 587: Correct initialization of the temporary directory - patch from the Wikier project
...
http://code.google.com/p/google-refine/issues/detail?id=587
https://bitbucket.org/wikier/google-refine/changeset/f3dbdb16a320#chg-main/src/com/google/refine/RefineServlet.java
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2583 7d457c2a-affb-35e4-300a-418c747d4874
2012-10-13 15:58:44 +00:00
Tom Morris
4d48741ce0
FIXED - task 574: create safe sheet names for Excel export - patch from jd@tekii.com.ar
...
http://code.google.com/p/google-refine/issues/detail?id=574
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2582 7d457c2a-affb-35e4-300a-418c747d4874
2012-10-12 23:05:17 +00:00
Tom Morris
ca2e959957
FIXED - task 529: Add support for key/value transpose with only two columns as well as repeating key fields in a single record.
...
http://code.google.com/p/google-refine/issues/detail?id=529
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2574 7d457c2a-affb-35e4-300a-418c747d4874
2012-10-05 23:31:25 +00:00
Tom Morris
ffe674729c
Just a little Javadoc. No functional changes.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2573 7d457c2a-affb-35e4-300a-418c747d4874
2012-10-05 21:10:32 +00:00
Tom Morris
2c52a00f55
Fixed - issue 544,600,618: Clean up handling of compressed files & archives with multi-segment paths
...
http://code.google.com/p/google-refine/issues/detail?id=600
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2569 7d457c2a-affb-35e4-300a-418c747d4874
2012-09-22 18:08:56 +00:00
Tom Morris
748e205ae8
FIXED - task 616: Support bzip2 decompression on import
...
http://code.google.com/p/google-refine/issues/detail?id=616
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2568 7d457c2a-affb-35e4-300a-418c747d4874
2012-09-22 16:00:42 +00:00
Tom Morris
27e3c0c8dc
FIXED - task 614: Use same instance of OAuthProvider in OAuth dance. Patch supplied by sdeo@google.com
...
http://code.google.com/p/google-refine/issues/detail?id=614
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2566 7d457c2a-affb-35e4-300a-418c747d4874
2012-09-19 23:16:29 +00:00
Tom Morris
b3f5fada95
FIXED - task 578 & 596: Clean up JSON importer
...
http://code.google.com/p/google-refine/issues/detail?id=578
http://code.google.com/p/google-refine/issues/detail?id=596
Extend tree parser framework to allow any Serializable instead of just Strings. Use this in JSON importer to: Import keywords null, true, false; Import empty strings and don't trim whitespace from strings on import; Import numbers directly instead of importing them as text and then parsing them ourselves. Add tests to verify all this stuff
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2543 7d457c2a-affb-35e4-300a-418c747d4874
2012-09-08 01:20:25 +00:00
Tom Morris
93d6e176d6
Task 478: Default "guess datatypes" to False so importers which don't specify it (e.g. gData & Excel) aren't effected
...
http://code.google.com/p/google-refine/issues/detail?id=478
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2541 7d457c2a-affb-35e4-300a-418c747d4874
2012-09-07 21:17:34 +00:00
Tom Morris
83dce305cb
FIXED - task 432: cross() failing - flush join cache table when column changes
...
http://code.google.com/p/google-refine/issues/detail?id=432
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2539 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-30 16:31:22 +00:00
Tom Morris
9b54a8f29e
FIXED - task 559: Deadlock between autosave thread and history code
...
http://code.google.com/p/google-refine/issues/detail?id=559
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2538 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-30 16:22:28 +00:00
Stefano Mazzocchi
ba89daec1c
make oauth against freebase work again in chrome
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2537 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-23 20:58:08 +00:00
Tom Morris
12a61b6ec6
task 603: range check column move commands
...
http://code.google.com/p/google-refine/issues/detail?id=603
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2534 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-18 22:01:23 +00:00
Tom Morris
202018fac4
Add Javadoc. No code changes.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2533 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-18 22:00:41 +00:00
Tom Morris
4bb6c43982
task 604: add Guava to main project so that we're not dependent on an extension
...
http://code.google.com/p/google-refine/issues/detail?id=604
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2531 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-15 13:33:17 +00:00
Tom Morris
1e043dcc94
FIXED - task 604: The common transform “Trim leading and trailing whitespace” doesn’t trim non-breaking spaces
...
http://code.google.com/p/google-refine/issues/detail?id=604
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2529 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-14 23:00:13 +00:00
Tom Morris
f29f77e8f8
STARTED - task 604: The common transform “Trim leading and trailing whitespace” doesn’t trim non-breaking spaces
...
http://code.google.com/p/google-refine/issues/detail?id=604
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2528 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-14 20:09:49 +00:00
Tom Morris
4bf212c03d
FIXED - task 154: Can't import RDF/XML Data
...
http://code.google.com/p/google-refine/issues/detail?id=154
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2526 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-05 16:31:41 +00:00
Tom Morris
5881addac8
Throw an exception if unsupported verb is used
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2525 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-05 15:36:23 +00:00
Tom Morris
b2ae74d23f
FIXED - task 586: Only one parse date format is attempted from list in toDate(format1,format2)
...
http://code.google.com/p/google-refine/issues/detail?id=586
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2520 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-03 18:01:01 +00:00
Tom Morris
4319314675
FIXED - task 594: Date diff function doesn't work for two Calendar objects
...
http://code.google.com/p/google-refine/issues/detail?id=594
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2519 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-02 21:41:19 +00:00
Tom Morris
efa58630cf
Add constructor that takes a Throwable to eliminate redundant code from callers.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2518 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-02 21:38:00 +00:00
Stefano Mazzocchi
2cb31b8b29
fixing oauth problems with redirection for the Freebase API
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2516 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-01 21:46:53 +00:00
David Huynh
4cfb921082
Added getStringKey() method for when it is difficult to generate integer keys that don't collide
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2515 7d457c2a-affb-35e4-300a-418c747d4874
2012-07-19 00:25:41 +00:00
Stefano Mazzocchi
6e41f4ad91
make the latest eclipse happy (it triggers a warning)
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2513 7d457c2a-affb-35e4-300a-418c747d4874
2012-07-12 01:55:11 +00:00
Stefano Mazzocchi
bccea8cebe
we could be leaking file descriptors here
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2506 7d457c2a-affb-35e4-300a-418c747d4874
2012-06-30 07:05:08 +00:00
Stefano Mazzocchi
f84dcff900
moving oauth authorize and deauthrorize into the core module because they are reusable across extensions
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2505 7d457c2a-affb-35e4-300a-418c747d4874
2012-06-29 19:39:42 +00:00
Tom Morris
8872c1b0a1
Keep track of when we have unsaved preference changes
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2502 7d457c2a-affb-35e4-300a-418c747d4874
2012-06-02 21:06:46 +00:00
Tom Morris
a0812c5751
Be slightly more tolerant of weird spreadsheet data
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2501 7d457c2a-affb-35e4-300a-418c747d4874
2012-06-02 21:00:30 +00:00
Tom Morris
c47b1e0ab7
Mark project as modified when metadata is changed
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2491 7d457c2a-affb-35e4-300a-418c747d4874
2012-04-14 14:10:11 +00:00
Tom Morris
8d22ede1f8
Issue 554 - rank formats *before* serializing them.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2482 7d457c2a-affb-35e4-300a-418c747d4874
2012-03-16 20:21:57 +00:00
Tom Morris
b3f8ce83c1
Issue 553 - Make sure we have a usable filename when importing from a URL
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2481 7d457c2a-affb-35e4-300a-418c747d4874
2012-03-16 20:16:18 +00:00
Tom Morris
51c586bc2c
Issue 543 - Handle HTTP responses with Content-Encoding of gzip
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2480 7d457c2a-affb-35e4-300a-418c747d4874
2012-03-16 20:12:10 +00:00
Tom Morris
a8cb23ca51
Issue 544 - preserve directory path after decompressing file
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2479 7d457c2a-affb-35e4-300a-418c747d4874
2012-03-16 20:06:54 +00:00
Tom Morris
e97e7523b2
Issue 548 - Convert non-strings to strings before escaping
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2463 7d457c2a-affb-35e4-300a-418c747d4874
2012-03-14 03:06:11 +00:00
Tom Morris
18b780bebe
Issue 517 - Fix combin() function to a) increase upper limit and b) keep it from continually recomputing the same values in recursion
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2459 7d457c2a-affb-35e4-300a-418c747d4874
2012-03-08 22:53:21 +00:00
Tom Morris
28ff2295fd
Issue 490 - Handle separator guessing for CSVs with quoted fields containing commas
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2458 7d457c2a-affb-35e4-300a-418c747d4874
2012-03-08 15:53:55 +00:00
Tom Morris
9a680e8307
Switch to class name for logging, per convention
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2457 7d457c2a-affb-35e4-300a-418c747d4874
2012-03-08 14:53:27 +00:00
Tom Morris
ddd3680128
Add a TODO for recon failure retries on HTTP 500s - no functional changes
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2455 7d457c2a-affb-35e4-300a-418c747d4874
2012-03-08 14:45:53 +00:00
Tom Morris
5a962b1768
Issue 534 - Attempt to recover recon links which have become corrupted
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2454 7d457c2a-affb-35e4-300a-418c747d4874
2012-03-08 00:37:29 +00:00
Tom Morris
dbdbd906b7
Issue 547 - Decompress kmz files
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2453 7d457c2a-affb-35e4-300a-418c747d4874
2012-03-08 00:29:25 +00:00
Tom Morris
4a99abf25d
Isse 542 - allow integers to be converted to dates
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2450 7d457c2a-affb-35e4-300a-418c747d4874
2012-03-03 21:36:36 +00:00
Tom Morris
5d080e5b3e
Wrap if statement in a block to avoid future problems.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2447 7d457c2a-affb-35e4-300a-418c747d4874
2012-03-01 18:10:59 +00:00
Tom Morris
c583ad4367
Issue 537 - Try to convert to Long first before converting to Double. Matches behavior on import.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2446 7d457c2a-affb-35e4-300a-418c747d4874
2012-02-26 17:27:00 +00:00
Tom Morris
190e817fb8
Protect against NullPointerException
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2444 7d457c2a-affb-35e4-300a-418c747d4874
2012-02-22 20:06:03 +00:00
David Huynh
e21ae32722
Make sure project ID is completely numeric. Slightly better error reporting on project page when project ID is not valid.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2441 7d457c2a-affb-35e4-300a-418c747d4874
2012-01-29 21:16:13 +00:00
Tom Morris
6414ae7f87
Remove redundant test
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2436 7d457c2a-affb-35e4-300a-418c747d4874
2012-01-27 20:38:55 +00:00
Tom Morris
40183aa0ba
Issue 513 - get rid of exception at end of import in JSON parser
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2435 7d457c2a-affb-35e4-300a-418c747d4874
2012-01-27 17:05:45 +00:00
Tom Morris
fdac0c30cf
Issue 524 - shorten __anonymous__ names for JSON importer
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2432 7d457c2a-affb-35e4-300a-418c747d4874
2012-01-26 22:38:25 +00:00
Tom Morris
df45d06b2b
Issue 523 - On URL fetch error, return HTTP error code, message, and contents of error stream (HTML page) if available
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2429 7d457c2a-affb-35e4-300a-418c747d4874
2012-01-26 18:47:30 +00:00
David Huynh
794629eee6
ChangeSequence did not save/load properly at all.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2427 7d457c2a-affb-35e4-300a-418c747d4874
2012-01-25 02:04:52 +00:00
David Huynh
893b767c01
ChangeSequence did not revert properly at all.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2426 7d457c2a-affb-35e4-300a-418c747d4874
2012-01-25 00:46:52 +00:00
Tom Morris
fa2e6fe608
Issue 517 - add some interim error checking and reporting
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2420 7d457c2a-affb-35e4-300a-418c747d4874
2012-01-12 06:20:28 +00:00
Tom Morris
8ec10a6ea6
Fix error message to match code
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2419 7d457c2a-affb-35e4-300a-418c747d4874
2012-01-12 05:51:16 +00:00
Tom Morris
b409ef5670
Issue 491 - fix off-by-one error in column counts
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2405 7d457c2a-affb-35e4-300a-418c747d4874
2011-12-09 23:50:40 +00:00
Tom Morris
b3bcb3361b
Issue 483 - make custom metadata available to the client
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2404 7d457c2a-affb-35e4-300a-418c747d4874
2011-12-09 23:05:42 +00:00
David Huynh
ae771a7ccb
Fixed Issue 502 in google-refine: Fetch URLs does not return the exact HTTP payload, like Create Project from URLs does.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2398 7d457c2a-affb-35e4-300a-418c747d4874
2011-12-02 20:44:13 +00:00
David Huynh
a7e2704655
Attempt at fixing Issue 500: Sequential creation of related columns using apply-operation command
...
by letting long-running processes report errors.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2394 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-30 23:54:40 +00:00
David Huynh
d419f4bbc7
For reinterpret function, swapped encoder and decoder arguments if decoder is specified, as discussed here:
...
http://groups.google.com/group/google-refine/msg/629dbf11b073e129
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2392 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-29 19:55:08 +00:00
Tom Morris
3b4bdbecdf
Issue 378 - JSONize NaNs as their string equivalent to keep JSONwriter from throwing an exception
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2391 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-29 07:57:36 +00:00
David Huynh
76802d328d
Default the encoding of clipboard data to UTF-8.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2390 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-29 05:21:26 +00:00
David Huynh
cdca6fff8f
Checked in Shardul Deo's patch from
...
http://groups.google.com/group/google-refine-dev/browse_thread/thread/5222a68396c56405
to support HTTP PUT and DELETE.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2387 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-25 03:48:03 +00:00
Tom Morris
f1b567bc31
Issue 487 - Add support for ISO 8601 date parsing
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2383 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-18 22:05:45 +00:00
Tom Morris
80c13e4b59
Issue 486 - make sure project character encoding doesn't get set to ""
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2381 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-18 20:52:49 +00:00
Tom Morris
d5dd04965a
Allow user to optionally override source encoding in reinterpret function so they can fix up bad projects. Interpret empty string as system default encoding.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2380 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-18 20:50:55 +00:00
Tom Morris
23ac625818
Issue 430 - Fix timeline facet to handle Calendar type as well as Date
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2379 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-17 23:52:32 +00:00
David Huynh
dbeaefb00b
Minor bug fix to previous check-in: made sure blank cells in the 2 newly generated columns don't get filled in.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2368 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-07 19:53:26 +00:00
David Huynh
d01745284b
Added option to "transpose columns into rows" operation for filling in other columns.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2367 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-07 19:39:23 +00:00
David Huynh
5aec75696d
Fixed Issue 477 in google-refine: Implement or remove the line separator option.
...
Also, fixed displaying bug in the fixed-width parser UI: previously, tab characters forced columns to be wider.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2364 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-06 20:13:05 +00:00
David Huynh
a35b9f53f7
Made operation "Transpose columns into rows" support the option of transposing into 2 new columns rather than just one.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2362 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-06 02:50:33 +00:00
Tom Morris
85a37d23f9
Issue 474 - implement record limit for XML and JSON importers
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2359 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-05 16:38:19 +00:00
David Huynh
b36b229ba4
Fixed Issue 465: Data text file with extension .dta within a .ZIP is not automatically extracted
...
.dta isn't recognized so there's no best format detected. But now we default to text/line-based and always select all files if no file gets selected by default.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2358 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-04 22:33:38 +00:00
David Huynh
41a90ad71f
Fixed Issue 459: Undefined error with some CSV files (incorrectly detected as EXCEL)
...
by favoring file name-based format over mime type-based format (because the user's computer might have .csv registered as an Excel format).
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2357 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-04 21:52:12 +00:00
David Huynh
2f6b635f66
Added initial implementation of Key/value Columnize operation and command.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2356 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-04 21:00:32 +00:00
Tom Morris
a7c81880a8
Issue 475 - Support escaped custom separators
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2355 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-04 19:04:16 +00:00
Tom Morris
cacbedd352
Fix index out of bounds exception when separator is the empty string
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2354 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-04 17:31:51 +00:00
Stefano Mazzocchi
856ef6a65a
commented out unused variables
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2352 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-01 21:47:24 +00:00
Tom Morris
71492c706c
Just some TODOs
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2349 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-28 17:51:20 +00:00
Tom Morris
ad8705e299
Javadoc only
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2348 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-28 17:29:35 +00:00
Tom Morris
a870e782f5
Make sure out counts our current before attempting to use them for sorting
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2347 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-28 17:28:27 +00:00
Tom Morris
5dad4d6a0b
Handle legacy projects which have an empty slot 0 for the column model (old off-by-one bug)
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2346 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-23 19:29:44 +00:00
Tom Morris
ab950689dd
Add debugging info - mostly toString() methods for types missing them
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2343 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-21 16:46:55 +00:00
Tom Morris
b2781bda3f
Javadoc only
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2342 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-21 16:30:37 +00:00
Tom Morris
9a9f4c1354
Issue 467 - provide JVM heap usage as part of the progress monitor during project creation.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2341 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-21 16:28:40 +00:00
David Huynh
f4b2ee3715
"Transpose columns into rows" operation now supports specifying the ending column to be the last column regardless of its name.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2337 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-19 13:42:50 +00:00
David Huynh
223074bb25
Xml importer should stop trying to skip over initial non-xml content after some number of characters.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2336 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-18 15:25:31 +00:00
Tom Morris
9710521ef8
Correct column counting so maxCellIndex represents current count rather than next column
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2335 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-14 21:00:50 +00:00
Tom Morris
5d6ab76b7c
Issue 313 - fix cell format so dates export as dates rather than numbers.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2334 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-14 20:23:59 +00:00
Tom Morris
2d5125af1e
Issue 462 - don't trim whitespace from string-valued cell contents on import
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2330 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-12 23:45:52 +00:00
Tom Morris
5c95c9c1f9
New exporter - Open Document Format (ODF) spreadsheets (.ods)
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2326 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-11 21:02:23 +00:00
Tom Morris
3bd84088da
Rename OO/ODS importer with more generic name
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2325 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-11 21:01:45 +00:00
Tom Morris
ee0fb9033e
Javadoc
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2324 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-11 20:57:40 +00:00
Tom Morris
ca17e1ef0a
New importer for Open Document Format (ODF) spreadsheet files (.ods)
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2323 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-11 20:27:40 +00:00
Tom Morris
2726f61a61
Add toString methods to help with debugging
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2321 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-11 20:19:53 +00:00
Tom Morris
5c856179cb
Add TODO for suspicious code
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2320 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-11 20:14:57 +00:00
Tom Morris
16421303cb
Add Javadoc
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2318 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-11 20:12:24 +00:00
David Huynh
55c3fdebab
Bumped up version to 2.5.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2314 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-10 21:58:42 +00:00
David Huynh
1a14d82393
For XML files, ignore not just leading whitespace but anything except <.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2313 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-10 20:51:00 +00:00
Tom Morris
fffd24d64b
Parse parameters from multipart/form-data POSTs rather than just dropping them (needed for Windmill tests, among other things)
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2302 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-07 23:15:55 +00:00
Stefano Mazzocchi
1f67866258
fixing a bunch of inconsistencies and potential bugs as indicated by findbugs, pmd and eclipse
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2301 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-07 21:23:23 +00:00
Tom Morris
31073d7712
Refactor importer interfaces to narrow exceptions thrown and handled
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2296 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-07 19:06:53 +00:00
Tom Morris
50927b33dc
Javadoc
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2295 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-07 18:56:23 +00:00
Tom Morris
4a230abb44
Narrow exception handling
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2294 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-07 18:55:46 +00:00
Tom Morris
29cbc5af20
Remove some obsolete TODOs. No functional changes.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2290 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-07 17:29:30 +00:00
David Huynh
18f32ed7e8
Fixed up Rdf Triples importer, added a parser UI for it, and got its tests to pass.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2283 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-06 21:28:20 +00:00
David Huynh
1c5dc32b88
Fixed tsv/csv tests.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2276 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-06 06:22:30 +00:00
Tom Morris
ac4a0ca747
Store blank cells as nulls if that's what the user request
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2272 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-05 23:41:52 +00:00
Tom Morris
0ce0a0a8d3
Add toString support for null cells to help with debugging
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2271 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-05 23:33:17 +00:00
David Huynh
e7e9dbc74d
Minor fixes to pass some exporter tests.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2269 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-03 16:38:07 +00:00
David Huynh
7935dfd60e
Stricter detection of json and xml formats on import, by checking for initial nonspace character.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2266 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-30 01:47:42 +00:00
David Huynh
d047acf1d1
Fixed Issue 452: Importing using Clipboard function does not guess structure correctly for XML or JSON
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2263 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-29 14:02:12 +00:00
David Huynh
5762efebf6
Fixed Issue 397: New UI Importer Branch - individual JSON record nodes do not preview well.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2258 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-28 03:38:23 +00:00
Tom Morris
1b197d93d8
Issue 447 - allow users to specify delimiters for toTitlecase function
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2253 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-20 05:07:46 +00:00
David Huynh
e1184003df
Color-code date values in data table.
...
Fixed Issue 426: filter with custom facet adds zero lines choice
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2251 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-20 01:36:47 +00:00
Tom Morris
59d6020979
Add basic test coverage for ToTitleCase and (commented out) support for 2nd parameter
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2250 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-19 15:47:33 +00:00
David Huynh
82cc76f076
Fixed bug where a blank row used to corrupt the whole project because it could not be re-loaded from file.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2248 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-19 10:36:38 +00:00
David Huynh
9111157172
Fixed Issue 447: Extend toTitlecase() function with support for char[] delimiters in Apache WordUtils.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2247 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-19 09:48:37 +00:00
David Huynh
db3bbb5c86
Fixed xml parsing error due to whitespaces in front of <?xml>.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2246 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-19 09:06:36 +00:00
David Huynh
66cf0b6596
Fixed Issue 449: Uncaught exception from Excel importer.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2245 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-19 08:49:35 +00:00
David Huynh
5c446d28d0
Support uploading directly to a new Google spreadsheet.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2243 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-16 18:04:55 +00:00
David Huynh
02c58e2c56
Periodically clean up stale importing jobs to free up disk space.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2240 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-15 23:52:05 +00:00
David Huynh
0693205430
Added support for importing from fusion tables.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2239 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-15 21:40:40 +00:00
Tom Morris
ebede9b424
Issue 441 - return EvalError if we can't parse a date
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2237 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-13 20:58:43 +00:00
Tom Morris
131ff81c0d
Don't reschedule a canceled timer
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2236 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-13 20:38:34 +00:00
David Huynh
57c11d0238
Fixed issue 442: Two column transforms to date on the same column turns the cells blank
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2230 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-01 22:11:45 +00:00
David Huynh
a88ccd2c32
Reduced amount of logging.
...
Suppressed logging for the GetProcessesCommand, which gets ping'ed often while there is a long running operation being executed (e.g., reconciling, fetching URLs).
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2228 7d457c2a-affb-35e4-300a-418c747d4874
2011-09-01 18:26:45 +00:00
David Huynh
a8815956cd
Implemented back-end of customizable tabular exporting support.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2225 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-30 19:19:46 +00:00
Tom Morris
e174bb163a
Issue 440 - Don't purge from memory those projects with pending operations
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2222 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-28 22:00:02 +00:00
David Huynh
420e74c6f4
Made CreateProjectCommand scriptable again, so it can be called from client libraries.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2216 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-23 18:49:47 +00:00
David Huynh
4113a10b5b
Catch/log exceptions in the importers a bit more carefully.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2215 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-22 21:47:15 +00:00
David Huynh
f023b922e1
Implemented encoding selectors in a few importing parser UIs.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2214 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-22 17:55:06 +00:00
Tom Morris
bde63ff417
Last set of indentation cleanups - no functional changes
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2211 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-18 17:46:36 +00:00
Tom Morris
9d7b8a5279
Don't die if we get passed no candidates
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2210 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-18 17:39:18 +00:00
David Huynh
afb7953eac
Fixed problem for importing from an archive file containing fixed width column files: we used to create totally new columns for each contained file, yielding too many columns.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2203 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-14 02:53:19 +00:00
David Huynh
33d99186ea
Made fixed width column guessing slightly better.
...
Made sure fixed width parser UI take into account the File column.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2202 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-14 02:05:18 +00:00
David Huynh
41e4e1cd70
Some more JS indentation fixes.
...
Fixed issue 31: "Maximum number of facet values should be configurable." Now when we're showing "too many choices" we also display exactly how many choices there are and show a link to change the limit.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2201 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-14 01:05:43 +00:00
David Huynh
e955ed05ae
Made sure busy indicator shows up for GData importing when needed.
...
Fixed radio button issue with GData worksheet selection.
Fixed resizing issue with open project action area.
Fixed NullPointerException in RecordModel.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2198 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-12 19:15:58 +00:00
David Huynh
823729776d
Google spreadsheets can now be imported directly from within Refine.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2192 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-11 00:35:01 +00:00
David Huynh
c5078d1887
Fixed issue 428: Excel import sometimes drops last row of data.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2189 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-06 19:37:23 +00:00
Tom Morris
da7347e7b1
Make sure all conditionals and loops are in blocks (too bug-prone otherwise)
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2183 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-02 22:21:47 +00:00
Tom Morris
c16a2378f9
Ask people not to reformat since this is imported code.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2182 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-02 22:18:50 +00:00
Tom Morris
539fea6eb3
Simplify some for loops using new Java 5 syntax
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2181 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-02 21:17:41 +00:00
Tom Morris
97a0f2a33e
Organize imports. com.google.refine last in a section of its own. Everything alphabetical in its section.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2180 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-02 21:10:22 +00:00
Tom Morris
5497fa4685
Remove unnecessary casts
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2173 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-02 20:33:57 +00:00
Tom Morris
7fd6e22af4
Convert tabs to spaces. No functional changes.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2172 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-02 20:26:32 +00:00