Commit Graph

2525 Commits

Author SHA1 Message Date
Tom Morris
3717111db8
Fix Open Office Spreadsheet (ODS) dates (#2843)
* Truncate any completely empty columns on the right

Fixes #565
The current versions of Open Office create default spreadsheets
with over 1000 empty columns. Keep track of the rightmost
non-empty column when importing and truncate everything else.

Also adds a basic ODS import test.

* Fix dates in ODS spreadsheets

Fixes #2224
2020-07-04 08:42:33 +02:00
Antonin Delpeuch
f4692de9e1 Increase maximum wait for testInvalidUrl, follow-up for #2876 #2875 2020-07-03 21:48:43 +02:00
Tom Morris
df8d092132
Micro benchmark harness & ToNumber optimizations (#2859)
* Performance optimized version of ToNumber

Approximately 5x faster for floats (data dependent)
and about the same speed for integers.

- Instead of blindly trying to parse as Long, do a quick check
  for obvious problems (e.g. decimal point).
- Don't trim. It's already done by called methods.
- Use valueOf() instead of parse() to avoid object creation

* Add Java Microbenchmark Harness

The shaded JAR is missing the OpenRefine classes, for a reason
that I haven't figured out, so requires openrefine-main.jar at runtime.

* Remove old implementations of ToNumber

* Remove unneeded dependencies from main project

* Clean up and reformat
2020-07-03 21:42:44 +02:00
Tom Morris
5d6af9cb6c
Merge pull request #2865 from tfmorris/2863-tree-column-ordering
Remove shortest-column-name ordering - fixes #2863
2020-07-03 15:23:36 -04:00
Tom Morris
f5786afa35
Increase test timeout - fixes #2875 (#2876) 2020-07-03 21:20:01 +02:00
Thad Guidry
49fd21759c
remove English sentence from French translation (#2871) 2020-07-03 16:12:43 +02:00
Tom Morris
139019f6e3
Internationalize clipboard default project name (#2814)
Fixes #2776
2020-07-03 14:22:44 +02:00
chetan
3932b23eb6 Fixed the guessing of JSON for .txt(2820) 2020-07-03 10:46:07 +05:30
Tom Morris
d3db73aa67 Remove shortest-column-name ordering
Refs #2863
The tree importer sorts columns/column groups by how populated
they are, which is of arguable utility, but the tie-breaker
of ordering by shortest column name is completely silly.

This change removes that and, in conjunction with a stable sort
algorithm, will preserve the original order of the columns.
2020-07-02 16:12:55 -04:00
Tom Morris
28a9f68236
Unit test improvements (#2856)
* Fix two deprecated methods usages

* Test ToNumber conversions

* Test behavior of all functions when passed 0 or 8 arguments

There are 16 which fail currently on 0 args (return null or
False instead of EvalError), but have been whitelisted until
we can verify whether it's safe to change them without introducing
compatibility issues.

There are 19 which fail to return an error on too many (ie 8) args.
2020-07-02 20:29:21 +02:00
Tom Morris
54291ef441
Use Apache IO Commons IOUtils instead of homerolled (#2845)
Probably should remove the funky Gzip support with the
overloaded use of the encoding parameter, but this is
a start.
2020-06-30 13:49:47 +02:00
Chetan Verma
e2a2dd2a4e
Fix misstatement about supported formats in import project screen (#2841)
Closes #2753.
2020-06-30 08:25:15 +02:00
Tom Morris
0f3a6006f3
Add Excel95 import test and improve other importer tests (#2844)
No issue.
- we don't support Excel95, but make sure that it generates an exception
- move the test data file into the appropriate directory
- for any normal test, consider exceptions a failure
2020-06-30 08:20:56 +02:00
Tom Morris
421974cc3d
Truncate any completely empty columns on the right (#2842)
Fixes #565
The current versions of Open Office create default spreadsheets
with over 1000 empty columns. Keep track of the rightmost
non-empty column when importing and truncate everything else.

Also adds a basic ODS import test.
2020-06-30 08:19:00 +02:00
Tom Morris
83f52d4ba5
Fall back to Apache Jena 3.9.0 (from 3.15.0) (#2826)
Fixes #2824
Versions up through 3.14.0 appear to work, but since odfdom bundles
Jena 3.9.0, we're going to be conservative and match that.

As an added bonus, includes a blank node test which will trigger
the failure.
2020-06-27 23:40:21 +02:00
Antoine Beaubien
043e595ea0
Change pref name for ui.browsing.pageSize (#2817)
Change the preference key name ui.gridPaginationSize for ui.browsing.pageSize.
2020-06-27 21:58:48 +02:00
Lisa Chandra
7b8f8486f6
Adds a default separator preference for split/join multi valued cells (#2520)
* default value for split/join

* using the new preference interface

* changed preference name to ui.cell.rowSplitDefaultSeparator
2020-06-25 14:35:53 +02:00
Tom Morris
cfa1038066
Remove commons-digester dependency (#2798) 2020-06-25 14:16:25 +02:00
Tom Morris
4b146acc6e
Create Project import improvements (#2806)
* Fix charset encoding & MIME type handling

Character set (ie what we call "encoding") is part of the Content-Type,
*not* the Content-Encoding, which specifies compression (e.g. gzip).

This correctly sets the character set encoding as well as cleaning
the MIME type so that additional parsing doesn't need to be done
downstream (and removes that code).

* Use "text" instead of "text/line-based" as default fallback format

The TextLineBasedGuesser only tries a limited number of
formats (CSV, TSV, fixed), so we can't get out of that hole to
find JSON, XML, etc.

Start with a more general format instead to improve our
guessing odds.

* Support content type Structured Name Syntax Suffixes (+json +xml)

If we can't find a fully specified content type in our lookup,
fall back to just the suffix (which is registered with a leading +)
Fixes #2800 Fixes #2805
2020-06-25 08:36:57 +02:00
Tom Morris
f9eb819b01
Merge pull request #2737 from OpenRefine/dependabot/maven/org.slf4j-slf4j-log4j12-1.7.30
Bump slf4j-log4j12 from 1.7.18 to 1.7.30
2020-06-24 16:00:22 -04:00
Hosted Weblate
07fbc70ada
Merge branch 'origin/master' into Weblate. 2020-06-24 21:41:53 +02:00
Isao Matsunami
1b30d61b2f
Translated using Weblate (Japanese)
Currently translated at 100.0% (753 of 753 strings)

Translation: OpenRefine/Translations
Translate-URL: https://hosted.weblate.org/projects/openrefine/translations/ja/
2020-06-24 21:41:46 +02:00
Adolfo Jayme Barrientos
cf388fc5f4
Translated using Weblate (Spanish)
Currently translated at 99.4% (749 of 753 strings)

Translation: OpenRefine/Translations
Translate-URL: https://hosted.weblate.org/projects/openrefine/translations/es/
2020-06-24 21:41:46 +02:00
Tom Morris
76d30ee1f0
Merge pull request #2794 from OpenRefine/dependabot/maven/org.codehaus.mojo-build-helper-maven-plugin-3.2.0
Bump build-helper-maven-plugin from 3.1.0 to 3.2.0
2020-06-23 17:08:40 -04:00
Tom Morris
1849e62234
Better error handling for reconciliation process - fixes #2590 (#2671)
* Harden reconciliation - Fixes #2590

- check for non-JSON / unparseable JSON returns
- handle malformed results response with no name for candidates
- catch any Exception, not just IOExceptions
- call processManager.onFailedProcess() for cleanup on error

* Add default constructor for Jackson

Jackson complains about needing a default constructor for the
NON_DEFAULT annotation, but I'm not sure why this worked before.

* Clean up indentation and unused variable - no functional changes

Make indentation consistent throughout the module, changing recently
added lines to use the standard all spaces convention.

Remove unused count variable

* Simplify control flow

* Update limit parameter comment. No functional change.

* Replace ternary expression which is causing NPE

* Add reconciliation tests using mock HTTP server
2020-06-23 21:54:54 +02:00
dependabot-preview[bot]
408b782117
Bump build-helper-maven-plugin from 3.1.0 to 3.2.0
Bumps [build-helper-maven-plugin](https://github.com/mojohaus/build-helper-maven-plugin) from 3.1.0 to 3.2.0.
- [Release notes](https://github.com/mojohaus/build-helper-maven-plugin/releases)
- [Commits](https://github.com/mojohaus/build-helper-maven-plugin/compare/build-helper-maven-plugin-3.1.0...build-helper-maven-plugin-3.2.0)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-06-23 08:45:07 +00:00
Tom Morris
0bfa3dd68a
Merge pull request #2745 from OpenRefine/dependabot/maven/org.codehaus.mojo-build-helper-maven-plugin-3.1.0
Bump build-helper-maven-plugin from 1.8 to 3.1.0
2020-06-22 13:06:02 -04:00
Tom Morris
bf57667a47
Merge pull request #2789 from OpenRefine/dependabot/maven/org.apache.maven.plugins-maven-resources-plugin-3.1.0
Bump maven-resources-plugin from 2.6 to 3.1.0
2020-06-22 12:48:59 -04:00
Hosted Weblate
1c63f6bf96
Merge branch 'origin/master' into Weblate. 2020-06-22 12:03:51 +02:00
Rafael Fontenelle
0da1b34095
Translated using Weblate (Portuguese (Brazil))
Currently translated at 100.0% (753 of 753 strings)

Translation: OpenRefine/Translations
Translate-URL: https://hosted.weblate.org/projects/openrefine/translations/pt_BR/
2020-06-22 12:03:51 +02:00
Adolfo Jayme Barrientos
e35358709a
Translated using Weblate (Spanish)
Currently translated at 99.4% (749 of 753 strings)

Translation: OpenRefine/Translations
Translate-URL: https://hosted.weblate.org/projects/openrefine/translations/es/
2020-06-22 12:03:51 +02:00
dependabot-preview[bot]
a8ae5d37ed
Bump maven-resources-plugin from 2.6 to 3.1.0
Bumps [maven-resources-plugin](https://github.com/apache/maven-resources-plugin) from 2.6 to 3.1.0.
- [Release notes](https://github.com/apache/maven-resources-plugin/releases)
- [Commits](https://github.com/apache/maven-resources-plugin/compare/maven-resources-plugin-2.6...maven-resources-plugin-3.1.0)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-06-22 08:45:55 +00:00
Tom Morris
e293602897
Restore character encoding guesser (#2755)
* Fixes #486. Builds on code from Steffen Stundzig

- Switch from ICU4J to juniversalchardet
  (Java port of Mozilla charset detector)
- Replace org.json code with Jackson
- Add tests
- Add TODO for multi-file character encoding mismatches

* Restore dependency lost in rebase

Co-authored-by: Steffen Stundzig <git@stundzig.de>
2020-06-22 06:04:51 +02:00
Hosted Weblate
ba4b70db4e
Merge branch 'origin/master' into Weblate. 2020-06-21 21:41:51 +02:00
Rafael Fontenelle
cc787b4257
Translated using Weblate (Portuguese (Brazil))
Currently translated at 100.0% (752 of 752 strings)

Translation: OpenRefine/Translations
Translate-URL: https://hosted.weblate.org/projects/openrefine/translations/pt_BR/
2020-06-21 21:41:47 +02:00
Isao Matsunami
0383a7385e
Translated using Weblate (Japanese)
Currently translated at 100.0% (752 of 752 strings)

Translation: OpenRefine/Translations
Translate-URL: https://hosted.weblate.org/projects/openrefine/translations/ja/
2020-06-21 21:41:46 +02:00
Adolfo Jayme Barrientos
b1625c714d
Translated using Weblate (Spanish)
Currently translated at 99.4% (748 of 752 strings)

Translation: OpenRefine/Translations
Translate-URL: https://hosted.weblate.org/projects/openrefine/translations/es/
2020-06-21 21:41:45 +02:00
Tom Morris
2977ffa167
Merge pull request #2778 from OpenRefine/2777-test-dependencies
Restrict copied jars to runtime dependencies
2020-06-21 15:09:28 -04:00
Antonin Delpeuch
e92200a35f Move jaxb-api dependency out of the test section 2020-06-21 20:56:33 +02:00
Tom Morris
5d6d0ad6ba
Add missing wiring for i18n plurals. (#2774)
* Add missing wiring for i18n plurals parser

* Fix goto page plural for French
2020-06-21 15:57:17 +02:00
Antonin Delpeuch
62cb20a201 Restrict copied jars to runtime dependencies. Fixes #2777 2020-06-21 15:36:17 +02:00
Antoine Beaubien
0cf7880391
(I #2765) Fix i18n not working in the Edit Pref Window (#2766)
* Fix i18n not working in the Edit Pref Window

Fix i18n not working in the Edit Pref Window, added an error message on failure.

* réussie -> réussi.
2020-06-21 08:26:00 +02:00
Hosted Weblate
eb752d7d5a
Merge branch 'origin/master' into Weblate. 2020-06-20 12:41:48 +02:00
Rafael Fontenelle
7d12275881
Translated using Weblate (Portuguese (Brazil))
Currently translated at 100.0% (743 of 743 strings)

Translation: OpenRefine/Translations
Translate-URL: https://hosted.weblate.org/projects/openrefine/translations/pt_BR/
2020-06-20 12:41:45 +02:00
Isao Matsunami
842de012df
Translated using Weblate (Japanese)
Currently translated at 100.0% (743 of 743 strings)

Translation: OpenRefine/Translations
Translate-URL: https://hosted.weblate.org/projects/openrefine/translations/ja/
2020-06-20 12:41:45 +02:00
Adolfo Jayme Barrientos
92ce62b0da
Translated using Weblate (Spanish)
Currently translated at 100.0% (743 of 743 strings)

Translation: OpenRefine/Translations
Translate-URL: https://hosted.weblate.org/projects/openrefine/translations/es/
2020-06-20 12:41:44 +02:00
dependabot-preview[bot]
479bc63eaf
Bump httpcore from 4.4.9 to 4.4.13
Bumps httpcore from 4.4.9 to 4.4.13.

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-06-19 08:20:00 +00:00
Tom Morris
f88c0e3657
Preserve international characters in project/file names on import/export (#2720)
* Preserve international characters on import/export

Fixes #1352. Preserve non-ASCII characters in project names on
project creation and filenames on export.

Uses existing filename cleaner with the addition of a few
more characters from StackOverflow, plus "#" which messes
up the download URL. Also URIencode download URL.

Removes unused I18N-incompatible cleaning function from
Wikidata extension rather than fixing it.

* Use common name cleaner function

Also preview cleaned table name instead of raw name, so user can see it.
Also add a TODO for better preview of column names
2020-06-18 22:06:46 +02:00
Antoine Beaubien
7793ffbbe9
(I #2624) Add preferences for the the grid's page size (#2626)
* Add preferences for the row display quantity

Be able to control the choices for the quantity of rows displayed.

* Added _checkPaginationSize(gridPageSize, defaultGridPageSize)

Added DataTableView._checkPaginationSize(gridPageSize, defaultGridPageSize), gridPageSize = smallest size.

* Update data-table-view.js

Fix missing semi-comma.

* Fix typeof gridPageSize != "object" not working for null

Fix typeof gridPageSize != "object" not working for null

* Update data-table-view.js

* Fix tableHeader instead of headerTable

Fix tableHeader instead of headerTable
2020-06-18 10:35:37 +02:00
Tom Morris
77b858db18
Fix race in Process Manager (#2748)
* Remove redundant JSON diff logging

* Fix race in process manager test causing intermittent failure
2020-06-17 21:24:25 +02:00