Commit Graph

5259 Commits

Author SHA1 Message Date
Tom Morris
4b146acc6e
Create Project import improvements (#2806)
* Fix charset encoding & MIME type handling

Character set (ie what we call "encoding") is part of the Content-Type,
*not* the Content-Encoding, which specifies compression (e.g. gzip).

This correctly sets the character set encoding as well as cleaning
the MIME type so that additional parsing doesn't need to be done
downstream (and removes that code).

* Use "text" instead of "text/line-based" as default fallback format

The TextLineBasedGuesser only tries a limited number of
formats (CSV, TSV, fixed), so we can't get out of that hole to
find JSON, XML, etc.

Start with a more general format instead to improve our
guessing odds.

* Support content type Structured Name Syntax Suffixes (+json +xml)

If we can't find a fully specified content type in our lookup,
fall back to just the suffix (which is registered with a leading +)
Fixes #2800 Fixes #2805
2020-06-25 08:36:57 +02:00
Tom Morris
3aa610d6aa
Improve Google Sheets upload (#2784)
* Support more than 26 columns

Google Sheets default to just 26 columns (A-Z) and we need to
explicitly add more columns if we need them.

Fixes #2760

* Improve Google Sheets upload

- upload in chunks instead of serializing the entire document at once
- Free up resources as we go
- stop if an error occurs
- reduce batch size to try and stay in 10MB request size limit
  (but need a more dynamic way to do this probably for very wide
   sheets or sheets with large values)

* Add basic test and do some cleanup

- add test for columns > 26
- refactor to allow testing and not depend on unnecessary fields
- add i18n TODO for translating spreadsheet description

* Preserve cell data types

Fixes #2785
- integers and floats are sent as Doubles
- bools as Boolean
- DateTimes as Strings
- nulls as the empty string
- anything else as Strings using .toString()

* Fix LGTM-flagged potentially null pointer dereference
2020-06-25 08:18:28 +02:00
dependabot-preview[bot]
de309158c9
Bump plexus-archiver from 4.0.0 to 4.2.2 (#2736)
* Bump plexus-archiver from 4.0.0 to 4.2.2

Bumps [plexus-archiver](https://github.com/codehaus-plexus/plexus-archiver) from 4.0.0 to 4.2.2.
- [Release notes](https://github.com/codehaus-plexus/plexus-archiver/releases)
- [Changelog](https://github.com/codehaus-plexus/plexus-archiver/blob/master/ReleaseNotes.md)
- [Commits](https://github.com/codehaus-plexus/plexus-archiver/compare/plexus-archiver-4.0.0...plexus-archiver-4.2.2)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>

* Add comment to explain dependency override

Co-authored-by: dependabot-preview[bot] <27856297+dependabot-preview[bot]@users.noreply.github.com>
Co-authored-by: Antonin Delpeuch <antonin@delpeuch.eu>
2020-06-25 08:03:48 +02:00
Tom Morris
7f435bd3df
Remove obsolete Google API key reference (#2809)
This key was used for the Freebase APIs and is no longer
referenced anywhere.
2020-06-25 07:57:04 +02:00
Tom Morris
a24f2f3feb
Merge pull request #2802 from OpenRefine/dependabot/maven/com.google.apis-google-api-services-drive-v3-rev20200609-1.30.9
Bump google-api-services-drive from v3-rev20200413-1.30.9 to v3-rev20200609-1.30.9
2020-06-24 23:37:46 -04:00
dependabot-preview[bot]
3c4712fb43
Bump google-api-services-drive
Bumps google-api-services-drive from v3-rev20200413-1.30.9 to v3-rev20200609-1.30.9.

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-06-25 03:08:30 +00:00
Tom Morris
f9eb819b01
Merge pull request #2737 from OpenRefine/dependabot/maven/org.slf4j-slf4j-log4j12-1.7.30
Bump slf4j-log4j12 from 1.7.18 to 1.7.30
2020-06-24 16:00:22 -04:00
Antonin Delpeuch
d5dc123bb1
Merge pull request #2808 from weblate/weblate-openrefine-translations
Translations update from Weblate
2020-06-24 21:56:55 +02:00
Hosted Weblate
07fbc70ada
Merge branch 'origin/master' into Weblate. 2020-06-24 21:41:53 +02:00
Adolfo Jayme Barrientos
b581721ede
Translated using Weblate (Spanish)
Currently translated at 94.3% (182 of 193 strings)

Translation: OpenRefine/wikidata
Translate-URL: https://hosted.weblate.org/projects/openrefine/wikidata/es/
2020-06-24 21:41:50 +02:00
Isao Matsunami
1b30d61b2f
Translated using Weblate (Japanese)
Currently translated at 100.0% (753 of 753 strings)

Translation: OpenRefine/Translations
Translate-URL: https://hosted.weblate.org/projects/openrefine/translations/ja/
2020-06-24 21:41:46 +02:00
Adolfo Jayme Barrientos
cf388fc5f4
Translated using Weblate (Spanish)
Currently translated at 99.4% (749 of 753 strings)

Translation: OpenRefine/Translations
Translate-URL: https://hosted.weblate.org/projects/openrefine/translations/es/
2020-06-24 21:41:46 +02:00
dependabot-preview[bot]
55171a85eb
Bump mariadb-java-client from 2.6.0 to 2.6.1 (#2801)
Bumps [mariadb-java-client](https://github.com/mariadb-corporation/mariadb-connector-j) from 2.6.0 to 2.6.1.
- [Release notes](https://github.com/mariadb-corporation/mariadb-connector-j/releases)
- [Changelog](https://github.com/mariadb-corporation/mariadb-connector-j/blob/master/CHANGELOG.md)
- [Commits](https://github.com/mariadb-corporation/mariadb-connector-j/compare/2.6.0...2.6.1)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>

Co-authored-by: dependabot-preview[bot] <27856297+dependabot-preview[bot]@users.noreply.github.com>
2020-06-24 11:59:48 +02:00
dependabot-preview[bot]
d297026f29
Bump maven-antrun-plugin from 1.4 to 3.0.0 (#2795)
* Bump maven-antrun-plugin from 1.4 to 3.0.0

Bumps [maven-antrun-plugin](https://github.com/apache/maven-antrun-plugin) from 1.4 to 3.0.0.
- [Release notes](https://github.com/apache/maven-antrun-plugin/releases)
- [Commits](https://github.com/apache/maven-antrun-plugin/compare/maven-antrun-plugin-1.4...maven-antrun-plugin-3.0.0)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>

* Replace <tasks> with <target> for maven-antrun-plugin 3.0.0

Co-authored-by: dependabot-preview[bot] <27856297+dependabot-preview[bot]@users.noreply.github.com>
Co-authored-by: Tom Morris <tfmorris@gmail.com>
2020-06-24 09:01:24 +02:00
Tom Morris
76d30ee1f0
Merge pull request #2794 from OpenRefine/dependabot/maven/org.codehaus.mojo-build-helper-maven-plugin-3.2.0
Bump build-helper-maven-plugin from 3.1.0 to 3.2.0
2020-06-23 17:08:40 -04:00
Tom Morris
d97d6c66b8
Update Google API dependencies for GData extension (#2754)
* Update Google API dependencies for Sheets & Drive

Remove unnecessary direct dependencies which are transitive
dependencies of those.

* Fix use of deprecated class
2020-06-23 21:55:46 +02:00
Tom Morris
1849e62234
Better error handling for reconciliation process - fixes #2590 (#2671)
* Harden reconciliation - Fixes #2590

- check for non-JSON / unparseable JSON returns
- handle malformed results response with no name for candidates
- catch any Exception, not just IOExceptions
- call processManager.onFailedProcess() for cleanup on error

* Add default constructor for Jackson

Jackson complains about needing a default constructor for the
NON_DEFAULT annotation, but I'm not sure why this worked before.

* Clean up indentation and unused variable - no functional changes

Make indentation consistent throughout the module, changing recently
added lines to use the standard all spaces convention.

Remove unused count variable

* Simplify control flow

* Update limit parameter comment. No functional change.

* Replace ternary expression which is causing NPE

* Add reconciliation tests using mock HTTP server
2020-06-23 21:54:54 +02:00
dependabot-preview[bot]
408b782117
Bump build-helper-maven-plugin from 3.1.0 to 3.2.0
Bumps [build-helper-maven-plugin](https://github.com/mojohaus/build-helper-maven-plugin) from 3.1.0 to 3.2.0.
- [Release notes](https://github.com/mojohaus/build-helper-maven-plugin/releases)
- [Commits](https://github.com/mojohaus/build-helper-maven-plugin/compare/build-helper-maven-plugin-3.1.0...build-helper-maven-plugin-3.2.0)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-06-23 08:45:07 +00:00
Tom Morris
6e66cb5144
Merge pull request #2734 from OpenRefine/dependabot/maven/org.codehaus.mojo-exec-maven-plugin-3.0.0
Bump exec-maven-plugin from 1.3 to 3.0.0
2020-06-22 13:08:11 -04:00
Tom Morris
0bfa3dd68a
Merge pull request #2745 from OpenRefine/dependabot/maven/org.codehaus.mojo-build-helper-maven-plugin-3.1.0
Bump build-helper-maven-plugin from 1.8 to 3.1.0
2020-06-22 13:06:02 -04:00
Tom Morris
6b00c7b602
Merge pull request #2782 from tfmorris/2306-gdata-empty-cells
Fix Google Sheets export with empty cells
2020-06-22 13:01:23 -04:00
Tom Morris
bf57667a47
Merge pull request #2789 from OpenRefine/dependabot/maven/org.apache.maven.plugins-maven-resources-plugin-3.1.0
Bump maven-resources-plugin from 2.6 to 3.1.0
2020-06-22 12:48:59 -04:00
Tom Morris
445395d18d
Merge pull request #2788 from OpenRefine/dependabot/maven/org.xerial-sqlite-jdbc-3.32.3
Bump sqlite-jdbc from 3.31.1 to 3.32.3
2020-06-22 12:36:31 -04:00
Tom Morris
5063466f16
Merge pull request #2781 from tfmorris/2780-missing-surefireArgs
Fix reference to undefined surfireArgs param
2020-06-22 11:24:01 -04:00
Antonin Delpeuch
17df444630
Merge pull request #2791 from weblate/weblate-openrefine-translations
Translations update from Weblate
2020-06-22 12:23:59 +02:00
Hosted Weblate
9524543eb2
Merge branch 'origin/master' into Weblate. 2020-06-22 12:22:58 +02:00
Adolfo Jayme Barrientos
c43214203b
Translated using Weblate (Spanish)
Currently translated at 100.0% (47 of 47 strings)

Translation: OpenRefine/gdata
Translate-URL: https://hosted.weblate.org/projects/openrefine/gdata/es/
2020-06-22 12:22:57 +02:00
Adolfo Jayme Barrientos
b94be34b73
Added translation using Weblate (Spanish) 2020-06-22 12:22:54 +02:00
Antonin Delpeuch
d30b21cf0f
Merge pull request #2790 from weblate/weblate-openrefine-translations
Translations update from Weblate
2020-06-22 12:22:23 +02:00
Hosted Weblate
1c63f6bf96
Merge branch 'origin/master' into Weblate. 2020-06-22 12:03:51 +02:00
Rafael Fontenelle
0da1b34095
Translated using Weblate (Portuguese (Brazil))
Currently translated at 100.0% (753 of 753 strings)

Translation: OpenRefine/Translations
Translate-URL: https://hosted.weblate.org/projects/openrefine/translations/pt_BR/
2020-06-22 12:03:51 +02:00
Adolfo Jayme Barrientos
e35358709a
Translated using Weblate (Spanish)
Currently translated at 99.4% (749 of 753 strings)

Translation: OpenRefine/Translations
Translate-URL: https://hosted.weblate.org/projects/openrefine/translations/es/
2020-06-22 12:03:51 +02:00
Adolfo Jayme Barrientos
644e3b6499
Added translation using Weblate (Spanish) 2020-06-22 12:03:46 +02:00
dependabot-preview[bot]
a8ae5d37ed
Bump maven-resources-plugin from 2.6 to 3.1.0
Bumps [maven-resources-plugin](https://github.com/apache/maven-resources-plugin) from 2.6 to 3.1.0.
- [Release notes](https://github.com/apache/maven-resources-plugin/releases)
- [Commits](https://github.com/apache/maven-resources-plugin/compare/maven-resources-plugin-2.6...maven-resources-plugin-3.1.0)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-06-22 08:45:55 +00:00
dependabot-preview[bot]
71028eb7ab
Bump sqlite-jdbc from 3.31.1 to 3.32.3
Bumps [sqlite-jdbc](https://github.com/xerial/sqlite-jdbc) from 3.31.1 to 3.32.3.
- [Release notes](https://github.com/xerial/sqlite-jdbc/releases)
- [Changelog](https://github.com/xerial/sqlite-jdbc/blob/master/CHANGELOG)
- [Commits](https://github.com/xerial/sqlite-jdbc/compare/3.31.1...3.32.3)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-06-22 08:45:54 +00:00
Antonin Delpeuch
3c355b2a5f Fix evaluation stage for surefireArgs 2020-06-22 06:28:13 +02:00
Tom Morris
85d4de8e2c Fix reference to undefined surfireArgs param
Restore parameter with an empty value
2020-06-22 00:21:07 -04:00
Tom Morris
e293602897
Restore character encoding guesser (#2755)
* Fixes #486. Builds on code from Steffen Stundzig

- Switch from ICU4J to juniversalchardet
  (Java port of Mozilla charset detector)
- Replace org.json code with Jackson
- Add tests
- Add TODO for multi-file character encoding mismatches

* Restore dependency lost in rebase

Co-authored-by: Steffen Stundzig <git@stundzig.de>
2020-06-22 06:04:51 +02:00
Tom Morris
9b8e750550 Don't skip empty cells on export
Make sure we output at least an empty string as a placeholder.
Fixes #2306
2020-06-21 23:16:46 -04:00
Tom Morris
7a1451f561 Report errors to user
No errors were being reported before.
Also add TODO for progress indicator on long uploads
2020-06-21 23:09:47 -04:00
Tom Morris
60ec57aff4
Merge pull request #2779 from weblate/weblate-openrefine-translations
Translations update from Weblate
2020-06-21 16:19:20 -04:00
Hosted Weblate
ba4b70db4e
Merge branch 'origin/master' into Weblate. 2020-06-21 21:41:51 +02:00
Isao Matsunami
8451e97fc8
Translated using Weblate (Japanese)
Currently translated at 100.0% (193 of 193 strings)

Translation: OpenRefine/wikidata
Translate-URL: https://hosted.weblate.org/projects/openrefine/wikidata/ja/
2020-06-21 21:41:48 +02:00
Rafael Fontenelle
cc787b4257
Translated using Weblate (Portuguese (Brazil))
Currently translated at 100.0% (752 of 752 strings)

Translation: OpenRefine/Translations
Translate-URL: https://hosted.weblate.org/projects/openrefine/translations/pt_BR/
2020-06-21 21:41:47 +02:00
Isao Matsunami
0383a7385e
Translated using Weblate (Japanese)
Currently translated at 100.0% (752 of 752 strings)

Translation: OpenRefine/Translations
Translate-URL: https://hosted.weblate.org/projects/openrefine/translations/ja/
2020-06-21 21:41:46 +02:00
Adolfo Jayme Barrientos
b1625c714d
Translated using Weblate (Spanish)
Currently translated at 99.4% (748 of 752 strings)

Translation: OpenRefine/Translations
Translate-URL: https://hosted.weblate.org/projects/openrefine/translations/es/
2020-06-21 21:41:45 +02:00
Tom Morris
2977ffa167
Merge pull request #2778 from OpenRefine/2777-test-dependencies
Restrict copied jars to runtime dependencies
2020-06-21 15:09:28 -04:00
Antonin Delpeuch
e92200a35f Move jaxb-api dependency out of the test section 2020-06-21 20:56:33 +02:00
Tom Morris
5d6d0ad6ba
Add missing wiring for i18n plurals. (#2774)
* Add missing wiring for i18n plurals parser

* Fix goto page plural for French
2020-06-21 15:57:17 +02:00
Antonin Delpeuch
62cb20a201 Restrict copied jars to runtime dependencies. Fixes #2777 2020-06-21 15:36:17 +02:00