Commit Graph

5288 Commits

Author SHA1 Message Date
Tom Morris
d3db73aa67 Remove shortest-column-name ordering
Refs #2863
The tree importer sorts columns/column groups by how populated
they are, which is of arguable utility, but the tie-breaker
of ordering by shortest column name is completely silly.

This change removes that and, in conjunction with a stable sort
algorithm, will preserve the original order of the columns.
2020-07-02 16:12:55 -04:00
Tom Morris
28a9f68236
Unit test improvements (#2856)
* Fix two deprecated methods usages

* Test ToNumber conversions

* Test behavior of all functions when passed 0 or 8 arguments

There are 16 which fail currently on 0 args (return null or
False instead of EvalError), but have been whitelisted until
we can verify whether it's safe to change them without introducing
compatibility issues.

There are 19 which fail to return an error on too many (ie 8) args.
2020-07-02 20:29:21 +02:00
Ekta Mishra
cd0ed11dad
Implemented Format Scrutinizer tests using Mockito (#2849)
* Implemented Format Scrutinizer tests using Mockito

Updated implementation of the scrutinzer & tests

* Testcases updated in FormatScrutinizerTest
2020-07-02 16:28:56 +02:00
Ekta Mishra
9dfb9114c4
Implemented QualifierComaptibilty Scrutinizer tests using Mockito (#2860)
Updated test cases & added AlLowedQualifierConstraint and MandatoryQualifierConstraint classes.
2020-07-02 14:22:50 +02:00
Ekta Mishra
67bc8581ce
Implemented InverseScrutinizer tests using Mocks (#2855)
* Implemented InverseScrutinizer tests using Mocks

updated testcases and added InverseConstraint Class

* Test cases updated & working fine
2020-07-01 20:49:15 +02:00
Tom Morris
2d1d740b44
Merge pull request #2853 from OpenRefine/dependabot/maven/com.google.http-client-google-http-client-jackson2-1.36.0
Bump google-http-client-jackson2 from 1.35.0 to 1.36.0
2020-07-01 08:58:19 -04:00
dependabot-preview[bot]
cd0d4bdda9
Bump google-api-services-sheets
Bumps google-api-services-sheets from v4-rev20200508-1.30.9 to v4-rev20200616-1.30.9.

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-07-01 08:26:36 +00:00
dependabot-preview[bot]
b9dedc4438
Bump google-http-client-jackson2 from 1.35.0 to 1.36.0
Bumps [google-http-client-jackson2](https://github.com/googleapis/google-http-java-client) from 1.35.0 to 1.36.0.
- [Release notes](https://github.com/googleapis/google-http-java-client/releases)
- [Changelog](https://github.com/googleapis/google-http-java-client/blob/master/CHANGELOG.md)
- [Commits](https://github.com/googleapis/google-http-java-client/compare/v1.35.0...v1.36.0)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-07-01 08:26:18 +00:00
Thad Guidry
b31d2457b0
Add Chan Zuckerberg Initiaive to our backers file (#2852) 2020-07-01 08:08:11 +02:00
dependabot[bot]
3ec20eecb6
Bump xstream from 1.4.9 to 1.4.10-java7 in /packaging
Bumps [xstream](https://github.com/x-stream/xstream) from 1.4.9 to 1.4.10-java7.
- [Release notes](https://github.com/x-stream/xstream/releases)
- [Commits](https://github.com/x-stream/xstream/commits)

Signed-off-by: dependabot[bot] <support@github.com>
2020-06-30 23:27:53 +00:00
Ekta Mishra
cef2e84e7f
Implemented EntityTypeScrutinizer tests usings mocks (#2839)
Updates all the testcases in EntityTypeScrutinizerTest
2020-06-30 22:59:43 +02:00
Tom Morris
54291ef441
Use Apache IO Commons IOUtils instead of homerolled (#2845)
Probably should remove the funky Gzip support with the
overloaded use of the encoding parameter, but this is
a start.
2020-06-30 13:49:47 +02:00
Chetan Verma
e2a2dd2a4e
Fix misstatement about supported formats in import project screen (#2841)
Closes #2753.
2020-06-30 08:25:15 +02:00
Tom Morris
b64cbfea4f
Fix i18n. Fixes #2805 (#2847)
Fix database extensions exporter which is corrupting the dictionary
name with the value of the language.
2020-06-30 08:22:12 +02:00
Tom Morris
0f3a6006f3
Add Excel95 import test and improve other importer tests (#2844)
No issue.
- we don't support Excel95, but make sure that it generates an exception
- move the test data file into the appropriate directory
- for any normal test, consider exceptions a failure
2020-06-30 08:20:56 +02:00
Tom Morris
421974cc3d
Truncate any completely empty columns on the right (#2842)
Fixes #565
The current versions of Open Office create default spreadsheets
with over 1000 empty columns. Keep track of the rightmost
non-empty column when importing and truncate everything else.

Also adds a basic ODS import test.
2020-06-30 08:19:00 +02:00
Ekta Mishra
bc672047f6
Implemented DistinctValueScrutinizer tests using mockito (#2833)
* Implemented DistinctValueScrutinizer tests using mcokito

Added inner class to the scrutinizer and updated the tests using mocks.

* Tests updated-testNoIssue added

* all tests updated & working fine
2020-06-29 16:00:37 +02:00
Ekta Mishra
46c510b5e2
Implemented SingleValue Scrutinizer tests using mocks (#2818)
* Implemented SingleValue Scrutinizer tests using mocks

Updated test class & added inner class to the scrutinizer

* tests updated

* Updated SingleValueConstraint class
2020-06-29 15:59:53 +02:00
Thad Guidry
2a34c8b5e6
begin Docusaurus 2 migration (#2799)
* begin Docusaurus 2 migration

* Need help fixing the broken 'index'
* needs further customizing footer if we want

* fix README.md

* fixed Pages and Sidebar not loading

Yeah!

* Revert "fixed Pages and Sidebar not loading"

This reverts commit b1588387fc89d650b391c5a8883b6100c4714fbd.

* Revert "fix README.md"

This reverts commit a81509c3c62f11370df40096e55dfd544dad2f87.

* Revert "begin Docusaurus 2 migration"

This reverts commit 59d59c355b8d2a1a270a5655922d53a0577d6414.

* clean move the files for Antonin

* fix broken Navbar links

* fix wrong GitHub link pointing to Docusaurus href

* Fix the edit link for GitHub in top right corner

* Copy content from wiki into Technical Reference

* Copy pages from wiki for top level Architecture

* fix sidebar ordering for Tech

* Add colors from our logo into Infima color matrix

* add comment about colors

* shift primary color by 1 shade in matrix
2020-06-29 08:45:24 +02:00
Ekta Mishra
f32f6a6ea2
Change return type of getConstraintsByType method (#2838)
changed the return type of getConstraintsByTpye method from Stream<Statement> to List<Statement>
2020-06-29 08:43:38 +02:00
Tom Morris
bc540a880e
Fix update to deprecated Google Drive credential code (#2828)
No issue. Restore missing piece of commit 42354c0 so that Builder
has the method parameter that it needs.
2020-06-28 23:07:06 +02:00
Ekta Mishra
1b04927d12
Add constraint class (#2822)
* Add constraint class

* Add constraint class

* updated names
2020-06-28 10:20:18 +02:00
Tom Morris
83f52d4ba5
Fall back to Apache Jena 3.9.0 (from 3.15.0) (#2826)
Fixes #2824
Versions up through 3.14.0 appear to work, but since odfdom bundles
Jena 3.9.0, we're going to be conservative and match that.

As an added bonus, includes a blank node test which will trigger
the failure.
2020-06-27 23:40:21 +02:00
Antoine Beaubien
043e595ea0
Change pref name for ui.browsing.pageSize (#2817)
Change the preference key name ui.gridPaginationSize for ui.browsing.pageSize.
2020-06-27 21:58:48 +02:00
Ekta Mishra
7ac41b4609
Implemented ConflictsWithScrutinizer tests using Mockito (#2804)
updated test class by creating mocks for ConstraintFetcher

Implemented tests for conflicts-with scrutinizer using mocks

Added testcase for no statementList & multiple constraint.

Implemented tests using mock for conflicts-with scrutinizer

Implemented tests using mock for conflicts-with scrutinizer

Added test case for multiple constraints

Added test case for multiple constraints
2020-06-27 17:17:20 +02:00
Ekta Mishra
8c1d8cdcb7
New implementation for Multivalue Scrutinizer (#2807)
Created inner class for Multivalue & mocks for unit tests

New implementation for multivalue scrutinizer

tests updated
2020-06-26 10:14:34 +02:00
Lisa Chandra
7b8f8486f6
Adds a default separator preference for split/join multi valued cells (#2520)
* default value for split/join

* using the new preference interface

* changed preference name to ui.cell.rowSplitDefaultSeparator
2020-06-25 14:35:53 +02:00
Tom Morris
cfa1038066
Remove commons-digester dependency (#2798) 2020-06-25 14:16:25 +02:00
dependabot-preview[bot]
c09e1d5baa
Bump jackson.version from 2.11.0 to 2.11.1 (#2811)
Bumps `jackson.version` from 2.11.0 to 2.11.1.

Updates `jackson-databind` from 2.11.0 to 2.11.1
- [Release notes](https://github.com/FasterXML/jackson/releases)
- [Commits](https://github.com/FasterXML/jackson/commits)

Updates `jackson-annotations` from 2.11.0 to 2.11.1
- [Release notes](https://github.com/FasterXML/jackson/releases)
- [Commits](https://github.com/FasterXML/jackson/commits)

Updates `jackson-core` from 2.11.0 to 2.11.1
- [Release notes](https://github.com/FasterXML/jackson-core/releases)
- [Commits](https://github.com/FasterXML/jackson-core/compare/jackson-core-2.11.0...jackson-core-2.11.1)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>

Co-authored-by: dependabot-preview[bot] <27856297+dependabot-preview[bot]@users.noreply.github.com>
2020-06-25 10:39:29 +02:00
Tom Morris
4b146acc6e
Create Project import improvements (#2806)
* Fix charset encoding & MIME type handling

Character set (ie what we call "encoding") is part of the Content-Type,
*not* the Content-Encoding, which specifies compression (e.g. gzip).

This correctly sets the character set encoding as well as cleaning
the MIME type so that additional parsing doesn't need to be done
downstream (and removes that code).

* Use "text" instead of "text/line-based" as default fallback format

The TextLineBasedGuesser only tries a limited number of
formats (CSV, TSV, fixed), so we can't get out of that hole to
find JSON, XML, etc.

Start with a more general format instead to improve our
guessing odds.

* Support content type Structured Name Syntax Suffixes (+json +xml)

If we can't find a fully specified content type in our lookup,
fall back to just the suffix (which is registered with a leading +)
Fixes #2800 Fixes #2805
2020-06-25 08:36:57 +02:00
Tom Morris
3aa610d6aa
Improve Google Sheets upload (#2784)
* Support more than 26 columns

Google Sheets default to just 26 columns (A-Z) and we need to
explicitly add more columns if we need them.

Fixes #2760

* Improve Google Sheets upload

- upload in chunks instead of serializing the entire document at once
- Free up resources as we go
- stop if an error occurs
- reduce batch size to try and stay in 10MB request size limit
  (but need a more dynamic way to do this probably for very wide
   sheets or sheets with large values)

* Add basic test and do some cleanup

- add test for columns > 26
- refactor to allow testing and not depend on unnecessary fields
- add i18n TODO for translating spreadsheet description

* Preserve cell data types

Fixes #2785
- integers and floats are sent as Doubles
- bools as Boolean
- DateTimes as Strings
- nulls as the empty string
- anything else as Strings using .toString()

* Fix LGTM-flagged potentially null pointer dereference
2020-06-25 08:18:28 +02:00
dependabot-preview[bot]
de309158c9
Bump plexus-archiver from 4.0.0 to 4.2.2 (#2736)
* Bump plexus-archiver from 4.0.0 to 4.2.2

Bumps [plexus-archiver](https://github.com/codehaus-plexus/plexus-archiver) from 4.0.0 to 4.2.2.
- [Release notes](https://github.com/codehaus-plexus/plexus-archiver/releases)
- [Changelog](https://github.com/codehaus-plexus/plexus-archiver/blob/master/ReleaseNotes.md)
- [Commits](https://github.com/codehaus-plexus/plexus-archiver/compare/plexus-archiver-4.0.0...plexus-archiver-4.2.2)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>

* Add comment to explain dependency override

Co-authored-by: dependabot-preview[bot] <27856297+dependabot-preview[bot]@users.noreply.github.com>
Co-authored-by: Antonin Delpeuch <antonin@delpeuch.eu>
2020-06-25 08:03:48 +02:00
Tom Morris
7f435bd3df
Remove obsolete Google API key reference (#2809)
This key was used for the Freebase APIs and is no longer
referenced anywhere.
2020-06-25 07:57:04 +02:00
Tom Morris
a24f2f3feb
Merge pull request #2802 from OpenRefine/dependabot/maven/com.google.apis-google-api-services-drive-v3-rev20200609-1.30.9
Bump google-api-services-drive from v3-rev20200413-1.30.9 to v3-rev20200609-1.30.9
2020-06-24 23:37:46 -04:00
dependabot-preview[bot]
3c4712fb43
Bump google-api-services-drive
Bumps google-api-services-drive from v3-rev20200413-1.30.9 to v3-rev20200609-1.30.9.

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-06-25 03:08:30 +00:00
Tom Morris
f9eb819b01
Merge pull request #2737 from OpenRefine/dependabot/maven/org.slf4j-slf4j-log4j12-1.7.30
Bump slf4j-log4j12 from 1.7.18 to 1.7.30
2020-06-24 16:00:22 -04:00
Antonin Delpeuch
d5dc123bb1
Merge pull request #2808 from weblate/weblate-openrefine-translations
Translations update from Weblate
2020-06-24 21:56:55 +02:00
Hosted Weblate
07fbc70ada
Merge branch 'origin/master' into Weblate. 2020-06-24 21:41:53 +02:00
Adolfo Jayme Barrientos
b581721ede
Translated using Weblate (Spanish)
Currently translated at 94.3% (182 of 193 strings)

Translation: OpenRefine/wikidata
Translate-URL: https://hosted.weblate.org/projects/openrefine/wikidata/es/
2020-06-24 21:41:50 +02:00
Isao Matsunami
1b30d61b2f
Translated using Weblate (Japanese)
Currently translated at 100.0% (753 of 753 strings)

Translation: OpenRefine/Translations
Translate-URL: https://hosted.weblate.org/projects/openrefine/translations/ja/
2020-06-24 21:41:46 +02:00
Adolfo Jayme Barrientos
cf388fc5f4
Translated using Weblate (Spanish)
Currently translated at 99.4% (749 of 753 strings)

Translation: OpenRefine/Translations
Translate-URL: https://hosted.weblate.org/projects/openrefine/translations/es/
2020-06-24 21:41:46 +02:00
dependabot-preview[bot]
55171a85eb
Bump mariadb-java-client from 2.6.0 to 2.6.1 (#2801)
Bumps [mariadb-java-client](https://github.com/mariadb-corporation/mariadb-connector-j) from 2.6.0 to 2.6.1.
- [Release notes](https://github.com/mariadb-corporation/mariadb-connector-j/releases)
- [Changelog](https://github.com/mariadb-corporation/mariadb-connector-j/blob/master/CHANGELOG.md)
- [Commits](https://github.com/mariadb-corporation/mariadb-connector-j/compare/2.6.0...2.6.1)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>

Co-authored-by: dependabot-preview[bot] <27856297+dependabot-preview[bot]@users.noreply.github.com>
2020-06-24 11:59:48 +02:00
dependabot-preview[bot]
d297026f29
Bump maven-antrun-plugin from 1.4 to 3.0.0 (#2795)
* Bump maven-antrun-plugin from 1.4 to 3.0.0

Bumps [maven-antrun-plugin](https://github.com/apache/maven-antrun-plugin) from 1.4 to 3.0.0.
- [Release notes](https://github.com/apache/maven-antrun-plugin/releases)
- [Commits](https://github.com/apache/maven-antrun-plugin/compare/maven-antrun-plugin-1.4...maven-antrun-plugin-3.0.0)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>

* Replace <tasks> with <target> for maven-antrun-plugin 3.0.0

Co-authored-by: dependabot-preview[bot] <27856297+dependabot-preview[bot]@users.noreply.github.com>
Co-authored-by: Tom Morris <tfmorris@gmail.com>
2020-06-24 09:01:24 +02:00
Tom Morris
76d30ee1f0
Merge pull request #2794 from OpenRefine/dependabot/maven/org.codehaus.mojo-build-helper-maven-plugin-3.2.0
Bump build-helper-maven-plugin from 3.1.0 to 3.2.0
2020-06-23 17:08:40 -04:00
Tom Morris
d97d6c66b8
Update Google API dependencies for GData extension (#2754)
* Update Google API dependencies for Sheets & Drive

Remove unnecessary direct dependencies which are transitive
dependencies of those.

* Fix use of deprecated class
2020-06-23 21:55:46 +02:00
Tom Morris
1849e62234
Better error handling for reconciliation process - fixes #2590 (#2671)
* Harden reconciliation - Fixes #2590

- check for non-JSON / unparseable JSON returns
- handle malformed results response with no name for candidates
- catch any Exception, not just IOExceptions
- call processManager.onFailedProcess() for cleanup on error

* Add default constructor for Jackson

Jackson complains about needing a default constructor for the
NON_DEFAULT annotation, but I'm not sure why this worked before.

* Clean up indentation and unused variable - no functional changes

Make indentation consistent throughout the module, changing recently
added lines to use the standard all spaces convention.

Remove unused count variable

* Simplify control flow

* Update limit parameter comment. No functional change.

* Replace ternary expression which is causing NPE

* Add reconciliation tests using mock HTTP server
2020-06-23 21:54:54 +02:00
dependabot-preview[bot]
408b782117
Bump build-helper-maven-plugin from 3.1.0 to 3.2.0
Bumps [build-helper-maven-plugin](https://github.com/mojohaus/build-helper-maven-plugin) from 3.1.0 to 3.2.0.
- [Release notes](https://github.com/mojohaus/build-helper-maven-plugin/releases)
- [Commits](https://github.com/mojohaus/build-helper-maven-plugin/compare/build-helper-maven-plugin-3.1.0...build-helper-maven-plugin-3.2.0)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-06-23 08:45:07 +00:00
Tom Morris
6e66cb5144
Merge pull request #2734 from OpenRefine/dependabot/maven/org.codehaus.mojo-exec-maven-plugin-3.0.0
Bump exec-maven-plugin from 1.3 to 3.0.0
2020-06-22 13:08:11 -04:00
Tom Morris
0bfa3dd68a
Merge pull request #2745 from OpenRefine/dependabot/maven/org.codehaus.mojo-build-helper-maven-plugin-3.1.0
Bump build-helper-maven-plugin from 1.8 to 3.1.0
2020-06-22 13:06:02 -04:00
Tom Morris
6b00c7b602
Merge pull request #2782 from tfmorris/2306-gdata-empty-cells
Fix Google Sheets export with empty cells
2020-06-22 13:01:23 -04:00