* Implemented RestrictedPosition Scrutinizer tests using mocks
Added RestrictedPositionConstraint class and updated test cases using mocks
* Tests updated & working fine
* Truncate any completely empty columns on the right
Fixes#565
The current versions of Open Office create default spreadsheets
with over 1000 empty columns. Keep track of the rightmost
non-empty column when importing and truncate everything else.
Also adds a basic ODS import test.
* Fix dates in ODS spreadsheets
Fixes#2224
* Performance optimized version of ToNumber
Approximately 5x faster for floats (data dependent)
and about the same speed for integers.
- Instead of blindly trying to parse as Long, do a quick check
for obvious problems (e.g. decimal point).
- Don't trim. It's already done by called methods.
- Use valueOf() instead of parse() to avoid object creation
* Add Java Microbenchmark Harness
The shaded JAR is missing the OpenRefine classes, for a reason
that I haven't figured out, so requires openrefine-main.jar at runtime.
* Remove old implementations of ToNumber
* Remove unneeded dependencies from main project
* Clean up and reformat
Refs #2863
The tree importer sorts columns/column groups by how populated
they are, which is of arguable utility, but the tie-breaker
of ordering by shortest column name is completely silly.
This change removes that and, in conjunction with a stable sort
algorithm, will preserve the original order of the columns.
* Fix two deprecated methods usages
* Test ToNumber conversions
* Test behavior of all functions when passed 0 or 8 arguments
There are 16 which fail currently on 0 args (return null or
False instead of EvalError), but have been whitelisted until
we can verify whether it's safe to change them without introducing
compatibility issues.
There are 19 which fail to return an error on too many (ie 8) args.
No issue.
- we don't support Excel95, but make sure that it generates an exception
- move the test data file into the appropriate directory
- for any normal test, consider exceptions a failure
Fixes#565
The current versions of Open Office create default spreadsheets
with over 1000 empty columns. Keep track of the rightmost
non-empty column when importing and truncate everything else.
Also adds a basic ODS import test.
* Implemented DistinctValueScrutinizer tests using mcokito
Added inner class to the scrutinizer and updated the tests using mocks.
* Tests updated-testNoIssue added
* all tests updated & working fine
* Implemented SingleValue Scrutinizer tests using mocks
Updated test class & added inner class to the scrutinizer
* tests updated
* Updated SingleValueConstraint class
* begin Docusaurus 2 migration
* Need help fixing the broken 'index'
* needs further customizing footer if we want
* fix README.md
* fixed Pages and Sidebar not loading
Yeah!
* Revert "fixed Pages and Sidebar not loading"
This reverts commit b1588387fc89d650b391c5a8883b6100c4714fbd.
* Revert "fix README.md"
This reverts commit a81509c3c62f11370df40096e55dfd544dad2f87.
* Revert "begin Docusaurus 2 migration"
This reverts commit 59d59c355b8d2a1a270a5655922d53a0577d6414.
* clean move the files for Antonin
* fix broken Navbar links
* fix wrong GitHub link pointing to Docusaurus href
* Fix the edit link for GitHub in top right corner
* Copy content from wiki into Technical Reference
* Copy pages from wiki for top level Architecture
* fix sidebar ordering for Tech
* Add colors from our logo into Infima color matrix
* add comment about colors
* shift primary color by 1 shade in matrix
Fixes#2824
Versions up through 3.14.0 appear to work, but since odfdom bundles
Jena 3.9.0, we're going to be conservative and match that.
As an added bonus, includes a blank node test which will trigger
the failure.
updated test class by creating mocks for ConstraintFetcher
Implemented tests for conflicts-with scrutinizer using mocks
Added testcase for no statementList & multiple constraint.
Implemented tests using mock for conflicts-with scrutinizer
Implemented tests using mock for conflicts-with scrutinizer
Added test case for multiple constraints
Added test case for multiple constraints
* Fix charset encoding & MIME type handling
Character set (ie what we call "encoding") is part of the Content-Type,
*not* the Content-Encoding, which specifies compression (e.g. gzip).
This correctly sets the character set encoding as well as cleaning
the MIME type so that additional parsing doesn't need to be done
downstream (and removes that code).
* Use "text" instead of "text/line-based" as default fallback format
The TextLineBasedGuesser only tries a limited number of
formats (CSV, TSV, fixed), so we can't get out of that hole to
find JSON, XML, etc.
Start with a more general format instead to improve our
guessing odds.
* Support content type Structured Name Syntax Suffixes (+json +xml)
If we can't find a fully specified content type in our lookup,
fall back to just the suffix (which is registered with a leading +)
Fixes#2800Fixes#2805
* Support more than 26 columns
Google Sheets default to just 26 columns (A-Z) and we need to
explicitly add more columns if we need them.
Fixes#2760
* Improve Google Sheets upload
- upload in chunks instead of serializing the entire document at once
- Free up resources as we go
- stop if an error occurs
- reduce batch size to try and stay in 10MB request size limit
(but need a more dynamic way to do this probably for very wide
sheets or sheets with large values)
* Add basic test and do some cleanup
- add test for columns > 26
- refactor to allow testing and not depend on unnecessary fields
- add i18n TODO for translating spreadsheet description
* Preserve cell data types
Fixes#2785
- integers and floats are sent as Doubles
- bools as Boolean
- DateTimes as Strings
- nulls as the empty string
- anything else as Strings using .toString()
* Fix LGTM-flagged potentially null pointer dereference