* Add utility functions to check/convert dates
* Add date tests and refactor to DRY up
* Fix date import - fixes#1908
Change from java.util.Date to OpenRefine 3.0+'s OffsetDateTime
Fixes#1908
* Centralize date conversion
* Moving utility methods to ParsingUtilities
* Fix tests
* Sleep to wait for database servers on Mac - refs #2861
* Tweak JDK build settings (again) - refs #2861
- Use default JDK on platforms where possible
- Make problematic builds (requiring Java installs) optional
* Use standard text normalization - fixes#2898Fixes#2898. Fixes#409. Refs #650
Replaces homegrown ISO Latin-1 only character subsitition
with standard Java Normalize to NFD, followed by diacritic
removal and a few custom character expansions/replacements.
* Fix Mac build
* Improve compatibility with previous code
One intentional change is folding O with stroke to
oe instead of o.
- Use more powerful NFKD instead of NFD
- strip punctuation after decomposition since it can generate
new punctuation
- Add compatibility test for old asciify() method
- Add some graphically similar characters to substitution table
* Add oe character/ligature & more long S forms
* More tests for ligatures and Latin Extended
* Add Latin-1 Supplement tests
Fixes#1161
This change parallels what was done in #12571da3c00 to fix
the FingerprintKeyer and moves the diacritic removal before
the deduping. Includes a test.
* Adjust Travis build environments - fixes#2861Fixes#2861
- Only builds one each of JDK 11-14
- Fixes all validator warnings
- Switches default build environment to bionic
- Uses trusty for an Oracle JDK 8 build
- Adds OS X build
- Adds JDK 13 & 14 builds
- Adds placeholder for JDK 16 builds
(but Jacoco doesn't currently support it,
so commented out)
- Reorder build jobs so that most informative ones run first
- Split before_install into before_install and
before_script sections
* Drop redundant JDK 13 build
* Swap OS X to JDK 14 instead of JDK 13
This doesn't have anything to do with JDK or OS X versions,
but instead the Travis CI build images. A bug in the homebrew
support was only fixed in recent images, so we need to use
an xcode11 build which implies macOS 10.14 or 10.5 and
JDK 14 or 14.0.1.
* Implemented RestrictedPosition Scrutinizer tests using mocks
Added RestrictedPositionConstraint class and updated test cases using mocks
* Tests updated & working fine
* Truncate any completely empty columns on the right
Fixes#565
The current versions of Open Office create default spreadsheets
with over 1000 empty columns. Keep track of the rightmost
non-empty column when importing and truncate everything else.
Also adds a basic ODS import test.
* Fix dates in ODS spreadsheets
Fixes#2224
* Performance optimized version of ToNumber
Approximately 5x faster for floats (data dependent)
and about the same speed for integers.
- Instead of blindly trying to parse as Long, do a quick check
for obvious problems (e.g. decimal point).
- Don't trim. It's already done by called methods.
- Use valueOf() instead of parse() to avoid object creation
* Add Java Microbenchmark Harness
The shaded JAR is missing the OpenRefine classes, for a reason
that I haven't figured out, so requires openrefine-main.jar at runtime.
* Remove old implementations of ToNumber
* Remove unneeded dependencies from main project
* Clean up and reformat