David Huynh
a1a8758c37
Added options for specifying # lines the header columns take, and the # lines to skip processing entirely initially.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@468 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-13 21:23:41 +00:00
David Huynh
a2db5590ac
Trim column names on import.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@461 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-13 06:28:13 +00:00
David Huynh
f7e830e709
Fixed bug in which editing a single cell and then starring the same row seemed to revert the cell back to its original content.
...
Added an option for not guessing cell value type during import.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@446 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-11 21:54:56 +00:00
David Huynh
5928a689e2
Use RowParser for parsing the header row, too.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@444 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-11 03:42:44 +00:00
Stefano Mazzocchi
d3d40d608a
bunch of PMD-induced fixes
...
(now the PMD report is clean)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@430 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-09 00:14:11 +00:00
David Huynh
5320cc6587
Make duplicated column names unique during import by appending indices to them.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@392 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-06 17:55:36 +00:00
Stefano Mazzocchi
798b2a36ca
- archive and compressed file importer (supports zip, tar, gz, bz2, tar.gz and tar.bz2)
...
(works by loading the files that have the most common extensions in the archive)
- changed default max heap to 3Gb
git-svn-id: http://google-refine.googlecode.com/svn/trunk@381 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-04 07:48:47 +00:00
Stefano Mazzocchi
dced641599
- added the ability to specify the character separator for CSV or TSV files that don't use commas or tabs (this was needed to parse a dataset that we got from the BBC to try things out)
...
- used commons-lang split function instead of the java String.split one, this is necessary to avoid having to escape separators that might be confused for regexps
git-svn-id: http://google-refine.googlecode.com/svn/trunk@368 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-31 22:34:21 +00:00
David Huynh
7e2667ab45
Minor bug in Excel importer: we forgot to update the max cell index.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@281 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-12 00:23:01 +00:00
David Huynh
b75f1faea8
Changed tabs to spaces. No functionality change.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@174 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-03 04:19:58 +00:00
David Huynh
bb83dcda1c
Added support for specifying number of initial rows to skip when creating a new project.
...
Fixed the height of the histogram images in range facets to eliminate jitters.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@135 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-24 18:52:54 +00:00
David Huynh
254853b51d
Added reverse and sort functions.
...
Support a limit on how many rows to load into a new project.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@134 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-23 23:22:02 +00:00
Stefano Mazzocchi
a61f35079a
make eclipse happier by removing @Override annotations when really it's an interface method implementation
...
(no functional changes)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@62 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-07 06:47:52 +00:00
David Huynh
a025b272bd
String.isEmpty() is no longer there (?!).
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@61 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-07 06:16:46 +00:00
David Huynh
16dda46a61
Refactored importers, adding support for Excel files.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@47 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-05 19:19:38 +00:00