Commit Graph

135 Commits

Author SHA1 Message Date
Stefano Mazzocchi
1b9cfbbf90 detabbing (no functional changes)
David, you might want to check your editor settings, you're mixing tabs with spaces


git-svn-id: http://google-refine.googlecode.com/svn/trunk@724 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-12 09:02:41 +00:00
Stefano Mazzocchi
11da70d223 Applying patch for Issue 21 from iainsproat
git-svn-id: http://google-refine.googlecode.com/svn/trunk@722 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-12 08:59:05 +00:00
Stefano Mazzocchi
fe0afa0bc3 Fixed Issue #18
git-svn-id: http://google-refine.googlecode.com/svn/trunk@721 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-12 07:53:26 +00:00
David Huynh
8c03c1ddcf Prevent autosave timer events from bunching up when the computer is put into sleep mode.
Don't autosave while creating or importing projects, exporting rows, or uploading data to Freebase. Those are potentially intensive operations.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@627 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-09 04:34:36 +00:00
David Huynh
fae6701493 Added support for exporting a scatterplot facet's image as a large image.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@614 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-06 20:44:40 +00:00
Stefano Mazzocchi
e6d36710ff findbug cleanups
git-svn-id: http://google-refine.googlecode.com/svn/trunk@599 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-05 01:35:51 +00:00
Stefano Mazzocchi
92ecc0c0f5 detab + dedos for java files (no functional changes)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@594 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-04 23:24:48 +00:00
David Huynh
883fc65304 Minor bug: blank maxColumns param caused SplitColumnCommand to throw an exception.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@587 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-04 16:14:13 +00:00
David Huynh
9641f28fe0 Made extend data from freebase command available on columns not officially reconciled, since some columns might contain reconciled data copied from other columns.
More error checking in the extend data utility.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@583 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-02 22:27:17 +00:00
David Huynh
d303adc48e Made data upload dialog shows only limited preview of triples, but made actual uploading process and the tripleloader exporter generate all triples. Added spinner busy dialog during uploading process.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@582 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-01 00:11:44 +00:00
David Huynh
bab1e8905b Jacked up jetty form upload size limit.
Added a few more array bound checks.
Reduced number of recon candidate and recon objects created by extend data operations.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@577 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-30 19:41:53 +00:00
Stefano Mazzocchi
ce40122754 - fixed oauth problem with non-127.0.0.1 hosts
- fixed scatterfacet filtering consistency
- increased size of scatterplot in the scatterfacet


git-svn-id: http://google-refine.googlecode.com/svn/trunk@573 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-30 06:12:42 +00:00
David Huynh
3f40195ea1 Implemented but disabled the denormalize operation.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@571 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-29 22:07:07 +00:00
David Huynh
89e1d8b5ac Got history entries' IDs into Recon objects so we can track from a Recon object to all others created by the same operation.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@562 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-28 17:18:44 +00:00
David Huynh
15c188ad7a Added more metadata into recon objects.
Tried to minimize number of unique recon objects created when calling Recon.dup().

git-svn-id: http://google-refine.googlecode.com/svn/trunk@560 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-27 23:17:18 +00:00
Stefano Mazzocchi
5cd0301e57 make sure that users can't bypass easily the upload badge checks simply by tweaking dom values from firebug
git-svn-id: http://google-refine.googlecode.com/svn/trunk@556 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-27 19:47:12 +00:00
Stefano Mazzocchi
b1375a8997 more polish
git-svn-id: http://google-refine.googlecode.com/svn/trunk@555 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-27 08:13:45 +00:00
Stefano Mazzocchi
0eb18633e6 implemented more conservative data loading workflow
git-svn-id: http://google-refine.googlecode.com/svn/trunk@554 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-27 07:13:11 +00:00
Stefano Mazzocchi
e6012bc14a Fixes for Freeq
git-svn-id: http://google-refine.googlecode.com/svn/trunk@552 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-27 00:18:31 +00:00
David Huynh
3ae72ea630 Minor bug.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@548 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-26 01:15:02 +00:00
David Huynh
fed3c87fa6 Added row flagging support. Fixed bug in row star change: starring or unstarring one row wasn't undo-able previously.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@547 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-26 01:08:56 +00:00
David Huynh
0778b324de Made facets' expressions editable.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@527 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-24 01:19:13 +00:00
David Huynh
f9a829758e Pool recons and recon candidates. This yields smaller project files, change files, and AJAX responses for get-rows. It should make re-loading existing projects faster.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@521 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-23 19:39:12 +00:00
Stefano Mazzocchi
6990604981 implemented the full gridworks -> freebase conduit via delegated oauth and freeq/tripleloader
(still doesn't work as argus returns a 500 but the entire conduit is in place)


git-svn-id: http://google-refine.googlecode.com/svn/trunk@519 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-23 08:25:52 +00:00
Stefano Mazzocchi
439474caeb Checkpoint for OAuth functionality in Gridworks
(doesn't work but since it's a substantial chunk of stuff, I want to get it in sooner rather than later)


git-svn-id: http://google-refine.googlecode.com/svn/trunk@516 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-21 21:08:34 +00:00
David Huynh
5ba67b7b26 Implemented column split command. It seems to be working in "by lengths" mode.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@510 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-20 23:08:14 +00:00
David Huynh
3b63e0b969 Scatterplot facet can now filter the rows.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@492 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-17 05:59:25 +00:00
Stefano Mazzocchi
7a716a4a1b - updgraded commons-coded to the last version (needed for base64 encoding of data: uris)
- added the ability to embed the scatterplot inside the returned json data with data: uris (although it doesn't seem to work well)
- connected the selection logic to the scatterfacets (although it doesn't seem to filter the rows... and I'm puzzled as why)
- reduced cut/paste and code overlap between the scatterplot generator and the scatterplot facet


git-svn-id: http://google-refine.googlecode.com/svn/trunk@490 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-17 03:00:38 +00:00
David Huynh
9e73a4e68c Started to work on a MARC importer. It doesn't work properly yet.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@486 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-16 19:52:01 +00:00
Stefano Mazzocchi
397861b612 - replace the 'cos' library with the apache 'commons-fileupload' for licensing reason (the cos library had a weird arm-twisting license that forced you to buy an o'reilly book on servlets for each developer in your company... good thing I read it all)
- some tweaks on imgareaselect's look


git-svn-id: http://google-refine.googlecode.com/svn/trunk@483 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-15 18:42:41 +00:00
Stefano Mazzocchi
8cf69301a5 added a new command to get column metadata prior of creating the scatterplot half-matrix, this allows us to build a much more compact table and make the browser crawl a little less
git-svn-id: http://google-refine.googlecode.com/svn/trunk@481 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-15 06:20:56 +00:00
David Huynh
4a06c49a9a Added streaming json parser for faster re-loading of existing projects.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@470 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-13 23:57:03 +00:00
Stefano Mazzocchi
7ab1acd801 skeleton code for scatterfacet
git-svn-id: http://google-refine.googlecode.com/svn/trunk@453 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-12 19:22:49 +00:00
David Huynh
f7e830e709 Fixed bug in which editing a single cell and then starring the same row seemed to revert the cell back to its original content.
Added an option for not guessing cell value type during import.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@446 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-11 21:54:56 +00:00
Stefano Mazzocchi
81fb2f1740 first step at scatterplot facet selector
git-svn-id: http://google-refine.googlecode.com/svn/trunk@442 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-10 08:28:06 +00:00
David Huynh
a0d8c385f9 Do a bit more checking when retrieving project metadata just in case project metadata is null.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@435 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-09 04:52:32 +00:00
Stefano Mazzocchi
d3d40d608a bunch of PMD-induced fixes
(now the PMD report is clean)


git-svn-id: http://google-refine.googlecode.com/svn/trunk@430 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-09 00:14:11 +00:00
David Huynh
0996b9e1dd Gzip project export tar files.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@394 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-06 22:28:30 +00:00
David Huynh
9d9329ca96 Implemented row remove command.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@391 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-06 07:47:44 +00:00
David Huynh
1fd85c62bf Implemented column rename command.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@390 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-06 07:15:34 +00:00
David Huynh
f402db10af Implemented inter-project joins.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@387 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-06 05:35:48 +00:00
Stefano Mazzocchi
771810bc0d avoid exception if there is only one extension in the whole archive
git-svn-id: http://google-refine.googlecode.com/svn/trunk@385 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-05 21:36:27 +00:00
Stefano Mazzocchi
2efbf0031f - removed the 'thirdparty' directory (now the 'gridworks' script will download and install needed tools if they are not present in the system already)
- added 'findbugs' command that uses the findbugs static analyzer to look for problems in the code
- fixed a bunch of issues that findbugs found (a few methods would go a little faster, and a few NPE will be avoided... nothing major but good to have)


git-svn-id: http://google-refine.googlecode.com/svn/trunk@382 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-05 07:15:16 +00:00
Stefano Mazzocchi
798b2a36ca - archive and compressed file importer (supports zip, tar, gz, bz2, tar.gz and tar.bz2)
(works by loading the files that have the most common extensions in the archive)
- changed default max heap to 3Gb


git-svn-id: http://google-refine.googlecode.com/svn/trunk@381 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-04 07:48:47 +00:00
Stefano Mazzocchi
c24ec94835 had to shuffle around a bunch of classes to separate the main server classloader from the context classloader and allow reloading to happen for real
git-svn-id: http://google-refine.googlecode.com/svn/trunk@377 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-03 01:01:11 +00:00
Stefano Mazzocchi
72203cd3d5 - moved all code that contained MIT IP outside (http://code.google.com/p/simile-vicino/)
- moved bzip2 and tar code from apache ant into their own jar files
- now gridworks source contains only com.metaweb.* code everything else is a jar dependency
- started to work on archive importer


git-svn-id: http://google-refine.googlecode.com/svn/trunk@376 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-02 23:40:12 +00:00
Stefano Mazzocchi
62f5f21ca3 atom is handled as well by the XML importer
git-svn-id: http://google-refine.googlecode.com/svn/trunk@374 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-02 06:44:05 +00:00
Stefano Mazzocchi
0e07ec7acc crude, I know, but for now make Gridworks digest RDF/XML as it was XML (works surprisingly well, btw)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@369 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-01 16:56:38 +00:00
Stefano Mazzocchi
dced641599 - added the ability to specify the character separator for CSV or TSV files that don't use commas or tabs (this was needed to parse a dataset that we got from the BBC to try things out)
- used commons-lang split function instead of the java String.split one, this is necessary to avoid having to escape separators that might be confused for regexps


git-svn-id: http://google-refine.googlecode.com/svn/trunk@368 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-31 22:34:21 +00:00
David Huynh
df7389876f First shot at XML import.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@354 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-24 23:08:08 +00:00