Commit Graph

286 Commits

Author SHA1 Message Date
David Huynh
15c188ad7a Added more metadata into recon objects.
Tried to minimize number of unique recon objects created when calling Recon.dup().

git-svn-id: http://google-refine.googlecode.com/svn/trunk@560 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-27 23:17:18 +00:00
David Huynh
e77b99e58b For relevance service, auto-match only if the type matches, the score is at least 100, and if there is more than one result, the ratio of the first result's score over the second result's score must be at least 1.5.
For recon service, auto-match only if the result has match:true.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@559 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-27 21:38:47 +00:00
David Huynh
ca2bc0a304 Fixed null pointer exception problem in HeuristicReconConfig when trying to use "recon" service.
Made custom suggest widget rely on gridworks-helper acre app for fetching property suggestions.
Made various property suggest in recon dialog use our custom suggest widget.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@557 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-27 20:34:36 +00:00
Stefano Mazzocchi
5cd0301e57 make sure that users can't bypass easily the upload badge checks simply by tweaking dom values from firebug
git-svn-id: http://google-refine.googlecode.com/svn/trunk@556 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-27 19:47:12 +00:00
Stefano Mazzocchi
b1375a8997 more polish
git-svn-id: http://google-refine.googlecode.com/svn/trunk@555 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-27 08:13:45 +00:00
Stefano Mazzocchi
0eb18633e6 implemented more conservative data loading workflow
git-svn-id: http://google-refine.googlecode.com/svn/trunk@554 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-27 07:13:11 +00:00
Stefano Mazzocchi
e6012bc14a Fixes for Freeq
git-svn-id: http://google-refine.googlecode.com/svn/trunk@552 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-27 00:18:31 +00:00
David Huynh
fece6187bf Jython libraries should now be properly imported on Windows as well.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@551 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-26 17:57:52 +00:00
David Huynh
53d7bd3287 Another star to flag copy and paste bug.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@549 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-26 01:18:57 +00:00
David Huynh
3ae72ea630 Minor bug.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@548 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-26 01:15:02 +00:00
David Huynh
fed3c87fa6 Added row flagging support. Fixed bug in row star change: starring or unstarring one row wasn't undo-able previously.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@547 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-26 01:08:56 +00:00
David Huynh
a734a9c6cb Initialize the jython library with the custom lib/jython/ path if we're running as a packaged app on Mac OSX.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@542 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-25 20:05:43 +00:00
David Huynh
007da57c1e More work on the extend data dialog. The suggested properties are now populated by the gridworks-helper acre app. Constraints can be specified per column, in the free form of a MQL query. It's a temporary solution.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@540 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-25 06:39:03 +00:00
Stefano Mazzocchi
1a6d1cf6b2 more polish
git-svn-id: http://google-refine.googlecode.com/svn/trunk@528 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-24 02:36:56 +00:00
David Huynh
0778b324de Made facets' expressions editable.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@527 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-24 01:19:13 +00:00
David Huynh
f9a829758e Pool recons and recon candidates. This yields smaller project files, change files, and AJAX responses for get-rows. It should make re-loading existing projects faster.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@521 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-23 19:39:12 +00:00
Stefano Mazzocchi
3e37970540 polishing (no functional changes)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@520 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-23 17:30:18 +00:00
Stefano Mazzocchi
6990604981 implemented the full gridworks -> freebase conduit via delegated oauth and freeq/tripleloader
(still doesn't work as argus returns a 500 but the entire conduit is in place)


git-svn-id: http://google-refine.googlecode.com/svn/trunk@519 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-23 08:25:52 +00:00
Stefano Mazzocchi
439474caeb Checkpoint for OAuth functionality in Gridworks
(doesn't work but since it's a substantial chunk of stuff, I want to get it in sooner rather than later)


git-svn-id: http://google-refine.googlecode.com/svn/trunk@516 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-21 21:08:34 +00:00
David Huynh
5ba67b7b26 Implemented column split command. It seems to be working in "by lengths" mode.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@510 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-20 23:08:14 +00:00
Vishal Talwar
d0df704d8a added python code part of jython distribution in lib/jython-2.5.1
added python.path vm arg to startup script
fixed infinite loop in unwrap() when displaying sequences of sequences



git-svn-id: http://google-refine.googlecode.com/svn/trunk@509 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-20 18:50:24 +00:00
David Huynh
35da36b0e8 Fixed misspell in clustering dialog.
Added option for not splitting lines into columns on import.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@508 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-20 07:26:07 +00:00
David Huynh
d85a0e1851 Retrieve dates correctly from Excel files.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@507 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-20 04:43:39 +00:00
David Huynh
2226d77c27 Oops, minor bug in range facet introduced in last check-in.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@504 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-19 23:45:18 +00:00
David Huynh
d1b0de95de Made our own slider widget to use in conjunction with our histogram widget.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@503 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-19 23:27:57 +00:00
David Huynh
72f1f0956e More polishing on the facet panel.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@498 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-18 00:56:09 +00:00
David Huynh
3b63e0b969 Scatterplot facet can now filter the rows.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@492 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-17 05:59:25 +00:00
Stefano Mazzocchi
85d7ed6b89 cleanup
git-svn-id: http://google-refine.googlecode.com/svn/trunk@491 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-17 03:02:34 +00:00
Stefano Mazzocchi
7a716a4a1b - updgraded commons-coded to the last version (needed for base64 encoding of data: uris)
- added the ability to embed the scatterplot inside the returned json data with data: uris (although it doesn't seem to work well)
- connected the selection logic to the scatterfacets (although it doesn't seem to filter the rows... and I'm puzzled as why)
- reduced cut/paste and code overlap between the scatterplot generator and the scatterplot facet


git-svn-id: http://google-refine.googlecode.com/svn/trunk@490 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-17 03:00:38 +00:00
David Huynh
8085208cf0 Fixed toTitlecase to handle fully capitalized text.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@489 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-16 23:42:52 +00:00
David Huynh
9e73a4e68c Started to work on a MARC importer. It doesn't work properly yet.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@486 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-16 19:52:01 +00:00
David Huynh
67662fcc96 Escape strings from TSV exporter.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@485 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-16 06:45:57 +00:00
Stefano Mazzocchi
1e5a787281 avoid ArrayOutOfBoundsException
git-svn-id: http://google-refine.googlecode.com/svn/trunk@484 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-15 20:24:23 +00:00
Stefano Mazzocchi
397861b612 - replace the 'cos' library with the apache 'commons-fileupload' for licensing reason (the cos library had a weird arm-twisting license that forced you to buy an o'reilly book on servlets for each developer in your company... good thing I read it all)
- some tweaks on imgareaselect's look


git-svn-id: http://google-refine.googlecode.com/svn/trunk@483 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-15 18:42:41 +00:00
Stefano Mazzocchi
8cf69301a5 added a new command to get column metadata prior of creating the scatterplot half-matrix, this allows us to build a much more compact table and make the browser crawl a little less
git-svn-id: http://google-refine.googlecode.com/svn/trunk@481 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-15 06:20:56 +00:00
David Huynh
155b5a483a When deleting project dirs, we need to recurse into them ourselves.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@480 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-15 00:17:08 +00:00
David Huynh
1d938bc4d0 Better MQL batching during extending data operations.
Tried to use JSON streaming in changes as well.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@479 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-15 00:09:14 +00:00
David Huynh
2277f45ef6 For jython, wrap native values properly using Py.java2py().
git-svn-id: http://google-refine.googlecode.com/svn/trunk@478 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-14 21:30:39 +00:00
David Huynh
24a7ea91b6 Fixed bugs
- MassEditOperation was barfing when engineConfig was missing
- When parsing JSON in streaming mode, get long instead of int and double instead of float so that we won't get overflow exception.

git-svn-id: http://google-refine.googlecode.com/svn/trunk@476 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-14 20:34:29 +00:00
Stefano Mazzocchi
3bae823010 fixed eclipsed warning (no functional change)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@473 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-14 02:39:24 +00:00
David Huynh
4a06c49a9a Added streaming json parser for faster re-loading of existing projects.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@470 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-13 23:57:03 +00:00
David Huynh
a1a8758c37 Added options for specifying # lines the header columns take, and the # lines to skip processing entirely initially.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@468 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-13 21:23:41 +00:00
Stefano Mazzocchi
dc4b63d2bf forgot a piece
git-svn-id: http://google-refine.googlecode.com/svn/trunk@465 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-13 07:27:20 +00:00
David Huynh
a2db5590ac Trim column names on import.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@461 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-13 06:28:13 +00:00
Stefano Mazzocchi
e232a90a73 progress but still no worky on the scatterfacet
git-svn-id: http://google-refine.googlecode.com/svn/trunk@457 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-13 03:44:42 +00:00
Stefano Mazzocchi
ba85f50e39 adding log-log support to the scatterplot matrix and more controls
(the scatterfacet still doesn't work but this is already more useful)


git-svn-id: http://google-refine.googlecode.com/svn/trunk@456 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-13 00:25:43 +00:00
David Huynh
8b95248c75 Fixed bug where after reconciling by ID, GUID, or key would generate a buggy numeric range facet, since all the scores were artificially the same.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@454 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-12 22:19:44 +00:00
Stefano Mazzocchi
7ab1acd801 skeleton code for scatterfacet
git-svn-id: http://google-refine.googlecode.com/svn/trunk@453 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-12 19:22:49 +00:00
David Huynh
8fb23913ce Added "time" part option to datePart function.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@448 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-11 23:07:56 +00:00
David Huynh
ce8963d009 Added datePart function.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@447 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-11 23:01:34 +00:00