Tom Morris
b3f5fada95
FIXED - task 578 & 596: Clean up JSON importer
...
http://code.google.com/p/google-refine/issues/detail?id=578
http://code.google.com/p/google-refine/issues/detail?id=596
Extend tree parser framework to allow any Serializable instead of just Strings. Use this in JSON importer to: Import keywords null, true, false; Import empty strings and don't trim whitespace from strings on import; Import numbers directly instead of importing them as text and then parsing them ourselves. Add tests to verify all this stuff
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2543 7d457c2a-affb-35e4-300a-418c747d4874
2012-09-08 01:20:25 +00:00
Tom Morris
abc162a0d0
Switch back to old JSON lib for now
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2536 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-21 17:33:17 +00:00
Tom Morris
60c3a31242
Update Jackson and JSON libs
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2532 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-18 21:46:49 +00:00
Tom Morris
4bb6c43982
task 604: add Guava to main project so that we're not dependent on an extension
...
http://code.google.com/p/google-refine/issues/detail?id=604
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2531 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-15 13:33:17 +00:00
Tom Morris
d6e00fb3c7
Add JRDF source jar
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2524 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-05 15:01:55 +00:00
Stefano Mazzocchi
2947ebba0e
updating the signpost library and attaching sources for easier inspection
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2517 7d457c2a-affb-35e4-300a-418c747d4874
2012-08-01 21:47:21 +00:00
Stefano Mazzocchi
5dffd249de
updating signpost to the latest release
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2514 7d457c2a-affb-35e4-300a-418c747d4874
2012-07-13 06:50:53 +00:00
Tom Morris
1df0dd62ce
Issue 566 - export httpclient libs
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2494 7d457c2a-affb-35e4-300a-418c747d4874
2012-05-03 15:43:22 +00:00
Tom Morris
166b176ba2
Update to Apache POI 3.8
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2486 7d457c2a-affb-35e4-300a-418c747d4874
2012-03-29 04:41:42 +00:00
Tom Morris
8ff6c5617f
Update Jackson parser to 1.9.5
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2448 7d457c2a-affb-35e4-300a-418c747d4874
2012-03-01 18:11:28 +00:00
David Huynh
94e0369af7
Added extension for importing PC-Axis files.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2365 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-07 07:27:35 +00:00
Stefano Mazzocchi
8184e16bb9
updating http client and http core to the latest released versions
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2351 7d457c2a-affb-35e4-300a-418c747d4874
2011-11-01 21:46:56 +00:00
David Huynh
ff7bbc8ec0
Export libraries.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2338 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-20 09:20:53 +00:00
Tom Morris
ca17e1ef0a
New importer for Open Document Format (ODF) spreadsheet files (.ods)
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2323 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-11 20:27:40 +00:00
Tom Morris
496d6b0b6a
Update Eclipse classpath for Jackson 1.8.6
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2289 7d457c2a-affb-35e4-300a-418c747d4874
2011-10-07 17:03:33 +00:00
David Huynh
c42382f3ae
Started deeper integration of GData: we now have a "Google Data" importing source, which lets you sign in and authorize access to your docs. It then lists all the spreadsheets you have access to. It does not yet let you import those spreadsheets.
...
Minor fixes to the open project action area; fixes to render relative dates properly.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2190 7d457c2a-affb-35e4-300a-418c747d4874
2011-08-07 23:26:51 +00:00
Tom Morris
527d383bc5
Update to Apache commons codec 1.5
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2108 7d457c2a-affb-35e4-300a-418c747d4874
2011-06-11 22:53:17 +00:00
Tom Morris
297809847d
Add references to source jars
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2091 7d457c2a-affb-35e4-300a-418c747d4874
2011-06-07 23:50:10 +00:00
Tom Morris
f674a96973
Add source directory for tests and necessary libraries
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2087 7d457c2a-affb-35e4-300a-418c747d4874
2011-06-06 21:27:24 +00:00
Tom Morris
6289f80da5
Update Eclipse classpath for POI 3.7
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2064 7d457c2a-affb-35e4-300a-418c747d4874
2011-05-25 06:37:50 +00:00
Stefano Mazzocchi
610de0d33a
adding Metaphone3 algorithm
...
Many thanks to Lawrence Philips for donating the code to us under the BSD license.
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2029 7d457c2a-affb-35e4-300a-418c747d4874
2011-03-01 00:17:48 +00:00
Iain Sproat
f55f11cd0d
Adding classes to now make it possible to parse Html in GREL. Uses small subset of methods from the JSoup library, licensed under the MIT license.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1948 7d457c2a-affb-35e4-300a-418c747d4874
2010-12-06 23:15:24 +00:00
Tom Morris
b963fc2fc7
Allow top level directory to be imported as Eclipse project grefine-all
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@1564 7d457c2a-affb-35e4-300a-418c747d4874
2010-10-15 12:47:40 +00:00
Stefano Mazzocchi
2c8595098c
Major refactor to separate the webapp part from the embedded servlet engine part
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@883 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-28 23:19:08 +00:00
Iain Sproat
25d3a9dfc1
Added a basic RDF triple importer plus unit tests. Some more work required - it's not plugged into the client and it creates a very sparse data structure (each triple is a new row). It uses JRDF library (Apache 1.1 license).
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@813 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-18 12:41:40 +00:00
Stefano Mazzocchi
2cf360b723
adding even runtime jars to the eclipse build path so that people running gw from IDEs don't get ClassNotFound messages at runtime
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@798 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-17 16:35:20 +00:00
Stefano Mazzocchi
f28e23e503
Committing patches by Iain:
...
- use OpenCSV parser instead of our own
- use TestNG instead of JUnit which is a lot more configurable in test selection (and allows us to do a much better job a leaving the tree green even while developing tests that are known to fail)
- integrated TestNG in './gridworks test'
- added Iain to the list of contributors in README.txt
- changed the Eclipse test launch file to use the TestNG launcher (unfortunately, this is not shipped by default in Eclipse, so you have to install it yourself from the http://beust.com/eclipse update file, I'll add this to the wiki shortly)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@782 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-16 18:42:52 +00:00
Stefano Mazzocchi
ea459aed07
Applied a bunch of patches from Tom Morris (Issue 25, 26 and 27)
...
- make java6 dependency explicit in eclipse project files
- avoid using NotImplementException especially the sun.* one
- avoid using internal sun signal handling and rely on standard java.* APIs
(I tested this one and it seems to be working fine)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@756 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-13 21:02:19 +00:00
Stefano Mazzocchi
1f2531f303
uniform newlines and seeting the proper svn controls for native line ending
...
(so that diffs from windows don't end up all screwed up)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@751 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-13 02:22:31 +00:00
Stefano Mazzocchi
86465c2d6f
forgot these pieces of for the previous commit
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@723 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-12 09:00:38 +00:00
Stefano Mazzocchi
8285083fb9
fixing classpath so that gw can be run direclty from eclipse
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@712 7d457c2a-affb-35e4-300a-418c747d4874
2010-05-12 00:25:39 +00:00
Stefano Mazzocchi
6990604981
implemented the full gridworks -> freebase conduit via delegated oauth and freeq/tripleloader
...
(still doesn't work as argus returns a 500 but the entire conduit is in place)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@519 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-23 08:25:52 +00:00
Stefano Mazzocchi
439474caeb
Checkpoint for OAuth functionality in Gridworks
...
(doesn't work but since it's a substantial chunk of stuff, I want to get it in sooner rather than later)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@516 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-21 21:08:34 +00:00
Stefano Mazzocchi
7a716a4a1b
- updgraded commons-coded to the last version (needed for base64 encoding of data: uris)
...
- added the ability to embed the scatterplot inside the returned json data with data: uris (although it doesn't seem to work well)
- connected the selection logic to the scatterfacets (although it doesn't seem to filter the rows... and I'm puzzled as why)
- reduced cut/paste and code overlap between the scatterplot generator and the scatterplot facet
git-svn-id: http://google-refine.googlecode.com/svn/trunk@490 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-17 03:00:38 +00:00
David Huynh
9e73a4e68c
Started to work on a MARC importer. It doesn't work properly yet.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@486 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-16 19:52:01 +00:00
Stefano Mazzocchi
397861b612
- replace the 'cos' library with the apache 'commons-fileupload' for licensing reason (the cos library had a weird arm-twisting license that forced you to buy an o'reilly book on servlets for each developer in your company... good thing I read it all)
...
- some tweaks on imgareaselect's look
git-svn-id: http://google-refine.googlecode.com/svn/trunk@483 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-15 18:42:41 +00:00
Stefano Mazzocchi
93a8f78192
- updated to latest jquery (1.4.2)
...
- removed commons-math which I don't use anymore
- added imgareaselect
- added a bunch of licenses for the javascript libraries dependencies
git-svn-id: http://google-refine.googlecode.com/svn/trunk@482 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-15 06:56:07 +00:00
David Huynh
4a06c49a9a
Added streaming json parser for faster re-loading of existing projects.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@470 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-13 23:57:03 +00:00
Stefano Mazzocchi
60d61b7808
add commons-math library (I'm going to need this for more advanced facets)
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@451 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-12 04:25:50 +00:00
Stefano Mazzocchi
6114530723
make sure the junit tests still work
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@405 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-07 01:17:14 +00:00
Stefano Mazzocchi
8f5c35799b
making room for windmill tests
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@403 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-07 00:35:59 +00:00
Stefano Mazzocchi
c24ec94835
had to shuffle around a bunch of classes to separate the main server classloader from the context classloader and allow reloading to happen for real
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@377 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-03 01:01:11 +00:00
Stefano Mazzocchi
72203cd3d5
- moved all code that contained MIT IP outside ( http://code.google.com/p/simile-vicino/ )
...
- moved bzip2 and tar code from apache ant into their own jar files
- now gridworks source contains only com.metaweb.* code everything else is a jar dependency
- started to work on archive importer
git-svn-id: http://google-refine.googlecode.com/svn/trunk@376 7d457c2a-affb-35e4-300a-418c747d4874
2010-04-02 23:40:12 +00:00
David Huynh
c3ebb5a9f4
Got Vishal's jython integration to work.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@277 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-11 19:56:43 +00:00
Stefano Mazzocchi
358586ac8f
adding minimal unit testing framework (type ./gridworks test to run)
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@253 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-09 08:08:35 +00:00
David Huynh
5d3a57eeeb
Implemented project import and export commands (from/to .tar files).
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@234 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-08 02:34:25 +00:00
Stefano Mazzocchi
404883da92
forgot to remove from the eclipse build path
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@231 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-07 23:57:03 +00:00
Stefano Mazzocchi
a8177131b4
adding the protocol buffer library that is needed by the generated code
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@198 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-04 17:25:51 +00:00
Stefano Mazzocchi
c07431fb88
- cataloged all the licenses for the libraries Gridworks depends on
...
- added the secondstring libraries that contains all sorts of useful string distance functions
- added a java arithmetic coding library (used to implement a string distance based on PPM arithmetic coding)
- added the vicino kNN string clustering library (from MIT's SIMILE)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@181 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-04 00:29:52 +00:00
Stefano Mazzocchi
2691ee50d7
adding OS-specific data paths
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@173 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-03 02:53:07 +00:00
Stefano Mazzocchi
0c6590fe2c
- added an encoding guesser
...
- fixed a bunch of encoding issues
- added a function to reinterpret call content in another encoding
- added a 'phonetic' function to the expression language that supports metaphone and soundex
- updated the COS library to the latest released version
- added the IBM ICU4j library (that contains the encoding guesser)
- added examples with same content but different encodings
git-svn-id: http://google-refine.googlecode.com/svn/trunk@154 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-28 21:51:33 +00:00
Stefano Mazzocchi
f1923758e7
- add a bunch of new functions
...
- very lax date parser
- lots of new advanced string functions
- new version of commons-lang
git-svn-id: http://google-refine.googlecode.com/svn/trunk@152 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-27 08:56:04 +00:00
David Huynh
cd376c7532
Added support for Excel 2007 XML file format.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@73 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-08 23:44:33 +00:00
Stefano Mazzocchi
2b985bf45a
moving json support in its own jar (code was taken today directly from json.org and compiled and packaged by me)
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@70 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-08 20:46:27 +00:00
Stefano Mazzocchi
1343162a75
major rewrite of the foundation:
...
- de-maveniziation (uses the same code that Acre uses to drive jetty directly)
- removed all dependencies on external javascript code (jquery and suggest) by making a local copy (this makes gridworks totally self-serving, meaning that you can use it even if you don't have any internet connectivity)
- fixed a NPE when the servlet is shutdown before any project is loaded
- found a way to spawn a browser directly from the java code (untested in windows)
- added two ant tasks to generate windows and macosx stand-alone binaries (unused just yet)
To run, just type "./gridworks run" at the command line
git-svn-id: http://google-refine.googlecode.com/svn/trunk@65 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-07 23:15:50 +00:00
Stefano Mazzocchi
2077d3f094
adding unix and windows startup scripts
...
use maven to build the eclipse scripts instead of committing them in svn which makes them less portable
(do './gridworks eclipse' at the beginning to regenerate your eclipse project files, then reload in eclipse)
git-svn-id: http://google-refine.googlecode.com/svn/trunk@59 7d457c2a-affb-35e4-300a-418c747d4874
2010-02-07 05:25:44 +00:00
David Huynh
22040a8348
Initial import.
...
git-svn-id: http://google-refine.googlecode.com/svn/trunk@2 7d457c2a-affb-35e4-300a-418c747d4874
2010-01-24 21:09:50 +00:00