Go to file
Stefano Mazzocchi a7d4951725 several improvements for clustering
- added a unicode ASCII-fiying addition to the fingerprinting functions
 - removed all distance functions for kNN that didn't seem to do anything useful
 - added the ability to indicate what value to use as cluster centroid by simply clicking on it
 (this is useful for those names that have non-ASCII chars that might not even be on your keyboard.. and cut/paste is error prone/cumbersome)
 - added a 10x multiplier to the PPM compression distance which makes it more aligned with the levenshtein ones
 - made sure that we construct a phonetic fingerprint for the whole string and not just the beginning subset
(performance is still not ideal but it's now reasonable)


git-svn-id: http://google-refine.googlecode.com/svn/trunk@268 7d457c2a-affb-35e4-300a-418c747d4874
2010-03-10 07:45:14 +00:00
.settings major rewrite of the foundation: 2010-02-07 23:15:50 +00:00
lib Implemented project import and export commands (from/to .tar files). 2010-03-08 02:34:25 +00:00
lib-src and more protbuf stuff removed 2010-03-07 23:56:34 +00:00
licenses and more protbuf stuff removed 2010-03-07 23:56:34 +00:00
src several improvements for clustering 2010-03-10 07:45:14 +00:00
tests adding minimal unit testing framework (type ./gridworks test to run) 2010-03-09 08:08:35 +00:00
thirdparty no more reason to use protocol buffers 2010-03-07 23:54:18 +00:00
.classpath adding minimal unit testing framework (type ./gridworks test to run) 2010-03-09 08:08:35 +00:00
.project major rewrite of the foundation: 2010-02-07 23:15:50 +00:00
build.xml adding minimal unit testing framework (type ./gridworks test to run) 2010-03-09 08:08:35 +00:00
gridworks fixing typo 2010-03-09 20:16:04 +00:00
gridworks.bat Implemented project import and export commands (from/to .tar files). 2010-03-08 02:34:25 +00:00
LICENSE.txt and more protbuf stuff removed 2010-03-07 23:56:34 +00:00
README.txt major rewrite of the foundation: 2010-02-07 23:15:50 +00:00



                             G r i d w o r k s
                            -------------------


                                    
  What is this?
  -------------
  
Gridworks is a tabular data exploration and manipulation tool.
   
   
   [more soon]
  

                                  - o -
                                                
                                                
   Thank you for your interest.