Across-the-docs updates
This commit is contained in:
parent
4f97fd55a5
commit
f7404799e3
@ -56,7 +56,7 @@ For example, the following column of strings on the left will transform into the
|
||||
|today|>|today|
|
||||
|never|>|never|
|
||||
|
||||
This is based on OpenRefine’s ability to recognize dates with the [`toDate()` function](expressions#dates).
|
||||
This is based on OpenRefine’s ability to recognize dates with the [`toDate()` function](expressions#date-functions).
|
||||
|
||||
Clicking the “today” cell and editing its data type manually will convert “today” into a value such as “2020-08-14T00:00:00Z”. Attempting the same data-type change on “never” will give you an error message and refuse to proceed.
|
||||
|
||||
@ -123,13 +123,17 @@ The clustering pop-up window offers you a variety of clustering methods:
|
||||
* levenshtein
|
||||
* ppm
|
||||
|
||||
#### Key collision
|
||||
|
||||
**Key collisions** are very fast and can process millions of cells in seconds:
|
||||
|
||||
**Fingerprinting** is the least likely to produce false positives, so it’s a good place to start. It does the same kind of data-cleaning behind the scenes that you might think to do manually: fix whitespace into single spaces, put all uppercase letters into lowercase, discard punctuation, remove diacritics (e.g. accents) from characters, split all strings (words) and sort them alphabetically (so “Zhenyi, Wang” becomes “Wang Zhenyi”). This makes comparing those types of name values very easy.
|
||||
|
||||
**N-gram fingerprinting** allows you to set the _n_ value to whatever number you’d like, and will create n-grams of _n_ size (after doing some cleaning), alphabetize them, then join them back together into a _fingerprint_. For example, a 1-gram fingerprint will simply organize all the letters in the cell into alphabetical order - by creating segments one character in length. A 2-gram fingerprint will find all the two-character segments, remove duplicates, alphabetize them, and join them back together (for example, “banana” generates “ba an na an na,” which becomes “anbana”). This can help match cells that have typos, or incorrect spaces (such as matching “lookout” and “look out,” which fingerprinting itself won’t identify). The higher the _n_ value, the fewer clusters will be identified. With 1-grams, keep an eye out for mismatched values that are near-anagrams of each other (such as “Wellington” and “Elgin Town”).
|
||||
|
||||
The next four methods are phonetic algorithsm: they know whether two letters sound the same when pronounced out loud, and assess text values based on that (such as knowing that a word with an “S” might be a mistype of a word with a “Z”). They are great for spotting mistakes made by not knowing the spelling of a word or name after only hearing it spoken aloud.
|
||||
##### Phonetic clustering
|
||||
|
||||
The next four methods are phonetic algorithms: they know whether two letters sound the same when pronounced out loud, and assess text values based on that (such as knowing that a word with an “S” might be a mistype of a word with a “Z”). They are great for spotting mistakes made by not knowing the spelling of a word or name after only hearing it spoken aloud.
|
||||
|
||||
**Metaphone3 fingerprinting** is an English-language phonetic algorithm. For example, “Reuben Gevorkiantz” and “Ruben Gevorkyants” share the same phonetic fingerprint in English.
|
||||
|
||||
@ -139,6 +143,8 @@ The next four methods are phonetic algorithsm: they know whether two letters sou
|
||||
|
||||
Regardless of the language of your data, applying each of them might find different potential matches: for example, Metaphone clusters “Cornwall” and “Corn Hill” and “Green Hill,” while Cologne clusters “Greenvale” and “Granville” and “Cornwall” and “Green Wall.”
|
||||
|
||||
#### Nearest neighbor
|
||||
|
||||
**Nearest neighbor** clustering methods are slower than key collision methods. They allow the user to set a radius - a threshold for matching or not matching. OpenRefine uses a “blocking” method first, which sorts values based on whether they have a certain amount of similarity (the default is “6” for a six-character string of identical characters) and then runs the nearest-neighbor operations on those sorted groups. We recommend setting the block number to at least 3, and then increasing it if you need to be more strict (for example, if every value with “river” is being matched, you should increase it to 6 or more). Note bigger block values will take much longer to process, while smaller blocks may miss matches. Increasing the radius will make the matches more lax, as bigger differences will be clustered:
|
||||
|
||||
**Levenshtein distance** counts the number of edits required to make one value perfectly match another. As in the key collision methods above, it will do things like change uppercase to lowercase, fix whitespace, change special characters, etc. Each character that gets changed counts as 1 “distance.” “New York” and “newyork” have an edit distance value of 3 (“N” to “n”, “Y” to “y,” remove the space). It can do relatively advanced edits, such as understand the distance between “M. Makeba” and “Miriam Makeba” (5), but it may create false positives if these distances are greater than other, simpler transformations (such as the one-character distance to “B. Makeba,” another person entirely).
|
||||
|
@ -42,6 +42,20 @@ Converting a cell's data type is not the same operation as transforming its cont
|
||||
|
||||
To transform data from one type to another, see [Transforming data](transforming#transform) for information on using common tranforms, and see [Expressions](expressions) for information on using `toString()`, `toDate()`, and other functions.
|
||||
|
||||
### Dates
|
||||
|
||||
Date-formatted data in OpenRefine relies on a number of conversion tools and standards. When you convert a cell into a "date" data type, what you'll be doing is trying to transform the original contents in an ISO-8601-compliant extended format with time in UTC: YYYY-MM-DDTHH:MM:SSZ.
|
||||
|
||||
You can convert dates when you [export your data using the custom tabular exporter](exporting#custom-tabular-exporter). You are given the option to keep your dates in ISO 8601 format, or to output short, medium, long, or full locale formats. This means that you can format your dates into, for example, MM/DD/YY (the US short standard) with or without including the time, after manipulating your data formatted into ISO 8601.
|
||||
|
||||
The following table shows the [date and time formatting styles for the U.S. and French locales](https://docs.oracle.com/javase/tutorial/i18n/format/dateFormat.html):
|
||||
|Style |U.S. Locale |French Locale|
|
||||
|DEFAULT |Jun 30, 2009 7:03:47 AM |30 juin 2009 07:03:47|
|
||||
|SHORT |6/30/09 7:03 AM |30/06/09 07:03|
|
||||
|MEDIUM |Jun 30, 2009 7:03:47 AM |30 juin 2009 07:03:47|
|
||||
|LONG |June 30, 2009 7:03:47 AM PDT |30 juin 2009 07:03:47 PDT|
|
||||
|FULL |Tuesday, June 30, 2009 7:03:47 AM PDT |mardi 30 juin 2009 07 h 03 PDT|
|
||||
|
||||
## Rows vs. records
|
||||
|
||||
A row is a simple way to organize data: a series of cells, one cell per column. Sometimes there are multiple pieces of information in one cell, such as when a survey respondent can select more than one response. In cases where there is more than one value for a single column in one or more rows, you may wish to use OpenRefine’s records mode: this defines a single record (a survey response, for example) as potentially containing more than one row. From there you can transform cells into multiple rows, each cell containing one value you’d like to work with.
|
||||
|
@ -1,5 +0,0 @@
|
||||
---
|
||||
id: glossary
|
||||
title: OpenRefine Glossary
|
||||
sidebar_label: Glossary
|
||||
---
|
@ -31,28 +31,26 @@ We are aware of some minor rendering and performance issues on other browsers su
|
||||
|
||||
### Release versions
|
||||
|
||||
OpenRefine always has a latest stable release as well as some more recent work available in beta, release candidate, or nightly release versions.
|
||||
|
||||
If you are installing for the first time, we recommend [the latest stable release](https://github.com/OpenRefine/OpenRefine/releases/latest).
|
||||
OpenRefine always has a [latest stable release](https://github.com/OpenRefine/OpenRefine/releases/latest), as well as some more recent developments available in beta, release candidate, or [snapshot releases](https://github.com/OpenRefine/OpenRefine-snapshot-releases/releases). If you are installing for the first time, we recommend [the latest stable release](https://github.com/OpenRefine/OpenRefine/releases/latest).
|
||||
|
||||
If you wish to use an extension that is only compatible with an earlier version of OpenRefine, and do not require the latest features, you may find that [an older stable version is best for you](https://github.com/OpenRefine/OpenRefine/releases) in our list of releases. Look at later releases to see which security vulnerabilities are being fixed, in order to assess your own risk tolerance for using earlier versions. Look for “final release” versions instead of “beta” or “release candidate” versions.
|
||||
|
||||
#### Unstable versions
|
||||
|
||||
If you need a recently developed function, and are willing to risk some untested code, you can look at [the most recent items in the reverse-chronological list](https://github.com/OpenRefine/OpenRefine/releases) and see what changes appeal to you.
|
||||
If you need a recently developed function, and are willing to risk some untested code, you can look at [the most recent items in the list](https://github.com/OpenRefine/OpenRefine/releases) and see what changes appeal to you.
|
||||
|
||||
“Beta” and “release candidate” versions may both have unreported bugs and are most suitable for people who are wiling to help us troubleshoot these versions by [creating bug reports](https://github.com/OpenRefine/OpenRefine/issues).
|
||||
|
||||
For the absolute latest development updates, see the [snapshot releases](https://github.com/OpenRefine/OpenRefine-nightly-releases/releases). These are created with every commit.
|
||||
For the absolute latest development updates, see the [snapshot releases](https://github.com/OpenRefine/OpenRefine-snapshot-releases/releases). These are created with every commit.
|
||||
|
||||
#### What’s changed
|
||||
|
||||
Our [latest release is at the time of writing is OpenRefine 3.4](**link goes here!**), released **XXXX XX 2020**. The major changes in this version are listed on the [3.4 final release page](**link goes here!**) with the downloadable packages.
|
||||
Our [latest version is OpenRefine 3.4.1](https://github.com/OpenRefine/OpenRefine/releases/tag/3.4.1), released September 24th 2020. The major changes in this version are listed on the [3.4 release page](https://github.com/OpenRefine/OpenRefine/releases/tag/3.4.1) with the downloadable packages.
|
||||
|
||||
You can find information about all of our releases on the [Releases page on Github](https://github.com/OpenRefine/OpenRefine/releases).
|
||||
You can find information about all OpenRefine versions on the [Releases page on Github](https://github.com/OpenRefine/OpenRefine/releases).
|
||||
|
||||
:::info Other distributions
|
||||
OpenRefine may also work in other environments, such as [Chromebooks](https://gist.github.com/organisciak/3e12e5138e44a2fed75240f4a4985b4f) where Linux terminals are available. Look at our list of [Other Distributions](https://openrefine.org/download.html) on the Downloads page for other ways of running OpenRefine, and refer to our contributor community to see new environments in development.
|
||||
OpenRefine may also work in other environments, such as [Chromebooks](https://gist.github.com/organisciak/3e12e5138e44a2fed75240f4a4985b4f) where Linux terminals are available. Look at our list of [Other Distributions on the Downloads page](https://openrefine.org/download.html) for other ways of running OpenRefine, and refer to our contributor community to see new environments in development.
|
||||
:::
|
||||
|
||||
## Installing or upgrading
|
||||
@ -60,9 +58,9 @@ OpenRefine may also work in other environments, such as [Chromebooks](https://gi
|
||||
|
||||
If you are upgrading from an older version of OpenRefine and have projects already on your computer, you should create backups of those projects before you install a new version.
|
||||
|
||||
First, [locate your workspace directory](installing.md#where-is-data-stored). Then copy everything you find there and paste it into a folder elsewhere on your computer.
|
||||
First, [locate your workspace directory](#where-is-data-stored). Then copy everything you find there and paste it into a folder elsewhere on your computer.
|
||||
|
||||
For extra security you can [export your existing OpenRefine projects](exporting.md#export-a-project).
|
||||
For extra security you can [export your existing OpenRefine projects](exporting#export-a-project).
|
||||
|
||||
:::caution
|
||||
Take note of the [extensions](#installing-extensions) you have currently installed. They may not be compatible with the upgraded version of OpenRefine. Installations can be installed in two places, so be sure to check both your workspace directory and the existing installation directory.
|
||||
@ -93,16 +91,16 @@ import TabItem from '@theme/TabItem';
|
||||
|
||||
<TabItem value="win">
|
||||
|
||||
1. On Windows 10, click the Windows start menu button, type `env`, and look at the search results. **Edit the system environment** variables. (If you are using an earlier version of Windows, use the **Search** or **Search programs and files** box in the start menu.)
|
||||
1. On Windows 10, click the Windows start menu button, type “env,” and look at the search results. Click “Edit the system environment variables.” (If you are using an earlier version of Windows, use the “Search” or “Search programs and files” box in the start menu.)
|
||||
|
||||
![A screenshot of the search results for 'env'.](/img/env.png "A screenshot of the search results for 'env'.")
|
||||
|
||||
2. Click **Environment Variables…** at the bottom of the **Advanced** window that appears.
|
||||
3. In the **Environment Variables** dialog that appears, click **New…** and create a variable with the key `JAVA_HOME`. You can set the variable for only your user account, as in the screenshot below, or set it as a system variable - it will work either way.
|
||||
2. Click “Environment Variables…” at the bottom of the “Advanced” window that appears.
|
||||
3. In the “Environment Variables” dialog that appears, click “New…” and create a variable with the key `JAVA_HOME`. You can set the variable for only your user account, as in the screenshot below, or set it as a system variable - it will work either way.
|
||||
|
||||
![A screenshot of 'Environment Variables'.](/img/javahome.png "A screenshot of 'Environment Variables'.")
|
||||
|
||||
4. Set the **Value** to the folder where you installed JDK, in the format `D:\Programs\OpenJDK`. You can locate this folder with the **Browse directory...** button.
|
||||
4. Set the `Value` to the folder where you installed JDK, in the format `D:\Programs\OpenJDK`. You can locate this folder with the “Browse directory...” button.
|
||||
|
||||
</TabItem>
|
||||
|
||||
@ -110,19 +108,27 @@ import TabItem from '@theme/TabItem';
|
||||
|
||||
First, find where Java is on your computer with this command:
|
||||
|
||||
```which java```
|
||||
```
|
||||
which java
|
||||
```
|
||||
|
||||
Check the environment variable `JAVA_HOME` with:
|
||||
|
||||
```$JAVA_HOME/bin/java --version```
|
||||
```
|
||||
$JAVA_HOME/bin/java --version
|
||||
```
|
||||
|
||||
To set the environment variable for the current Java version of your MacOS:
|
||||
|
||||
```export JAVA_HOME="$(/usr/libexec/java_home)"```
|
||||
```
|
||||
export JAVA_HOME="$(/usr/libexec/java_home)"
|
||||
```
|
||||
|
||||
Or, for Java 13.x:
|
||||
|
||||
```export JAVA_HOME="$(/usr/libexec/java_home -v 13)"```
|
||||
```
|
||||
export JAVA_HOME="$(/usr/libexec/java_home -v 13)"
|
||||
```
|
||||
|
||||
</TabItem>
|
||||
|
||||
@ -132,20 +138,27 @@ Or, for Java 13.x:
|
||||
|
||||
Enter the following:
|
||||
|
||||
```sudo apt install default-jre```
|
||||
```
|
||||
sudo apt install default-jre
|
||||
```
|
||||
|
||||
This probably won’t install the latest JDK package available on the Java website, but it is faster and more straightforward. (At the time of writing, it installs OpenJDK 11.0.7.)
|
||||
|
||||
##### Manually
|
||||
|
||||
First, [extract the JDK package](https://openjdk.java.net/install/) to the new directory `usr/lib/jvm`:
|
||||
|
||||
```
|
||||
sudo mkdir -p /usr/lib/jvm
|
||||
sudo tar -x -C /usr/lib/jvm -f /tmp/openjdk-14.0.1_linux-x64_bin.tar.gz
|
||||
```
|
||||
Then, navigate to this folder and confirm the final path (in this case, `usr/lib/jvm/jdk-14.0.1`.
|
||||
Open a terminal and type
|
||||
```sudo gedit /etc/profile```
|
||||
|
||||
Then, navigate to this folder and confirm the final path (in this case, `usr/lib/jvm/jdk-14.0.1`. Open a terminal and type
|
||||
|
||||
```
|
||||
sudo gedit /etc/profile
|
||||
```
|
||||
|
||||
In the text window that opens, insert the following lines at the end of the `profile` file, using the path above:
|
||||
|
||||
```
|
||||
@ -154,10 +167,18 @@ PATH=$PATH:$HOME/bin:$JAVA_HOME/bin
|
||||
export JAVA_HOME
|
||||
export PATH
|
||||
```
|
||||
|
||||
Save and close the file. When you are back in the terminal, type
|
||||
```source /etc/environment```
|
||||
|
||||
```
|
||||
source /etc/environment
|
||||
```
|
||||
|
||||
Exit the terminal and restart your system. You can then check that JAVA_HOME is set properly by opening another terminal and typing
|
||||
```echo $JAVA_HOME```
|
||||
```
|
||||
echo $JAVA_HOME
|
||||
```
|
||||
|
||||
It should show the path you set above.
|
||||
|
||||
</TabItem>
|
||||
@ -168,7 +189,7 @@ It should show the path you set above.
|
||||
|
||||
### Install or upgrade OpenRefine
|
||||
|
||||
If you are upgrading an existing OpenRefine installation, you can delete the old program files and install the new files into the same space. Do not overwrite the files as some obsolete files may be left over unnecessarily.
|
||||
If you are upgrading an existing OpenRefine installation, you can delete the old program files and install the new files into the same space. Do not overwrite the files as some obsolete files may be left over unnecessarily.
|
||||
|
||||
:::caution
|
||||
If you have extensions installed, do not delete the `webapp\extensions` folder where you installed them. You may wish to install extensions into the workspace directory instead of the program directory. There is no guarantee that extensions will be forward-compatible with new versions of OpenRefine, and we do not maintain extensions.
|
||||
@ -201,19 +222,21 @@ Once you have downloaded the `.dmg` file, open it and drag the OpenRefine icon o
|
||||
|
||||
The quick version:
|
||||
|
||||
1. Install[ Homebrew from here](http://brew.sh)
|
||||
1. Install [Homebrew](http://brew.sh)
|
||||
2. In Terminal enter ` brew cask install openrefine`
|
||||
1. Then find OpenRefine in your Applications folder.
|
||||
|
||||
The long version:
|
||||
|
||||
[Homebrew](http://brew.sh) is a popular command-line package manager for Mac. Installing Homebrew is accomplished by pasting the installation command on the Homebrew website into a Terminal window. Once Homebrew is installed, applications like OpenRefine can be installed via a simple command. You can [install Homebrew from their website]([http://brew.sh](http://brew.sh)).
|
||||
[Homebrew](http://brew.sh) is a popular command-line package manager for Mac. Installing Homebrew is accomplished by pasting the installation command on the Homebrew website into a Terminal window. Once Homebrew is installed, applications like OpenRefine can be installed via a simple command. You can [install Homebrew from their website](http://brew.sh).
|
||||
|
||||
###### Install
|
||||
|
||||
Install OpenRefine with this command:
|
||||
|
||||
``` brew cask install openrefine```
|
||||
```
|
||||
brew cask install openrefine
|
||||
```
|
||||
|
||||
You should see output like this:
|
||||
|
||||
@ -228,23 +251,29 @@ You should see output like this:
|
||||
|
||||
Behind the scenes, this command causes Homebrew to download the OpenRefine installer, verify the file’s authenticity (using a SHA-256 checksum), mount the disk image, copy the `OpenRefine.app` application bundle into the Applications folder, unmount the disk image, and save a copy of the installer and metadata about the installation for future use.
|
||||
|
||||
_If an existing `OpenRefine.app` is found in the Applications folder, Homebrew will not overwrite it, so installing via Homebrew requires either deleting or renaming previously installed copies._
|
||||
If an existing `OpenRefine.app` is found in the Applications folder, Homebrew will not overwrite it, so installing via Homebrew requires either deleting or renaming previously installed copies.
|
||||
|
||||
###### Uninstall
|
||||
|
||||
To uninstall OpenRefine, paste this command into the Terminal:
|
||||
|
||||
``` brew cask uninstall openrefine```
|
||||
```
|
||||
brew cask uninstall openrefine
|
||||
```
|
||||
|
||||
You should see output like this:
|
||||
|
||||
``` ==> Removing App '/Applications/OpenRefine.app'.```
|
||||
```
|
||||
==> Removing App '/Applications/OpenRefine.app'.
|
||||
```
|
||||
|
||||
###### Update
|
||||
|
||||
To update to the latest version of OpenRefine, paste this command into the Terminal:
|
||||
|
||||
``` brew cask reinstall openrefine```
|
||||
```
|
||||
brew cask reinstall openrefine
|
||||
```
|
||||
|
||||
You should see output like this:
|
||||
|
||||
@ -269,7 +298,9 @@ If you had previously installed the `openrefine-dev` cask (containing a release
|
||||
|
||||
Once you have downloaded the `.tar.gz` file, open a shell, navigate to the folder containing the download, and type:
|
||||
|
||||
```tar xzf openrefine-linux-3.4.tar.gz```
|
||||
```
|
||||
tar xzf openrefine-linux-3.4.tar.gz
|
||||
```
|
||||
|
||||
</TabItem>
|
||||
|
||||
@ -280,7 +311,7 @@ Once you have downloaded the `.tar.gz` file, open a shell, navigate to the folde
|
||||
|
||||
### Set where data is stored
|
||||
|
||||
OpenRefine stores data in two places: program files in the program directory, wherever it is you’ve installed it; and project files in what we call the “workspace directory.” You can access this folder easily from OpenRefine by going to the [home screen](running.md#the-home-screen) (at [http://127.0.0.1:3333/](http://127.0.0.1:3333/)) and clicking "Browse workspace directory."
|
||||
OpenRefine stores data in two places: program files in the program directory, wherever it is you’ve installed it; and project files in what we call the “workspace directory.” You can access this folder easily from OpenRefine by going to the [home screen](running#the-home-screen) (at [http://127.0.0.1:3333/](http://127.0.0.1:3333/)) and clicking “Browse workspace directory.”
|
||||
|
||||
By default this is:
|
||||
|
||||
@ -308,11 +339,15 @@ For older Google Refine releases, replace `OpenRefine` with `Google\Refine`.
|
||||
|
||||
You can change this by adding this line to the file `openrefine.l4j.ini` and specifying your desired drive and folder path:
|
||||
|
||||
```-Drefine.data_dir=D:\MyDesiredFolder```
|
||||
```
|
||||
-Drefine.data_dir=D:\MyDesiredFolder
|
||||
```
|
||||
|
||||
If your folder path has spaces, use neutral quotation marks around it:
|
||||
|
||||
```-Drefine.data_dir="D:\My Desired Folder"```
|
||||
```
|
||||
-Drefine.data_dir="D:\My Desired Folder"
|
||||
```
|
||||
|
||||
If the folder does not exist, OpenRefine will create it.
|
||||
|
||||
@ -320,11 +355,15 @@ If the folder does not exist, OpenRefine will create it.
|
||||
|
||||
<TabItem value="mac">
|
||||
|
||||
```~/Library/Application Support/OpenRefine/```
|
||||
```
|
||||
~/Library/Application Support/OpenRefine/
|
||||
```
|
||||
|
||||
For older versions as Google Refine:
|
||||
|
||||
```~/Library/Application Support/Google/Refine/ ```
|
||||
```
|
||||
~/Library/Application Support/Google/Refine/
|
||||
```
|
||||
|
||||
Logging is to `/var/log/daemon.log` - grep for `com.google.refine.Refine`.
|
||||
|
||||
@ -332,11 +371,15 @@ Logging is to `/var/log/daemon.log` - grep for `com.google.refine.Refine`.
|
||||
|
||||
<TabItem value="linux">
|
||||
|
||||
```~/.local/share/openrefine/```
|
||||
```
|
||||
~/.local/share/openrefine/
|
||||
```
|
||||
|
||||
You can change this when you run OpenRefine from the terminal, by pointing to the workspace directory through the `-d` parameter:
|
||||
|
||||
``` ./refine -p 3333 -i 0.0.0.0 -m 6000M -d /My/Desired/Folder```
|
||||
```
|
||||
./refine -p 3333 -i 0.0.0.0 -m 6000M -d /My/Desired/Folder
|
||||
```
|
||||
|
||||
</TabItem>
|
||||
|
||||
@ -362,10 +405,10 @@ OpenRefine does not currently output an error log, but because the OpenRefine co
|
||||
You can access OpenRefine server logs from the terminal on Mac:
|
||||
|
||||
* Find the OpenRefine app/icon in Finder
|
||||
* Ctrl+Click on the icon and select **Show Package Contents** from the context menu that displays
|
||||
* This should open a new Finder menu showing a folder called **Contents** - navigate into this folder then into the **MacOS** folder
|
||||
* Ctrl+Click on **JavaAppLauncher**
|
||||
* Choose **Open With** from menu, and select **Terminal**
|
||||
* control-click on the icon and select “Show Package Contents” from the context menu that displays
|
||||
* This should open a new Finder menu showing a folder called “Contents” - navigate into this folder then into the “MacOS” folder
|
||||
* control-click on “JavaAppLauncher”
|
||||
* Choose “Open With” from the menu, and select “Terminal”
|
||||
|
||||
---
|
||||
|
||||
@ -373,25 +416,21 @@ You can access OpenRefine server logs from the terminal on Mac:
|
||||
|
||||
</Tabs>
|
||||
|
||||
|
||||
|
||||
|
||||
## Increasing memory allocation
|
||||
|
||||
OpenRefine relies on having computer memory available to it to work effectively. If you are planning to work with large data sets, you may wish to set up OpenRefine to handle it at the outset. By “large” we generally mean one of the following indicators:
|
||||
* more than **one million** rows
|
||||
* more than **one million **total cells
|
||||
* more than one million total cells
|
||||
* an input file size of more than 50 megabytes (MB)
|
||||
* more than **50** [rows per record in records mode](**running.md#records-mode**)
|
||||
* more than 50 [rows per record in records mode](running#records-mode)
|
||||
|
||||
By default OpenRefine is set to operate with 1 gigabyte (GB) of memory (1024MB). If OpenRefine is running slowly, or you are getting "out of memory" errors (for example, `java.lang.OutOfMemoryError`), or generally feel that OpenRefine is slow, you can try allocating more memory.
|
||||
By default OpenRefine is set to operate with 1 gigabyte (GB) of memory (1024MB). If you feel that OpenRefine is running slowly, or you are getting “out of memory” errors (for example, `java.lang.OutOfMemoryError`), you can try allocating more memory.
|
||||
|
||||
A good practice is to start with no more than 50% of whatever memory is left over after the estimated usage of your operating system, to leave memory for your browser to run.
|
||||
|
||||
All of the settings below use a four-digit number to specify the megabytes (MB) used. The default is usually 1024MB, but the new value doesn't need to be a multiple of 1024.
|
||||
All of the settings below use a four-digit number to specify the megabytes (MB) used (actually [mebibytes](https://en.wikipedia.org/wiki/Mebibyte)). The default is usually 1024MB, but the new value doesn't need to be a multiple of 1024.
|
||||
|
||||
:::info Dealing with large datasets
|
||||
If your project is big enough to need more than the default amount of memory, consider turning off "Parse cell text into numbers, dates, ..." on import. It's convenient, but less efficient than explicitly converting any columns that you need as a data type other than the default "string" type.
|
||||
If your project is big enough to need more than the default amount of memory, consider turning off “Parse cell text into numbers, dates, ...” on import. It's convenient, but less efficient than explicitly converting any columns that you need as a data type other than the default “string” type.
|
||||
:::
|
||||
|
||||
<Tabs
|
||||
@ -415,7 +454,7 @@ If you run `openrefine.exe`, you will need to edit the `openrefine.l4j.ini` file
|
||||
-Xmx1024M
|
||||
```
|
||||
|
||||
The line `-Xmx1024M` defines the amount of memory available in megabytes (actually [mebibytes](https://en.wikipedia.org/wiki/Mebibyte)). Change the number “1024” - for example, edit the line to `-Xmx2048M` to make 2048MB [2GB] of memory available.
|
||||
The line “-Xmx1024M” defines the amount of memory available in megabytes. Change the number “1024” - for example, edit the line to “-Xmx2048M” to make 2048MB [2GB] of memory available.
|
||||
|
||||
:::caution openrefine.exe not running?
|
||||
Once you increase the memory allocation, you may find that you cannot run `openrefine.exe`. In this case, your computer needs a 64-bit version of [Java](https://www.java.com/en/download/help/index_installing.xml) (this is different from [Java JDK](#install-or-upgrade-java). Look for the “Windows Offline (64-bit)” download on the Downloads page and install that. Your system must also be set to use the 64-bit version of Java by [changing the Java configuration](https://www.java.com/en/download/help/update_runtime_settings.xml).
|
||||
@ -425,46 +464,47 @@ Once you increase the memory allocation, you may find that you cannot run `openr
|
||||
|
||||
On Windows, OpenRefine can also be run by using the file `refine.bat` in the program directory. If you start OpenRefine using `refine.bat`, the memory available to OpenRefine can be specified either through command line options, or through the `refine.ini` file.
|
||||
|
||||
To set the maximum amount of memory on the command line when using `refine.bat`, 'cd' to the program directory, then type
|
||||
To set the maximum amount of memory on the command line when using `refine.bat`, "cd" to the program directory, then type
|
||||
|
||||
```refine.bat /m 2048m```
|
||||
|
||||
where "2048" is the maximum amount of MB that you want OpenRefine to use.
|
||||
where “2048” is the maximum amount of MB that you want OpenRefine to use.
|
||||
|
||||
To change the default that `refine.bat` uses, edit the `refine.ini` line that reads
|
||||
|
||||
```REFINE_MEMORY=1024M```
|
||||
|
||||
Note that this file is only read if you use `refine.bat`, not **openrefine.exe**.
|
||||
Note that this file is only read if you use `refine.bat`, not `openrefine.exe`.
|
||||
|
||||
</TabItem>
|
||||
<TabItem value="mac">
|
||||
|
||||
If you have downloaded the **.dmg** package and you start OpenRefine by double-clicking on it:
|
||||
If you have downloaded the `.dmg` package and you start OpenRefine by double-clicking on it:
|
||||
|
||||
* close OpenRefine
|
||||
* **control-click** on the OpenRefine icon (opens the contextual menu)
|
||||
* click on **show package content** (a finder window opens)
|
||||
* open the **Contents** folder
|
||||
* open the **Info.plist** file with any text editor (like Mac's default TextEdit)
|
||||
* Change `-Xmx1024M` into, for example, `-Xmx2048M` or `-Xmx8G`
|
||||
* control-click on the OpenRefine icon (opens the contextual menu)
|
||||
* click on "show package content” (a finder window opens)
|
||||
* open the “Contents” folder
|
||||
* open the `Info.plist` file with any text editor (like Mac's default TextEdit)
|
||||
* Change “-Xmx1024M” into, for example, “-Xmx2048M” or “-Xmx8G”
|
||||
* save the file
|
||||
* restart OpenRefine.
|
||||
|
||||
</TabItem>
|
||||
<TabItem value="linux">
|
||||
|
||||
If you have downloaded the `.tar.gz` package and you start OpenRefine from the command line, add the `-m xxxxM` parameter like this:
|
||||
|
||||
If you have downloaded the `.tar.gz` package and you start OpenRefine from the command line, add the “-m xxxxM” parameter like this:
|
||||
`./refine -m 2048m`
|
||||
|
||||
#### Setting a default
|
||||
|
||||
If you don't want to set this option on the command line each time, you can also set it in the `refine.ini` file. Edit the line
|
||||
|
||||
```REFINE_MEMORY=1024M```
|
||||
```
|
||||
REFINE_MEMORY=1024M
|
||||
```
|
||||
|
||||
Make sure it is not commented out (that is, that the line doesn't start with a '#' character), and change `1024` to a higher value. Save the file, and when you next start OpenRefine it will use this value.
|
||||
Make sure it is not commented out (that is, that the line doesn't start with a “#” character), and change “1024” to a higher value. Save the file, and when you next start OpenRefine it will use this value.
|
||||
|
||||
</TabItem>
|
||||
|
||||
@ -483,7 +523,7 @@ If you’d like to create or modify an extension, [see our developer documentati
|
||||
|
||||
### Two ways to install extensions
|
||||
|
||||
You can [install extensions in one of two places](installing.md#set-where-data-is-stored):
|
||||
You can [install extensions in one of two places](installing#set-where-data-is-stored):
|
||||
|
||||
* Into your OpenRefine program folder, so they will only be available to that version/installation of OpenRefine (meaning the extension will not run if you upgrade OpenRefine), or
|
||||
* Into your workspace, where your projects are stored, so they will be available no matter which version of OpenRefine you’re using.
|
||||
@ -496,11 +536,11 @@ If you want to install the extension into the program folder, go to your program
|
||||
|
||||
If you want to install the extension into your workspace, you can:
|
||||
* launch OpenRefine and click <span class="menuItems">Open Project</span> in the sidebar
|
||||
* At the bottom of the screen, click <span class="menuItems"> Browse workspace directory </span>
|
||||
* At the bottom of the screen, click <span class="menuItems">Browse workspace directory</span>
|
||||
* A file-explorer or finder window will open in your workspace
|
||||
* Create a new folder called `extensions` inside the workspace if it does not exist.
|
||||
* Create a new folder called “extensions” inside the workspace if it does not exist.
|
||||
|
||||
You can also [find your workspace on each operating system using these instructions](installing.md#set-where-data-is-stored).
|
||||
You can also [find your workspace on each operating system using these instructions](installing#set-where-data-is-stored).
|
||||
|
||||
### Install the extension
|
||||
|
||||
@ -514,12 +554,4 @@ Generally, the installation process will be:
|
||||
* Extract the zip contents into the `extensions` directory, making sure all the contents go into one folder with the name of the extension
|
||||
* Start (or restart) OpenRefine.
|
||||
|
||||
To confirm that installation was a success, follow the instructions provided by the extension. Each extension will appear in its own way inside the OpenRefine interface: make sure you read the documentation to know where the functionality will appear, such as under specific dropdown menus.
|
||||
|
||||
## Advanced OpenRefine uses
|
||||
|
||||
|
||||
### Running as a server
|
||||
|
||||
|
||||
### Automating OpenRefine
|
||||
To confirm that installation was a success, follow the instructions provided by the extension. Each extension will appear in its own way inside the OpenRefine interface: make sure you read the documentation to know where the functionality will appear, such as under specific dropdown menus.
|
@ -1,152 +0,0 @@
|
||||
---
|
||||
id: key_value_columnize
|
||||
title: Columnize by key/value columns
|
||||
sidebar_label: Columnize by key/value
|
||||
---
|
||||
|
||||
This operation can be used to reshape a table which contains *key* and *value* columns, such that the repeating contents in the key column become new column names, and the contents of the value column are spread in the new columns. This operation can be invoked from
|
||||
any column menu, via **Transpose** → **Columnize by key/value columns**.
|
||||
|
||||
Overview
|
||||
--------
|
||||
|
||||
Consider the following table:
|
||||
|
||||
| Field | Data |
|
||||
|---------|-----------------------|
|
||||
| Name | Galanthus nivalis |
|
||||
| Color | White |
|
||||
| IUCN ID | 162168 |
|
||||
| Name | Narcissus cyclamineus |
|
||||
| Color | Yellow |
|
||||
| IUCN ID | 161899 |
|
||||
|
||||
In this format, each flower species is described by multiple attributes, which are spread on consecutive rows.
|
||||
In this example, the "Field" column contains the keys and the "Data" column contains the values. With
|
||||
this configuration, the *Columnize by key/value columns* operations transforms this table as follows:
|
||||
|
||||
| Name | Color | IUCN ID |
|
||||
|-----------------------|----------|---------|
|
||||
| Galanthus nivalis | White | 162168 |
|
||||
| Narcissus cyclamineus | Yellow | 161899 |
|
||||
|
||||
Entries with multiple values in the same column
|
||||
-----------------------------------------------
|
||||
|
||||
If an entry has multiple values for a given key, then these values will be grouped on consecutive rows,
|
||||
to form a [record structure](exploring#rows-vs-records).
|
||||
|
||||
For instance, flower species can have multiple colors:
|
||||
|
||||
| Field | Data |
|
||||
|-------------|-----------------------|
|
||||
| Name | Galanthus nivalis |
|
||||
| **Color** | **White** |
|
||||
| **Color** | **Green** |
|
||||
| IUCN ID | 162168 |
|
||||
| Name | Narcissus cyclamineus |
|
||||
| Color | Yellow |
|
||||
| IUCN ID | 161899 |
|
||||
|
||||
This table is transformed by the operation as follows:
|
||||
|
||||
| Name | Color | IUCN ID |
|
||||
|-----------------------|----------|---------|
|
||||
| Galanthus nivalis | White | 162168 |
|
||||
| | Green | |
|
||||
| Narcissus cyclamineus | Yellow | 161899 |
|
||||
|
||||
The first key encountered by the operation serves as the record key.
|
||||
The "Green" value is attached to the "Galanthus nivalis" name because it is the latest record key encountered by the operation as it scans the table. See the [Row order](#row-order) section for more details about the influence of row order on
|
||||
the results of the operation.
|
||||
|
||||
Notes column
|
||||
------------
|
||||
|
||||
In addition to the key and value columns, a *notes* column can be used optionally. This can be used
|
||||
to store extra metadata associated to a key/value pair.
|
||||
|
||||
Consider the following example:
|
||||
|
||||
| Field | Data | Source |
|
||||
|---------|-----------------------|-----------------------|
|
||||
| Name | Galanthus nivalis | IUCN |
|
||||
| Color | White | Contributed by Martha |
|
||||
| IUCN ID | 162168 | |
|
||||
| Name | Narcissus cyclamineus | Legacy |
|
||||
| Color | Yellow | 2009 survey |
|
||||
| IUCN ID | 161899 | |
|
||||
|
||||
If the "Source" column is selected as notes column, this table is transformed to:
|
||||
|
||||
| Name | Color | IUCN ID | Source : Name | Source : Color |
|
||||
|-----------------------|----------|---------|---------------|-----------------------|
|
||||
| Galanthus nivalis | White | 162168 | IUCN | Contributed by Martha |
|
||||
| Narcissus cyclamineus | Yellow | 161899 | Legacy | 2009 survey |
|
||||
|
||||
Notes columns can therefore be used to preserve provenance or other context about a particular key/value pair.
|
||||
|
||||
Extra columns
|
||||
-------------
|
||||
|
||||
If the table contains extra columns, which are not used as key, value or notes columns, they can be preserved
|
||||
by the operation. For this to work, they must have the same value in all old rows corresponding to a new row.
|
||||
|
||||
Consider for instance the following table, where the "Field" and "Data" columns are used as key and value columns
|
||||
respectively, and the "Wikidata ID" column is not selected:
|
||||
|
||||
| Field | Data | Wikidata ID |
|
||||
|---------|-----------------------|-------------|
|
||||
| Name | Galanthus nivalis | Q109995 |
|
||||
| Color | White | Q109995 |
|
||||
| IUCN ID | 162168 | Q109995 |
|
||||
| Name | Narcissus cyclamineus | Q1727024 |
|
||||
| Color | Yellow | Q1727024 |
|
||||
| IUCN ID | 161899 | Q1727024 |
|
||||
|
||||
This will be transformed to
|
||||
|
||||
| Wikidata ID | Name | Color | IUCN ID |
|
||||
|-------------|-----------------------|----------|---------|
|
||||
| Q109995 | Galanthus nivalis | White | 162168 |
|
||||
| Q1727024 | Narcissus cyclamineus | Yellow | 161899 |
|
||||
|
||||
If extra columns do not contain identical values for all old rows spanning an entry, this can
|
||||
be fixed beforehand by using the [fill down operation](cellediting#fill-down).
|
||||
|
||||
Row order
|
||||
---------
|
||||
|
||||
In the absence of extra columns, it is important to note that the order in which
|
||||
the key/value pairs appear matters. Specifically, the operation will use the first key it encounters as the delimiter for entries:
|
||||
every time it encounters this key again, it will produce a new row and add the following other key/value pairs to that row.
|
||||
|
||||
Consider for instance the following table:
|
||||
|
||||
| Field | Data |
|
||||
|----------|-----------------------|
|
||||
| **Name** | Galanthus nivalis |
|
||||
| Color | White |
|
||||
| IUCN ID | 162168 |
|
||||
| **Name** | Crinum variabile |
|
||||
| **Name** | Narcissus cyclamineus |
|
||||
| Color | Yellow |
|
||||
| IUCN ID | 161899 |
|
||||
|
||||
The occurrences of the "Name" value in the "Field" column define the boundaries of the entries. Because there is
|
||||
no other row between the "Crinum variabile" and the "Narcissus cyclamineus" rows, the "Color" and "IUCN ID" columns
|
||||
for the "Crinum variabile" entry will be empty:
|
||||
|
||||
| Name | Color | IUCN ID |
|
||||
|-----------------------|----------|---------|
|
||||
| Galanthus nivalis | White | 162168 |
|
||||
| Crinum variabile | | |
|
||||
| Narcissus cyclamineus | Yellow | 161899 |
|
||||
|
||||
This sensitivity to order is removed if there are extra columns: in that case, the first extra column will serve as root identifier
|
||||
for the entries.
|
||||
|
||||
Behaviour in records mode
|
||||
-------------------------
|
||||
|
||||
In records mode, this operation behaves just like in rows mode, except that any facets applied to it will be interpreted in records mode.
|
@ -10,7 +10,7 @@ OpenRefine does not require internet access to run its basic functions. Once you
|
||||
|
||||
You will see a command line window open when you run OpenRefine. Leave that window alone while you work on datasets in your browser.
|
||||
|
||||
No matter how you load OpenRefine, it will load in your computer’s default browser. If you would like to use another browser instead, start OpenRefine and then point your chosen browser at the home screen: [http://127.0.0.1:3333/](http://127.0.0.1:3333/).
|
||||
No matter how you load OpenRefine, it will load in your computer’s default browser. If you would like to use another browser instead, start OpenRefine and then point your chosen browser at the home screen: http://127.0.0.1:3333/.
|
||||
|
||||
OpenRefine works best on browsers based on Webkit, such as:
|
||||
* Google Chrome
|
||||
@ -20,7 +20,7 @@ OpenRefine works best on browsers based on Webkit, such as:
|
||||
|
||||
We are aware of some minor rendering and performance issues on other browsers such as Firefox. We don't support Internet Explorer.
|
||||
|
||||
You can launch multiple projects at the same time by simply having multiple tabs or browser windows open. From the <span class="menuItems">Open Project</span> screen, you can right-click on project names and select <span class="menuItems">Open in new tab</span>.
|
||||
You can launch multiple projects at the same time by simply having multiple tabs or browser windows open. From the <span class="menuItems">Open Project</span> screen, you can right-click on project names and open them in new tabs or windows.
|
||||
|
||||
import Tabs from '@theme/Tabs';
|
||||
import TabItem from '@theme/TabItem';
|
||||
@ -37,7 +37,7 @@ import TabItem from '@theme/TabItem';
|
||||
|
||||
<TabItem value="win">
|
||||
|
||||
To exit OpenRefine, close all the browser tabs, then navigate to the command line window. To close this window and ensure OpenRefine exits properly, hold down `Control` and press `C` on your keyboard.
|
||||
To exit OpenRefine, close all the browser tabs or windows, then navigate to the command line window. To close this window and ensure OpenRefine exits properly, hold down `Control` and press `C` on your keyboard. This will save any last changes to your projects.
|
||||
|
||||
#### With openrefine.exe
|
||||
You can run OpenRefine by double-clicking `openrefine.exe` or calling it from the command line. If you want to [modify the way `openrefine.exe` opens](#starting-with-modifications), you can edit the `openrefine.l4j.ini` file.
|
||||
@ -106,7 +106,9 @@ When you run OpenRefine from a command line, you can change a number of default
|
||||
|
||||
On Windows, use a slash:
|
||||
|
||||
```C:>refine /i 127.0.0.2 /p 3334```
|
||||
```
|
||||
C:>refine /i 127.0.0.2 /p 3334
|
||||
```
|
||||
|
||||
Get a list of all the commands with `refine /?`.
|
||||
|
||||
@ -119,7 +121,6 @@ Get a list of all the commands with `refine /?`.
|
||||
|/d|Enable debugging (on port 8000)|refine /d|
|
||||
|/x|Enable JMX monitoring for Jconsole and JvisualVM|refine /x|
|
||||
|
||||
|
||||
</TabItem>
|
||||
|
||||
<TabItem value="mac">
|
||||
@ -166,9 +167,31 @@ To see the full list of command-line options, run `./refine -h`.
|
||||
|
||||
#### Modifications set within files
|
||||
|
||||
On Windows, you can modify the way `openrefine.exe` runs by editing `openrefine.l4j.ini`; you can modify the way `refine.bat` runs by editing `refine.ini`. You can modify the Mac application by editing `info.plist`. On Linux, you can edit `refine.ini`.
|
||||
On Windows, you can modify the way `openrefine.exe` runs by editing `openrefine.l4j.ini`; you can modify the way `refine.bat` runs by editing `refine.ini`.
|
||||
You can modify the Mac application by editing `Info.plist`.
|
||||
On Linux, you can edit `refine.ini`.
|
||||
|
||||
These JVM preferences are different options and have different syntax than the key/value descriptions above. Some of the most common keys (with their defaults) are:
|
||||
Some settings, such as changing memory allocations, are already set inside these files, and all you have to do is change the values. Some lines need to be un-commented to work.
|
||||
|
||||
For example, inside `refine.ini`, you should see:
|
||||
```
|
||||
no_proxy="localhost,127.0.0.1"
|
||||
#REFINE_PORT=3334
|
||||
#REFINE_HOST=127.0.0.1
|
||||
#REFINE_WEBAPP=main\webapp
|
||||
|
||||
# Memory and max form size allocations
|
||||
#REFINE_MAX_FORM_CONTENT_SIZE=1048576
|
||||
REFINE_MEMORY=1400M
|
||||
|
||||
# Set initial java heap space (default: 256M) for better performance with large datasets
|
||||
REFINE_MIN_MEMORY=1400M
|
||||
...
|
||||
```
|
||||
|
||||
Further modifications can be performed by using JVM preferences.
|
||||
|
||||
These JVM preferences are different options and have different syntax than the key/value descriptions used on the command line. Some of the most common keys (with their defaults) are:
|
||||
* -Drefine.autosave (5 [minutes])
|
||||
* -Drefine.data_dir (/)
|
||||
* -Drefine.development (false)
|
||||
@ -177,7 +200,7 @@ These JVM preferences are different options and have different syntax than the k
|
||||
* -Drefine.port (3333)
|
||||
* -Drefine.webapp (main/webapp)
|
||||
|
||||
The syntax within the `.ini` files is as follows:
|
||||
The syntax is as follows:
|
||||
|
||||
<Tabs
|
||||
groupId="operating-systems"
|
||||
@ -191,14 +214,20 @@ The syntax within the `.ini` files is as follows:
|
||||
|
||||
<TabItem value="win">
|
||||
|
||||
Inside either of the `.ini` files, insert lines in this way:
|
||||
Inside the `refine.l4j.ini` file, insert lines in this way:
|
||||
|
||||
```
|
||||
-Drefine.port=3333
|
||||
-Drefine.host=127.0.0.1
|
||||
-Drefine.port=3334
|
||||
-Drefine.host=127.0.0.2
|
||||
-Drefine.webapp=broker/core
|
||||
```
|
||||
|
||||
In `refine.ini`, use a similar syntax, but set multiple parameters within a single line starting with `JAVA_OPTIONS=`:
|
||||
|
||||
```
|
||||
JAVA_OPTIONS=-Drefine.data_dir=C:\Users\user\Documents\OpenRefine\ -Drefine.port=3334
|
||||
|
||||
```
|
||||
</TabItem>
|
||||
|
||||
<TabItem value="mac">
|
||||
@ -257,7 +286,7 @@ Refer to the [official Java documentation](https://docs.oracle.com/javase/8/docs
|
||||
|
||||
## The home screen
|
||||
|
||||
When you first launch OpenRefine, you will see a screen with a menu on the left hand side that includes <span class="menuItems">Create Project</span>, <span class="menuItems">Open Project</span>, <span class="menuItems">Import Project</span>, and <span class="menuItems">Language Settings</span>. This is called the "home screen", where you can manage your projects and general settings.
|
||||
When you first launch OpenRefine, you will see a screen with a menu on the left hand side that includes <span class="menuItems">Create Project</span>, <span class="menuItems">Open Project</span>, <span class="menuItems">Import Project</span>, and <span class="menuItems">Language Settings</span>. This is called the “home screen,” where you can manage your projects and general settings.
|
||||
|
||||
### Language settings
|
||||
|
||||
@ -294,7 +323,7 @@ At this time you can set preferences using a key/value pair: that is, selecting
|
||||
|Timeout for Google Drive authorization|googleConnectTimeOut|Number (microseconds)|180000|500000|
|
||||
|Maximum lag for Wikidata edit retries|wikibase.upload.maxLag|Number (seconds)|5|10|
|
||||
|
||||
To leave the Preferences screen, click on the “OpenRefine” logo.
|
||||
To leave the Preferences screen, click on the diamond “OpenRefine” logo.
|
||||
|
||||
If the preference you’re looking for isn’t here, look at the options you can set from the [command line or in an `.ini` file](#starting-with-modifications).
|
||||
|
||||
@ -316,21 +345,21 @@ Don’t click the “back” button on your browser - it will likely close your
|
||||
|
||||
You can rename a project at any time by clicking inside the project title, which will turn into a text field. Project names don’t have to be unique, as OpenRefine organizes them based on a unique identifier behind the scenes.
|
||||
|
||||
<span class="menuItems">Permalink</span> allows you to return to a project at a specific view state - that is, with facets and filters applied. The permalink can help you pick up where you left off if you have to close your project while working with facets and filters. It puts view-specific information directly into the URL: clicking on it will load this current-view URL in the existing tab. You can right-click and copy the Permalink URL to copy the current view state to your clipboard, without refreshing the tab you’re using.
|
||||
<br/>
|
||||
<span class="menuItems">Open…</span> will open up a new browser tab showing the “Create Project” screen. From here you can change settings, start a new project, or open an existing project.
|
||||
<br/>
|
||||
The <span class="menuItems">Permalink</span> allows you to return to a project at a specific view state - that is, with facets and filters applied. The permalink can help you pick up where you left off if you have to close your project while working with facets and filters. It puts view-specific information directly into the URL: clicking on it will load this current-view URL in the existing tab. You can right-click and copy the Permalink URL to copy the current view state to your clipboard, without refreshing the tab you’re using.
|
||||
|
||||
The <span class="menuItems">Open…</span> button will open up a new browser tab showing the <span class="menuItems">Create Project</span> screen. From here you can change settings, start a new project, or open an existing project.
|
||||
|
||||
<span class="menuItems">Export</span> is a dropdown menu that allows you to pick a format for exporting your current dataset. It will only export rows and records that are currently visible - the currently selected facets and filters, not the total data in the project.
|
||||
<br/>
|
||||
|
||||
<span class="menuItems">Help</span> will open up a new browser tab and bring you to this user manual on the web.
|
||||
|
||||
### The grid header
|
||||
|
||||
The grid header sits below the project bar and above the project grid (the data of your project). The grid header will tell you the total number of rows or records in your project, and indicate whether you are in rows or records mode.
|
||||
|
||||
It will also tell you if you’re currently looking at a select number of rows via facets or filtering, rather than the entire dataset, by displaying either, for example, <span class="menuItems">180 rows</span> or <span class="menuItems">67 matching rows (180 total)</span>.
|
||||
It will also tell you if you’re currently looking at a select number of rows via facets or filtering, rather than the entire dataset, by displaying either, for example, “180 rows” or “67 matching rows (180 total).”
|
||||
|
||||
Directly below the row number, you have the ability to switch between row mode and records mode. OpenRefine stores which projects are in records mode, and displays your data as records by default if you are.
|
||||
Directly below the row number, you have the ability to switch between [row mode and records mode](exploring#rows-vs-records). OpenRefine stores which projects are in records mode, and displays your data as records by default if you are.
|
||||
|
||||
To the right of the rows/records selection is the array of options for how many rows/records to view on screen at one time. At the far right of the screen you can navigate through your entire dataset one page at a time.
|
||||
|
||||
@ -340,23 +369,21 @@ The <span class="menuItems">Extensions</span> dropdown offers you options for ex
|
||||
|
||||
### The grid
|
||||
|
||||
The area of the project screen that displays your dataset is called the "project grid" (or the "data grid", or simply the "grid"). The grid presents data in a tabular format, which may look like a normal spreadsheet program to you.
|
||||
The area of the project screen that displays your dataset is called the “project grid” (or the “data grid,” or simply the “grid”). The grid presents data in a tabular format, which may look like a normal spreadsheet program to you.
|
||||
|
||||
Columns widths are automatically set based on their contents; some column headers may be cut off, but can be viewed by mousing over the headers.
|
||||
|
||||
In each column header you will see a small arrow. Clicking on this arrow brings up a dropdown menu containing column-specific data exploration and transformation options. You will learn about each of these options in the [Exploring data](exploring) and [Transforming data](transforming) sections.
|
||||
|
||||
The first column in every project will always be <span class="menuItems">All</span>, which contains options to flag, star, and do non-column-specific operations. The <span class="menuItems">All</span> column is also where rows/records are numbered.
|
||||
The first column in every project will always be “All,” which contains options to flag, star, and do non-column-specific operations. The “All” column is also where rows/records are numbered.
|
||||
|
||||
The project grid may display with both vertical and horizontal scrolling, depending on the number and width of columns, and the number of rows/records displayed. You can control the display of the project grid by using [Sort and View options](exploring#sort-and-view).
|
||||
|
||||
Mousing over individual cells will allow you to [edit cells individually](transforming).
|
||||
Mousing over individual cells will allow you to [edit cells individually](cellediting#edit-one-cell-at-a-time).
|
||||
|
||||
### The sidebar
|
||||
### Facet/Filter
|
||||
|
||||
#### Facet/Filter
|
||||
|
||||
The Facet/Filter tab is one of the main ways of exploring your data: displaying the patterns and trends in your data, and helping you narrow your focus and modify that data. [Facets](exploring#facets) and [filters](exploring#filters) are explained more in [Exploring data](exploring).
|
||||
The Facet/Filter tab is one of the main ways of exploring your data: displaying the patterns and trends in your data, and helping you narrow your focus and modify that data. [Facets](facets) and [filters](facets#text-filter) are explained more in [Exploring data](exploring).
|
||||
|
||||
![A screenshot of facets and filters in action.](/img/facetfilter.png)
|
||||
|
||||
@ -368,7 +395,7 @@ Removing your facets will clear out the sidebar entirely. If you have written cu
|
||||
|
||||
You can preserve your facets and filters for future use by copying a [Permalink](#the-project-bar).
|
||||
|
||||
#### History (Undo/Redo)
|
||||
### History (Undo/Redo)
|
||||
|
||||
In OpenRefine, any activity that changes the data can be undone. Changes are tracked from the very beginning, when a project is first created. The change history of each project is saved with the project's data, so quitting OpenRefine does not erase the steps you've taken. When you restart OpenRefine, you can view and undo changes that you made before you quit OpenRefine.
|
||||
|
||||
@ -376,7 +403,7 @@ Project history gets saved when you export a project archive, and restored when
|
||||
|
||||
![A screenshot of the History (Undo/Redo) tab with 13 steps.](/img/history.png "A screenshot of the History (Undo/Redo) tab with 13 steps.")
|
||||
|
||||
When you click on <span class="menuItems">Undo / Redo</span> in the sidebar of any project, that project’s history is shown as a list of changes in order, with the first change being the action of creating the project itself. (That first change, indexed as step zero, cannot be undone.) Here is a sample history with 3 changes:
|
||||
When you click on <span class="menuItems">Undo / Redo</span> in the sidebar of any project, that project’s history is shown as a list of changes in order, with the first “change” being the action of creating the project itself. (That first change, indexed as step zero, cannot be undone.) Here is a sample history with 3 changes:
|
||||
|
||||
```
|
||||
0. Create project
|
||||
@ -387,22 +414,77 @@ When you click on <span class="menuItems">Undo / Redo</span> in the sidebar of a
|
||||
|
||||
The current state of the project is highlighted with a dark blue background. If you move back and forth on the timeline you will see the current state become highlighted, while the actions that came after that state will be grayed out.
|
||||
|
||||
To revert your data back to an earlier state, simply click on the last action in the timeline you want to keep. In the example above, if we keep the removal of 7 rows but revert everything we did after that, then click on <span class="menuItems">Remove 7 rows</span>. The last 2 changes will be undone, in order to bring the project back to state #1.
|
||||
To revert your data back to an earlier state, simply click on the last action in the timeline you want to keep. In the example above, if we keep the removal of 7 rows but revert everything we did after that, then click on “Remove 7 rows.” The last 2 changes will be undone, in order to bring the project back to state #1.
|
||||
|
||||
In this example, changes #2 and #3 will now be grayed out. You can redo a change by clicking on it in the history - everything up to and including it will be redone.
|
||||
|
||||
If you have moved back one or more states, and then you perform a new operation on your data, the later actions (everything that’s greyed out) will be erased and cannot be re-applied.
|
||||
|
||||
The <span class="menuItems">Undo/Redo</span> tab will show you which step you’re on, and if you’re about to risk erasing work - by saying something like "4/5" or "1/7" at the end.
|
||||
The Undo/Redo tab will show you which step you’re on, and if you’re about to risk erasing work - by saying something like “4/5" or “1/7” at the end.
|
||||
|
||||
##### Reusing operations
|
||||
#### Reusing operations
|
||||
|
||||
Operations that you perform in OpenRefine can be reused. For example, a formula you wrote inside one project can be copied and applied to another project later.
|
||||
|
||||
To reuse one or more operations, you first extract it from the project where it was first applied. Click to the <span class="menuItems">Undo/Redo</span> tab and click <span class="menuItems">Extract…</span>. This brings up a box that lists all operations up to the current state (it does not show undone operations). Select the operation or operations you want to extract using the checkboxes on the left, and they will be encoded as JSON on the right. Copy that JSON off to the clipboard.
|
||||
To reuse one or more operations, you first extract it from the project where it was first applied. Click to the Undo/Redo tab and click <span class="menuItems">Extract…</span>. This brings up a box that lists all operations up to the current state (it does not show undone operations). Select the operation or operations you want to extract using the checkboxes on the left, and they will be encoded as JSON on the right. Copy that JSON off to the clipboard.
|
||||
|
||||
Move to the second project, go to the <span class="menuItems">Undo/Redo</span> tab, click <span class="menuItems">Apply…</span> and paste in that JSON.
|
||||
Move to the second project, go to the Undo/Redo tab, click <span class="menuItems">Apply…</span> and paste in that JSON.
|
||||
|
||||
Not all operations can be extracted. Edits to a single cell, for example, can’t be replicated.
|
||||
|
||||
### Common extension buttons
|
||||
## Advanced OpenRefine uses
|
||||
|
||||
### Running as a server
|
||||
|
||||
:::caution
|
||||
Please note that if your machine has an external IP (is exposed to the Internet), you should not do this, or should protect it behind a proxy or firewall, such as nginx. Proceed at your own risk.
|
||||
:::
|
||||
|
||||
By default (and for security reasons), OpenRefine only listens to TCP requests coming from localhost (127.0.0.1) on port 3333. If you want to share your OpenRefine instance with colleagues and respond to TCP requests to any IP address of the machine, start it from the command line like this:
|
||||
```
|
||||
./refine -i 0.0.0.0
|
||||
```
|
||||
|
||||
or set this option in `refine.ini`:
|
||||
```
|
||||
REFINE_HOST=0.0.0.0
|
||||
```
|
||||
|
||||
or set this JVM option:
|
||||
```
|
||||
-Drefine.host=0.0.0.0
|
||||
```
|
||||
|
||||
On Mac, you can add a specific entry to the `Info.plist` file located within the app bundle (`/Applications/OpenRefine.app/Contents/Info.plist`):
|
||||
```
|
||||
<key>JVMOptions</key>
|
||||
|
||||
<array>
|
||||
<string>-Drefine.host=0.0.0.0</string>
|
||||
…
|
||||
</array>
|
||||
```
|
||||
|
||||
:::caution
|
||||
OpenRefine has no built-in security or version control for multi-user scenarios. OpenRefine has a single data model that is not shared, so there is a risk of data operations being overwritten by other users. Care must be taken by users.
|
||||
:::
|
||||
|
||||
### Automating OpenRefine
|
||||
|
||||
Some users may wish to employ OpenRefine for batch processing as part of a larger automated pipeline. Not all OpenRefine features can work without human supervision and advancement (such as clustering), but many data transformation tasks can be automated.
|
||||
|
||||
:::info
|
||||
The following are all third-party extensions and code; the OpenRefine team does not maintain them and cannot guarantee that any of them work.
|
||||
:::
|
||||
|
||||
|
||||
Some examples:
|
||||
|
||||
* This project allows OpenRefine to be run from the command line using [operations saved in a JSON file](running#reusing-operations): [OpenRefine batch processing](https://github.com/opencultureconsulting/openrefine-batch)
|
||||
* A Python project for applying a JSON file of operations to a data file, outputting the new file, and deleting the temporary project, written by David Huynh and Max Ogden: [Python client library for Google Refine](https://github.com/maxogden/refine-python)
|
||||
* And the same in Ruby: [Refine-Ruby](https://github.com/maxogden/refine-ruby)
|
||||
* Another Python client library, by Paul Makepeace: [OpenRefine Python Client Library](https://github.com/PaulMakepeace/refine-client-py)
|
||||
|
||||
To look for other instances, search our Google Groups [for users](https://groups.google.com/g/openrefine and [for developers](https://groups.google.com/g/openrefine-dev), where [these projects were originally posted](https://groups.google.com/g/openrefine/c/GfS1bfCBJow/m/qWYOZo3PKe4J).
|
||||
|
||||
|
||||
|
@ -10,13 +10,13 @@ An OpenRefine project is started by importing in some existing data - OpenRefine
|
||||
|
||||
No matter where your data comes from, OpenRefine doesn’t modify your original data source. It copies all the information from your input, creates its own project file, and stores it in your [workspace directory](installing#set-where-data-is-stored).
|
||||
|
||||
The data and all of your edits are automatically saved inside the project file. When you’re finished modifying the data, you can export it back out into the file format of your choice.
|
||||
The data and all of your edits are [automatically saved](#autosaving) inside the project file. When you’re finished modifying the data, you can [export it back out](exporting) into the file format of your choice.
|
||||
|
||||
You can also receive and open other people’s projects, or send them yours, by exporting a project archive and importing it.
|
||||
You can also receive and open other people’s projects, or send them yours, by [exporting a project archive](exporting#export-a-project) and [importing it](#import-a-project).
|
||||
|
||||
## Create project by importing data
|
||||
|
||||
When you start OpenRefine, you’ll be taken to the “Create Project” screen. You’ll see on the left side of the screen that your options are to:
|
||||
When you start OpenRefine, you’ll be taken to the <span class="menuItems">Create Project</span> screen. You’ll see on the left side of the screen that your options are to:
|
||||
|
||||
* import data from a file on your computer
|
||||
* import data from a link to the web
|
||||
@ -49,6 +49,7 @@ If you supply two or more files for one project, the files’ rows will be loade
|
||||
|berries.csv||9|Mulberry|Greece|
|
||||
|berries.csv||2|Blueberry|Canada|
|
||||
|
||||
You cannot combine two datasets into one project by appending data within rows. You can, however, combine two projects later using functions such as [cross()](grelfunctions/#crosscell-s-projectname-s-columnname).
|
||||
|
||||
For whichever method you choose, when you click <span class="menuItems">Next >></span> you will be given a preview and a chance to configure the way OpenRefine interprets the file.
|
||||
|
||||
@ -94,16 +95,17 @@ If your connection is successful, you will see a Query Editor where you can run
|
||||
|
||||
You have two ways to load in data from Google Sheets:
|
||||
* A link to an accessible Google Sheet (that is, one with link-sharing turned on)
|
||||
* Selecting a Google Sheet in your Google Drive
|
||||
|
||||
* Selecting a Google Sheet in your Google Drive.
|
||||
|
||||
#### Google Sheet by URL
|
||||
|
||||
You can import data from any Google Sheet that has link-sharing turned on. Paste in a URL that looks something like
|
||||
|
||||
```https://docs.google.com/spreadsheets/………/edit?usp=sharing```
|
||||
```
|
||||
https://docs.google.com/spreadsheets/………/edit?usp=sharing
|
||||
```
|
||||
|
||||
This will only work with Sheets, not with any other Google Drive file that might have an available link, including `.xls` and other valid files that are hosted in Google Drive. These links will also not work [by URL](#web-addresses-urls), so you need to download the files to your computer.
|
||||
This will only work with Sheets, not with any other Google Drive file that might have an available link, including `.xls` and other valid files that are hosted in Google Drive. These links will not work when attempting to start a project [by URL](#web-addresses-urls) either, so you need to download those files to your computer.
|
||||
|
||||
#### Google Sheet from Drive
|
||||
|
||||
@ -130,6 +132,10 @@ If you imported a spreadsheet with multiple worksheets, they will be listed alon
|
||||
|
||||
Note that OpenRefine does not preserve any formatting, such as cell or text colour, that my have been in the original data file.
|
||||
|
||||
:::info
|
||||
Look for character encoding issues at this stage. You may want to manually select an encoding, such as UTF-8, UTF-16, or ASCII, if OpenRefine does not display some characters correctly in the preview. Once your project is created, you can specify another encoding for specific columns using the [reinterpret() function](grelfunctions#reinterprets-s-encoder).
|
||||
:::
|
||||
|
||||
You should create a project name at this stage. You can also supply tags to keep your projects organized. When you’re happy with the preview, click <span class="menuItems">Create Project</span>.
|
||||
|
||||
|
||||
@ -137,12 +143,10 @@ You should create a project name at this stage. You can also supply tags to keep
|
||||
|
||||
Because OpenRefine only runs locally on your computer, you can’t have a project accessible to more than one person at the same time.
|
||||
|
||||
The best way to collaborate with another person is to export and import projects that save all your changes, so that you can pick up where someone else left off. You can also [export projects](exporting) and import them to new computers of your own, such as for working on the same project from the office and from home.
|
||||
The best way to collaborate with another person is to export and import projects that save all your changes, so that you can pick up where someone else left off. You can also [export projects](exporting#export-a-project) and import them to new computers of your own, such as for working on the same project from the office and from home.
|
||||
|
||||
An exported project will include all of the [history](running#history-undoredo), so you can see (and undo) all the changes from the previous user. It is essentially a point-in-time snapshot of their work. OpenRefine only exports projects as `.tar.gz` files at this time.
|
||||
|
||||
### Instructions
|
||||
|
||||
Once someone has sent you a project archive file from their computer, you can save it anywhere, including your Downloads folder.
|
||||
|
||||
In the left-hand menu of the home screen, click <span class="menuItems">Import Project</span>. Click <span class="menuItems">Browse…</span> and navigate to wherever you saved the file you were sent (for example, your Downloads folder).
|
||||
@ -161,13 +165,13 @@ You can access all of your created projects by clicking on <span class="menuItem
|
||||
|
||||
### Naming projects
|
||||
|
||||
You may have multiple projects from the same dataset, or multiple versions from sharing a project with another person. OpenRefine automatically generates a project name from the imported file, or <span class="menuItems">clipboard</span> when you use Clipboard importing. Project names don’t have to be unique, so OpenRefine will create many projects with the same name unless you intervene.
|
||||
You may have multiple projects from the same dataset, or multiple versions from sharing a project with another person. OpenRefine automatically generates a project name from the imported file, or “clipboard” when you use <span class="menuItems">Clipboard</span> importing. Project names don’t have to be unique, so OpenRefine will create many projects with the same name unless you intervene.
|
||||
|
||||
You can name a project when you create it or import it, and you can rename a project by opening it and clicking on the project name at the top of the screen.
|
||||
|
||||
### Autosaving
|
||||
|
||||
OpenRefine saves all of your actions (everything you can see in the <span class="menuItems">Undo/Redo</span> panel). That includes flagging and starring rows.
|
||||
OpenRefine [saves all of your actions](running#history-undoredo) (everything you can see in the <span class="menuItems">Undo/Redo</span> panel). That includes flagging and starring rows.
|
||||
|
||||
It doesn’t, however, save your facets, filters, or any kind of view you may have in place while you work. This includes the number of rows showing, whether you are showing your data as rows or records, and any sorting or column collapsing you may have done. A good rule of thumb is: if it’s not showing in <span class="menuItems">Undo/Redo</span>, you will lose it when you leave the project workspace.
|
||||
|
||||
@ -181,4 +185,4 @@ Go to <span class="menuItems">Open Project</span> and find the project you want
|
||||
|
||||
### Project files
|
||||
|
||||
You can find all of your raw project files in your work directory. They will be named according to the unique Project ID that OpenRefine has assigned them, which you can find on the <span class="menuItems">Open Project</span> screen, under the “About” link for each project.
|
||||
You can find all of your raw project files in your work directory. They will be named according to the unique “Project ID” that OpenRefine has assigned them, which you can find on the <span class="menuItems">Open Project</span> screen, under the “About” link for each project.
|
@ -8,7 +8,7 @@ sidebar_label: Overview
|
||||
|
||||
OpenRefine gives you powerful ways to clean, correct, codify, and extend your data. Without ever needing to type inside a single cell, you can automatically fix typos, convert things to the right format, and add structured categories from trusted sources.
|
||||
|
||||
The following ways to improve data are organized by their appearance in the menu options in OpenRefine. You can:
|
||||
This section of ways to improve data are organized by their appearance in the menu options in OpenRefine. You can:
|
||||
|
||||
* change the order of rows or columns
|
||||
* edit cell contents within a particular column
|
||||
@ -16,7 +16,7 @@ The following ways to improve data are organized by their appearance in the menu
|
||||
* transform rows into columns, and columns into rows
|
||||
* split or join columns
|
||||
* add new columns based on existing data or through reconciliation
|
||||
* convert your rows of data into multi-row records
|
||||
* convert your rows of data into multi-row records.
|
||||
|
||||
## Edit rows
|
||||
|
||||
|
@ -4,32 +4,26 @@ title: Troubleshooting
|
||||
sidebar_label: Troubleshooting
|
||||
---
|
||||
|
||||
## Frequently Asked Questions
|
||||
## Frequently asked questions
|
||||
|
||||
We collect and share FAQs and responses on Github at [https://github.com/OpenRefine/OpenRefine/wiki/FAQ](https://github.com/OpenRefine/OpenRefine/wiki/FAQ). If you don’t find your problem and solution there, continue on to the resources in the Community section to see more conversations and look for solutions.
|
||||
|
||||
|
||||
## Community
|
||||
|
||||
If you’re having a problem:
|
||||
|
||||
|
||||
|
||||
* Search the [User forum](https://groups.google.com/g/openrefine) to see if the problem is already reported
|
||||
* Read [Github issues](https://github.com/OpenRefine/OpenRefine/issues) to see if the problem is already reported
|
||||
* Read [Stack Overflow](https://stackoverflow.com/questions/tagged/openrefine) to see if the problem is already reported
|
||||
* Check [Twitter](https://twitter.com/search?f=tweets&vertical=default&q=OpenRefine%20OR%20%22Open%20Refine%22%20OR%20%23OpenRefine&src=typd) to see if others are discussing the problem
|
||||
* Report an issue:
|
||||
* First as a new thread in the User forum
|
||||
* Then, if you wish, you can create a Github issue
|
||||
* Then, if you wish, you can create a Github issue.
|
||||
|
||||
If you want to contribute:
|
||||
|
||||
|
||||
|
||||
* [We have a guide to contributing here.](https://github.com/OpenRefine/OpenRefine/blob/master/CONTRIBUTING.md)
|
||||
* Contribute your feature requests in the User forum or as Github issues
|
||||
* Share with us your successes and use cases in the User forum
|
||||
* Add your blog posts, guides, tips, tricks, tutorials to our list
|
||||
* Respond to our biennial user survey
|
||||
* Join the User Forum and/or [Developer Forum](https://groups.google.com/g/openrefine-dev)
|
||||
* [Help us translate the tool into more languages](https://docs.openrefine.org/technical-reference/translating), using Weblate
|
||||
* [We have a guide to contributing](https://docs.openrefine.org/technical-reference/contributing) in the Technical Reference section
|
||||
* Contribute your feature requests in the [User forum](https://groups.google.com/g/openrefine) or as [Github issues](https://github.com/OpenRefine/OpenRefine/issues)
|
||||
* Join the User Forum and/or the [Developer Forum](https://groups.google.com/g/openrefine-dev)
|
||||
* Share your successes and use cases with us, in the User forum
|
||||
* Add your [blog posts, guides, tips, tricks, tutorials to our list](https://github.com/OpenRefine/OpenRefine/wiki/External-Resources)
|
||||
* Keep an eye out for and respond to our biennial user survey.
|
@ -8,7 +8,7 @@ sidebar_label: Wikidata
|
||||
|
||||
OpenRefine provides powerful ways to both pull data from Wikidata and add data to it.
|
||||
|
||||
OpenRefine’s connections to Wikidata is supplied by an extension that is available by default in OpenRefine. The Wikidata extension can be removed manually by navigating to your OpenRefine installation folder, and then looking inside `webapp/extensions/` and deleting the `wikidata` folder inside.
|
||||
OpenRefine’s connections to Wikidata were formerly an optional extension, but are now installed automatically with the downloadable package. The Wikidata extension can be removed manually by navigating to your OpenRefine installation folder, and then looking inside `webapp/extensions/` and deleting the `wikidata` folder inside.
|
||||
|
||||
You do not need a Wikidata account to reconcile your local OpenRefine project to Wikidata. If you wish to [upload your cleaned dataset to Wikidata](#editing-wikidata-with-openrefine), you will need an [autoconfirmed](https://www.wikidata.org/wiki/Wikidata:Autoconfirmed_users) account, and you must [authorize OpenRefine with that account](#manage-wikidata-account).
|
||||
|
||||
@ -180,10 +180,4 @@ The best resource is the [Quality assurance page](https://www.wikidata.org/wiki/
|
||||
|
||||
OpenRefine will analyze your schema and make suggestions. It does not check for conflicts in your proposed edits, or tell you about redundancies.
|
||||
|
||||
One of the most common suggestions is to attach [a reference to your edits](https://www.wikidata.org/wiki/Help:Sources) - a citation for where the information can be found. This can be a book or newspaper citation, a URL to an online page, a reference to a physical source in an archival or special collection, or another source. If the source is itself an item on Wikidata, use the relationship [stated in (P248)](https://www.wikidata.org/wiki/Property:P248); otherwise, use [reference URL (P854)](https://www.wikidata.org/wiki/Property:P854) to identify an external source.
|
||||
|
||||
## Wikibases
|
||||
|
||||
Much of the above is also true of other Wikibase instances. You can reconcile your dataset against an available Wikibase reconciliation API.
|
||||
|
||||
Wikibase administrators can configure a reconciliation API using the [instructions here](https://openrefine-wikibase.readthedocs.io/en/latest/index.html).
|
||||
One of the most common suggestions is to attach [a reference to your edits](https://www.wikidata.org/wiki/Help:Sources) - a citation for where the information can be found. This can be a book or newspaper citation, a URL to an online page, a reference to a physical source in an archival or special collection, or another source. If the source is itself an item on Wikidata, use the relationship [stated in (P248)](https://www.wikidata.org/wiki/Property:P248); otherwise, use [reference URL (P854)](https://www.wikidata.org/wiki/Property:P854) to identify an external source.
|
@ -23,7 +23,6 @@ module.exports = {
|
||||
items: ['manual/expressions', 'manual/grelfunctions'],
|
||||
},
|
||||
'manual/exporting',
|
||||
'manual/glossary',
|
||||
'manual/troubleshooting'
|
||||
],
|
||||
'Technical Reference': [
|
||||
|
Loading…
Reference in New Issue
Block a user