Merge pull request #71 from opencultureconsulting:felixlohmeier/tutorial-42

getting started tutorial
This commit is contained in:
Felix Lohmeier 2022-10-25 12:45:20 +02:00 committed by GitHub
commit 8885fe89fb
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
4 changed files with 112 additions and 33 deletions

View File

@ -3,7 +3,7 @@ tasks:
- name: install bashly and OpenRefine
before: gem install --silent bashly
init: |
wget -q -O openrefine.tar.gz https://github.com/OpenRefine/OpenRefine/releases/download/3.5.2/openrefine-linux-3.5.2.tar.gz
wget -q -O openrefine.tar.gz "https://oss.sonatype.org/service/local/artifact/maven/content?r=releases&g=org.openrefine&a=openrefine&v=3.6.2&c=linux&p=tar.gz"
tar --exclude 'licenses' --exclude 'LICENSE.txt' --exclude 'README.md' -xzf openrefine.tar.gz --strip 1
rm openrefine.tar.gz
command: |

119
README.md
View File

@ -4,9 +4,11 @@ Bash script to control OpenRefine via [its HTTP API](https://docs.openrefine.org
## Features
* works with latest OpenRefine version (currently 3.5)
* batch processing (import, transform, export)
* works with latest OpenRefine version (currently 3.6)
* run batch processes (import, transform, export)
* orcli takes care of starting and stopping OpenRefine with temporary workspaces
* allows execution of arbitrary bash scripts
* interactive mode for playing around and debugging
* your existing OpenRefine data will not be touched
* import CSV, ~~TSV, line-based TXT, fixed-width TXT, JSON or XML~~ (and specify input options)
* supports stdin, multiple files and URLs
@ -29,24 +31,101 @@ Bash script to control OpenRefine via [its HTTP API](https://docs.openrefine.org
2. Download bash script there and make it executable
```sh
wget https://github.com/opencultureconsulting/orcli/raw/main/orcli
chmod +x orcli
```
```sh
wget https://github.com/opencultureconsulting/orcli/raw/main/orcli
chmod +x orcli
```
3. Optional: Create a symlink in your $PATH (e.g. to ~/.local/bin)
Optional:
```sh
ln -s "${PWD}/orcli" ~/.local/bin/
```
* Create a symlink in your $PATH (e.g. to ~/.local/bin)
```sh
ln -s "${PWD}/orcli" ~/.local/bin/
```
* Install Bash tab completion
* temporary
```sh
source <(orcli completions)
```
* permanently
```sh
mkdir -p ~/.bashrc.d
orcli completions > ~/.bashrc.d/orcli
```
## Getting Started
1. Launch an interactive playground
```sh
./orcli run --interactive
```
2. Create OpenRefine project `duplicates` from comma-separated-values (CSV) file
```sh
orcli import csv "https://git.io/fj5hF" --projectName "duplicates"
```
3. Show OpenRefine project's metadata
```sh
orcli info "duplicates"
```
4. Remove duplicates (coming soon)
5. Export data from OpenRefine project to tab-separated-values (TSV) file `duplicates.tsv`
```sh
orcli export tsv "duplicates" --output "duplicates.tsv"
```
6. Write out your session history to file `example.sh` (and delete the last line to remove the history command)
```sh
history -a "example.sh"
sed -i '$ d' example.sh
```
7. Exit playground
```sh
exit
```
8. Run batch process
```sh
./orcli run example.sh
```
9. Cleanup example files
```sh
rm duplicates.tsv
rm example.sh
```
## Usage
Use integrated help screens for available options and examples for each command.
* Use integrated help screens for available options and examples for each command.
```sh
orcli --help
```
```sh
orcli --help
```
* If your OpenRefine is running on a server, then use the environment variable OPENREFINE_URL.
```sh
OPENREFINE_URL="http://localhost:3333" orcli list
```
## Development
@ -54,14 +133,14 @@ orcli uses [bashly](https://github.com/DannyBen/bashly/) for generating the one-
1. Install bashly (requires ruby)
```sh
gem install bashly
```
```sh
gem install bashly
```
2. Edit code in [src](src) directory
3. Generate script
```sh
bashly generate --upgrade
```
```sh
bashly generate --upgrade
```

18
orcli
View File

@ -35,10 +35,10 @@ orcli_usage() {
# :command.usage_commands
printf "Commands:\n"
echo " completions Generate bash completions"
echo " import import commands"
echo " import commands to create OpenRefine projects from files or URLs"
echo " list list projects on OpenRefine server"
echo " info show project metadata"
echo " export export commands"
echo " info show OpenRefine project's metadata"
echo " export commands to export data from OpenRefine projects to files"
echo " run run tmp OpenRefine workspace and execute shell script(s)"
echo
@ -122,11 +122,11 @@ orcli_completions_usage() {
# :command.usage
orcli_import_usage() {
if [[ -n $long_usage ]]; then
printf "orcli import - import commands\n"
printf "orcli import - commands to create OpenRefine projects from files or URLs\n"
echo
else
printf "orcli import - import commands\n"
printf "orcli import - commands to create OpenRefine projects from files or URLs\n"
echo
fi
@ -253,11 +253,11 @@ orcli_list_usage() {
# :command.usage
orcli_info_usage() {
if [[ -n $long_usage ]]; then
printf "orcli info - show project metadata\n"
printf "orcli info - show OpenRefine project's metadata\n"
echo
else
printf "orcli info - show project metadata\n"
printf "orcli info - show OpenRefine project's metadata\n"
echo
fi
@ -296,11 +296,11 @@ orcli_info_usage() {
# :command.usage
orcli_export_usage() {
if [[ -n $long_usage ]]; then
printf "orcli export - export commands\n"
printf "orcli export - commands to export data from OpenRefine projects to files\n"
echo
else
printf "orcli export - export commands\n"
printf "orcli export - commands to export data from OpenRefine projects to files\n"
echo
fi

View File

@ -38,7 +38,7 @@ commands:
Usage: eval "\$(orcli completions)"
- name: import
help: import commands
help: commands to create OpenRefine projects from files or URLs
commands:
- name: csv
@ -77,7 +77,7 @@ commands:
help: list projects on OpenRefine server
- name: info
help: show project metadata
help: show OpenRefine project's metadata
args:
- name: project
help: project name or id
@ -87,7 +87,7 @@ commands:
- info 1234567890123
- name: export
help: export commands
help: commands to export data from OpenRefine projects to files
commands:
- name: tsv