OpenRefine command-line interface written in Bash. Supports batch processing (import, transform, export).
Go to file
Felix Lohmeier fe2d503f64
Merge pull request #50 from opencultureconsulting:felixlohmeier/batch-should-run-49
set headless=true
2022-04-20 12:52:14 +02:00
src set headless=true 2022-04-20 10:51:51 +00:00
.gitignore setup dev environment 2022-03-25 11:55:57 +01:00
.gitpod.yml create orcli symlink in $PATH 2022-04-20 09:36:03 +02:00
LICENSE Initial commit 2022-03-25 10:34:28 +01:00
README.md first draft batch processing 2022-04-20 10:27:53 +00:00
orcli first draft batch processing 2022-04-20 10:27:53 +00:00

README.md

orcli (💎+🤖)

Bash script to control OpenRefine via its HTTP API.

Features

  • works with latest OpenRefine version (currently 3.5)
  • batch processing (import, transform, export)
    • orcli takes care of starting and stopping OpenRefine with temporary workspaces
    • your existing OpenRefine data will not be touched
  • import CSV, TSV, line-based TXT, fixed-width TXT, JSON or XML (and specify input options)
    • supports stdin, multiple files and URLs
  • transform data by providing an undo/redo JSON file
    • orcli calls specific endpoints for each operation to provide improved error handling and logging
    • supports stdin, multiple files and URLs
  • export to TSV, CSV, HTML, XLS, XLSX, ODS
  • templating export to additional formats like JSON or XML

Requirements

Install

  1. Navigate to the OpenRefine program directory

  2. Download bash script there and make it executable

wget https://github.com/opencultureconsulting/orcli/raw/main/orcli
chmod +x orcli
  1. Create a symlink in your $PATH (e.g. to ~/.local/bin)
ln -s "${PWD}/orcli" ~/.local/bin/

Usage

Ensure you have OpenRefine running (i.e. available at http://localhost:3333 or another URL) or use the integrated start command first.

Use integrated help screens for available options and examples for each command.

$ orcli --help
orcli - OpenRefine command-line interface written in Bash

Usage:
  orcli [command]
  orcli [command] --help | -h
  orcli --version | -v

Commands:
  batch    start tmp OpenRefine workspace and run multiple orcli commands
  import   import commands
  list     list projects on OpenRefine server
  info     show project metadata
  export   export commands

Options:
  --help, -h
    Show this help

  --version, -v
    Show version number

Environment Variables:
  OPENREFINE_URL
    URL to OpenRefine server
    Default: http://localhost:3333

Examples:
  orcli import csv "https://git.io/fj5hF" --projectName "duplicates"
  orcli list
  orcli info "duplicates"
  orcli export tsv "duplicates"
  orcli export tsv "duplicates" --output "duplicates.tsv"
  orcli batch \
    import csv "https://git.io/fj5hF" --projectName "duplicates" \
    info "duplicates" \
    export tsv "duplicates"

https://github.com/opencultureconsulting/orcli

Development

orcli uses bashly for generating the one-file script from files in the src directory

  1. Install bashly (requires ruby)
gem install bashly
  1. Edit code in src directory

  2. Generate script

bashly generate