orcli/README.md

1.9 KiB

orcli (💎+🤖)

Bash script to control OpenRefine via its HTTP API.

Features

  • works with latest OpenRefine version (currently 3.5)
  • batch processing (import, transform, export)
    • orcli takes care of starting and stopping OpenRefine with temporary workspaces
    • your existing OpenRefine data will not be touched
  • import CSV, TSV, line-based TXT, fixed-width TXT, JSON or XML (and specify input options)
    • supports stdin, multiple files and URLs
  • transform data by providing an undo/redo JSON file
    • orcli calls specific endpoints for each operation to provide improved error handling and logging
    • supports stdin, multiple files and URLs
  • export to TSV, CSV, HTML, XLS, XLSX, ODS
  • templating export to additional formats like JSON or XML

Requirements

Install

  1. Navigate to the OpenRefine program directory

  2. Download bash script there and make it executable

wget https://github.com/opencultureconsulting/orcli/raw/main/orcli
chmod +x orcli
  1. Create a symlink in your $PATH (e.g. to ~/.local/bin)
ln -s "${PWD}/orcli" ~/.local/bin/

Usage

Ensure you have OpenRefine running (i.e. available at http://localhost:3333 or another URL) or use the integrated start command first.

Use integrated help screens for available options and examples for each command.

orcli --help

Development

orcli uses bashly for generating the one-file script from files in the src directory

  1. Install bashly (requires ruby)
gem install bashly
  1. Edit code in src directory

  2. Generate script

bashly generate --upgrade