This fork extends the command line interface (CLI) and is distributed as a convenient one-file-executable (Windows, Linux, Mac). It is also available via Docker Hub, PyPI and Binder.
Go to file
Felix Lohmeier ad95432fc0 add codacy badge 2019-08-06 22:02:24 +02:00
docker fix comment in Dockerfile 2019-08-04 02:15:26 +02:00
google fix templating example 2019-08-06 21:58:28 +02:00
tests Revert "refactor to allow module execution" 2019-08-03 13:23:32 +02:00
.gitignore prepare distribution on PyPI 2019-08-04 03:03:49 +02:00
COPYING.txt Revert "minor refactoring for dist" 2019-08-03 13:23:10 +02:00
MANIFEST.in prepare distribution on PyPI 2019-08-04 03:03:49 +02:00
Makefile Revert "refactor to allow module execution" 2019-08-03 13:23:32 +02:00
README.md add codacy badge 2019-08-06 22:02:24 +02:00
openrefine-client-peek.gif new README with video demo 2019-05-08 23:23:55 +02:00
refine.py prepare distribution on PyPI 2019-08-04 03:03:49 +02:00
requirements.txt Revert "included urllib2_file.py in the package to ease installation" 2017-11-17 16:47:31 +01:00
setup.py fix PyPI error (invalid classifier) 2019-08-04 03:26:57 +02:00
tests.sh improved script tests.sh 2019-08-06 22:01:20 +02:00

README.md

OpenRefine Python Client with extended command line interface

Codacy Badge

The OpenRefine Python Client Library from PaulMakepeace provides an interface to communicating with an OpenRefine server. This fork extends the command line interface (CLI) and supports communication between docker containers.

Download

One-file-executables:

For native Python installation on Windows, Mac or Linux see Installation below.

Peek

A short video loop that demonstrates the basic features (list, create, apply, export)

video loop that demonstrates basic features

Usage

Command line interface:

  • list all projects: --list
  • create project from file: --create [FILE]
  • apply rules from json file: --apply [FILE.json] [PROJECTID/PROJECTNAME]
  • export project to file: --export [PROJECTID/PROJECTNAME] --output=FILE.tsv
  • templating export: --export "My Address Book" --template='{ "friend" : {{jsonize(cells["friend"].value)}}, "address" : {{jsonize(cells["address"].value)}} }' --prefix='{ "address" : [' --rowSeparator=',' --suffix='] }' --filterQuery="^mary$"
  • show project metadata: --info [PROJECTID/PROJECTNAME]
  • delete project: --delete [PROJECTID/PROJECTNAME]
  • check --help for further options...

If you are familiar with python you may try all functions interactively (python -i refine.py) or use this library in your own python scripts. Some Examples:

  • show version of OpenRefine server: refine.RefineServer().get_version()
  • show total rows of project 2151545447855: refine.RefineProject(refine.RefineServer(),'2151545447855').do_json('get-rows')['total']
  • compute clusters of project 2151545447855 and column key: refine.RefineProject(refine.RefineServer(),'2151545447855').compute_clusters('key')

Configuration

By default the OpenRefine server URL is http://127.0.0.1:3333

The environment variables OPENREFINE_HOST and OPENREFINE_PORT enable overriding the host & port as well as the command line options -H and -P.

Installation

pip install openrefine-client

(requires Python 2.x, depends on urllib2_file>=0.2.1)

Tests

Ensure you have a Refine server running somewhere and, if necessary, set the environment vars as above.

Run tests, build, and install:

python setup.py test # to do a subset, e.g., --test-suite tests.test_facet

python setup.py build

python setup.py install

There is a Makefile that will do this too, and more.

Credits

Paul Makepeace, author

David Huynh, [initial cut](<http://markmail.org/message/jsxzlcu3gn6drtb7)

Artfinder, inspiration

Felix Lohmeier, extended the CLI features

Some data used in the test suite has been used from publicly available sources,