Compare commits

..

No commits in common. "master" and "v0.3.10" have entirely different histories.

8 changed files with 47 additions and 62 deletions

3
.gitignore vendored
View File

@ -4,6 +4,7 @@ dist
.*
openrefine_client.egg-info
refine.spec
openrefine-*
openrefine-2.*
openrefine-3.*
openrefine-client_*
tests-cli.log

View File

@ -1,12 +1,12 @@
# OpenRefine Python Client with extended command line interface (⌨️ for 💎)
# OpenRefine Python Client with extended command line interface
[![Codacy Badge](https://app.codacy.com/project/badge/Grade/43ad9bfd707b4627bd45e5c5f912a8e0)](https://www.codacy.com/gh/opencultureconsulting/openrefine-client/dashboard) [![Docker](https://img.shields.io/microbadger/image-size/felixlohmeier/openrefine-client?label=docker)](https://hub.docker.com/r/felixlohmeier/openrefine-client/) [![PyPI](https://img.shields.io/pypi/v/openrefine-client)](https://pypi.org/project/openrefine-client/) [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/felixlohmeier/openrefineder/master)
[![Codacy Badge](https://api.codacy.com/project/badge/Grade/33129bd15cdc4ece88c8012caab8d347)](https://www.codacy.com/app/felixlohmeier/openrefine-client?utm_source=github.com&utm_medium=referral&utm_content=opencultureconsulting/openrefine-client&utm_campaign=Badge_Grade) [![Docker](https://img.shields.io/microbadger/image-size/felixlohmeier/openrefine-client?label=docker)](https://hub.docker.com/r/felixlohmeier/openrefine-client/) [![PyPI](https://img.shields.io/pypi/v/openrefine-client)](https://pypi.org/project/openrefine-client/) [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/felixlohmeier/openrefineder/master)
The [OpenRefine Python Client from PaulMakepeace](https://github.com/PaulMakepeace/refine-client-py) provides a library for communicating with an [OpenRefine](http://openrefine.org) server.
This fork extends the command line interface (CLI) and is distributed as a convenient one-file-executable (Windows, Linux, macOS).
It is also available via Docker Hub, PyPI and Binder.
works with OpenRefine 2.7, 2.8, 3.0, 3.1, 3.2, 3.3, 3.4, 3.4.1, 3.5.0
works with OpenRefine 2.7, 2.8, 3.0, 3.1, 3.2, 3.3, 3.4, 3.4.1
## Download
@ -248,7 +248,7 @@ openrefine-client --create combined.zip --format csv --projectName myproject --i
### See also
- Linux Bash script to run OpenRefine in batch mode (import, transform, export): [openrefine-batch](https://github.com/opencultureconsulting/openrefine-batch)
- [Jupyter notebook demonstrating usage in Linux Bash](https://nbviewer.jupyter.org/github/felixlohmeier/openrefineder/blob/master/notebooks/openrefine-client-bash.ipynb)
- [Jupyter notebook demonstrating usage in Linux Bash](https://nbviewer.jupyter.org/github/felixlohmeier/openrefineder/blob/master/openrefine-client-bash.ipynb)
- Use case [HOS-MetadataTransformations](https://github.com/subhh/HOS-MetadataTransformations): Automated workflow for harvesting, transforming and indexing of metadata using metha, OpenRefine and Solr. Part of the Hamburg Open Science "Schaufenster" software stack.
- Use case [Data processing of ILS data to facilitate a new discovery layer for the German Literature Archive (DLA)](https://doi.org/10.5281/zenodo.2678113): Custom data processing pipeline based on Pandas (a Python library) and OpenRefine.
@ -297,7 +297,7 @@ Run openrefine-client linked to a dockerized OpenRefine ([felixlohmeier/openrefi
2. Run server (will be available at http://localhost:3333)
```sh
docker run -d -p 3333:3333 --network=openrefine --name=openrefine-server felixlohmeier/openrefine:3.5.0
docker run -d -p 3333:3333 --network=openrefine --name=openrefine-server felixlohmeier/openrefine:3.4.1
```
3. Run client with some [basic commands](#basic-commands): 1. download example files, 2. create project from file, 3. list projects, 4. show metadata, 5. export to terminal, 6. apply transformation rules (deduplication), 7. export again to terminal, 8. export to xls file and 9. delete project
@ -337,7 +337,7 @@ Customize OpenRefine server:
- Example for [allocating more memory](https://github.com/OpenRefine/OpenRefine/wiki/FAQ#out-of-memory-errors---feels-slow---could-not-reserve-enough-space-for-object-heap) to OpenRefine with additional option `-m 4G`
```sh
docker run -d -p 3333:3333 --network=openrefine --name=openrefine-server felixlohmeier/openrefine:3.5.0 -i 0.0.0.0 -d /data -m 4G
docker run -d -p 3333:3333 --network=openrefine --name=openrefine-server felixlohmeier/openrefine:3.4.1 -i 0.0.0.0 -d /data -m 4G
```
- The OpenRefine version is defined by the docker tag.
@ -624,8 +624,8 @@ See also:
- free to use on-demand server with Jupyter notebook, OpenRefine and Bash
- no registration needed, will start within a few minutes
- [restricted](https://mybinder.readthedocs.io/en/latest/faq.html#how-much-memory-am-i-given-when-using-binder) to 2 GB RAM and server will be deleted after 10 minutes of inactivity
- [bash_kernel demo notebook](https://nbviewer.jupyter.org/github/felixlohmeier/openrefineder/blob/master/notebooks/openrefine-client-bash.ipynb) for using the openrefine-client in a Linux Bash environment [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/felixlohmeier/openrefineder/master?urlpath=/tree/notebooks/openrefine-client-bash.ipynb)
- [python2 demo notebook](https://nbviewer.jupyter.org/github/felixlohmeier/openrefineder/blob/master/notebooks/openrefine-client-python.ipynb) for using the openrefine-client in a Python 2 environment [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/felixlohmeier/openrefineder/master?urlpath=/tree/notebooks/openrefine-client-python.ipynb)
- [bash_kernel demo notebook](https://nbviewer.jupyter.org/github/felixlohmeier/openrefineder/blob/master/openrefine-client-bash.ipynb) for using the openrefine-client in a Linux Bash environment [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/felixlohmeier/openrefineder/master?urlpath=/tree/openrefine-client-bash.ipynb)
- [python2 demo notebook](https://nbviewer.jupyter.org/github/felixlohmeier/openrefineder/blob/master/openrefine-client-python.ipynb) for using the openrefine-client in a Python 2 environment [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/felixlohmeier/openrefineder/master?urlpath=/tree/openrefine-client-python.ipynb)
## Development
@ -651,42 +651,42 @@ The Python client library includes several unit tests.
There is also a script that uses docker images to run the unit tests with different versions of OpenRefine.
- run tests on all OpenRefine versions (from 2.0 up to 3.5.0)
- run tests on all OpenRefine versions (from 2.0 up to 3.4.1)
```sh
./tests.sh -a
```
- run tests on tag 3.5.0
- run tests on tag 3.4.1
```sh
./tests.sh -t 3.5.0
./tests.sh -t 3.4.1
```
- run tests on tag 3.5.0 interactively (pause before and after tests)
- run tests on tag 3.4.1 interactively (pause before and after tests)
```sh
./tests.sh -t 3.5.0 -i
./tests.sh -t 3.4.1 -i
```
- run tests on tags 3.5.0 and 2.7
- run tests on tags 3.4.1 and 2.7
```sh
./tests.sh -t 3.5.0 -t 2.7
./tests.sh -t 3.4.1 -t 2.7
```
For Linux there are also functional tests for all command line options.
- run all functional tests on OpenRefine 3.5.0
- run all functional tests on OpenRefine 3.4
```sh
./tests-cli.sh 3.5.0
./tests-cli.sh 3.4.1
```
- run all functional tests on OpenRefine 3.5.0 with one-file-executable
- run all functional tests on OpenRefine 3.4 with one-file-executable
```sh
./tests-cli.sh 3.5.0 openrefine-client_0-3-7_linux
./tests-cli.sh 3.4.1 openrefine-client_0-3-7_linux
```
### Distributing
@ -696,7 +696,7 @@ Note to myself: When releasing a new version...
1. Run functional tests
```sh
for v in 2.7 2.8 3.0 3.1 3.2 3.3 3.4 3.4.1 3.5.0; do
for v in 2.7 2.8 3.0 3.1 3.2 3.3 3.4 3.4.1; do
./tests-cli.sh $v
done
```
@ -728,7 +728,7 @@ Note to myself: When releasing a new version...
4. Run functional tests with Linux executable
```sh
for v in 2.7 2.8 3.0 3.1 3.2 3.3 3.4 3.4.1 3.5.0; do
for v in 2.7 2.8 3.0 3.1 3.2 3.3 3.4 3.4.1; do
./tests-cli.sh $v openrefine-client_0-3-7_linux
done
```
@ -752,7 +752,7 @@ Note to myself: When releasing a new version...
8. Bump openrefine-client version in related projects
- openrefine-batch: [openrefine-batch.sh](https://github.com/opencultureconsulting/openrefine-batch/blob/master/openrefine-batch.sh#L7) and [openrefine-batch-docker.sh](https://github.com/opencultureconsulting/openrefine-batch/blob/master/openrefine-batch-docker.sh)
- openrefineder: [postBuild](https://github.com/felixlohmeier/openrefineder/blob/master/binder/postBuild)
- openrefineder: [postBuild](https://github.com/felixlohmeier/openrefineder/blob/master/postBuild)
## Credits

View File

@ -97,8 +97,7 @@ class RefineServer(object):
try:
response = urllib2.urlopen(req)
except urllib2.HTTPError as e:
raise Exception('HTTP %d "%s" for %s\n\t%s' %
(e.code, e.msg, e.geturl(), data))
raise Exception('HTTP %d "%s" for %s\n\t%s' % (e.code, e.msg, e.geturl(), data))
except urllib2.URLError as e:
raise urllib2.URLError(
'%s for %s. No Refine server reachable/running; ENV set?' %
@ -114,10 +113,6 @@ class RefineServer(object):
"""Open a Refine URL, optionally POST data, and return parsed JSON."""
response = json.loads(self.urlopen(*args, **kwargs).read())
if 'code' in response and response['code'] not in ('ok', 'pending'):
if 'Missing or invalid csrf_token parameter' == response['message']:
self.get_csrf_token()
response = json.loads(self.urlopen(*args, **kwargs).read())
return response
error_message = ('server ' + response['code'] + ': ' +
response.get('message', response.get('stack', response)))
raise Exception(error_message)
@ -418,10 +413,7 @@ class RefineProject:
for i, column in enumerate(column_model['columns']):
name = column['name']
self.column_order[name] = i
try:
column_index[name] = column['cellIndex']
except KeyError:
column_index[name] = i
column_index[name] = column['cellIndex']
self.key_column = column_model['keyColumnName']
self.has_records = response['recordModel'].get('hasRecords', False)
self.rows_response_factory = RowsResponseFactory(column_index)

View File

@ -17,8 +17,8 @@
# along with this program. If not, see <http://www.gnu.org/licenses/>
# defaults:
all=(3.5.0 3.4.1 3.4 3.3 3.2-java12 3.2-java11 3.2-java10 3.2-java9 3.2 3.1-java9 3.1 3.0-java9 3.0 2.8-java9 2.8 2.8-java7 2.7 2.7-java7 2.5-java7 2.5-java6 2.1-java6 2.0-java6)
main=(3.5.0 3.4.1 3.4 3.3 3.2 3.1 3.0 2.8 2.7 2.5-java6 2.1-java6 2.0-java6)
all=(3.4.1 3.4 3.3 3.2-java12 3.2-java11 3.2-java10 3.2-java9 3.2 3.1-java9 3.1 3.0-java9 3.0 2.8-java9 2.8 2.8-java7 2.7 2.7-java7 2.5-java7 2.5-java6 2.1-java6 2.0-java6)
main=(3.4.1 3.4 3.3 3.2 3.1 3.0 2.8 2.7 2.5-java6 2.1-java6 2.0-java6)
interactively=false
port="3333"
@ -31,10 +31,10 @@ Script for running tests with different OpenRefine and Java versions.
It uses docker images from https://hub.docker.com/r/felixlohmeier/openrefine.
Examples:
./tests.sh -a # run tests on all OpenRefine versions (from 2.0 up to 3.5.0)
./tests.sh -t 3.5.0 # run tests on tag 3.5.0
./tests.sh -t 3.5.0 -i # run tests on tag 3.5.0 interactively (pause before and after tests)
./tests.sh -t 3.5.0 -t 2.7 # run tests on tags 3.5.0 and 2.7
./tests.sh -a # run tests on all OpenRefine versions (from 2.0 up to 3.4.1)
./tests.sh -t 3.4.1 # run tests on tag 3.4.1
./tests.sh -t 3.4.1 -i # run tests on tag 3.4.1 interactively (pause before and after tests)
./tests.sh -t 3.4.1 -t 2.7 # run tests on tags 3.4.1 and 2.7
Advanced:
./tests.sh -j # run tests on all OpenRefine versions and each with all supported Java versions (requires a lot of docker images to be downloaded!)

View File

@ -32,8 +32,7 @@ DATA
# ================================== ACTION ================================== #
# OpenRefine 4.x fails without manually set headerLines
${cmd} --create "tmp/${t}/${t}.csv" --processQuotes "false" --headerLines 1
${cmd} --create "tmp/${t}/${t}.csv" --processQuotes "false"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #

View File

@ -38,13 +38,8 @@ DATA
cat << "DATA" > "tmp/${t}/${t}.transform"
[
{
"op": "core/column-reorder",
"columnNames": [
"record - a",
"record - b",
"record - c"
],
"description": "Reorder columns"
"op": "core/column-removal",
"columnName": "record"
},
{
"op": "core/row-removal",

View File

@ -38,13 +38,12 @@ DATA
cat << "DATA" > "tmp/${t}/${t}.transform"
[
{
"op": "core/column-reorder",
"columnNames": [
"root - record - icon",
"root - record - code",
"root - record - meaning"
],
"description": "Reorder columns"
"op": "core/column-removal",
"columnName": "root"
},
{
"op": "core/column-removal",
"columnName": "root - record"
},
{
"op": "core/row-removal",

View File

@ -38,13 +38,12 @@ DATA
cat << "DATA" > "tmp/${t}/${t}.transform"
[
{
"op": "core/column-reorder",
"columnNames": [
"root - record - a",
"root - record - b",
"root - record - c"
],
"description": "Reorder columns"
"op": "core/column-removal",
"columnName": "root"
},
{
"op": "core/column-removal",
"columnName": "root - record"
},
{
"op": "core/row-removal",