Compare commits

...

138 Commits

Author SHA1 Message Date
Felix Lohmeier
02cf1192c4
Merge pull request #21 from mohammed78620/master
moved csrf token check into urlopen_json
2022-01-17 23:06:14 +01:00
mohammed akmal miah
e4d52818fc remove csrf check in urlopen 2022-01-17 12:33:05 +00:00
mohammed akmal miah
16541c522e update csrf token in urlopen_json 2022-01-17 12:30:07 +00:00
Felix Lohmeier
965c4e97fd
Merge pull request #20 from mohammed78620/master
updates csrf token when invalid
2022-01-17 10:34:26 +01:00
mohammed akmal miah
0563b54fc6 updates csrf token when invalid 2022-01-15 18:18:37 +00:00
Felix Lohmeier
fa3e352879 OpenRefine 3.5.0 2021-11-09 23:23:33 +01:00
Felix Lohmeier
1dd0cafd4e get-models differs between OpenRefine 4.x and OpenRefine 3.x
https://groups.google.com/g/openrefine-dev/c/N6tRlDBZ05g
2021-07-12 23:04:00 +02:00
Felix Lohmeier
a368147bdf adjust tests for OpenRefine 4.x 2021-07-12 22:51:48 +02:00
Felix Lohmeier
f66c88ee35 adjust tests for OpenRefine 3.5 2021-07-12 22:46:54 +02:00
Felix Lohmeier
2735db3f3f fix codacy badge 2021-02-12 13:39:52 +01:00
Felix Lohmeier
0df6050deb update links to openrefineder notebooks 2021-01-25 12:26:33 +01:00
Felix Lohmeier
65b0500d5e fix test create-json-trimStrings 2021-01-04 14:19:51 +01:00
Felix Lohmeier
a0274f6166 release 0.3.10 2021-01-04 14:01:54 +01:00
Felix Lohmeier
9eeebce47c improve functional tests for delete command 2021-01-04 13:44:03 +01:00
Felix Lohmeier
21bf1f2adc add CSRF token to API calls #7 #13 2021-01-04 13:42:13 +01:00
Felix Lohmeier
8a726d9a30 fix #15 unicode issues when piping stdout 2021-01-02 17:34:33 +01:00
Felix Lohmeier
bc89a98776 fix #14 export --format --output 2021-01-02 17:25:21 +01:00
Felix Lohmeier
82da3f7b4e add functional tests
all CLI options
replaces manual tests with jupyter notebook
2021-01-02 17:02:39 +01:00
Felix Lohmeier
356315bc9e instructions for appending data
as proposed by @axfelix in #12
2020-08-19 22:17:03 +02:00
Felix Lohmeier
41b90c38b3 pin base image to alpine:3.11 2020-08-08 12:46:23 +02:00
Felix Lohmeier
1c812d1253 update notes for distributing 2020-08-07 22:07:49 +02:00
Felix Lohmeier
f0d76b6acd release v0.3.9 2020-08-07 18:21:08 +02:00
Felix Lohmeier
ecf253ca44 fix #11 separator 2020-08-07 18:20:31 +02:00
Felix Lohmeier
c47ce10eba fix #6 templating with splitToFiles and mode record-based 2019-08-25 21:46:54 +02:00
Felix Lohmeier
c80bef3fb1 release v0.3.8 2019-08-22 03:18:36 +02:00
Felix Lohmeier
7b657eb76e workaround for pyinstaller on win32 2019-08-22 03:02:32 +02:00
Felix Lohmeier
d22022e273 tests with jupyter notebooks 2019-08-22 01:43:52 +02:00
Felix Lohmeier
6778c5cf73 workaround für pyinstaller 2019-08-22 01:33:06 +02:00
Felix Lohmeier
97bada2254 refactored info command for pyinstaller 2019-08-22 00:36:18 +02:00
Felix Lohmeier
bd13ffeb50 support unicode in args 2019-08-21 20:18:27 +02:00
Felix Lohmeier
240d0368f5 harmonize internal names for projectName 2019-08-21 19:50:45 +02:00
Felix Lohmeier
33430d5fe2 emoji in column name is really evil 2019-08-21 19:25:38 +02:00
Felix Lohmeier
3c16169767 add encoding option (defaults to UTF-8 for csv/tsv/txt) and fix templating feature suffixById 2019-08-21 19:24:13 +02:00
Felix Lohmeier
4ed6925b25 add example data with unicode chars 2019-08-21 16:49:39 +02:00
Felix Lohmeier
505a62afc2 support unicode chars in project name and column names 2019-08-21 16:48:43 +02:00
Felix Lohmeier
18a4d68b5c support unicode chars in export to stdout 2019-08-21 16:47:21 +02:00
Felix Lohmeier
41c1e618bb update links to openrefineder (new python2 demo notebook) 2019-08-20 16:47:58 +02:00
Felix Lohmeier
cb797fac51 move file back to prior location because it is referenced 2019-08-20 13:37:49 +02:00
Felix Lohmeier
f4f38d02fb fix whitespace (codacy issue) 2019-08-20 06:47:35 +02:00
Felix Lohmeier
2e6507bdf2 added code highlighting and improved pip install command 2019-08-20 06:35:33 +02:00
Felix Lohmeier
75e9a763d1 change python command to alias python2 2019-08-20 04:38:15 +02:00
Felix Lohmeier
375ac42be0 realigned create/new_project to upstream
new feature: xml root element will be discovered if recordPath is not set
bugfix: newly introduced option projectTags was not working in 0.3.7
bugfix: txt defaulted to fixed-width (should be line-based)
bugfix: default recordPath for json was not working in 0.3.7
bugfix: default sheets option was broken (but xls, xlsx, ods is broken in OpenRefine >=2.8 anyway, see #4)
tests: added sample files and an ipython notebook for comprehensive tests of create option
2019-08-20 04:30:50 +02:00
Felix Lohmeier
7ad79af3ca improved usage instructions for templating 2019-08-19 15:34:31 +02:00
Felix Lohmeier
abbef338ff fix typo 2019-08-16 13:37:45 +02:00
Felix Lohmeier
5730150b8c prepare distribution 2019-08-16 13:15:51 +02:00
Felix Lohmeier
caa2ebfde8 Workaround for SSL verification problems in one-file-executables 2019-08-16 13:15:24 +02:00
Felix Lohmeier
062e6960a8 remove unused import (codacy issue) 2019-08-16 12:27:27 +02:00
Felix Lohmeier
d9efd5c61b realign test to upstream (fix typo) 2019-08-16 11:39:03 +02:00
Felix Lohmeier
bfd00b55aa release v0.3.7 - substantially revised code and docs
fixed bug #1 (option columnWidths broken since v0.3.2)
fixed bug #2 (commands create and export templating broken since v0.3.5)
added --download command
extended --info command
improved performance of --export command
improved error handling and user feedback
removed support for legacy docker option --link
added detailed usage instructions with examples
moved and extended instructions from docker/README.md to README.md
added usage instructions for Python library
added chapter on Binder openrefineder
added badges for docker, pypi and binder
added usage instructions for tests script
added note to myself for distributing releases
moved all functions from parser to cli module
separated export and template function
improved code style (PEP8)
2019-08-16 10:58:41 +02:00
Felix Lohmeier
7d66993982 set required python version from >2.6 to >=2.7 (to be more explicit) 2019-08-16 00:23:26 +02:00
Felix Lohmeier
be439c986b templating export format should always be txt 2019-08-16 00:18:59 +02:00
Felix Lohmeier
777d73997c restore project_format default and remove unused code
create project from url has never worked before
new_project_defaults are outdated
2019-08-15 18:06:53 +02:00
Felix Lohmeier
e18b4d04be add json history file for test tutorial duplicates.csv 2019-08-14 13:46:44 +02:00
Felix Lohmeier
aa5b3a4203 realign to upstream 2019-08-14 13:45:35 +02:00
Felix Lohmeier
ad95432fc0 add codacy badge 2019-08-06 22:02:24 +02:00
Felix Lohmeier
6de4399012 improved script tests.sh
default to main versions
option for multiple tags
option for host port
improved usage of sudo
more examples in help screen
2019-08-06 22:01:20 +02:00
Felix Lohmeier
70782d8465 fix templating example 2019-08-06 21:58:28 +02:00
Felix Lohmeier
f0e6fbcd75 removed test for 3.1-java7 (OpenRefine 3.1 does not support java 7) 2019-08-06 00:47:39 +02:00
Felix Lohmeier
5819b15cf3 Script for running tests with different OpenRefine and Java versions based on Docker images. 2019-08-05 23:33:55 +02:00
Felix Lohmeier
b6f20f2e93 renamed module client.py to cli.py 2019-08-05 10:20:09 +02:00
Felix Lohmeier
4a455282f5 fix PyPI error (invalid classifier) 2019-08-04 03:26:57 +02:00
Felix Lohmeier
ece61c1096 prepare distribution on PyPI
moved parser from refine.py to google/refine/__main__.py to allow module execution (python -m google.refine); script refine.py is now just a forwarding to __main__.py
2019-08-04 03:03:49 +02:00
Felix Lohmeier
b8675fc894 fix comment in Dockerfile 2019-08-04 02:15:26 +02:00
Felix Lohmeier
1d0c4f828a moved functions from cli script into new module client.py 2019-08-04 02:15:07 +02:00
Felix Lohmeier
d82c7b28fb Revert "refactor to allow module execution"
This reverts commit a0123e951171cb22eddac6c65a23c2d8a28a88dc.
2019-08-03 13:23:32 +02:00
Felix Lohmeier
16560cb884 Revert "minor refactoring for dist"
This reverts commit 0c26aad39aa6ec6ec8681e76ad838ed2578cf0a1.
2019-08-03 13:23:10 +02:00
Felix Lohmeier
0c26aad39a minor refactoring for dist 2019-08-02 23:28:40 +02:00
Felix Lohmeier
a0123e9511 refactor to allow module execution 2019-08-01 17:23:40 +02:00
Felix Lohmeier
04db513453 restore curl for openrefine-batch 2019-07-29 23:12:09 +02:00
Felix Lohmeier
e03b3633e5 update metadata 2019-07-29 22:36:31 +02:00
Felix Lohmeier
aa844bde99 Update maintainer info to new format 2019-07-29 22:13:47 +02:00
Felix Lohmeier
221b83e805 create docker image with COPY instead of curl 2019-07-29 20:41:32 +02:00
Felix Lohmeier
fce77d8d78 new README with video demo 2019-05-08 23:23:55 +02:00
Felix Lohmeier
491795a30d fix file size in readme 2018-12-02 22:06:13 +01:00
Felix Lohmeier
0c0a0dfc1c added one-file-executable for mac 2018-12-02 22:04:14 +01:00
Felix Lohmeier
fd8f34be39 release v0.3.4 2017-12-11 17:45:26 +01:00
Felix Lohmeier
c896248c8c improved templating option splitToFiles 2017-12-11 17:32:10 +01:00
Felix Lohmeier
f7b33684b3 fix download link for windows 2017-11-23 16:15:05 +01:00
Felix Lohmeier
e31d565194 update version to v0.3.3 2017-11-23 02:00:39 +01:00
Felix Lohmeier
3e8f209a50 release v0.3.3: support for templating export, fixed some bugs 2017-11-23 01:48:29 +01:00
Felix Lohmeier
913e5eda56 Merge branch 'master' of github.com:felixlohmeier/openrefine-client 2017-11-20 12:22:35 +01:00
Felix Lohmeier
53fc89c147 added windows one-file-executable 2017-11-20 12:21:10 +01:00
Felix Lohmeier
9e97aff653 added windows one-file-executable 2017-11-20 12:19:20 +01:00
Felix Lohmeier
8253b8f794 added link to one-file-executable 2017-11-20 06:29:17 +01:00
Felix Lohmeier
1605ac2cab fix docker entrypoint 2017-11-20 05:28:30 +01:00
Felix Lohmeier
ed9e1e2afb update options to CamelCase 2017-11-20 04:55:51 +01:00
Felix Lohmeier
6c65f15363 fixed JSON post arguments 2017-11-20 04:55:11 +01:00
Felix Lohmeier
6262d703d3 improved parameter handling 2017-11-20 04:53:58 +01:00
Felix Lohmeier
28b4c7466b added chapter usage with examples 2017-11-19 23:29:45 +01:00
Felix Lohmeier
2061e804c3 OpenRefine 2.7/2.8 compatibility 2017-11-19 23:28:51 +01:00
Felix Lohmeier
058552aab6 refactored Dockerfile a bit 2017-11-19 23:27:40 +01:00
Felix Lohmeier
947c7510a6 refactored and extended CLI 2017-11-19 23:26:22 +01:00
Felix Lohmeier
31f06b35c4 ignore refine.spec (pyinstaller) 2017-11-19 23:24:52 +01:00
Felix Lohmeier
7a0f405007 Revert "added new function create_project and many options in CLI"
This reverts commit 35963dad38515740e86f6ebff27131863ed4b207.
2017-11-17 16:50:09 +01:00
Felix Lohmeier
b6ce0cf24c Revert "added options for separator (csv,tsv) and projectName (all)"
This reverts commit 6f8badae6a9e37d60ed7d7ea1d5a3a8f54fca4c1.
2017-11-17 16:49:56 +01:00
Felix Lohmeier
f0643b46a0 Revert "included urllib2_file.py in the package to ease installation"
This reverts commit bf91e918df59c0ce4b96f102db2639f0d7ed521b.
2017-11-17 16:47:31 +01:00
Felix Lohmeier
f70fed2966 Revert "removed urllib2 dependencies"
This reverts commit 44fd7c4611765579595cd908d66ceac3cc10cf98.
2017-11-17 16:46:37 +01:00
Felix Lohmeier
37004aadff Revert "minor changes to coding style"
This reverts commit 56e5ee96f5e1c06c0cb2175718b214d6b8b69ff8.
2017-11-17 12:43:48 +01:00
Felix Lohmeier
4e03a1452f added chmod +x in Dockerfile 2017-10-27 23:57:59 +02:00
Felix Lohmeier
56e5ee96f5 minor changes to coding style 2017-10-19 17:01:52 +02:00
Felix Lohmeier
8f2ef1d3e0 fixed typo in Dockerfile 2017-03-14 22:23:20 +01:00
Felix Lohmeier
1d0fb82f07 fixed typo in Dockerfile 2017-03-14 22:21:27 +01:00
Felix Lohmeier
44fd7c4611 removed urllib2 dependencies 2017-03-14 22:16:36 +01:00
Felix Lohmeier
bf91e918df included urllib2_file.py in the package to ease installation 2017-03-14 22:04:06 +01:00
Felix Lohmeier
6f8badae6a added options for separator (csv,tsv) and projectName (all) 2017-03-01 04:25:20 +01:00
Felix Lohmeier
74ec004c8b added link to article on how to extract json code 2017-02-02 13:07:49 +01:00
Felix Lohmeier
b45cda5ad1 fixed typo in docker environment variable 2017-02-02 12:54:06 +01:00
Felix Lohmeier
ad6072b1bb fixed typo in readme file (the last one i hope...) 2017-02-02 12:46:31 +01:00
Felix Lohmeier
115937d447 optimized readme layout for docker hub repo description 2017-02-02 12:38:58 +01:00
Felix Lohmeier
bd73be52ea list basic command line options in readme files 2017-02-02 12:23:16 +01:00
Felix Lohmeier
67b586a734 fixed docker environment variables 2017-02-02 11:05:14 +01:00
Felix Lohmeier
b83245cba9 fix typo in readme 2017-02-02 01:11:26 +01:00
Felix Lohmeier
8716b15d4c update readme files 2017-02-02 01:09:13 +01:00
Felix Lohmeier
78a7a75515 get environment variables in docker network 2017-02-01 23:59:13 +01:00
Felix Lohmeier
92b2316077 move Dockerfile and provide README.md for docker repo 2017-02-01 22:31:07 +01:00
Felix Lohmeier
221d7da379 repair Dockerfile directory 2017-02-01 22:23:50 +01:00
Felix Lohmeier
1b99ff1d62 add package curl to Dockerfile 2017-02-01 22:11:20 +01:00
Felix Lohmeier
26fe214eaf add Dockerfile for automatic build 2017-02-01 18:23:06 +01:00
Felix Lohmeier
35963dad38 added new function create_project and many options in CLI 2017-02-01 16:51:13 +01:00
Paul Makepeace
684dcdf8df Merge pull request #5 from vad/fix-encoding-parameter-in-project-creation
fixed the encoding parameter: it was passed as a Refine old-style argume...
2014-09-04 12:57:43 -07:00
Davide Setti
101a226a4f Merge pull request #1 from armisael/patch-1
allow for extra parameters in new_project
2014-08-12 11:36:37 +02:00
Stefano Parmesan
2d94ac4e36 allow for extra parameters in new_project 2014-08-12 11:34:17 +02:00
Davide Setti
9bd8102b0a fixed the encoding parameter: it was passed as a Refine old-style argument, but it's a new one 2014-05-22 14:18:56 +02:00
Paul Makepeace
1a4f00b3cd Explicitly insist on guessing cell value types (change in 2.6). 2013-10-14 00:30:24 +06:00
Paul Makepeace
ca25a305e0 Use true/false instead of on/''. Update to new new-project params. 2013-10-14 00:28:50 +06:00
Paul Makepeace
717a03a838 For OR 2.5 just rename the default column back to 'Column' (from 'Column 1') 2013-10-13 00:52:49 +06:00
Paul Makepeace
a1ea660ffa Catch HTTP errors and report more diags 2013-10-13 00:34:10 +06:00
Paul Makepeace
b92aa0efd1 Use project_format instead of split-into-columns:false 2013-10-13 00:02:33 +06:00
Paul Makepeace
1a5f7c482d Default column name in OR 2.5 is "Column 1" 2013-10-13 00:01:58 +06:00
Paul Makepeace
08dd425f28 Be explicit server-originated errors are from server. Derive new project defaults.
sudo tcpflow -AH -c -e  -i lo0 src or dst host localhost and port 3335 | ruby -ractive_support/core_ext -ruri -lne 'next unless /^format=([^&]+)&options=(.*)/; format = URI.unescape($1); opts = URI.unescape($2) ; opts.gsub!(/"(\w+)"/) { %Q['\''#{$1.underscore}'\''] }; opts.gsub!(/:/,": "); opts.gsub!(/,/,",\n\t"); opts.gsub!(/(true|false)/) { $1.titleize }; puts "        '\''#{format}'\'': #{opts},"'
2013-10-12 23:38:16 +06:00
Paul Makepeace
4104e58fd5 Refer to the archived message of David Huynh's original refine client 2013-10-10 16:43:16 +05:00
Paul Makepeace
c9fbc66e8e Remove README.txt too 2013-10-10 16:42:29 +05:00
Paul Makepeace
bbd4d84c96 Google Refine -> OpenRefine 2013-10-10 16:41:10 +05:00
Paul Makepeace
eade7dafe0 Use full path to test data 2013-10-10 15:40:23 +05:00
Paul Makepeace
5b701821a9 incorrectly class attributes 2013-10-10 15:32:56 +05:00
Paul Makepeace
6f8db835ab Add a link to pip installer 2013-10-09 23:16:43 +05:00
Paul Makepeace
bc0a8e7c7b Whitespace & minor renaming to bring in line with PEP8 guidelines 2013-10-09 23:16:43 +05:00
Paul Makepeace
e9ef9a6d56 Use less confusing package name 2013-10-09 13:27:30 +05:00
116 changed files with 7263 additions and 351 deletions

7
.gitignore vendored
View File

@ -2,5 +2,8 @@ build
dist
*.pyc
.*
refine_client.egg-info
README.html
openrefine_client.egg-info
refine.spec
openrefine-*
openrefine-client_*
tests-cli.log

View File

@ -1,4 +1,4 @@
include README.rst
include README.md
include COPYING.txt
recursive-include tests/data *.csv
recursive-include tests *.py

View File

@ -25,7 +25,7 @@ install:
clean:
find . -name '*.pyc' | xargs rm -f
# XXX is there some way of having setup.py clean up its junk?
rm -rf README.html build dist refine_client.egg-info distribute-*
rm -rf README.{html,txt} build dist refine_client.egg-info distribute-*
upload: clean
python setup.py sdist upload

773
README.md Normal file
View File

@ -0,0 +1,773 @@
# OpenRefine Python Client with extended command line interface (⌨️ for 💎)
[![Codacy Badge](https://app.codacy.com/project/badge/Grade/43ad9bfd707b4627bd45e5c5f912a8e0)](https://www.codacy.com/gh/opencultureconsulting/openrefine-client/dashboard) [![Docker](https://img.shields.io/microbadger/image-size/felixlohmeier/openrefine-client?label=docker)](https://hub.docker.com/r/felixlohmeier/openrefine-client/) [![PyPI](https://img.shields.io/pypi/v/openrefine-client)](https://pypi.org/project/openrefine-client/) [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/felixlohmeier/openrefineder/master)
The [OpenRefine Python Client from PaulMakepeace](https://github.com/PaulMakepeace/refine-client-py) provides a library for communicating with an [OpenRefine](http://openrefine.org) server.
This fork extends the command line interface (CLI) and is distributed as a convenient one-file-executable (Windows, Linux, macOS).
It is also available via Docker Hub, PyPI and Binder.
works with OpenRefine 2.7, 2.8, 3.0, 3.1, 3.2, 3.3, 3.4, 3.4.1, 3.5.0
## Download
One-file-executables:
- Windows: [openrefine-client_0-3-10_windows.exe](https://github.com/opencultureconsulting/openrefine-client/releases/download/v0.3.10/openrefine-client_0-3-10_windows.exe) (~5 MB)
- macOS: [openrefine-client_0-3-10_macos](https://github.com/opencultureconsulting/openrefine-client/releases/download/v0.3.10/openrefine-client_0-3-10_macos) (~5 MB)
- Linux: [openrefine-client_0-3-10_linux](https://github.com/opencultureconsulting/openrefine-client/releases/download/v0.3.10/openrefine-client_0-3-10_linux) (~5 MB)
For [Docker](#docker) containers, native [Python](#python) installation and free [Binder](#binder) on-demand server see the corresponding chapters below.
## Peek
A short video loop that demonstrates the basic features (list, create, apply, export):
![video loop that demonstrates basic features](openrefine-client-peek.gif)
## Usage
Ensure you have [OpenRefine](http://openrefine.org) running (i.e. available at http://localhost:3333 or [another URL](#change-url)).
To use the client:
1. Open a terminal pointing to the folder where you have [downloaded](#download) the one-file-executable (e.g. Downloads in your home directory).
- Windows: Open PowerShell and enter following command
```sh
cd ~\Downloads
```
- macOS: Open Terminal (Finder > Applications > Utilities > Terminal) and enter following command
```sh
cd ~/Downloads
```
- Linux: Open terminal app (Terminal, Konsole, xterm, ...) and enter following command
```sh
cd ~/Downloads
```
2. Make the file executable.
- Windows: not necessary
- macOS:
```sh
chmod +x openrefine-client_0-3-10_macos
```
- Linux:
```sh
chmod +x openrefine-client_0-3-10_linux
```
3. Execute the file.
- Windows:
```sh
.\openrefine-client_0-3-10_windows.exe
```
- macOS:
```sh
./openrefine-client_0-3-10_macos
```
- Linux:
```sh
./openrefine-client_0-3-10_linux
```
Using tab completion and command history is highly recommended:
- autocomplete filenames: enter a few characters and press `↹`
- recall previous command: press `↑`
### Basic commands
Execute the client by entering its filename followed by the desired command.
The following example will download two small files ([duplicates.csv](https://raw.githubusercontent.com/opencultureconsulting/openrefine-client/master/tests/data/duplicates.csv) and [duplicates-deletion.json](https://raw.githubusercontent.com/opencultureconsulting/openrefine-client/master/tests/data/duplicates-deletion.json)) into the current directory and will create a new OpenRefine project from file duplicates.csv.
Download example data (`--download`) and create project from file (`--create`):
- Windows:
```sh
.\openrefine-client_0-3-10_windows.exe --download "https://git.io/fj5hF" --output=duplicates.csv
.\openrefine-client_0-3-10_windows.exe --download "https://git.io/fj5ju" --output=duplicates-deletion.json
.\openrefine-client_0-3-10_windows.exe --create duplicates.csv
```
- macOS:
```sh
./openrefine-client_0-3-10_macos --download "https://git.io/fj5hF" --output=duplicates.csv
./openrefine-client_0-3-10_macos --download "https://git.io/fj5ju" --output=duplicates-deletion.json
./openrefine-client_0-3-10_macos --create duplicates.csv
```
- Linux:
```sh
./openrefine-client_0-3-10_linux --download "https://git.io/fj5hF" --output=duplicates.csv
./openrefine-client_0-3-10_linux --download "https://git.io/fj5ju" --output=duplicates-deletion.json
./openrefine-client_0-3-10_linux --create duplicates.csv
```
Other commands:
- list all projects: `--list`
- show project metadata: `--info "duplicates"`
- export project to terminal: `--export "duplicates"`
- apply [rules from json file](http://kb.refinepro.com/2012/06/google-refine-json-and-my-notepad-or.html): `--apply duplicates-deletion.json "duplicates"`
- export project to file: `--export --output=deduped.xls "duplicates"`
- delete project: `--delete "duplicates"`
### Getting help
Check `--help` for further options.
Please file an [issue](https://github.com/opencultureconsulting/openrefine-client/issues) if you miss some features in the command line interface or if you have tracked a bug.
And you are welcome to ask any questions!
### Change URL
By default the client connects to the usual URL of OpenRefine [http://localhost:3333](http://localhost:3333).
If your OpenRefine server is running somewhere else then you may set hostname and port with additional command line options (e.g. http://example.com):
- set host: `-H example.com`
- set port: `-P 80`
### Templating
The OpenRefine [Templating](https://github.com/OpenRefine/OpenRefine/wiki/Export-As-YAML) supports exporting data in any text format (i.e. to construct JSON or XML).
The graphical user interface offers four input fields:
1. prefix
2. row template
- supports [GREL](https://github.com/OpenRefine/OpenRefine/wiki/General-Refine-Expression-Language) inside two curly brackets, e.g. `{{jsonize(cells["name"].value)}}`
3. row separator
4. suffix
This templating functionality is available via the openrefine-client command line interface.
It even provides an additional feature for splitting results into multiple files.
To try out the functionality create another project from the example file above.
```sh
--create duplicates.csv --projectName=advanced
```
The following example code will export...
- the columns "name" and "purchase" in JSON format
- from the project "advanced"
- for rows matching the regex text filter `^F$` in column "gender"
macOS/Linux Terminal (multi-line input with `\` ):
```sh
"advanced" \
--prefix='{ "events" : [
' \
--template=' { "name" : {{jsonize(cells["name"].value)}}, "purchase" : {{jsonize(cells["purchase"].value)}} }' \
--rowSeparator=',
' \
--suffix='
] }' \
--filterQuery='^F$' \
--filterColumn='gender'
```
Windows PowerShell (multi-line input with `` ` ``; quotes needs to be doubled):
```sh
"advanced" `
--prefix='{ ""events"" : [
' `
--template=' { ""name"" : {{jsonize(cells[""name""].value)}}, ""purchase"" : {{jsonize(cells[""purchase""].value)}} }' `
--rowSeparator=',
' `
--suffix='
] }' `
--filterQuery='^F$' `
--filterColumn='gender'
```
Add the following options to the last command (recall with `↑`) to store the results in multiple files.
Each file will contain the prefix, an processed row, and the suffix.
```sh
--output=advanced.json --splitToFiles=true
```
Filenames are suffixed with the row number by default (e.g. `advanced_1.json`, `advanced_2.json` etc.).
There is another option to use the value in the first column instead:
```sh
--output=advanced.json --splitToFiles=true --suffixById=true
```
Because our project "advanced" contains duplicates in the first column "email" this command will overwrite files (e.g. `advanced_melanie.white@example2.edu.json`).
When using this option, the first column should contain unique identifiers.
### Append data to an existing project
OpenRefine does not support appending rows to an existing project.
As long as the [feature request](https://github.com/OpenRefine/OpenRefine/issues/715) is not yet implemented, you can use the openrefine-client to script a workaround:
1. export existing project as csv
2. put old and new data into a zip archive
3. create new project by importing the zip archive
Here is an example that replaces the existing project:
```
openrefine-client --export myproject --output old.csv
openrefine-client --delete myproject
zip combined.zip old.csv new.csv
openrefine-client --create combined.zip --format csv --projectName myproject
```
Note that the project id will change.
If you want to distinguish between old and new data, you can use the additional flag includeFileSources:
```
openrefine-client --create combined.zip --format csv --projectName myproject --includeFileSources true
```
### See also
- Linux Bash script to run OpenRefine in batch mode (import, transform, export): [openrefine-batch](https://github.com/opencultureconsulting/openrefine-batch)
- [Jupyter notebook demonstrating usage in Linux Bash](https://nbviewer.jupyter.org/github/felixlohmeier/openrefineder/blob/master/notebooks/openrefine-client-bash.ipynb)
- Use case [HOS-MetadataTransformations](https://github.com/subhh/HOS-MetadataTransformations): Automated workflow for harvesting, transforming and indexing of metadata using metha, OpenRefine and Solr. Part of the Hamburg Open Science "Schaufenster" software stack.
- Use case [Data processing of ILS data to facilitate a new discovery layer for the German Literature Archive (DLA)](https://doi.org/10.5281/zenodo.2678113): Custom data processing pipeline based on Pandas (a Python library) and OpenRefine.
## Docker
[felixlohmeier/openrefine-client](https://hub.docker.com/r/felixlohmeier/openrefine-client/) [![Docker](https://img.shields.io/microbadger/image-size/felixlohmeier/openrefine-client?label=docker)](https://hub.docker.com/r/felixlohmeier/openrefine-client/)
```sh
docker pull felixlohmeier/openrefine-client:v0.3.10
```
### Option 1: Dockerized client
Run client and mount current directory as workspace:
```sh
docker run --rm --network=host -v ${PWD}:/data:z felixlohmeier/openrefine-client:v0.3.10
```
The docker option `--network=host` allows you to connect to a local or remote OpenRefine via the host network:
- list projects on default URL (http://localhost:3333)
```sh
docker run --rm --network=host -v ${PWD}:/data:z felixlohmeier/openrefine-client:v0.3.10 --list
```
- list projects on a remote server (http://example.com)
```sh
docker run --rm --network=host -v ${PWD}:/data:z felixlohmeier/openrefine-client:v0.3.10 -H example.com -P 80 --list
```
Usage: same commands as explained above (see [Basic Commands](#basic-commands) and [Advanced Templating](#advanced-templating))
### Option 2: Dockerized client and dockerized OpenRefine
Run openrefine-client linked to a dockerized OpenRefine ([felixlohmeier/openrefine](https://hub.docker.com/r/felixlohmeier/openrefine/) [![Docker](https://img.shields.io/microbadger/image-size/felixlohmeier/openrefine?label=docker)](https://hub.docker.com/r/felixlohmeier/openrefine)):
1. Create docker network
```sh
docker network create openrefine
```
2. Run server (will be available at http://localhost:3333)
```sh
docker run -d -p 3333:3333 --network=openrefine --name=openrefine-server felixlohmeier/openrefine:3.5.0
```
3. Run client with some [basic commands](#basic-commands): 1. download example files, 2. create project from file, 3. list projects, 4. show metadata, 5. export to terminal, 6. apply transformation rules (deduplication), 7. export again to terminal, 8. export to xls file and 9. delete project
```sh
docker run --rm --network=openrefine -v ${PWD}:/data:z felixlohmeier/openrefine-client:v0.3.10 --download "https://git.io/fj5hF" --output=duplicates.csv
docker run --rm --network=openrefine -v ${PWD}:/data:z felixlohmeier/openrefine-client:v0.3.10 --download "https://git.io/fj5ju" --output=duplicates-deletion.json
docker run --rm --network=openrefine -v ${PWD}:/data:z felixlohmeier/openrefine-client:v0.3.10 -H openrefine-server --create duplicates.csv
docker run --rm --network=openrefine -v ${PWD}:/data:z felixlohmeier/openrefine-client:v0.3.10 -H openrefine-server --list
docker run --rm --network=openrefine -v ${PWD}:/data:z felixlohmeier/openrefine-client:v0.3.10 -H openrefine-server --info "duplicates"
docker run --rm --network=openrefine -v ${PWD}:/data:z felixlohmeier/openrefine-client:v0.3.10 -H openrefine-server --export "duplicates"
docker run --rm --network=openrefine -v ${PWD}:/data:z felixlohmeier/openrefine-client:v0.3.10 -H openrefine-server --apply duplicates-deletion.json "duplicates"
docker run --rm --network=openrefine -v ${PWD}:/data:z felixlohmeier/openrefine-client:v0.3.10 -H openrefine-server --export "duplicates"
docker run --rm --network=openrefine -v ${PWD}:/data:z felixlohmeier/openrefine-client:v0.3.10 -H openrefine-server --export --output=deduped.xls "duplicates"
docker run --rm --network=openrefine -v ${PWD}:/data:z felixlohmeier/openrefine-client:v0.3.10 -H openrefine-server --delete "duplicates"
```
4. Stop and delete server:
```sh
docker stop openrefine-server
docker rm openrefine-server
```
5. Delete docker network:
```sh
docker network rm openrefine
```
Customize OpenRefine server:
- If you want to add an OpenRefine startup option you need to repeat the default commands (cf. [Dockerfile](https://hub.docker.com/r/felixlohmeier/openrefine/dockerfile))
- `-i 0.0.0.0` sets OpenRefine to be accessible from outside the container, i.e. from host
- `-d /data` sets OpenRefine workspace
- Example for [allocating more memory](https://github.com/OpenRefine/OpenRefine/wiki/FAQ#out-of-memory-errors---feels-slow---could-not-reserve-enough-space-for-object-heap) to OpenRefine with additional option `-m 4G`
```sh
docker run -d -p 3333:3333 --network=openrefine --name=openrefine-server felixlohmeier/openrefine:3.5.0 -i 0.0.0.0 -d /data -m 4G
```
- The OpenRefine version is defined by the docker tag.
Check the [DockerHub repository](https://hub.docker.com/r/felixlohmeier/openrefine) for available tags.
Example for OpenRefine `2.8` with same options as above:
```sh
docker run -d -p 3333:3333 --network=openrefine --name=openrefine-server felixlohmeier/openrefine:2.8 -i 0.0.0.0 -d /data -m 4G
```
- If you want OpenRefine to read and write persistent data in host directory (i.e. store projects) you can mount the container path `/data`. Example for host directory `/home/felix/refine`:
```sh
docker run -d -p 3333:3333 -v /home/felix/refine:/data:z --network=openrefine name=openrefine-server felixlohmeier/openrefine:2.8 -i 0.0.0.0 -d /data -m 4G
```
See also:
- [GitHub Repository](https://github.com/opencultureconsulting/openrefine-docker) for docker container `felixlohmeier/openrefine`
- Linux Bash script to run OpenRefine in batch mode (import, transform, export) with docker containers: [openrefine-batch-docker.sh](https://github.com/opencultureconsulting/openrefine-batch/#docker)
## Python
[openrefine-client](https://pypi.org/project/openrefine-client/) [![PyPI](https://img.shields.io/pypi/v/openrefine-client)](https://pypi.org/project/openrefine-client/) (requires Python 2.x)
```sh
python2 -m pip install openrefine-client --user
```
This will install the package `openrefine-client` containing modules in `google.refine`.
A command line script `openrefine-client` will also be installed.
### Option 1: command line script
```sh
openrefine-client --help
```
Usage: same commands as explained above (see [Basic Commands](#basic-commands) and [Advanced Templating](#advanced-templating))
### Option 2: using cli functions in Python 2.x environment
Import module cli:
```python
from google.refine import cli
```
Change URL (if necessary):
```python
cli.refine.REFINE_HOST = 'localhost'
cli.refine.REFINE_PORT = '3333'
```
Help screen:
```python
help(cli)
```
Commands:
* download (e.g. example data):
```python
cli.download('https://git.io/fj5hF','duplicates.csv')
cli.download('https://git.io/fj5ju','duplicates-deletion.json')
```
* list projects:
```python
cli.ls()
```
* create project:
```python
p1 = cli.create('duplicates.csv')
```
* show metadata:
```python
cli.info(p1.project_id)
```
* apply rules from file to project:
```python
cli.apply(p1.project_id, 'duplicates-deletion.json')
```
* export project to terminal:
```python
cli.export(p1.project_id)
```
* export project to file in xls format:
```python
cli.export(p1.project_id, 'deduped.xls')
```
* export templating (see [Advanced Templating](#advanced-templating) above):
```python
cli.templating(
p1.project_id,
prefix='''{ "events" : [
''',template=''' { "name" : {{jsonize(cells["name"].value)}}, "purchase" : {{jsonize(cells["purchase"].value)}} }''',
rowSeparator=''',
''',suffix='''
] }''')
```
* delete project:
```python
cli.delete(p1.project_id)
```
### Option 3: the upstream way
This fork can be used in the same way as the upstream [Python client library](https://github.com/PaulMakepeace/refine-client-py/).
Some functions in the python client library are not yet compatible with OpenRefine >=3.0 (cf. [issue #19 in refine-client-py](https://github.com/paulmakepeace/refine-client-py/issues/19)).
Import module refine:
```python
from google.refine import refine
```
Server Commands:
* set up connection:
```python
server1 = refine.Refine('http://localhost:3333')
```
- show version:
```python
server1.server.get_version()
server1.server.version
```
- list projects:
```python
server1.list_projects()
```
- pretty print the returned dict with json.dumps:
```python
import json
print(json.dumps(server1.list_projects(), indent=1))
```
- create project:
```python
server1.new_project(project_file='duplicates.csv')
```
* create and open the returned project in one step:
```python
project1 = server1.new_project(project_file='duplicates.csv')
```
Project commands:
* open project:
```python
project1 = server1.open_project('1234567890123')
```
* print full URL to project:
```python
project1.project_url()
```
* list columns:
```python
project1.columns
```
* compute text facet on first column (**fails with OpenRefine >=3.2**):
```python
project1.compute_facets(facet.TextFacet(project1.columns[0]))
```
* print returned object
```python
facets = project1.compute_facets(facet.TextFacet(project1.columns[0])).facets[0]
for k in sorted(facets.choices, key=lambda k: facets.choices[k].count, reverse=True):
print(facets.choices[k].count, k)
```
* compute clusters on first column:
```python
project1.compute_clusters(project1.columns[0])
```
* apply rules from file to project:
```python
project1.apply_operations('duplicates-deletion.json')
```
* export project:
```python
project1.export(export_format='tsv')
```
* print the returned fileobject:
```python
print(project1.export(export_format='tsv').read())
```
* save the returned fileobject to file:
```python
with open('export.tsv', 'wb') as f:
f.write(project1.export(export_format='tsv').read())
```
* templating export (**function was added in this fork**, see [Advanced Templating](#advanced-templating) above):
```python
data = project1.export_templating(
prefix='''{ "events" : [
''',template=''' { "name" : {{jsonize(cells["name"].value)}}, "purchase" : {{jsonize(cells["purchase"].value)}} }''',
rowSeparator=''',
''',suffix='''
] }''')
print(data.read())
```
* print help screen with available commands (many more!):
```python
help(project1)
```
* example for custom commands:
```python
project1.do_json('get-rows')['total']
```
* delete project:
```python
project1.delete()
```
See also:
- Jupyter notebook by Trevor Muñoz (2013-08-18): [Programmatic Use of Open Refine to Facet and Cluster Names of 'Dishes' from NYPL's What's on the menu?](https://nbviewer.jupyter.org/gist/trevormunoz/6265360)
- Jupyter notebook by Tony Hirst (2019-01-09) [Notebook demonstrating how to control OpenRefine via a Python client.](https://nbviewer.jupyter.org/github/ouseful-PR/openrefineder/blob/4cef25a4ca6077536c5f49cafb531499fbcad96e/notebooks/OpenRefine%20Demos.ipynb)
- Unittests [test_refine.py](tests/test_refine.py) and [test_tutorial.py](tests/test_tutorial.py) (both importing [refinetest.py](tests/refinetest.py))
- [OpenRefine API](https://github.com/OpenRefine/OpenRefine/wiki/OpenRefine-API) in official OpenRefine wiki
## Binder
[![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/felixlohmeier/openrefineder/master)
- free to use on-demand server with Jupyter notebook, OpenRefine and Bash
- no registration needed, will start within a few minutes
- [restricted](https://mybinder.readthedocs.io/en/latest/faq.html#how-much-memory-am-i-given-when-using-binder) to 2 GB RAM and server will be deleted after 10 minutes of inactivity
- [bash_kernel demo notebook](https://nbviewer.jupyter.org/github/felixlohmeier/openrefineder/blob/master/notebooks/openrefine-client-bash.ipynb) for using the openrefine-client in a Linux Bash environment [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/felixlohmeier/openrefineder/master?urlpath=/tree/notebooks/openrefine-client-bash.ipynb)
- [python2 demo notebook](https://nbviewer.jupyter.org/github/felixlohmeier/openrefineder/blob/master/notebooks/openrefine-client-python.ipynb) for using the openrefine-client in a Python 2 environment [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/felixlohmeier/openrefineder/master?urlpath=/tree/notebooks/openrefine-client-python.ipynb)
## Development
If you would like to contribute to the Python client library please consider a pull request to the upstream repository [refine-client-py](https://github.com/PaulMakepeace/refine-client-py/).
### Tests
Ensure you have OpenRefine running (i.e. available at http://localhost:3333). If necessary set the environment variables `OPENREFINE_HOST` and `OPENREFINE_PORT` to change the URL.
The Python client library includes several unit tests.
- run all tests
```sh
python setup.py test
```
- run subset test_facet
```sh
python setup.py --test-suite tests.test_facet
```
There is also a script that uses docker images to run the unit tests with different versions of OpenRefine.
- run tests on all OpenRefine versions (from 2.0 up to 3.5.0)
```sh
./tests.sh -a
```
- run tests on tag 3.5.0
```sh
./tests.sh -t 3.5.0
```
- run tests on tag 3.5.0 interactively (pause before and after tests)
```sh
./tests.sh -t 3.5.0 -i
```
- run tests on tags 3.5.0 and 2.7
```sh
./tests.sh -t 3.5.0 -t 2.7
```
For Linux there are also functional tests for all command line options.
- run all functional tests on OpenRefine 3.5.0
```sh
./tests-cli.sh 3.5.0
```
- run all functional tests on OpenRefine 3.5.0 with one-file-executable
```sh
./tests-cli.sh 3.5.0 openrefine-client_0-3-7_linux
```
### Distributing
Note to myself: When releasing a new version...
1. Run functional tests
```sh
for v in 2.7 2.8 3.0 3.1 3.2 3.3 3.4 3.4.1 3.5.0; do
./tests-cli.sh $v
done
```
2. Make final changes in Git
- update versions (e.g. 0.3.7 und 0-3-7) in [README.md](https://github.com/opencultureconsulting/openrefine-client/blob/master/README.md#download)
- update version in [setup.py](https://github.com/opencultureconsulting/openrefine-client/blob/master/setup.py)
- check if [Dockerfile](https://github.com/opencultureconsulting/openrefine-client/blob/master/docker/Dockerfile) needs to be changed
3. Build executables with PyInstaller
- Run PyInstaller in Python 2 environments on native Windows, macOS and Linux. Should be "the oldest version of the OS you need to support"! Current release is built with:
- Ubuntu 16.04 LTS (64-bit)
- macOS Sierra 10.12 (64-bit)
- Windows 7 (32-bit)
- One-file-executables will be available in `dist/`.
```sh
git clone https://github.com/opencultureconsulting/openrefine-client.git
cd openrefine-client
python2 -m pip install pyinstaller --user
python2 -m pip install urllib2_file --user
python2 -m PyInstaller --onefile refine.py --hidden-import google.refine.__main__
```
4. Run functional tests with Linux executable
```sh
for v in 2.7 2.8 3.0 3.1 3.2 3.3 3.4 3.4.1 3.5.0; do
./tests-cli.sh $v openrefine-client_0-3-7_linux
done
```
5. Create release in GitHub
- draft [release notes](https://github.com/opencultureconsulting/openrefine-client/releases) and attach one-file-executables
6. Build package and upload to PyPI
```sh
python3 setup.py sdist bdist_wheel
python3 -m twine upload dist/*
```
7. Update Docker container
- add new autobuild for release version
- trigger latest build
8. Bump openrefine-client version in related projects
- openrefine-batch: [openrefine-batch.sh](https://github.com/opencultureconsulting/openrefine-batch/blob/master/openrefine-batch.sh#L7) and [openrefine-batch-docker.sh](https://github.com/opencultureconsulting/openrefine-batch/blob/master/openrefine-batch-docker.sh)
- openrefineder: [postBuild](https://github.com/felixlohmeier/openrefineder/blob/master/binder/postBuild)
## Credits
[Paul Makepeace](http://paulm.com), author
David Huynh, [initial cut](<http://markmail.org/message/jsxzlcu3gn6drtb7)
[Artfinder](http://www.artfinder.com), inspiration
[Felix Lohmeier](https://felixlohmeier.de), extended the CLI features
Some data used in the test suite has been used from publicly available sources:
- louisiana-elected-officials.csv: from http://www.sos.louisiana.gov/tabid/136/Default.aspx
- us_economic_assistance.csv: ["The Green Book"](http://www.data.gov/raw/1554)
- eli-lilly.csv: [ProPublica's "Docs for Dollars](http://projects.propublica.org/docdollars) leading to a [Lilly Faculty PDF](http://www.lillyfacultyregistry.com/documents/EliLillyFacultyRegistryQ22010.pdf) processed by [David Huynh's ScraperWiki script](http://scraperwiki.com/scrapers/eli-lilly-dollars-for-docs-scraper/edit/)

View File

@ -1,121 +0,0 @@
===================================
Google Refine Python Client Library
===================================
The Google Refine Python Client Library provides an interface to
communicating with a Google Refine server.
Currently, the following API is supported:
- project creation/import, deletion, export
- facet computation
- text
- text filter
- numeric
- blank
- starred & flagged
- ... extensible class
- 'engine': managing multiple facets and their computation results
- sorting & reordering
- clustering
- transforms
- transposes
- single and mass edits
- annotation (star/flag)
- column
- move
- add
- split
- rename
- reorder
- remove
- reconciliation
- reconciliation judgment facet
- guessing column type
- querying reconciliation services preferences
- perform reconciliation
Configuration
=============
By default the Google Refine server URL is http://127.0.0.1:3333
The environment variables ``GOOGLE_REFINE_HOST`` and ``GOOGLE_REFINE_PORT``
enable overriding the host & port.
In order to run all tests, a live Refine server is needed. No existing projects
are affected.
Installation
============
(Someone with more familiarity with python's byzantine collection of installation
frameworks is very welcome to improve/"best practice" all this.)
#. Install dependencies, which currently is ``urllib2_file``:
``sudo pip install -r requirements.txt``
#. Ensure you have a Refine server running somewhere and, if necessary, set
the envvars as above.
#. Run tests, build, and install:
``python setup.py test # to do a subset, e.g., --test-suite tests.test_facet``
``python setup.py build``
``python setup.py install``
There is a Makefile that will do this too, and more.
TODO
====
The API so far has been filled out from building a test suite to carry out the
actions in `David Huynh's Refine tutorial <http://davidhuynh.net/spaces/nicar2011/tutorial.pdf>`_ which while certainly showing off a
wide range of Refine features doesn't cover the entire suite. Notable exceptions
currently include:
- reconciliation support is useful but not complete
- undo/redo
- Freebase
- join columns
- columns from URL
Contribute
============
Patches welcome! Source is at https://github.com/PaulMakepeace/refine-client-py
Useful Tools
------------
One aspect of development is watching HTTP transactions. To that end, I found
`Fiddler <http://www.fiddler2.com/>`_ on Windows and `HTTPScoop
<http://www.tuffcode.com/>`_ invaluable. The latter won't URL-decode nor nicely
format JSON but the `Online JavaScript Beautifier <http://jsbeautifier.org/>`_
will.
Credits
=======
Paul Makepeace, author, <paulm@paulm.com>
David Huynh, `initial cut <http://groups.google.com/group/google-refine/msg/ee29cf8d660e66a9>`_
`Artfinder <http://www.artfinder.com/>`_, inspiration
Some data used in the test suite has been used from publicly available sources,
- louisiana-elected-officials.csv: from
http://www.sos.louisiana.gov/tabid/136/Default.aspx
- us_economic_assistance.csv: `"The Green Book" <http://www.data.gov/raw/1554>`_
- eli-lilly.csv: `ProPublica's "Docs for Dollars" <http://projects.propublica.org/docdollars/>`_ leading to a `Lilly Faculty PDF <http://www.lillyfacultyregistry.com/documents/EliLillyFacultyRegistryQ22010.pdf>`_ processed by `David Huynh's ScraperWiki script <http://scraperwiki.com/scrapers/eli-lilly-dollars-for-docs-scraper/edit/>`_

28
docker/Dockerfile Normal file
View File

@ -0,0 +1,28 @@
FROM alpine:3.11
LABEL maintainer="felixlohmeier@opencultureconsulting.com"
# The OpenRefine Python Client Library from PaulMakepeace provides an interface to communicating with an OpenRefine server. This fork extends the command line interface (CLI) and supports communication between docker containers.
# Source: https://github.com/opencultureconsulting/openrefine-client
# Install python and pip
# ... and curl for https://github.com/opencultureconsulting/openrefine-batch
RUN apk add --no-cache \
python \
py-pip \
curl
# Install dependency urllib2_file
RUN pip install urllib2_file==0.2.1
# Copy python scripts
WORKDIR /app
COPY google google
COPY refine.py .
# Change docker WORKDIR (shall be mounted by user)
WORKDIR /data
# Execute refine.py
ENTRYPOINT ["/app/refine.py"]
# Default command: print help
CMD ["-h"]

279
google/refine/__main__.py Normal file
View File

@ -0,0 +1,279 @@
#! /usr/bin/env python
"""
Script to provide a command line interface to a Refine server.
"""
# Copyright (c) 2011 Paul Makepeace, Real Programmers. All rights reserved.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>
import optparse
from google.refine import refine
from google.refine import cli
class myParser(optparse.OptionParser):
def format_epilog(self, formatter):
return self.epilog
PARSER = \
myParser(description=('Script to provide a command line interface to an '
'OpenRefine server.'),
usage='usage: %prog [--help | OPTIONS]',
epilog="""
Example data:
--download "https://git.io/fj5hF" --output=duplicates.csv
--download "https://git.io/fj5ju" --output=duplicates-deletion.json
Basic commands:
--list # list all projects
--list -H 127.0.0.1 -P 80 # specify hostname and port
--create duplicates.csv # create new project from file
--info "duplicates" # show project metadata
--apply duplicates-deletion.json "duplicates" # apply rules in file to project
--export "duplicates" # export project to terminal in tsv format
--export --output=deduped.xls "duplicates" # export project to file in xls format
--delete "duplicates" # delete project
Some more examples:
--info 1234567890123 # specify project by id
--create example.tsv --encoding=UTF-8
--create example.xml --recordPath=collection --recordPath=record
--create example.json --recordPath=_ --recordPath=_
--create example.xlsx --sheets=0
--create example.ods --sheets=0
Example for Templating Export:
Cf. https://github.com/opencultureconsulting/openrefine-client#advanced-templating
""")
group1 = optparse.OptionGroup(PARSER, 'Connection options')
group1.add_option('-H', '--host', dest='host',
metavar='127.0.0.1',
help='OpenRefine hostname (default: 127.0.0.1)')
group1.add_option('-P', '--port', dest='port',
metavar='3333',
help='OpenRefine port (default: 3333)')
PARSER.add_option_group(group1)
group2 = optparse.OptionGroup(PARSER, 'Commands')
group2.add_option('-c', '--create', dest='create',
metavar='[FILE]',
help='Create project from file. The filename ending (e.g. .csv) defines the input format (csv,tsv,xml,json,txt,xls,xlsx,ods)')
group2.add_option('-l', '--list', dest='list',
action='store_true',
help='List projects')
group2.add_option('--download', dest='download',
metavar='[URL]',
help='Download file from URL (e.g. example data). Combine with --output to specify a filename.')
PARSER.add_option_group(group2)
group3 = optparse.OptionGroup(
PARSER, 'Commands with argument [PROJECTID/PROJECTNAME]')
group3.add_option('-d', '--delete', dest='delete',
action='store_true',
help='Delete project')
group3.add_option('-f', '--apply', dest='apply',
metavar='[FILE]',
help='Apply JSON rules to OpenRefine project')
group3.add_option('-E', '--export', dest='export',
action='store_true',
help='Export project in tsv format to stdout.')
group3.add_option('-o', '--output', dest='output',
metavar='[FILE]',
help='Export project to file. The filename ending (e.g. .tsv) defines the output format (csv,tsv,xls,xlsx,html)')
group3.add_option('--template', dest='template',
metavar='[STRING]',
help='Export project with templating. Provide (big) text string that you enter in the *row template* textfield in the export/templating menu in the browser app)')
group3.add_option('--info', dest='info',
action='store_true',
help='show project metadata')
PARSER.add_option_group(group3)
group4 = optparse.OptionGroup(PARSER, 'General options')
group4.add_option('--format', dest='file_format',
help='Override file detection (import: csv,tsv,xml,json,line-based,fixed-width,xls,xlsx,ods; export: csv,tsv,html,xls,xlsx,ods)')
PARSER.add_option_group(group4)
group5 = optparse.OptionGroup(PARSER, 'Create options')
group5.add_option('--columnWidths', dest='columnWidths',
action='append',
type='int',
help='(txt/fixed-width), please provide widths in multiple arguments, e.g. --columnWidths=7 --columnWidths=5')
group5.add_option('--encoding', dest='encoding',
help='(csv,tsv,txt), please provide short encoding name (e.g. UTF-8)')
group5.add_option('--guessCellValueTypes', dest='guessCellValueTypes',
metavar='true/false', choices=('true', 'false'),
help='(xml,csv,tsv,txt,json, default: false)')
group5.add_option('--headerLines', dest='headerLines',
type="int",
help='(csv,tsv,txt/fixed-width,xls,xlsx,ods), default: 1, default txt/fixed-width: 0')
group5.add_option('--ignoreLines', dest='ignoreLines',
type="int",
help='(csv,tsv,txt,xls,xlsx,ods), default: -1')
group5.add_option('--includeFileSources', dest='includeFileSources',
metavar='true/false', choices=('true', 'false'),
help='(all formats), default: false')
group5.add_option('--limit', dest='limit',
type="int",
help='(all formats), default: -1')
group5.add_option('--linesPerRow', dest='linesPerRow',
type="int",
help='(txt/line-based), default: 1')
group5.add_option('--processQuotes', dest='processQuotes',
metavar='true/false', choices=('true', 'false'),
help='(csv,tsv), default: true')
group5.add_option('--projectName', dest='projectName',
help='(all formats), default: filename')
group5.add_option('--projectTags', dest='projectTags',
action='append',
help='(all formats), please provide tags in multiple arguments, e.g. --projectTags=beta --projectTags=client1')
group5.add_option('--recordPath', dest='recordPath',
action='append',
help='(xml,json), please provide path in multiple arguments, e.g. /collection/record/ should be entered: --recordPath=collection --recordPath=record, default xml: root element, default json: _ _')
group5.add_option('--separator', dest='separator',
help='(csv,tsv), default csv: , default tsv: \\t')
group5.add_option('--sheets', dest='sheets',
action='append',
type="int",
help='(xls,xlsx,ods), please provide sheets in multiple arguments, e.g. --sheets=0 --sheets=1, default: 0 (first sheet)')
group5.add_option('--skipDataLines', dest='skipDataLines',
type="int",
help='(csv,tsv,txt,xls,xlsx,ods), default: 0, default line-based: -1')
group5.add_option('--storeBlankCellsAsNulls', dest='storeBlankCellsAsNulls',
metavar='true/false', choices=('true', 'false'),
help='(csv,tsv,txt,xls,xlsx,ods), default: true')
group5.add_option('--storeBlankRows', dest='storeBlankRows',
metavar='true/false', choices=('true', 'false'),
help='(csv,tsv,txt,xls,xlsx,ods), default: true')
group5.add_option('--storeEmptyStrings', dest='storeEmptyStrings',
metavar='true/false', choices=('true', 'false'),
help='(xml,json), default: true')
group5.add_option('--trimStrings', dest='trimStrings',
metavar='true/false', choices=('true', 'false'),
help='(xml,json), default: false')
PARSER.add_option_group(group5)
group6 = optparse.OptionGroup(PARSER, 'Templating options')
group6.add_option('--mode', dest='mode',
metavar='row-based/record-based',
choices=('row-based', 'record-based'),
help='engine mode (default: row-based)')
group6.add_option('--prefix', dest='prefix',
help='text string that you enter in the *prefix* textfield in the browser app')
group6.add_option('--rowSeparator', dest='rowSeparator',
help='text string that you enter in the *row separator* textfield in the browser app')
group6.add_option('--suffix', dest='suffix',
help='text string that you enter in the *suffix* textfield in the browser app')
group6.add_option('--filterQuery', dest='filterQuery',
metavar='REGEX',
help='Simple RegEx text filter on filterColumn, e.g. ^12015$'),
group6.add_option('--filterColumn', dest='filterColumn',
metavar='COLUMNNAME',
help='column name for filterQuery (default: name of first column)')
group6.add_option('--facets', dest='facets',
help='facets config in json format (may be extracted with browser dev tools in browser app)')
group6.add_option('--splitToFiles', dest='splitToFiles',
metavar='true/false', choices=('true', 'false'),
help='will split each row/record into a single file; it specifies a presumably unique character series for splitting; --prefix and --suffix will be applied to all files; filename-prefix can be specified with --output (default: %Y%m%d)')
group6.add_option('--suffixById', dest='suffixById',
metavar='true/false', choices=('true', 'false'),
help='enhancement option for --splitToFiles; will generate filename-suffix from values in key column')
PARSER.add_option_group(group6)
def main():
"""Command line interface."""
options, args = PARSER.parse_args()
# set environment
if options.host:
refine.REFINE_HOST = options.host
if options.port:
refine.REFINE_PORT = options.port
# get project_id
if args and not str.isdigit(args[0]):
projects = refine.Refine(refine.RefineServer()).list_projects().items()
idlist = []
for project_id, project_info in projects:
if args[0].decode('UTF-8') == project_info['name']:
idlist.append(str(project_id))
if len(idlist) > 1:
print('Error: Found %s projects with name %s.\n'
'Please specify project by id.' % (len(idlist), args[0]))
for i in idlist:
print('')
cli.info(i)
return
else:
try:
project_id = idlist[0]
except IndexError:
print('Error: No project found with name %s.\n'
'Try command --list' % args[0])
return
elif args:
project_id = args[0]
# commands without args
if options.list:
cli.ls()
elif options.download:
cli.download(options.download, output_file=options.output)
elif options.create:
group5_dict = {group5_arg.dest: getattr(options, group5_arg.dest)
for group5_arg in group5.option_list}
kwargs = {k: v for k, v in group5_dict.items()
if v is not None and v not in ['true', 'false']}
kwargs.update({k: True for k, v in group5_dict.items()
if v == 'true'})
kwargs.update({k: False for k, v in group5_dict.items()
if v == 'false'})
if options.file_format:
kwargs.update({'project_format': options.file_format})
cli.create(options.create, **kwargs)
# commands with args
elif args and options.info:
cli.info(project_id)
elif args and options.delete:
cli.delete(project_id)
elif args and options.apply:
cli.apply(project_id, options.apply)
elif args and options.template:
group6_dict = {group6_arg.dest: getattr(options, group6_arg.dest)
for group6_arg in group6.option_list}
kwargs = {k: v for k, v in group6_dict.items()
if v is not None and v not in ['true', 'false']}
kwargs.update({k: True for k, v in group6_dict.items()
if v == 'true'})
kwargs.update({k: False for k, v in group6_dict.items()
if v == 'false'})
cli.templating(project_id, options.template,
output_file=options.output, **kwargs)
elif args and (options.export or options.output):
cli.export(project_id, output_file=options.output,
export_format=options.file_format)
else:
PARSER.print_usage()
if __name__ == "__main__":
# execute only if run as a script
main()

335
google/refine/cli.py Normal file
View File

@ -0,0 +1,335 @@
#! /usr/bin/env python
"""
Functions used by the command line interface (CLI)
"""
# Copyright (c) 2011 Paul Makepeace, Real Programmers. All rights reserved.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>
import json
import os
import ssl
import sys
import time
import urllib
from xml.etree import ElementTree
from google.refine import refine
def apply(project_id, history_file):
"""Apply OpenRefine history from json file to project."""
project = refine.RefineProject(project_id)
response = project.apply_operations(history_file)
if response != 'ok':
raise Exception('Failed to apply %s to %s: %s' %
(history_file, project_id, response))
else:
print('File %s has been successfully applied to project %s' %
(history_file, project_id))
def create(project_file,
project_format=None,
columnWidths=None,
encoding=None,
guessCellValueTypes=False,
headerLines=None,
ignoreLines=None,
includeFileSources=False,
limit=None,
linesPerRow=None,
processQuotes=True,
projectName=None,
projectTags=None,
recordPath=None,
separator=None,
sheets=None,
skipDataLines=None,
storeBlankCellsAsNulls=True,
storeBlankRows=True,
storeEmptyStrings=True,
trimStrings=False
):
"""Create a new project from file."""
# guess format from file extension
if not project_format:
project_format = os.path.splitext(project_file)[1][1:].lower()
if project_format == 'txt':
try:
columnWidths[0]
project_format = 'fixed-width'
except TypeError:
project_format = 'line-based'
# defaults for each file type
if project_format == 'xml':
project_format = 'text/xml'
if not recordPath:
recordPath = [ElementTree.parse(project_file).getroot().tag]
elif project_format == 'csv':
project_format = 'text/line-based/*sv'
elif project_format == 'tsv':
project_format = 'text/line-based/*sv'
if not separator:
separator = '\t'
elif project_format == 'line-based':
project_format = 'text/line-based'
if not skipDataLines:
skipDataLines = -1
elif project_format == 'fixed-width':
project_format = 'text/line-based/fixed-width'
if not headerLines:
headerLines = 0
elif project_format == 'json':
project_format = 'text/json'
if not recordPath:
recordPath = ['_', '_']
elif project_format == 'xls':
project_format = 'binary/text/xml/xls/xlsx'
if not sheets:
sheets = [0]
# TODO: new format for sheets option introduced in OpenRefine 2.8
elif project_format == 'xlsx':
project_format = 'binary/text/xml/xls/xlsx'
if not sheets:
sheets = [0]
# TODO: new format for sheets option introduced in OpenRefine 2.8
elif project_format == 'ods':
project_format = 'text/xml/ods'
if not sheets:
sheets = [0]
# TODO: new format for sheets option introduced in OpenRefine 2.8
# execute
kwargs = {k: v for k, v in vars().items() if v is not None}
project = refine.Refine(refine.RefineServer()).new_project(
guess_cell_value_types=guessCellValueTypes,
ignore_lines=ignoreLines,
header_lines=headerLines,
skip_data_lines=skipDataLines,
store_blank_rows=storeBlankRows,
process_quotes=processQuotes,
project_name=projectName,
store_blank_cells_as_nulls=storeBlankCellsAsNulls,
include_file_sources=includeFileSources,
**kwargs)
rows = project.do_json('get-rows')['total']
if rows > 0:
print('{0}: {1}'.format('id', project.project_id))
print('{0}: {1}'.format('rows', rows))
return project
else:
raise Exception(
'Project contains 0 rows. Please check --help for mandatory '
'arguments for xml, json, xlsx and ods')
def delete(project_id):
"""Delete project."""
project = refine.RefineProject(project_id)
response = project.delete()
if response != True:
raise Exception('Failed to delete %s: %s' %
(project_id, response))
else:
print('Project %s has been successfully deleted' % project_id)
def download(url, output_file=None):
"""Integrated download function for your convenience."""
if not output_file:
output_file = os.path.basename(url)
if os.path.exists(output_file):
print('Error: File %s already exists.\n'
'Delete existing file or try command --output '
'to specify a different filename.' % output_file)
return
# Workaround for SSL verification problems in one-file-executables
context = ssl._create_unverified_context()
urllib.urlretrieve(url, output_file, context=context)
print('Download to file %s complete' % output_file)
def export(project_id, encoding=None, output_file=None, export_format=None):
"""Dump a project to stdout or file."""
project = refine.RefineProject(project_id)
if not output_file:
if not export_format:
export_format = 'tsv'
if export_format in ['csv', 'tsv', 'txt']:
encoding = 'UTF-8'
sys.stdout.write(project.export(
export_format=export_format, encoding=encoding).read())
else:
ext = os.path.splitext(output_file)[1][1:]
if ext and not export_format:
export_format = ext.lower()
if not export_format:
export_format = 'tsv'
if export_format in ['csv', 'tsv', 'txt']:
encoding = 'UTF-8'
with open(output_file, 'wb') as f:
f.write(project.export(
export_format=export_format, encoding=encoding).read())
print('Export to file %s complete' % output_file)
def info(project_id):
"""Show project metadata"""
projects = refine.Refine(refine.RefineServer()).list_projects()
if project_id in projects.keys():
print('{0:>20}: {1}'.format('id', project_id))
print('{0:>20}: {1}'.format('url', 'http://' +
refine.REFINE_HOST + ':' +
refine.REFINE_PORT +
'/project?project=' + project_id))
for k, v in projects[project_id].items():
if v:
print(u'{0:>20}: {1}'.format(k, v))
project_model = refine.RefineProject(project_id).get_models()
columns = [c['name'] for c in project_model['columnModel']['columns']]
for (i, v) in enumerate(columns, start=1):
print(u'{0:>20}: {1}'.format(u'column ' + str(i).zfill(3), v).encode('utf-8'))
else:
print('Error: No project found with id %s.\n'
'Check existing projects with command --list' % (project_id))
def ls():
"""Query the server and list projects sorted by mtime."""
projects = refine.Refine(refine.RefineServer()).list_projects().items()
def date_to_epoch(json_dt):
"""Convert a JSON date time into seconds-since-epoch."""
return time.mktime(time.strptime(json_dt, '%Y-%m-%dT%H:%M:%SZ'))
projects.sort(key=lambda v: date_to_epoch(v[1]['modified']), reverse=True)
if projects:
for project_id, project_info in projects:
print(u'{0:>14}: {1}'.format(project_id, project_info['name']).encode('utf-8'))
else:
print('Error: No projects found')
def templating(project_id,
template,
encoding='UTF-8',
output_file=None,
mode=None,
prefix='',
rowSeparator='\n',
suffix='',
filterQuery=None,
filterColumn=None,
facets=None,
splitToFiles=False,
suffixById=None
):
"""Dump a project to stdout or file with templating."""
project = refine.RefineProject(project_id)
# basic config
templateconfig = {'prefix': prefix,
'suffix': suffix,
'template': template,
'rowSeparator': rowSeparator,
'encoding': encoding}
# construct the engine config
if mode == 'record-based':
engine = {'facets': [], 'mode': 'record-based'}
else:
engine = {'facets': [], 'mode': 'row-based'}
if facets:
engine['facets'].append(json.loads(facets))
if filterQuery:
if not filterColumn:
filterColumn = project.get_models()['columnModel']['keyColumnName']
textFilter = {'type': 'text',
'name': filterColumn,
'columnName': filterColumn,
'mode': 'regex',
'caseSensitive': False,
'query': filterQuery}
engine['facets'].append(textFilter)
templateconfig.update({'engine': json.dumps(engine)})
if not splitToFiles:
# normal output
if not output_file:
sys.stdout.write(project.export_templating(
**templateconfig).read())
else:
with open(output_file, 'wb') as f:
f.write(project.export_templating(**templateconfig).read())
print('Export to file %s complete' % output_file)
else:
# splitToFiles functionality
prefix = templateconfig['prefix']
suffix = templateconfig['suffix']
split = '===|||THISISTHEBEGINNINGOFANEWRECORD|||==='
if not output_file:
output_file = time.strftime('%Y%m%d')
else:
base = os.path.splitext(output_file)[0]
ext = os.path.splitext(output_file)[1][1:]
if not ext:
ext = 'txt'
# generate config for subfeature suffixById
if suffixById:
ids_template = ('{{forNonBlank(' +
'with(row.columnNames[0],cn,cells[cn].value),' +
'v,v,"")}}')
ids_templateconfig = {'engine': json.dumps(engine),
'template': ids_template,
'rowSeparator': '\n',
'encoding': encoding}
ids = [line.rstrip('\n') for line in project.export_templating(
**ids_templateconfig) if line.rstrip('\n')]
# generate common config
if mode == 'record-based':
# record-based: split-character into template
# if key column is not blank (=record)
template = ('{{forNonBlank(' +
'with(row.columnNames[0],cn,cells[cn].value),' +
'v,"' + split + '", "")}}' +
templateconfig['template'])
templateconfig.update({'prefix': '',
'suffix': '',
'template': template,
'rowSeparator': ''})
else:
# row-based: split-character into template
template = split + templateconfig['template']
templateconfig.update({'prefix': '',
'suffix': '',
'template': template,
'rowSeparator': ''})
# execute
records = project.export_templating(
**templateconfig).read().split(split)
del records[0] # skip first blank entry
if suffixById:
for index, record in enumerate(records):
output_file = base + '_' + ids[index] + '.' + ext
with open(output_file, 'wb') as f:
f.writelines([prefix, record, suffix])
print('Export to files complete. Last file: %s' % output_file)
else:
zeros = len(str(len(records)))
for index, record in enumerate(records):
output_file = base + '_' + \
str(index + 1).zfill(zeros) + '.' + ext
with open(output_file, 'wb') as f:
f.writelines([prefix, record, suffix])
print('Export to files complete. Last file: %s' % output_file)

View File

@ -1,6 +1,6 @@
#!/usr/bin/env python
"""
Google Refine Facets, Engine, and Facet Responses.
OpenRefine Facets, Engine, and Facet Responses.
"""
# Copyright (c) 2011 Paul Makepeace, Real Programmers. All rights reserved.
@ -28,6 +28,7 @@ def to_camel(attr):
return (attr[0].lower() +
re.sub(r'_(.)', lambda x: x.group(1).upper(), attr[1:]))
def from_camel(attr):
"""convert thisAttrName to this_attr_name."""
# Don't add an underscore for capitalized first letter
@ -35,8 +36,8 @@ def from_camel(attr):
class Facet(object):
def __init__(self, column, type, **options):
self.type = type
def __init__(self, column, facet_type, **options):
self.type = facet_type
self.name = column
self.column_name = column
for k, v in options.items():
@ -50,17 +51,17 @@ class Facet(object):
class TextFilterFacet(Facet):
def __init__(self, column, query, **options):
super(TextFilterFacet, self).__init__(
column, query=query, case_sensitive=False, type='text',
column, query=query, case_sensitive=False, facet_type='text',
mode='text', **options)
class TextFacet(Facet):
def __init__(self, column, selection=None, expression='value',
omit_blank=False, omit_error=False, select_blank=False,
select_error=False, invert=False, **options):
omit_blank=False, omit_error=False, select_blank=False,
select_error=False, invert=False, **options):
super(TextFacet, self).__init__(
column,
type='list',
facet_type='list',
omit_blank=omit_blank,
omit_error=omit_error,
select_blank=select_blank,
@ -99,37 +100,39 @@ class BoolFacet(TextFacet):
raise ValueError('selection must be True or False.')
if expression is None:
raise ValueError('Missing expression')
super(BoolFacet, self).__init__(column,
expression=expression, selection=selection)
super(BoolFacet, self).__init__(
column, expression=expression, selection=selection)
class StarredFacet(BoolFacet):
def __init__(self, selection=None):
super(StarredFacet, self).__init__('',
expression='row.starred', selection=selection)
super(StarredFacet, self).__init__(
'', expression='row.starred', selection=selection)
class FlaggedFacet(BoolFacet):
def __init__(self, selection=None):
super(FlaggedFacet, self).__init__('',
expression='row.flagged', selection=selection)
super(FlaggedFacet, self).__init__(
'', expression='row.flagged', selection=selection)
class BlankFacet(BoolFacet):
def __init__(self, column, selection=None):
super(BlankFacet, self).__init__(column,
expression='isBlank(value)', selection=selection)
super(BlankFacet, self).__init__(
column, expression='isBlank(value)', selection=selection)
class ReconJudgmentFacet(TextFacet):
def __init__(self, column, **options):
super(ReconJudgmentFacet, self).__init__(column,
super(ReconJudgmentFacet, self).__init__(
column,
expression=('forNonBlank(cell.recon.judgment, v, v, '
'if(isNonBlank(value), "(unreconciled)", "(blank)"))'),
**options)
# Capitalize 'From' to get around python's reserved word.
#noinspection PyPep8Naming
class NumericFacet(Facet):
def __init__(self, column, From=None, to=None, expression='value',
select_blank=True, select_error=True, select_non_numeric=True,
@ -139,7 +142,7 @@ class NumericFacet(Facet):
From=From,
to=to,
expression=expression,
type='range',
facet_type='range',
select_blank=select_blank,
select_error=select_error,
select_non_numeric=select_non_numeric,
@ -155,10 +158,12 @@ class NumericFacet(Facet):
class FacetResponse(object):
"""Class for unpacking an individual facet response."""
def __init__(self, facet):
self.name = None
for k, v in facet.items():
if isinstance(k, bool) or isinstance(k, basestring):
setattr(self, from_camel(k), v)
self.choices = {}
class FacetChoice(object):
def __init__(self, c):
self.count = c['c']
@ -188,11 +193,14 @@ class FacetsResponse(object):
def __init__(self, engine, facets):
class FacetResponseContainer(object):
facets = None
def __init__(self, facet_responses):
self.facets = [FacetResponse(fr) for fr in facet_responses]
def __iter__(self):
for facet in self.facets:
yield facet
def __getitem__(self, index):
if not isinstance(index, int):
index = engine.facet_index_by_id[id(index)]
@ -205,10 +213,10 @@ class FacetsResponse(object):
class Engine(object):
"""An Engine keeps track of Facets, and responses to facet computation."""
facets = []
facet_index_by_id = {} # dict of facets by Facet object id
def __init__(self, *facets, **kwargs):
self.facets = []
self.facet_index_by_id = {} # dict of facets by Facet object id
self.set_facets(*facets)
self.mode = kwargs.get('mode', 'row-based')

View File

@ -1,6 +1,6 @@
#!/usr/bin/env python
"""
Google Refine history: parsing responses.
OpenRefine history: parsing responses.
"""
# Copyright (c) 2011 Paul Makepeace, Real Programmers. All rights reserved.
@ -18,15 +18,13 @@ Google Refine history: parsing responses.
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>
import json
import re
class HistoryEntry(object):
# N.B. e.g. **response['historyEntry'] won't work as keys are unicode :-/
def __init__(self, id=None, time=None, description=None, **kwargs):
if id is None:
#noinspection PyUnusedLocal
def __init__(self, history_entry_id=None, time=None, description=None, **kwargs):
if history_entry_id is None:
raise ValueError('History entry id must be set')
self.id = id
self.id = history_entry_id
self.description = description
self.time = time

View File

@ -33,8 +33,8 @@ import urlparse
from google.refine import facet
from google.refine import history
REFINE_HOST = os.environ.get('GOOGLE_REFINE_HOST', '127.0.0.1')
REFINE_PORT = os.environ.get('GOOGLE_REFINE_PORT', '3333')
REFINE_HOST = os.environ.get('OPENREFINE_HOST', os.environ.get('GOOGLE_REFINE_HOST', '127.0.0.1'))
REFINE_PORT = os.environ.get('OPENREFINE_PORT', os.environ.get('GOOGLE_REFINE_PORT', '3333'))
class RefineServer(object):
@ -50,9 +50,21 @@ class RefineServer(object):
def __init__(self, server=None):
if server is None:
server=self.url()
server = self.url()
self.server = server[:-1] if server.endswith('/') else server
self.__version = None # see version @property below
self.token = None # CSRF token introduced in OpenRefine 3.3
self.get_csrf_token()
def get_csrf_token(self):
"""Return csrf token."""
try:
url = self.server + '/command/core/get-csrf-token'
response = json.loads(urllib2.urlopen(url).read())
self.token = response['token']
return self.token
except:
pass # fail silently to not disturb usage of OpenRefine <3.3
def urlopen(self, command, data=None, params=None, project_id=None):
"""Open a Refine URL and with optional query params and POST data.
@ -73,18 +85,24 @@ class RefineServer(object):
data['project'] = project_id
else:
params['project'] = project_id
# be lazy and send the token for each API call (even when not needed)
if self.token:
params['csrf_token'] = self.token
if params:
url += '?' + urllib.urlencode(params)
req = urllib2.Request(url)
if data:
req.add_data(data) # data = urllib.urlencode(data)
req.add_data(data) # data = urllib.urlencode(data)
#req.add_header('Accept-Encoding', 'gzip')
try:
response = urllib2.urlopen(req)
except urllib2.URLError as (url_error,):
except urllib2.HTTPError as e:
raise Exception('HTTP %d "%s" for %s\n\t%s' %
(e.code, e.msg, e.geturl(), data))
except urllib2.URLError as e:
raise urllib2.URLError(
'%s for %s. No Refine server reachable/running; ENV set?' %
(url_error, self.server))
(e.reason, self.server))
if response.info().get('Content-Encoding', None) == 'gzip':
# Need a seekable filestream for gzip
gzip_fp = gzip.GzipFile(fileobj=StringIO.StringIO(response.read()))
@ -96,9 +114,13 @@ class RefineServer(object):
"""Open a Refine URL, optionally POST data, and return parsed JSON."""
response = json.loads(self.urlopen(*args, **kwargs).read())
if 'code' in response and response['code'] not in ('ok', 'pending'):
raise Exception(
response['code'] + ': ' +
response.get('message', response.get('stack', response)))
if 'Missing or invalid csrf_token parameter' == response['message']:
self.get_csrf_token()
response = json.loads(self.urlopen(*args, **kwargs).read())
return response
error_message = ('server ' + response['code'] + ': ' +
response.get('message', response.get('stack', response)))
raise Exception(error_message)
return response
def get_version(self):
@ -114,6 +136,7 @@ class RefineServer(object):
self.__version = self.get_version()['version']
return self.__version
class Refine:
"""Class representing a connection to a Refine server."""
def __init__(self, server):
@ -144,35 +167,115 @@ class Refine:
"""Open a Refine project."""
return RefineProject(self.server, project_id)
def new_project(self, project_file=None, project_url=None,
project_name=None,
split_into_columns=True,
separator='',
ignore_initial_non_blank_lines=0,
header_lines=1, # use 0 if your data has no header
skip_initial_data_rows=0,
limit=None, # no more than this number of rows
guess_value_type=True, # numbers, dates, etc.
ignore_quotes=False):
# These aren't used yet but are included for reference
new_project_defaults = {
'text/line-based/*sv': {
'encoding': '',
'separator': ',',
'ignore_lines': -1,
'header_lines': 1,
'skip_data_lines': 0,
'limit': -1,
'store_blank_rows': True,
'guess_cell_value_types': True,
'process_quotes': True,
'store_blank_cells_as_nulls': True,
'include_file_sources': False},
'text/line-based': {
'encoding': '',
'lines_per_row': 1,
'ignore_lines': -1,
'limit': -1,
'skip_data_lines': -1,
'store_blank_rows': True,
'store_blank_cells_as_nulls': True,
'include_file_sources': False},
'text/line-based/fixed-width': {
'encoding': '',
'column_widths': [20],
'ignore_lines': -1,
'header_lines': 0,
'skip_data_lines': 0,
'limit': -1,
'guess_cell_value_types': False,
'store_blank_rows': True,
'store_blank_cells_as_nulls': True,
'include_file_sources': False},
'text/line-based/pc-axis': {
'encoding': '',
'limit': -1,
'skip_data_lines': -1,
'include_file_sources': False},
'text/rdf+n3': {'encoding': ''},
'text/xml/ods': {
'sheets': [],
'ignore_lines': -1,
'header_lines': 1,
'skip_data_lines': 0,
'limit': -1,
'store_blank_rows': True,
'store_blank_cells_as_nulls': True,
'include_file_sources': False},
'binary/xls': {
'xml_based': False,
'sheets': [],
'ignore_lines': -1,
'header_lines': 1,
'skip_data_lines': 0,
'limit': -1,
'store_blank_rows': True,
'store_blank_cells_as_nulls': True,
'include_file_sources': False}
}
if ((project_file and project_url) or
(not project_file and not project_url)):
def new_project(self, project_file=None, project_url=None, project_name=None, project_format='text/line-based/*sv',
encoding='',
separator=',',
ignore_lines=-1,
header_lines=1,
skip_data_lines=0,
limit=-1,
store_blank_rows=True,
guess_cell_value_types=False,
process_quotes=True,
store_blank_cells_as_nulls=True,
include_file_sources=False,
**opts):
if (project_file and project_url) or (not project_file and not project_url):
raise ValueError('One (only) of project_file and project_url must be set')
def s(opt):
if isinstance(opt, bool):
return 'on' if opt else ''
return 'true' if opt else 'false'
if opt is None:
return ''
return str(opt)
options = {
'split-into-columns': s(split_into_columns),
'separator': s(separator),
'ignore': s(ignore_initial_non_blank_lines),
'header-lines': s(header_lines),
'skip': s(skip_initial_data_rows), 'limit': s(limit),
'guess-value-type': s(guess_value_type),
'ignore-quotes': s(ignore_quotes),
# the new APIs requires a json in the 'option' POST or GET argument
# POST is broken at the moment, so we send it in the URL
new_style_options = dict(opts, **{
'encoding': s(encoding),
'separator': s(separator)
})
params = {
'options': json.dumps(new_style_options),
}
# old style options
options = {
'format': project_format,
'ignore-lines': s(ignore_lines),
'header-lines': s(header_lines),
'skip-data-lines': s(skip_data_lines),
'limit': s(limit),
'guess-value-type': s(guess_cell_value_types),
'process-quotes': s(process_quotes),
'store-blank-rows': s(store_blank_rows),
'store-blank-cells-as-nulls': s(store_blank_cells_as_nulls),
'include-file-sources': s(include_file_sources)
}
if project_url is not None:
options['url'] = project_url
elif project_file is not None:
@ -185,7 +288,9 @@ class Refine:
project_name = (project_file or 'New project').rsplit('.', 1)[0]
project_name = os.path.basename(project_name)
options['project-name'] = project_name
response = self.server.urlopen('create-project-from-upload', options)
response = self.server.urlopen(
'create-project-from-upload', options, params
)
# expecting a redirect to the new project containing the id in the url
url_params = urlparse.parse_qs(
urlparse.urlparse(response.geturl()).query)
@ -211,6 +316,7 @@ def RowsResponseFactory(column_index):
self.index = row_response['i']
self.row = [c['v'] if c else None
for c in row_response['cells']]
def __getitem__(self, column):
# Trailing nulls seem to be stripped from row data
try:
@ -220,11 +326,14 @@ def RowsResponseFactory(column_index):
def __init__(self, rows_response):
self.rows_response = rows_response
def __iter__(self):
for row_response in self.rows_response:
yield self.RefineRow(row_response)
def __getitem__(self, index):
return self.RefineRow(self.rows_response[index])
def __len__(self):
return len(self.rows_response)
@ -240,7 +349,7 @@ def RowsResponseFactory(column_index):
class RefineProject:
"""A Google Refine project."""
"""An OpenRefine project."""
def __init__(self, server, project_id=None):
if not isinstance(server, RefineServer):
@ -309,7 +418,10 @@ class RefineProject:
for i, column in enumerate(column_model['columns']):
name = column['name']
self.column_order[name] = i
column_index[name] = column['cellIndex']
try:
column_index[name] = column['cellIndex']
except KeyError:
column_index[name] = i
self.key_column = column_model['keyColumnName']
self.has_records = response['recordModel'].get('hasRecords', False)
self.rows_response_factory = RowsResponseFactory(column_index)
@ -331,18 +443,38 @@ class RefineProject:
return
def apply_operations(self, file_path, wait=True):
json = open(file_path).read()
response_json = self.do_json('apply-operations', {'operations': json})
json_data = open(file_path).read()
response_json = self.do_json('apply-operations', {'operations': json_data})
if response_json['code'] == 'pending' and wait:
self.wait_until_idle()
return 'ok'
return response_json['code'] # can be 'ok' or 'pending'
return response_json['code'] # can be 'ok' or 'pending'
def export(self, export_format='tsv'):
def export(self, encoding=None, export_format='tsv'):
"""Return a fileobject of a project's data."""
url = ('export-rows/' + urllib.quote(self.project_name()) + '.' +
export_format)
return self.do_raw(url, data={'format': export_format})
url = ('export-rows/' +
urllib.quote(self.project_name().encode('utf8')) +
'.' + export_format)
data = {'format': export_format}
if encoding:
data['encoding'] = encoding
return self.do_raw(url, data)
def export_templating(self, encoding=None, engine='', prefix='',
template='', rowSeparator='\n', suffix=''):
"""Return a fileobject of a project's data in templating mode."""
url = ('export-rows/' +
urllib.quote(self.project_name().encode('utf8')) +
'.' + 'txt')
data = {'format': 'template',
'template': template,
'engine': engine,
'prefix': prefix,
'suffix': suffix,
'separator': rowSeparator}
if encoding:
data['encoding'] = encoding
return self.do_raw(url, data)
def export_rows(self, **kwargs):
"""Return an iterable of parsed rows of a project's data."""
@ -426,6 +558,7 @@ class RefineProject:
},
},
}
def compute_clusters(self, column, clusterer_type='binning',
function=None, params=None):
"""Returns a list of clusters of {'value': ..., 'count': ...}."""
@ -443,7 +576,7 @@ class RefineProject:
def annotate_one_row(self, row, annotation, state=True):
if annotation not in ('starred', 'flagged'):
raise ValueError('annotation must be one of starred or flagged')
state = 'true' if state == True else 'false'
state = 'true' if state is True else 'false'
return self.do_json('annotate-one-row', {'row': row.index,
annotation: state})
@ -457,18 +590,19 @@ class RefineProject:
column_insert_index=None, on_error='set-to-blank'):
if column_insert_index is None:
column_insert_index = self.column_order[column] + 1
response = self.do_json('add-column', {'baseColumnName': column,
'newColumnName': new_column, 'expression': expression,
'columnInsertIndex': column_insert_index, 'onError': on_error})
response = self.do_json('add-column', {
'baseColumnName': column, 'newColumnName': new_column,
'expression': expression, 'columnInsertIndex': column_insert_index,
'onError': on_error})
self.get_models()
return response
def split_column(self, column, separator=',', mode='separator',
regex=False, guess_cell_type=True,
remove_original_column=True):
response = self.do_json('split-column', {'columnName': column,
'separator': separator, 'mode': mode, 'regex': regex,
'guessCellType': guess_cell_type,
response = self.do_json('split-column', {
'columnName': column, 'separator': separator, 'mode': mode,
'regex': regex, 'guessCellType': guess_cell_type,
'removeOriginalColumn': remove_original_column})
self.get_models()
return response
@ -505,9 +639,11 @@ class RefineProject:
self.get_models()
return response
def transpose_columns_into_rows(self, start_column, column_count,
combined_column_name, separator=':', prepend_column_name=True,
ignore_blank_cells=True):
def transpose_columns_into_rows(
self, start_column, column_count,
combined_column_name, separator=':', prepend_column_name=True,
ignore_blank_cells=True):
response = self.do_json('transpose-columns-into-rows', {
'startColumnName': start_column, 'columnCount': column_count,
'combinedColumnName': combined_column_name,
@ -550,7 +686,8 @@ class RefineProject:
return recon_service
return None
def reconcile(self, column, service, type=None, config=None):
def reconcile(self, column, service, reconciliation_type=None,
reconciliation_config=None):
"""Perform a reconciliation asynchronously.
config: {
@ -570,21 +707,21 @@ class RefineProject:
for reconciliation to complete.
"""
# Create a reconciliation config by looking up recon service info
if config is None:
if reconciliation_config is None:
service = self.get_reconciliation_service_by_name_or_url(service)
if type is None:
if reconciliation_type is None:
raise ValueError('Must have at least one of config or type')
config = {
reconciliation_config = {
'mode': 'standard-service',
'service': service['url'],
'identifierSpace': service['identifierSpace'],
'schemaSpace': service['schemaSpace'],
'type': {
'id': type['id'],
'name': type['name'],
'id': reconciliation_type['id'],
'name': reconciliation_type['name'],
},
'autoMatch': True,
'columnDetails': [],
}
return self.do_json('reconcile', {
'columnName': column, 'config': json.dumps(config)})
'columnName': column, 'config': json.dumps(reconciliation_config)})

BIN
openrefine-client-peek.gif Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.7 MiB

View File

@ -1,13 +1,6 @@
#!/usr/bin/env python
"""
Script to provide a command line interface to a Refine server.
Examples,
refine --list # show list of Refine projects, ID: name
refine --export 1234... > project.tsv
refine --export --output=project.xls 1234...
refine --apply trim.json 1234...
"""
# Copyright (c) 2011 Paul Makepeace, Real Programmers. All rights reserved.
@ -25,79 +18,18 @@ refine --apply trim.json 1234...
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>
import optparse
import os
import sys
import time
from google.refine import refine
from google.refine import __main__, cli, refine
PARSER = optparse.OptionParser(
usage='usage: %prog [--help | OPTIONS] [project ID/URL]')
PARSER.add_option('-H', '--host', dest='host',
help='Google Refine hostname')
PARSER.add_option('-P', '--port', dest='port',
help='Google Refine port')
PARSER.add_option('-o', '--output', dest='output',
help='Output filename')
# Options that are more like commands
PARSER.add_option('-l', '--list', dest='list', action='store_true',
help='List projects')
PARSER.add_option('-E', '--export', dest='export', action='store_true',
help='Export project')
PARSER.add_option('-f', '--apply', dest='apply',
help='Apply a JSON commands file to a project')
def list_projects():
"""Query the Refine server and list projects by ID: name."""
projects = refine.Refine(refine.RefineServer()).list_projects().items()
def date_to_epoch(json_dt):
"Convert a JSON date time into seconds-since-epoch."
return time.mktime(time.strptime(json_dt, '%Y-%m-%dT%H:%M:%SZ'))
projects.sort(key=lambda v: date_to_epoch(v[1]['modified']), reverse=True)
for project_id, project_info in projects:
print('{0:>14}: {1}'.format(project_id, project_info['name']))
def export_project(project, options):
"""Dump a project to stdout or options.output file."""
export_format = 'tsv'
if options.output:
ext = os.path.splitext(options.output)[1][1:] # 'xls'
if ext:
export_format = ext.lower()
output = open(options.output, 'wb')
else:
output = sys.stdout
output.writelines(project.export(export_format=export_format))
output.close()
def main():
"Main."
options, args = PARSER.parse_args()
if options.host:
refine.REFINE_HOST = options.host
if options.port:
refine.REFINE_PORT = options.port
if not options.list and len(args) != 1:
PARSER.print_usage()
if options.list:
list_projects()
if args:
project = refine.RefineProject(args[0])
if options.apply:
response = project.apply_operations(options.apply)
if response != 'ok':
print >>sys.stderr, 'Failed to apply %s: %s' % (options.apply,
response)
if options.export:
export_project(project, options)
return project
# workaround for pyinstaller
if getattr(sys, 'frozen', False) and hasattr(sys, '_MEIPASS'):
reload(sys)
sys.setdefaultencoding('utf-8')
if sys.platform == "win32":
import codecs
codecs.register(lambda name: codecs.lookup(
'utf-8') if name == 'cp65001' else None)
if __name__ == '__main__':
# return project so that it's available interactively, python -i refine.py
project = main()
__main__.main()

View File

@ -1 +1 @@
https://github.com/seisen/urllib2_file/tarball/master
urllib2_file>=0.2.1

View File

@ -20,28 +20,40 @@ import os
from setuptools import setup
from setuptools import find_packages
def read(fname):
return open(os.path.join(os.path.dirname(__file__), fname)).read()
setup(name='refine-client',
version='0.2.1',
description=('The Google Refine Python Client Library provides an '
'interface to communicating with a Google Refine server.'),
long_description=read('README.rst'),
author='Paul Makepeace',
author_email='paulm@paulm.com',
url='https://github.com/PaulMakepeace/refine-client-py',
def read(filename):
return open(os.path.join(os.path.dirname(__file__), filename)).read()
setup(name='openrefine-client',
version='0.3.10',
description=('The OpenRefine Python Client Library provides an '
'interface to communicating with an OpenRefine server. '
'This fork extends the command line interface (CLI).'),
long_description=read('README.md'),
long_description_content_type='text/markdown',
author='Felix Lohmeier',
author_email='felix.lohmeier@opencultureconsulting.com',
url='https://github.com/opencultureconsulting/openrefine-client',
packages=find_packages(exclude=['tests']),
install_requires=['urllib2_file'],
python_requires='>=2.7, !=3.*',
entry_points={
'console_scripts': [ 'openrefine-client = google.refine.__main__:main' ]
},
platforms=['Any'],
keywords='openrefine client batch processing docker etl code4lib',
classifiers = [
'Development Status :: 3 - Alpha',
'Intended Audience :: Developers',
'License :: OSI Approved :: GNU General Public License (GPL)',
'Operating System :: OS Independent',
'Programming Language :: Python',
'Topic :: Software Development :: Libraries :: Python Modules',
'Topic :: Text Processing',
'Development Status :: 4 - Beta',
'Intended Audience :: Developers',
'Intended Audience :: System Administrators',
'License :: OSI Approved :: GNU General Public License (GPL)',
'License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)',
'Operating System :: OS Independent',
'Programming Language :: Python',
'Programming Language :: Python :: 2',
'Programming Language :: Python :: 2.7',
'Topic :: Software Development :: Libraries :: Python Modules',
'Topic :: Text Processing',
],
test_suite='tests',
)

123
tests-cli.sh Executable file
View File

@ -0,0 +1,123 @@
#!/bin/bash
# Script for running functional tests against the CLI
# Copyright (c) 2011 Paul Makepeace, Real Programmers. All rights reserved.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>
# ================================== CONFIG ================================== #
cd "${BASH_SOURCE%/*}/" || exit 1
port=3334
if [[ ${1} ]]; then
version="${1}"
else
version="3.2"
fi
refine="openrefine-${version}/refine"
if [[ ${2} ]]; then
client="$(readlink -e "${2}")"
else
client="python2 $(readlink -e refine.py)"
fi
cmd="${client} -H localhost -P ${port}"
if [[ ${3} ]]; then
filename="${3%%.*}"
else
filename=""
fi
cmd="${client} -H localhost -P ${port}"
# =============================== REQUIREMENTS =============================== #
# check existence of java and cURL
if [[ -z "$(command -v java 2> /dev/null)" ]] ; then
echo 1>&2 "ERROR: OpenRefine requires JAVA runtime environment (jre)" \
"https://openjdk.java.net/install/"
exit 1
fi
if [[ -z "$(command -v curl 2> /dev/null)" ]] ; then
echo 1>&2 "ERROR: This shell script requires cURL" \
"https://curl.haxx.se/download.html"
exit 1
fi
# download OpenRefine
if [[ -z "$(readlink -e "${refine}")" ]]; then
echo "Download OpenRefine ${version}..."
mkdir -p "$(dirname "${refine}")"
curl -L --output openrefine.tar.gz \
"https://github.com/OpenRefine/OpenRefine/releases/download/${version}/openrefine-linux-${version}.tar.gz"
echo "Install OpenRefine ${version} in subdirectory $(dirname "${refine}")..."
tar -xzf openrefine.tar.gz -C "$(dirname "${refine}")" --strip 1 --totals
rm -f openrefine.tar.gz
# do not try to open OpenRefine in browser
sed -i '$ a JAVA_OPTIONS=-Drefine.headless=true' \
"$(dirname "${refine}")"/refine.ini
# set autosave period from 5 minutes to 25 hours
sed -i 's/#REFINE_AUTOSAVE_PERIOD=60/REFINE_AUTOSAVE_PERIOD=1500/' \
"$(dirname "${refine}")"/refine.ini
echo
fi
# ================================== SETUP =================================== #
dir="$(readlink -f "tests/tmp")"
mkdir -p "${dir}"
rm -f tests-cli.log
echo "start OpenRefine ${version}..."
${refine} -v warn -p ${port} -d "${dir}" &>> tests-cli.log &
pid_server=${!}
timeout 30s bash -c "until curl -s 'http://localhost:3334' \
| cat | grep -q -o 'OpenRefine' ; do sleep 1; done" \
|| error "starting OpenRefine server failed!"
echo
# ================================== TESTS =================================== #
echo "running tests, please wait..."
tests=()
results=()
for t in tests/*${filename}*.sh; do
tests+=("${t}")
echo "======================= ${t} =======================" &>> tests-cli.log
bash "${t}" "${cmd}" "${version}" &>> tests-cli.log
results+=(${?})
done
echo
# ================================= TEARDOWN ================================= #
echo "cleanup..."
{ kill -9 "${pid_server}" && wait "${pid_server}"; } 2>/dev/null
rm -rf "${dir}"
echo
# ================================= SUMMARY ================================== #
printf "%s\t%s\n" "code" "test"
printf "%s\t%s\n" "----" "----------------"
for i in "${!tests[@]}"; do
printf "%s\t%s\n" "${results[$i]}" "${tests[$i]}"
done
echo
if [[ " ${results[*]} " =~ [1-9] ]]; then
echo "failed tests! check tests-cli.log for debugging"; echo
else
echo "all tests passed!"; echo
fi

130
tests.sh Executable file
View File

@ -0,0 +1,130 @@
#!/bin/bash
# Script for running tests with different OpenRefine and Java versions based on Docker images.
# Copyright (c) 2011 Paul Makepeace, Real Programmers. All rights reserved.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>
# defaults:
all=(3.5.0 3.4.1 3.4 3.3 3.2-java12 3.2-java11 3.2-java10 3.2-java9 3.2 3.1-java9 3.1 3.0-java9 3.0 2.8-java9 2.8 2.8-java7 2.7 2.7-java7 2.5-java7 2.5-java6 2.1-java6 2.0-java6)
main=(3.5.0 3.4.1 3.4 3.3 3.2 3.1 3.0 2.8 2.7 2.5-java6 2.1-java6 2.0-java6)
interactively=false
port="3333"
# help screen
function usage () {
cat <<EOF
Usage: ./tests.sh [-t TAG] [-i] [-p] [-a] [-h]
Script for running tests with different OpenRefine and Java versions.
It uses docker images from https://hub.docker.com/r/felixlohmeier/openrefine.
Examples:
./tests.sh -a # run tests on all OpenRefine versions (from 2.0 up to 3.5.0)
./tests.sh -t 3.5.0 # run tests on tag 3.5.0
./tests.sh -t 3.5.0 -i # run tests on tag 3.5.0 interactively (pause before and after tests)
./tests.sh -t 3.5.0 -t 2.7 # run tests on tags 3.5.0 and 2.7
Advanced:
./tests.sh -j # run tests on all OpenRefine versions and each with all supported Java versions (requires a lot of docker images to be downloaded!)
./tests.sh -t 3.1 -i -p 3334 # run tests on tag 3.1 interactively on port 3334
Running tests interactively (-i) allows you to examine OpenRefine GUI at http://localhost:3333.
Execute the script concurrently in another terminal on another port (-p 3334) to compare changes in the OpenRefine GUI at http://localhost:3333 and http://localhost:3334.
Available tags (java 8 if java not mentioned in tag):
EOF
for t in ${all[*]} ; do
echo "$t"
done
exit 1
}
# check input
NUMARGS=$#
if [ "$NUMARGS" -eq 0 ]; then
usage
fi
# check system requirements
DOCKER="$(command -v docker 2> /dev/null)"
if [ -z "$DOCKER" ] ; then
echo 1>&2 "This action requires you to have 'docker' installed and present in your PATH. You can download it for free at http://www.docker.com/"
exit 1
fi
DOCKERINFO="$(docker info 2>/dev/null | grep 'Server Version')"
if [ -z "$DOCKERINFO" ]
then
echo "command 'docker info' failed, trying again with sudo..."
DOCKERINFO="$(sudo docker info 2>/dev/null | grep 'Server Version')"
echo "OK"
docker=(sudo docker)
if [ -z "$DOCKERINFO" ] ; then
echo 1>&2 "This action requires you to start the docker daemon. Try 'sudo systemctl start docker' or 'sudo start docker'. If the docker daemon is already running then maybe some security privileges are missing to run docker commands.'"
exit 1
fi
else
docker=(docker)
fi
CURLINFO="$(command -v curl 2>/dev/null)"
if [ -z "$CURLINFO" ] ; then
echo 1>&2 "This action requires you to have 'curl' installed and present in your PATH."
exit 1
fi
# get user input
options="t:p:iajh"
while getopts $options opt; do
case $opt in
t ) tags+=("${OPTARG}");;
p ) port="${OPTARG}";export OPENREFINE_PORT="$port";;
i ) interactively=true;;
a ) tags=("${main[*]}");;
j ) tags=("${all[*]}");;
h ) usage ;;
\? ) echo 1>&2 "Unknown option: -$OPTARG"; usage; exit 1;;
: ) echo 1>&2 "Missing option argument for -$OPTARG"; usage; exit 1;;
* ) echo 1>&2 "Unimplemented option: -$OPTARG"; usage; exit 1;;
esac
done
shift $((OPTIND - 1))
# print config
echo "Tags: ${tags[*]}"
echo "Port: $port"
echo ""
# safe cleanup handler
cleanup()
{
echo "cleanup..."
${docker[*]} stop "$t"
}
trap "cleanup;exit" SIGHUP SIGINT SIGQUIT SIGTERM
# run setup.py tests for each docker tag
for t in ${tags[*]} ; do
echo "=== Tests for $t ==="
echo ""
echo "Begin: $(date)"
${docker[*]} run -d -p "$port":3333 --rm --name "$t" felixlohmeier/openrefine:"$t"
until curl --silent -N http://localhost:"$port" | cat | grep -q -o "Refine" ; do sleep 1; done
echo "Refine running at http://localhost:${port}"
if [ $interactively = true ]; then read -r -p "Press [Enter] key to start tests..."; fi
python2 setup.py test
if [ $interactively = true ]; then read -r -p "Press [Enter] key to stop OpenRefine..."; fi
${docker[*]} stop "$t"
echo "End: $(date)"
echo ""
done

57
tests/apply-utf8.sh Normal file
View File

@ -0,0 +1,57 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
cat << "DATA" > "tmp/${t}/${t}.transform"
[
{
"op": "core/column-addition",
"engineConfig": {
"mode": "row-based"
},
"newColumnName": "apply",
"columnInsertIndex": 2,
"baseColumnName": "b",
"expression": "grel:value.replace('2','⛲')",
"onError": "set-to-blank"
}
]
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
a b apply c
1 2 ⛲ 3
0 0 0 0
$ \ \ '
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --apply "tmp/${t}/${t}.transform" "${t}"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

57
tests/apply.sh Normal file
View File

@ -0,0 +1,57 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
cat << "DATA" > "tmp/${t}/${t}.transform"
[
{
"op": "core/column-addition",
"engineConfig": {
"mode": "row-based"
},
"newColumnName": "apply",
"columnInsertIndex": 2,
"baseColumnName": "b",
"expression": "grel:value.replace('2','TEST')",
"onError": "set-to-blank"
}
]
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
a b apply c
1 2 TEST 3
0 0 0 0
$ \ \ '
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --apply "tmp/${t}/${t}.transform" "${t}"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -0,0 +1,41 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}-utf8.csv"
a,b,c
1,2,3
ä,é,ß
$,\,'
DATA
iconv -f UTF-8 -t ISO-8859-1 "tmp/${t}/${t}-utf8.csv" > "tmp/${t}/${t}.csv"
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
a b c
1 2 3
ä é ß
$ \ '
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv" --encoding "ISO-8859-1"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -0,0 +1,40 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,2,3
01,02,03
$,\,'
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
a b c
1 2 3
1 2 3
$ \ '
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv" --guessCellValueTypes "true"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -0,0 +1,41 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
Column 1 Column 2 Column 3
a b c
1 2 3
0 0 0
$ \ '
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv" --headerLines "0"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -0,0 +1,39 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
1 2 3
0 0 0
$ \ '
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv" --ignoreLines "1"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

39
tests/create-csv-limit.sh Normal file
View File

@ -0,0 +1,39 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
a b c
1 2 3
0 0 0
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv" --limit "2"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -0,0 +1,41 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,"2,0",3
0,0,0
$,\,'
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
a b c Column 4
1 2 0 3
0 0 0
$ \ '
DATA
# ================================== ACTION ================================== #
# OpenRefine 4.x fails without manually set headerLines
${cmd} --create "tmp/${t}/${t}.csv" --processQuotes "false" --headerLines 1
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -0,0 +1,45 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
if [[ ${2} ]]; then
version="${2}"
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
# ================================= ASSERTION ================================ #
if [[ "${version:0:1}" = "2" ]]; then
echo "projectTags were introduced in OpenRefine 3.0"
exit 200
else
cat << "DATA" > "tmp/${t}/${t}.assert"
tags: [u'beta', u'client1']
DATA
fi
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv" --projectTags "beta" --projectTags "client1"
${cmd} --info "${t}" | grep ' tags: ' > "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -0,0 +1,40 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a;b;c
1;2;3
0;0;0
$;\;'
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
a b c
1 2 3
0 0 0
$ \ '
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv" --separator ";"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -0,0 +1,39 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
a b c
0 0 0
$ \ '
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv" --skipDataLines "1"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -0,0 +1,58 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,2,3
0,,0
$,\,'
DATA
cat << "DATA" > "tmp/${t}/${t}.transform"
[
{
"op": "core/text-transform",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"columnName": "b",
"expression": "grel:isNull(value)",
"onError": "keep-original",
"repeat": false,
"repeatCount": 10
}
]
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
a b c
1 false 3
0 false 0
$ false '
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv" --storeBlankCellsAsNulls "false"
${cmd} --apply "tmp/${t}/${t}.transform" "${t}"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -0,0 +1,39 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,2,3
,,
$,\,'
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
a b c
1 2 3
$ \ '
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv" --storeBlankRows "false"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

40
tests/create-csv-utf8.sh Normal file
View File

@ -0,0 +1,40 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
⌨,code,meaning
⛲,1F347,FOUNTAIN
⛳,1F349,FLAG IN HOLE
⛵,1F352,SAILBOAT
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
⌨ code meaning
⛲ 1F347 FOUNTAIN
⛳ 1F349 FLAG IN HOLE
⛵ 1F352 SAILBOAT
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv" --projectName "${t} biểu tượng cảm xúc ⛲"
${cmd} --export "${t} biểu tượng cảm xúc ⛲" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

40
tests/create-csv.sh Normal file
View File

@ -0,0 +1,40 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
a b c
1 2 3
0 0 0
$ \ '
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -0,0 +1,55 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.json"
{
"rows":[
{
"a":1,
"b":2,
"c":3
},
{
"a":0,
"b":0,
"c":0
},
{
"a":"$",
"b":"\\",
"c":"\""
}
]
}
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
_ - a _ - b _ - c
1 2 3
0 0 0
$ \ """"
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.json" --recordPath "_" --recordPath "rows" --recordPath "_"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -0,0 +1,52 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.json"
[
{
"a": 1,
"b": 2,
"c": 3
},
{
"a": "",
"b": "",
"c": ""
},
{
"a": "$",
"b": "\\",
"c": "\""
}
]
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
_ - a _ - b _ - c
1 2 3
$ \ """"
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.json" --storeEmptyStrings "false"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -0,0 +1,62 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
if [[ ${2} ]]; then
version="${2}"
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.json"
[
{
"a": 1,
"b": 2,
"c": 3
},
{
"a": "0",
"b": " 0",
"c": "0 "
},
{
"a": "$",
"b": "\\",
"c": "\""
}
]
DATA
# ================================= ASSERTION ================================ #
if [[ "${version:0:1}" = "2" || "${version}" = "3.0" || "${version}" = "3.1" || "${version}" = "3.2" || "${version}" = "3.3" ]]; then
echo "trimStrings option does not work in OpenRefine <=3.3"
echo "https://github.com/OpenRefine/OpenRefine/issues/2409"
exit 200
else
cat << "DATA" > "tmp/${t}/${t}.assert"
_ - a _ - b _ - c
1 2 3
0 0 0
$ \ """"
DATA
fi
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.json" --trimStrings "true"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

53
tests/create-json-utf8.sh Normal file
View File

@ -0,0 +1,53 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.json"
[
{
"⌨": "⛲",
"code": "1F347",
"meaning": "FOUNTAIN"
},
{
"⌨": "⛳",
"code": "1F349",
"meaning": "FLAG IN HOLE"
},
{
"⌨": "⛵",
"code": "1F352",
"meaning": "SAILBOAT"
}
]
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
_ - ⌨ _ - code _ - meaning
⛲ 1F347 FOUNTAIN
⛳ 1F349 FLAG IN HOLE
⛵ 1F352 SAILBOAT
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.json"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

53
tests/create-json.sh Normal file
View File

@ -0,0 +1,53 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.json"
[
{
"a": 1,
"b": 2,
"c": 3
},
{
"a": 0,
"b": 0,
"c": 0
},
{
"a": "$",
"b": "\\",
"c": "\""
}
]
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
_ - a _ - b _ - c
1 2 3
0 0 0
$ \ """"
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.json"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -0,0 +1,44 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
if [[ ${2} ]]; then
version="${2}"
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cp "data/example.ods" "tmp/${t}/${t}.ods"
# ================================= ASSERTION ================================ #
if [[ "${version}" = "2.7" ]]; then
cat << "DATA" > "tmp/${t}/${t}.assert"
⌨ code meaning Column Column 5 Column 6 Column 7 Column 8
⛲ 1F347 FOUNTAIN
⛳ 1F349 FLAG IN HOLE
⛵ 1F352 SAILBOAT
DATA
else
#TODO
echo "https://github.com/opencultureconsulting/openrefine-client/issues/4"
exit 200
fi
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.ods" --sheets 1
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

48
tests/create-ods.sh Normal file
View File

@ -0,0 +1,48 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
if [[ ${2} ]]; then
version="${2}"
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cp "data/example.ods" "tmp/${t}/${t}.ods"
#a b c
#1 2 3
#0 0 0
#$ \ '
# ================================= ASSERTION ================================ #
if [[ "${version}" = "2.7" ]]; then
cat << "DATA" > "tmp/${t}/${t}.assert"
a b c Column Column 5 Column 6 Column 7 Column 8
1.0 2.0 3.0
0.0 0.0 0.0
$ \ '
DATA
else
#TODO
echo "https://github.com/opencultureconsulting/openrefine-client/issues/4"
exit 200
fi
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.ods"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

40
tests/create-tsv-utf8.sh Normal file
View File

@ -0,0 +1,40 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.tsv"
⌨ code meaning
⛲ 1F347 FOUNTAIN
⛳ 1F349 FLAG IN HOLE
⛵ 1F352 SAILBOAT
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
⌨,code,meaning
⛲,1F347,FOUNTAIN
⛳,1F349,FLAG IN HOLE
⛵,1F352,SAILBOAT
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.tsv"
${cmd} --export "${t}" --output "tmp/${t}/${t}.csv"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.csv"

40
tests/create-tsv.sh Normal file
View File

@ -0,0 +1,40 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.tsv"
a b c
1 2 3
0 0 0
$ \ '
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.tsv"
${cmd} --export "${t}" --output "tmp/${t}/${t}.csv"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.csv"

View File

@ -0,0 +1,80 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.txt"
1 2 3
mon tue wed
$2 $300 $1
DATA
cat << "DATA" > "tmp/${t}/${t}.transform"
[
{
"op": "core/text-transform",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"columnName": "1",
"expression": "grel:value.trim()",
"onError": "keep-original",
"repeat": false,
"repeatCount": 10
},
{
"op": "core/text-transform",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"columnName": "2",
"expression": "grel:value.trim()",
"onError": "keep-original",
"repeat": false,
"repeatCount": 10
},
{
"op": "core/text-transform",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"columnName": "3",
"expression": "grel:value.trim()",
"onError": "keep-original",
"repeat": false,
"repeatCount": 10
}
]
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
1 2 3
mon tue wed
$2 $300 $1
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.txt" --columnWidths "6" --columnWidths "6" --columnWidths "6" --headerLines "1"
${cmd} --apply "tmp/${t}/${t}.transform" "${t}"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -0,0 +1,81 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.txt"
⛲ 1F347 FOUNTAIN
⛳ 1F349 FLAG IN HOLE
⛵ 1F352 SAILBOAT
DATA
cat << "DATA" > "tmp/${t}/${t}.transform"
[
{
"op": "core/text-transform",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"columnName": "Column 1",
"expression": "grel:value.trim()",
"onError": "keep-original",
"repeat": false,
"repeatCount": 10
},
{
"op": "core/text-transform",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"columnName": "Column 2",
"expression": "grel:value.trim()",
"onError": "keep-original",
"repeat": false,
"repeatCount": 10
},
{
"op": "core/text-transform",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"columnName": "Column 3",
"expression": "grel:value.trim()",
"onError": "keep-original",
"repeat": false,
"repeatCount": 10
}
]
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
Column 1 Column 2 Column 3
⛲ 1F347 FOUNTAIN
⛳ 1F349 FLAG IN HOLE
⛵ 1F352 SAILBOAT
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.txt" --columnWidths "6" --columnWidths "6" --columnWidths "60"
${cmd} --apply "tmp/${t}/${t}.transform" "${t}"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -0,0 +1,81 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.txt"
1 2 3
mon tue wed
$2 $300 $1
DATA
cat << "DATA" > "tmp/${t}/${t}.transform"
[
{
"op": "core/text-transform",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"columnName": "Column 1",
"expression": "grel:value.trim()",
"onError": "keep-original",
"repeat": false,
"repeatCount": 10
},
{
"op": "core/text-transform",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"columnName": "Column 2",
"expression": "grel:value.trim()",
"onError": "keep-original",
"repeat": false,
"repeatCount": 10
},
{
"op": "core/text-transform",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"columnName": "Column 3",
"expression": "grel:value.trim()",
"onError": "keep-original",
"repeat": false,
"repeatCount": 10
}
]
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
Column 1 Column 2 Column 3
1 2 3
mon tue wed
$2 $300 $1
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.txt" --columnWidths "6" --columnWidths "6" --columnWidths "6"
${cmd} --apply "tmp/${t}/${t}.transform" "${t}"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -0,0 +1,39 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.txt"
mon tue wed
$2 $300 $1
thu fri sat
$70 $20 $50
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
Column 1 Column 2
mon tue wed $2 $300 $1
thu fri sat $70 $20 $50
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.txt" --linesPerRow "2"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

39
tests/create-txt.sh Normal file
View File

@ -0,0 +1,39 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.txt"
1 2 3
mon tue wed
$2 $300 $1
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
Column 1
1 2 3
mon tue wed
$2 $300 $1
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.txt"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -0,0 +1,44 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
if [[ ${2} ]]; then
version="${2}"
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cp "data/example.xls" "tmp/${t}/${t}.xls"
# ================================= ASSERTION ================================ #
if [[ "${version}" = "2.7" ]]; then
cat << "DATA" > "tmp/${t}/${t}.assert"
⌨ code meaning
⛲ 1F347 FOUNTAIN
⛳ 1F349 FLAG IN HOLE
⛵ 1F352 SAILBOAT
DATA
else
#TODO
echo "https://github.com/opencultureconsulting/openrefine-client/issues/4"
exit 200
fi
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.xls" --sheets 1
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

48
tests/create-xls.sh Normal file
View File

@ -0,0 +1,48 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
if [[ ${2} ]]; then
version="${2}"
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cp "data/example.xls" "tmp/${t}/${t}.xls"
#a b c
#1 2 3
#0 0 0
#$ \ '
# ================================= ASSERTION ================================ #
if [[ "${version}" = "2.7" ]]; then
cat << "DATA" > "tmp/${t}/${t}.assert"
a b c
1.0 2.0 3.0
0.0 0.0 0.0
$ \ '
DATA
else
#TODO
echo "https://github.com/opencultureconsulting/openrefine-client/issues/4"
exit 200
fi
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.xls"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -0,0 +1,44 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
if [[ ${2} ]]; then
version="${2}"
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cp "data/example.xlsx" "tmp/${t}/${t}.xlsx"
# ================================= ASSERTION ================================ #
if [[ "${version}" = "2.7" ]]; then
cat << "DATA" > "tmp/${t}/${t}.assert"
⌨ code meaning
⛲ 1F347 FOUNTAIN
⛳ 1F349 FLAG IN HOLE
⛵ 1F352 SAILBOAT
DATA
else
#TODO
echo "https://github.com/opencultureconsulting/openrefine-client/issues/4"
exit 200
fi
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.xlsx" --sheets 1
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

48
tests/create-xlsx.sh Normal file
View File

@ -0,0 +1,48 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
if [[ ${2} ]]; then
version="${2}"
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cp "data/example.xlsx" "tmp/${t}/${t}.xlsx"
#a b c
#1 2 3
#0 0 0
#$ \ '
# ================================= ASSERTION ================================ #
if [[ "${version}" = "2.7" ]]; then
cat << "DATA" > "tmp/${t}/${t}.assert"
a b c
1.0 2.0 3.0
0.0 0.0 0.0
$ \ '
DATA
else
#TODO
echo "https://github.com/opencultureconsulting/openrefine-client/issues/4"
exit 200
fi
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.xlsx"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -0,0 +1,96 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.xml"
<?xml version="1.0" encoding="UTF-8"?>
<root>
<record>
<a>1</a>
<b>2</b>
<c>3</c>
</record>
<record>
<a>0</a>
<b>0</b>
<c>0</c>
</record>
<record>
<a>$</a>
<b>\</b>
<c>'</c>
</record>
</root>
DATA
cat << "DATA" > "tmp/${t}/${t}.transform"
[
{
"op": "core/column-reorder",
"columnNames": [
"record - a",
"record - b",
"record - c"
],
"description": "Reorder columns"
},
{
"op": "core/row-removal",
"engineConfig": {
"facets": [
{
"type": "list",
"name": "Blank Rows",
"expression": "(filter(row.columnNames,cn,isNonBlank(cells[cn].value)).length()==0).toString()",
"columnName": "",
"invert": false,
"omitBlank": false,
"omitError": false,
"selection": [
{
"v": {
"v": "true",
"l": "true"
}
}
],
"selectBlank": false,
"selectError": false
}
],
"mode": "record-based"
}
}
]
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
record - a record - b record - c
1 2 3
0 0 0
$ \ '
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.xml" --recordPath "root" --recordPath "record"
${cmd} --apply "tmp/${t}/${t}.transform" "${t}"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

96
tests/create-xml-utf8.sh Normal file
View File

@ -0,0 +1,96 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.xml"
<?xml version="1.0" encoding="UTF-8"?>
<root>
<record>
<icon>⛲</icon>
<code>1F347</code>
<meaning>FOUNTAIN</meaning>
</record>
<record>
<icon>⛳</icon>
<code>1F349</code>
<meaning>FLAG IN HOLE</meaning>
</record>
<record>
<icon>⛵</icon>
<code>1F352</code>
<meaning>SAILBOAT</meaning>
</record>
</root>
DATA
cat << "DATA" > "tmp/${t}/${t}.transform"
[
{
"op": "core/column-reorder",
"columnNames": [
"root - record - icon",
"root - record - code",
"root - record - meaning"
],
"description": "Reorder columns"
},
{
"op": "core/row-removal",
"engineConfig": {
"facets": [
{
"type": "list",
"name": "Blank Rows",
"expression": "(filter(row.columnNames,cn,isNonBlank(cells[cn].value)).length()==0).toString()",
"columnName": "",
"invert": false,
"omitBlank": false,
"omitError": false,
"selection": [
{
"v": {
"v": "true",
"l": "true"
}
}
],
"selectBlank": false,
"selectError": false
}
],
"mode": "record-based"
}
}
]
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
root - record - icon root - record - code root - record - meaning
⛲ 1F347 FOUNTAIN
⛳ 1F349 FLAG IN HOLE
⛵ 1F352 SAILBOAT
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.xml"
${cmd} --apply "tmp/${t}/${t}.transform" "${t}"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

96
tests/create-xml.sh Normal file
View File

@ -0,0 +1,96 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.xml"
<?xml version="1.0" encoding="UTF-8"?>
<root>
<record>
<a>1</a>
<b>2</b>
<c>3</c>
</record>
<record>
<a>0</a>
<b>0</b>
<c>0</c>
</record>
<record>
<a>$</a>
<b>\</b>
<c>'</c>
</record>
</root>
DATA
cat << "DATA" > "tmp/${t}/${t}.transform"
[
{
"op": "core/column-reorder",
"columnNames": [
"root - record - a",
"root - record - b",
"root - record - c"
],
"description": "Reorder columns"
},
{
"op": "core/row-removal",
"engineConfig": {
"facets": [
{
"type": "list",
"name": "Blank Rows",
"expression": "(filter(row.columnNames,cn,isNonBlank(cells[cn].value)).length()==0).toString()",
"columnName": "",
"invert": false,
"omitBlank": false,
"omitError": false,
"selection": [
{
"v": {
"v": "true",
"l": "true"
}
}
],
"selectBlank": false,
"selectError": false
}
],
"mode": "record-based"
}
}
]
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
root - record - a root - record - b root - record - c
1 2 3
0 0 0
$ \ '
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.xml"
${cmd} --apply "tmp/${t}/${t}.transform" "${t}"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -0,0 +1,44 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}-1.csv"
a,b,c
1,2,3
DATA
cat << "DATA" > "tmp/${t}/${t}-2.csv"
a,b,c
4,5,6
DATA
zip "tmp/${t}/${t}.zip" "tmp/${t}/${t}-1.csv" "tmp/${t}/${t}-2.csv"
# ================================= ASSERTION ================================ #
cat << DATA > "tmp/${t}/${t}.assert"
File a b c
tmp/${t}/${t}-1.csv 1 2 3
tmp/${t}/${t}-2.csv 4 5 6
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.zip" --includeFileSources "true"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

44
tests/create-zip.sh Normal file
View File

@ -0,0 +1,44 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}-1.csv"
a,b,c
1,2,3
DATA
cat << "DATA" > "tmp/${t}/${t}-2.csv"
a,b,c
4,5,6
DATA
zip "tmp/${t}/${t}.zip" "tmp/${t}/${t}-1.csv" "tmp/${t}/${t}-2.csv"
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
a b c
1 2 3
4 5 6
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.zip"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -0,0 +1,11 @@
email,name,state,gender,purchase,count,date
danny.baron@example1.com,Danny Baron,CA,M,TV (UTF-8: 📺),1,"Wed, 4 Jul 2001"
melanie.white@example2.edu,Melanie White,NC,F,<iPhone>,1,2001-07-04T12:08:56
danny.baron@example1.com, D. ("Tab") Baron,CA,M,Winter jacket,1,2001-07-04
ben.tyler@example3.org,Ben Tyler,NV,M,Flashlight,1,2001/07/04
arthur.duff@example4.com,Arthur Duff,OR,M,Dining table,1,2001-07
danny.baron@example1.com,Daniel Baron,,,Bike,1,2001
jean.griffith@example5.org,Jean Griffith,WA,F,Power drill,1,2000
melanie.white@example2.edu,Melanie White,NC,F,'iPad',1,1999
ben.morisson@example6.org,Ben Morisson,FL,M,Amplifier,1,1998
arthur.duff@example4.com,Arthur Duff,OR,M,Night table,1,1997
Can't render this file because it contains an unexpected character in line 4 and column 33.

View File

@ -0,0 +1,92 @@
[
{
"email": "danny.baron@example1.com",
"name": "Danny Baron",
"state": "CA",
"gender": "M",
"purchase": "TV (UTF-8: 📺)",
"count": 1,
"date": "Wed, 4 Jul 2001"
},
{
"email": "melanie.white@example2.edu",
"name": "Melanie White",
"state": "NC",
"gender": "F",
"purchase": "<iPhone>",
"count": 1,
"date": "2001-07-04T12:08:56"
},
{
"email": "danny.baron@example1.com",
"name": " D.\t(\"Tab\") Baron",
"state": "CA",
"gender": "M",
"purchase": "Winter jacket",
"count": 1,
"date": "2001-07-04"
},
{
"email": "ben.tyler@example3.org",
"name": "Ben Tyler",
"state": "NV",
"gender": "M",
"purchase": "Flashlight",
"count": 1,
"date": "2001/07/04"
},
{
"email": "arthur.duff@example4.com",
"name": "Arthur Duff",
"state": "OR",
"gender": "M",
"purchase": "Dining table",
"count": 1,
"date": "2001-07"
},
{
"email": "danny.baron@example1.com",
"name": "Daniel Baron",
"state": "",
"gender": "",
"purchase": "Bike",
"count": 1,
"date": 2001
},
{
"email": "jean.griffith@example5.org",
"name": "Jean Griffith",
"state": "WA",
"gender": "F",
"purchase": "Power drill",
"count": 1,
"date": 2000
},
{
"email": "melanie.white@example2.edu",
"name": "Melanie White",
"state": "NC",
"gender": "F",
"purchase": "'iPad'",
"count": 1,
"date": 1999
},
{
"email": "ben.morisson@example6.org",
"name": "Ben Morisson",
"state": "FL",
"gender": "M",
"purchase": "Amplifier",
"count": 1,
"date": 1998
},
{
"email": "arthur.duff@example4.com",
"name": "Arthur Duff",
"state": "OR",
"gender": "M",
"purchase": "Night table",
"count": 1,
"date": 1997
}
]

Binary file not shown.

View File

@ -0,0 +1,11 @@
email name state gender purchase count date
danny.baron@example1.com Danny Baron CA M TV (UTF-8: 📺) 1 Wed, 4 Jul 2001
melanie.white@example2.edu Melanie White NC F <iPhone> 1 2001-07-04T12:08:56
danny.baron@example1.com "D. (""Tab"") Baron" CA M Winter jacket 1 2001-07-04
ben.tyler@example3.org Ben Tyler NV M Flashlight 1 2001/07/04
arthur.duff@example4.com Arthur Duff OR M Dining table 1 2001-07
danny.baron@example1.com Daniel Baron Bike 1 2001
jean.griffith@example5.org Jean Griffith WA F Power drill 1 2000
melanie.white@example2.edu Melanie White NC F 'iPad' 1 1999
ben.morisson@example6.org Ben Morisson FL M Amplifier 1 1998
arthur.duff@example4.com Arthur Duff OR M Night table 1 1997
1 email name state gender purchase count date
2 danny.baron@example1.com Danny Baron CA M TV (UTF-8: 📺) 1 Wed, 4 Jul 2001
3 melanie.white@example2.edu Melanie White NC F <iPhone> 1 2001-07-04T12:08:56
4 danny.baron@example1.com D. ("Tab") Baron CA M Winter jacket 1 2001-07-04
5 ben.tyler@example3.org Ben Tyler NV M Flashlight 1 2001/07/04
6 arthur.duff@example4.com Arthur Duff OR M Dining table 1 2001-07
7 danny.baron@example1.com Daniel Baron Bike 1 2001
8 jean.griffith@example5.org Jean Griffith WA F Power drill 1 2000
9 melanie.white@example2.edu Melanie White NC F 'iPad' 1 1999
10 ben.morisson@example6.org Ben Morisson FL M Amplifier 1 1998
11 arthur.duff@example4.com Arthur Duff OR M Night table 1 1997

View File

@ -0,0 +1,11 @@
email name state gender purchase count date
danny.baron@example1.com Danny Baron CA M TV (UTF-8: 📺) 1 Wed, 4 Jul 2001
melanie.white@example2.edu Melanie White NC F <iPhone> 1 2001-07-04T12:08:5
danny.baron@example1.com D. ("Tab") Baron CA M Winter jacket 1 2001-07-04
ben.tyler@example3.org Ben Tyler NV M Flashlight 1 2001/07/04
arthur.duff@example4.com Arthur Duff OR M Dining table 1 2001-07
danny.baron@example1.com Daniel Baron Bike 1 2001
jean.griffith@example5.org Jean Griffith WA F Power drill 1 2000
melanie.white@example2.edu Melanie White NC F 'iPad' 1 1999
ben.morisson@example6.org Ben Morisson FL M Amplifier 1 1998
arthur.duff@example4.com Arthur Duff OR M Night table 1 1997

Binary file not shown.

Binary file not shown.

View File

@ -0,0 +1,93 @@
<?xml version="1.0" encoding="UTF-8"?>
<root>
<record>
<email>danny.baron@example1.com</email>
<name>Danny Baron</name>
<state>CA</state>
<gender>M</gender>
<purchase>TV (UTF-8: 📺)</purchase>
<count>1</count>
<date>Wed, 4 Jul 2001</date>
</record>
<record>
<email>melanie.white@example2.edu</email>
<name>Melanie White</name>
<state>NC</state>
<gender>F</gender>
<purchase>&lt;iPhone&gt;</purchase>
<count>1</count>
<date>2001-07-04T12:08:56</date>
</record>
<record>
<email>danny.baron@example1.com</email>
<name> D. (&quot;Tab&quot;) Baron</name>
<state>CA</state>
<gender>M</gender>
<purchase>Winter jacket</purchase>
<count>1</count>
<date>2001-07-04</date>
</record>
<record>
<email>ben.tyler@example3.org</email>
<name>Ben Tyler</name>
<state>NV</state>
<gender>M</gender>
<purchase>Flashlight</purchase>
<count>1</count>
<date>2001/07/04</date>
</record>
<record>
<email>arthur.duff@example4.com</email>
<name>Arthur Duff</name>
<state>OR</state>
<gender>M</gender>
<purchase>Dining table</purchase>
<count>1</count>
<date>2001-07</date>
</record>
<record>
<email>danny.baron@example1.com</email>
<name>Daniel Baron</name>
<state></state>
<gender></gender>
<purchase>Bike</purchase>
<count>1</count>
<date>2001</date>
</record>
<record>
<email>jean.griffith@example5.org</email>
<name>Jean Griffith</name>
<state>WA</state>
<gender>F</gender>
<purchase>Power drill</purchase>
<count>1</count>
<date>2000</date>
</record>
<record>
<email>melanie.white@example2.edu</email>
<name>Melanie White</name>
<state>NC</state>
<gender>F</gender>
<purchase>&apos;iPad&apos;</purchase>
<count>1</count>
<date>1999</date>
</record>
<record>
<email>ben.morisson@example6.org</email>
<name>Ben Morisson</name>
<state>FL</state>
<gender>M</gender>
<purchase>Amplifier</purchase>
<count>1</count>
<date>1998</date>
</record>
<record>
<email>arthur.duff@example4.com</email>
<name>Arthur Duff</name>
<state>OR</state>
<gender>M</gender>
<purchase>Night table</purchase>
<count>1</count>
<date>1997</date>
</record>
</root>

Binary file not shown.

View File

@ -0,0 +1,10 @@
<?xml version="1.0" encoding="UTF-8"?>
<record>
<email>danny.baron@example1.com</email>
<name>Danny Baron</name>
<state>CA</state>
<gender>M</gender>
<purchase>TV (UTF-8: 📺)</purchase>
<count>1</count>
<date>Wed, 4 Jul 2001</date>
</record>

Binary file not shown.

Binary file not shown.

Binary file not shown.

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,6 @@
🔣 code meaning
🍇 1F347 GRAPES
🍉 1F349 WATERMELON
🍒 1F352 CHERRIES
🍓 1F353 STRAWBERRY
🍍 1F34D PINEAPPLE
1 🔣 code meaning
2 🍇 1F347 GRAPES
3 🍉 1F349 WATERMELON
4 🍒 1F352 CHERRIES
5 🍓 1F353 STRAWBERRY
6 🍍 1F34D PINEAPPLE

View File

@ -0,0 +1,69 @@
[
{
"op": "core/row-reorder",
"description": "Reorder rows",
"mode": "record-based",
"sorting": {
"criteria": [
{
"errorPosition": 1,
"caseSensitive": false,
"valueType": "string",
"column": "email",
"blankPosition": 2,
"reverse": false
}
]
}
},
{
"op": "core/column-addition",
"description": "Create column count at index 1 based on column email using expression grel:facetCount(value, \"value\", \"email\")",
"engineConfig": {
"mode": "row-based",
"facets": []
},
"newColumnName": "count",
"columnInsertIndex": 1,
"baseColumnName": "email",
"expression": "grel:facetCount(value, \"value\", \"email\")",
"onError": "set-to-blank"
},
{
"op": "core/blank-down",
"description": "Blank down cells in column email",
"engineConfig": {
"mode": "row-based",
"facets": []
},
"columnName": "email"
},
{
"op": "core/row-removal",
"description": "Remove rows",
"engineConfig": {
"mode": "row-based",
"facets": [
{
"omitError": false,
"expression": "isBlank(value)",
"selectBlank": false,
"selection": [
{
"v": {
"v": true,
"l": "true"
}
}
],
"selectError": false,
"invert": false,
"name": "email",
"omitBlank": false,
"type": "list",
"columnName": "email"
}
]
}
}
]

BIN
tests/data/example.ods Normal file

Binary file not shown.

BIN
tests/data/example.xls Normal file

Binary file not shown.

BIN
tests/data/example.xlsx Normal file

Binary file not shown.

36
tests/delete-utf8.sh Normal file
View File

@ -0,0 +1,36 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,2,3
DATA
# ================================= ASSERTION ================================ #
cat << DATA > "tmp/${t}/${t}.assert"
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv" --projectName "${t} biểu tượng cảm xúc ⛲"
${cmd} --list | grep "${t}" || exit 1
${cmd} --delete "${t} biểu tượng cảm xúc ⛲"
${cmd} --list | grep "${t}" | cut -d ':' -f 2 > "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

36
tests/delete.sh Normal file
View File

@ -0,0 +1,36 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,2,3
DATA
# ================================= ASSERTION ================================ #
cat << DATA > "tmp/${t}/${t}.assert"
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --list | grep "${t}" || exit 1
${cmd} --delete "${t}"
${cmd} --list | grep "${t}" | cut -d ':' -f 2 > "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

21
tests/download.sh Normal file
View File

@ -0,0 +1,21 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# ================================== ACTION ================================== #
${cmd} --download "https://git.io/fj5ju" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "data/duplicates-deletion.json" "tmp/${t}/${t}.output"

44
tests/export-csv-utf8.sh Normal file
View File

@ -0,0 +1,44 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.tsv"
🔣 code meaning
🍇 1F347 GRAPES
🍉 1F349 WATERMELON
🍒 1F352 CHERRIES
🍓 1F353 STRAWBERRY
🍍 1F34D PINEAPPLE
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
🔣,code,meaning
🍇,1F347,GRAPES
🍉,1F349,WATERMELON
🍒,1F352,CHERRIES
🍓,1F353,STRAWBERRY
🍍,1F34D,PINEAPPLE
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.tsv"
${cmd} --export "${t}" --output "tmp/${t}/${t} biểu tượng cảm xúc 🍉.csv"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t} biểu tượng cảm xúc 🍉.csv"

40
tests/export-csv.sh Normal file
View File

@ -0,0 +1,40 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.tsv"
a b c
1 2 3
0 0 0
$ \ '
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.tsv"
${cmd} --export "${t}" --output "tmp/${t}/${t}.csv"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.csv"

72
tests/export-html-utf8.sh Normal file
View File

@ -0,0 +1,72 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
if [[ ${2} ]]; then
majorversion="${2%%.*}"
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
⌨,code,meaning
⛲,1F347,FOUNTAIN
⛳,1F349,FLAG IN HOLE
⛵,1F352,SAILBOAT
DATA
# ================================= ASSERTION ================================ #
if [[ "$majorversion" = 2 ]]; then
cat << "DATA" > "tmp/${t}/${t}.assert"
<html>
<head>
<title>export-html-utf8</title>
<meta charset="utf-8" />
</head>
<body>
<table>
<tr><th>⌨</th><th>code</th><th>meaning</th></tr>
<tr><td>&#9970;</td><td>1F347</td><td>FOUNTAIN</td></tr>
<tr><td>&#9971;</td><td>1F349</td><td>FLAG IN HOLE</td></tr>
<tr><td>&#9973;</td><td>1F352</td><td>SAILBOAT</td></tr>
</table>
</body>
</html>
DATA
else
cat << "DATA" > "tmp/${t}/${t}.assert"
<html>
<head>
<title>export-html-utf8</title>
<meta charset="utf-8" />
</head>
<body>
<table>
<tr><th>⌨</th><th>code</th><th>meaning</th></tr>
<tr><td>⛲</td><td>1F347</td><td>FOUNTAIN</td></tr>
<tr><td>⛳</td><td>1F349</td><td>FLAG IN HOLE</td></tr>
<tr><td>⛵</td><td>1F352</td><td>SAILBOAT</td></tr>
</table>
</body>
</html>
DATA
fi
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --export "${t}" --output "tmp/${t}/${t}.html"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.html"

50
tests/export-html.sh Normal file
View File

@ -0,0 +1,50 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
<html>
<head>
<title>export-html</title>
<meta charset="utf-8" />
</head>
<body>
<table>
<tr><th>a</th><th>b</th><th>c</th></tr>
<tr><td>1</td><td>2</td><td>3</td></tr>
<tr><td>0</td><td>0</td><td>0</td></tr>
<tr><td>$</td><td>\</td><td>&apos;</td></tr>
</table>
</body>
</html>
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --export "${t}" --output "tmp/${t}/${t}.html"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.html"

43
tests/export-ods-utf8.sh Normal file
View File

@ -0,0 +1,43 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
⌨,code,meaning
⛲,1F347,FOUNTAIN
⛳,1F349,FLAG IN HOLE
⛵,1F352,SAILBOAT
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
⌨,code,meaning
⛲,1F347,FOUNTAIN
⛳,1F349,"FLAG IN HOLE"
⛵,1F352,SAILBOAT
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --export "${t}" --output "tmp/${t}/${t}.ods"
(cd tmp/"${t}" &&
ssconvert -S "${t}.ods" "${t}.csv" &&
mv "${t}.csv.1" "${t}.output")
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

47
tests/export-ods.sh Normal file
View File

@ -0,0 +1,47 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
if [[ -z "$(command -v ssconvert 2> /dev/null)" ]] ; then
echo 1>&2 "ERROR: This test requires ssconvert (gnumeric)"
exit 127
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --export "${t}" --output "tmp/${t}/${t}.ods"
(cd tmp/"${t}" &&
ssconvert -S "${t}.ods" "${t}.csv" &&
mv "${t}.csv.1" "${t}.output")
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

44
tests/export-tsv-utf8.sh Normal file
View File

@ -0,0 +1,44 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
🔣,code,meaning
🍇,1F347,GRAPES
🍉,1F349,WATERMELON
🍒,1F352,CHERRIES
🍓,1F353,STRAWBERRY
🍍,1F34D,PINEAPPLE
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
🔣 code meaning
🍇 1F347 GRAPES
🍉 1F349 WATERMELON
🍒 1F352 CHERRIES
🍓 1F353 STRAWBERRY
🍍 1F34D PINEAPPLE
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --export "${t}" --output "tmp/${t}/${t}.tsv"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.tsv"

40
tests/export-tsv.sh Normal file
View File

@ -0,0 +1,40 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
a b c
1 2 3
0 0 0
$ \ '
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --export "${t}" --output "tmp/${t}/${t}.tsv"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.tsv"

44
tests/export-utf8.sh Normal file
View File

@ -0,0 +1,44 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
🔣,code,meaning
🍇,1F347,GRAPES
🍉,1F349,WATERMELON
🍒,1F352,CHERRIES
🍓,1F353,STRAWBERRY
🍍,1F34D,PINEAPPLE
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
🔣 code meaning
🍇 1F347 GRAPES
🍉 1F349 WATERMELON
🍒 1F352 CHERRIES
🍓 1F353 STRAWBERRY
🍍 1F34D PINEAPPLE
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --export "${t}" > "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

43
tests/export-xls-utf8.sh Normal file
View File

@ -0,0 +1,43 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
⌨,code,meaning
⛲,1F347,FOUNTAIN
⛳,1F349,FLAG IN HOLE
⛵,1F352,SAILBOAT
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
⌨,code,meaning
⛲,1F347,FOUNTAIN
⛳,1F349,FLAG IN HOLE
⛵,1F352,SAILBOAT
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --export "${t}" --output "tmp/${t}/${t}.xls"
(cd tmp/"${t}" &&
ssconvert -S "${t}.xls" "${t}.csv" &&
mv "${t}.csv" "${t}.output")
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

43
tests/export-xls.sh Normal file
View File

@ -0,0 +1,43 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --export "${t}" --output "tmp/${t}/${t}.xls"
(cd tmp/"${t}" &&
ssconvert -S "${t}.xls" "${t}.csv" &&
mv "${t}.csv" "${t}.output")
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

45
tests/export-xlsx-utf8.sh Normal file
View File

@ -0,0 +1,45 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
⌨,code,meaning
⛲,1F347,FOUNTAIN
⛳,1F349,FLAG IN HOLE
⛵,1F352,SAILBOAT
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
⌨,code,meaning
⛲,1F347,FOUNTAIN
⛳,1F349,FLAG IN HOLE
⛵,1F352,SAILBOAT
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --export "${t}" --output "tmp/${t}/${t}.xlsx"
(cd tmp/"${t}" &&
ssconvert -S "${t}.xlsx" "${t}.csv" &&
mv "${t}.csv" "${t}.output")
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

43
tests/export-xlsx.sh Normal file
View File

@ -0,0 +1,43 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --export "${t}" --output "tmp/${t}/${t}.xlsx"
(cd tmp/"${t}" &&
ssconvert -S "${t}.xlsx" "${t}.csv" &&
mv "${t}.csv" "${t}.output")
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

40
tests/export.sh Normal file
View File

@ -0,0 +1,40 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
a b c
1 2 3
0 0 0
$ \ '
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --export "${t}" > "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -0,0 +1,40 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.txt"
a;b;c
1;2;3
0;0;0
$;\;'
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
a b c
1 2 3
0 0 0
$ \ '
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.txt" --format "csv" --separator ";"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

41
tests/format-create.sh Normal file
View File

@ -0,0 +1,41 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
Column 1
a,b,c
1,2,3
0,0,0
$,\,'
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv" --format "line-based"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -0,0 +1,40 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --export "${t}" --format "csv" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

40
tests/format-export.sh Normal file
View File

@ -0,0 +1,40 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --export "${t}" --format "csv" > "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

27
tests/help.sh Normal file
View File

@ -0,0 +1,27 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
Script to provide a command line interface to an OpenRefine server.
DATA
# ================================== ACTION ================================== #
${cmd} --help | sed '3q;d' > "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

39
tests/info-utf8.sh Normal file
View File

@ -0,0 +1,39 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.tsv"
🔣 code meaning
🍇 1F347 GRAPES
🍉 1F349 WATERMELON
🍒 1F352 CHERRIES
🍓 1F353 STRAWBERRY
🍍 1F34D PINEAPPLE
DATA
# ================================= ASSERTION ================================ #
cat << DATA > "tmp/${t}/${t}.assert"
column 001: 🔣
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.tsv"
${cmd} --info "${t}" | grep 'column 001' > "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

35
tests/info.sh Normal file
View File

@ -0,0 +1,35 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,2,3
DATA
# ================================= ASSERTION ================================ #
cat << DATA > "tmp/${t}/${t}.assert"
column 002: b
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --info "${t}" | grep 'column 002' > "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

35
tests/list-utf8.sh Normal file
View File

@ -0,0 +1,35 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,2,3
DATA
# ================================= ASSERTION ================================ #
cat << DATA > "tmp/${t}/${t}.assert"
${t} biểu tượng cảm xúc ⛲
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv" --projectName "${t} biểu tượng cảm xúc ⛲"
${cmd} --list | grep "${t}" | cut -d ':' -f 2 > "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

35
tests/list.sh Normal file
View File

@ -0,0 +1,35 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests-cli.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,2,3
DATA
# ================================= ASSERTION ================================ #
cat << DATA > "tmp/${t}/${t}.assert"
${t}
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --list | grep "${t}" | cut -d ':' -f 2 > "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

Some files were not shown because too many files have changed in this diff Show More