Compare commits

..

No commits in common. "f06fc54e123e0cadcd4e58e9345fed53d865c6d3" and "cca067a87cbddc965f1a772fb00dc476b9876a43" have entirely different histories.

94 changed files with 21256 additions and 2792 deletions

5
.gitignore vendored
View File

@ -4,7 +4,4 @@ dist
.*
openrefine_client.egg-info
refine.spec
openrefine-2.*
openrefine-3.*
openrefine-client_*
tests.log
README.html

View File

@ -1,2 +1,4 @@
include README.md
include COPYING.txt
recursive-include tests/data *.csv
recursive-include tests *.py

32
Makefile Normal file
View File

@ -0,0 +1,32 @@
# XXX have a Makefile written by someone that knows Makefiles...
all: test build install
readme:
# requires docutils, e.g. pip install docutils
rst2html.py README.rst > README.html
w3m -dump README.html | unix2dos > README.txt
test:
python setup.py test
# tests that don't require a Refine server running
smalltest:
python setup.py test --test-suite tests.test_refine_small
python setup.py test --test-suite tests.test_facet
python setup.py test --test-suite tests.test_history
build:
python setup.py build
install:
sudo python setup.py install
clean:
find . -name '*.pyc' | xargs rm -f
# XXX is there some way of having setup.py clean up its junk?
rm -rf README.{html,txt} build dist refine_client.egg-info distribute-*
upload: clean
python setup.py sdist upload

271
README.md
View File

@ -6,15 +6,13 @@ The [OpenRefine Python Client from PaulMakepeace](https://github.com/PaulMakepea
This fork extends the command line interface (CLI) and is distributed as a convenient one-file-executable (Windows, Linux, macOS).
It is also available via Docker Hub, PyPI and Binder.
works with OpenRefine 2.7, 2.8, 3.0, 3.1, 3.2
## Download
One-file-executables:
- Windows: [openrefine-client_0-3-9_windows.exe](https://github.com/opencultureconsulting/openrefine-client/releases/download/v0.3.9/openrefine-client_0-3-9_windows.exe) (~5 MB)
- macOS: [openrefine-client_0-3-9_macos](https://github.com/opencultureconsulting/openrefine-client/releases/download/v0.3.9/openrefine-client_0-3-9_macos) (~5 MB)
- Linux: [openrefine-client_0-3-9_linux](https://github.com/opencultureconsulting/openrefine-client/releases/download/v0.3.9/openrefine-client_0-3-9_linux) (~5 MB)
- Windows: [openrefine-client_0-3-8_windows.exe](https://github.com/opencultureconsulting/openrefine-client/releases/download/v0.3.8/openrefine-client_0-3-8_windows.exe) (~5 MB)
- macOS: [openrefine-client_0-3-8_macos](https://github.com/opencultureconsulting/openrefine-client/releases/download/v0.3.8/openrefine-client_0-3-8_macos) (~5 MB)
- Linux: [openrefine-client_0-3-8_linux](https://github.com/opencultureconsulting/openrefine-client/releases/download/v0.3.8/openrefine-client_0-3-8_linux) (~5 MB)
For [Docker](#docker) containers, native [Python](#python) installation and free [Binder](#binder) on-demand server see the corresponding chapters below.
@ -57,13 +55,13 @@ To use the client:
- macOS:
```sh
chmod +x openrefine-client_0-3-9_macos
chmod +x openrefine-client_0-3-8_macos
```
- Linux:
```sh
chmod +x openrefine-client_0-3-9_linux
chmod +x openrefine-client_0-3-8_linux
```
3. Execute the file.
@ -71,19 +69,19 @@ To use the client:
- Windows:
```sh
.\openrefine-client_0-3-9_windows.exe
.\openrefine-client_0-3-8_windows.exe
```
- macOS:
```sh
./openrefine-client_0-3-9_macos
./openrefine-client_0-3-8_macos
```
- Linux:
```sh
./openrefine-client_0-3-9_linux
./openrefine-client_0-3-8_linux
```
Using tab completion and command history is highly recommended:
@ -102,25 +100,25 @@ Download example data (`--download`) and create project from file (`--create`):
- Windows:
```sh
.\openrefine-client_0-3-9_windows.exe --download "https://git.io/fj5hF" --output=duplicates.csv
.\openrefine-client_0-3-9_windows.exe --download "https://git.io/fj5ju" --output=duplicates-deletion.json
.\openrefine-client_0-3-9_windows.exe --create duplicates.csv
.\openrefine-client_0-3-8_windows.exe --download "https://git.io/fj5hF" --output=duplicates.csv
.\openrefine-client_0-3-8_windows.exe --download "https://git.io/fj5ju" --output=duplicates-deletion.json
.\openrefine-client_0-3-8_windows.exe --create duplicates.csv
```
- macOS:
```sh
./openrefine-client_0-3-9_macos --download "https://git.io/fj5hF" --output=duplicates.csv
./openrefine-client_0-3-9_macos --download "https://git.io/fj5ju" --output=duplicates-deletion.json
./openrefine-client_0-3-9_macos --create duplicates.csv
./openrefine-client_0-3-8_macos --download "https://git.io/fj5hF" --output=duplicates.csv
./openrefine-client_0-3-8_macos --download "https://git.io/fj5ju" --output=duplicates-deletion.json
./openrefine-client_0-3-8_macos --create duplicates.csv
```
- Linux:
```sh
./openrefine-client_0-3-9_linux --download "https://git.io/fj5hF" --output=duplicates.csv
./openrefine-client_0-3-9_linux --download "https://git.io/fj5ju" --output=duplicates-deletion.json
./openrefine-client_0-3-9_linux --create duplicates.csv
./openrefine-client_0-3-8_linux --download "https://git.io/fj5hF" --output=duplicates.csv
./openrefine-client_0-3-8_linux --download "https://git.io/fj5ju" --output=duplicates-deletion.json
./openrefine-client_0-3-8_linux --create duplicates.csv
```
Other commands:
@ -232,7 +230,7 @@ When using this option, the first column should contain unique identifiers.
[felixlohmeier/openrefine-client](https://hub.docker.com/r/felixlohmeier/openrefine-client/) [![Docker](https://img.shields.io/microbadger/image-size/felixlohmeier/openrefine-client?label=docker)](https://hub.docker.com/r/felixlohmeier/openrefine-client/)
```sh
docker pull felixlohmeier/openrefine-client:v0.3.9
docker pull felixlohmeier/openrefine-client:v0.3.8
```
### Option 1: Dockerized client
@ -240,7 +238,7 @@ docker pull felixlohmeier/openrefine-client:v0.3.9
Run client and mount current directory as workspace:
```sh
docker run --rm --network=host -v ${PWD}:/data:z felixlohmeier/openrefine-client:v0.3.9
docker run --rm --network=host -v ${PWD}:/data:z felixlohmeier/openrefine-client:v0.3.8
```
The docker option `--network=host` allows you to connect to a local or remote OpenRefine via the host network:
@ -248,13 +246,13 @@ The docker option `--network=host` allows you to connect to a local or remote Op
- list projects on default URL (http://localhost:3333)
```sh
docker run --rm --network=host -v ${PWD}:/data:z felixlohmeier/openrefine-client:v0.3.9 --list
docker run --rm --network=host -v ${PWD}:/data:z felixlohmeier/openrefine-client:v0.3.8 --list
```
- list projects on a remote server (http://example.com)
```sh
docker run --rm --network=host -v ${PWD}:/data:z felixlohmeier/openrefine-client:v0.3.9 -H example.com -P 80 --list
docker run --rm --network=host -v ${PWD}:/data:z felixlohmeier/openrefine-client:v0.3.8 -H example.com -P 80 --list
```
Usage: same commands as explained above (see [Basic Commands](#basic-commands) and [Advanced Templating](#advanced-templating))
@ -278,16 +276,16 @@ Run openrefine-client linked to a dockerized OpenRefine ([felixlohmeier/openrefi
3. Run client with some [basic commands](#basic-commands): 1. download example files, 2. create project from file, 3. list projects, 4. show metadata, 5. export to terminal, 6. apply transformation rules (deduplication), 7. export again to terminal, 8. export to xls file and 9. delete project
```sh
docker run --rm --network=openrefine -v ${PWD}:/data:z felixlohmeier/openrefine-client:v0.3.9 --download "https://git.io/fj5hF" --output=duplicates.csv
docker run --rm --network=openrefine -v ${PWD}:/data:z felixlohmeier/openrefine-client:v0.3.9 --download "https://git.io/fj5ju" --output=duplicates-deletion.json
docker run --rm --network=openrefine -v ${PWD}:/data:z felixlohmeier/openrefine-client:v0.3.9 -H openrefine-server --create duplicates.csv
docker run --rm --network=openrefine -v ${PWD}:/data:z felixlohmeier/openrefine-client:v0.3.9 -H openrefine-server --list
docker run --rm --network=openrefine -v ${PWD}:/data:z felixlohmeier/openrefine-client:v0.3.9 -H openrefine-server --info "duplicates"
docker run --rm --network=openrefine -v ${PWD}:/data:z felixlohmeier/openrefine-client:v0.3.9 -H openrefine-server --export "duplicates"
docker run --rm --network=openrefine -v ${PWD}:/data:z felixlohmeier/openrefine-client:v0.3.9 -H openrefine-server --apply duplicates-deletion.json "duplicates"
docker run --rm --network=openrefine -v ${PWD}:/data:z felixlohmeier/openrefine-client:v0.3.9 -H openrefine-server --export "duplicates"
docker run --rm --network=openrefine -v ${PWD}:/data:z felixlohmeier/openrefine-client:v0.3.9 -H openrefine-server --export --output=deduped.xls "duplicates"
docker run --rm --network=openrefine -v ${PWD}:/data:z felixlohmeier/openrefine-client:v0.3.9 -H openrefine-server --delete "duplicates"
docker run --rm --network=openrefine -v ${PWD}:/data:z felixlohmeier/openrefine-client:v0.3.8 --download "https://git.io/fj5hF" --output=duplicates.csv
docker run --rm --network=openrefine -v ${PWD}:/data:z felixlohmeier/openrefine-client:v0.3.8 --download "https://git.io/fj5ju" --output=duplicates-deletion.json
docker run --rm --network=openrefine -v ${PWD}:/data:z felixlohmeier/openrefine-client:v0.3.8 -H openrefine-server --create duplicates.csv
docker run --rm --network=openrefine -v ${PWD}:/data:z felixlohmeier/openrefine-client:v0.3.8 -H openrefine-server --list
docker run --rm --network=openrefine -v ${PWD}:/data:z felixlohmeier/openrefine-client:v0.3.8 -H openrefine-server --info "duplicates"
docker run --rm --network=openrefine -v ${PWD}:/data:z felixlohmeier/openrefine-client:v0.3.8 -H openrefine-server --export "duplicates"
docker run --rm --network=openrefine -v ${PWD}:/data:z felixlohmeier/openrefine-client:v0.3.8 -H openrefine-server --apply duplicates-deletion.json "duplicates"
docker run --rm --network=openrefine -v ${PWD}:/data:z felixlohmeier/openrefine-client:v0.3.8 -H openrefine-server --export "duplicates"
docker run --rm --network=openrefine -v ${PWD}:/data:z felixlohmeier/openrefine-client:v0.3.8 -H openrefine-server --export --output=deduped.xls "duplicates"
docker run --rm --network=openrefine -v ${PWD}:/data:z felixlohmeier/openrefine-client:v0.3.8 -H openrefine-server --delete "duplicates"
```
4. Stop and delete server:
@ -336,10 +334,10 @@ See also:
## Python
[openrefine-client](https://pypi.org/project/openrefine-client/) [![PyPI](https://img.shields.io/pypi/v/openrefine-client)](https://pypi.org/project/openrefine-client/) (requires Python 3.x)
[openrefine-client](https://pypi.org/project/openrefine-client/) [![PyPI](https://img.shields.io/pypi/v/openrefine-client)](https://pypi.org/project/openrefine-client/) (requires Python 2.x)
```sh
python3 -m pip install openrefine-client --user
python2 -m pip install openrefine-client --user
```
This will install the package `openrefine-client` containing modules in `google.refine`.
@ -354,7 +352,7 @@ openrefine-client --help
Usage: same commands as explained above (see [Basic Commands](#basic-commands) and [Advanced Templating](#advanced-templating))
### Option 2: using cli functions in Python 3.x environment
### Option 2: using cli functions in Python 2.x environment
Import module cli:
@ -438,9 +436,158 @@ Commands:
cli.delete(p1.project_id)
```
### Option 3: the upstream way
This fork can be used in the same way as the upstream [Python client library](https://github.com/PaulMakepeace/refine-client-py/).
Some functions in the python client library are not yet compatible with OpenRefine >=3.0 (cf. [issue #19 in refine-client-py](https://github.com/paulmakepeace/refine-client-py/issues/19)).
Import module refine:
```python
from google.refine import refine
```
Server Commands:
* set up connection:
```python
server1 = refine.Refine('http://localhost:3333')
```
- show version:
```python
server1.server.get_version()
server1.server.version
```
- list projects:
```python
server1.list_projects()
```
- pretty print the returned dict with json.dumps:
```python
import json
print(json.dumps(server1.list_projects(), indent=1))
```
- create project:
```python
server1.new_project(project_file='duplicates.csv')
```
* create and open the returned project in one step:
```python
project1 = server1.new_project(project_file='duplicates.csv')
```
Project commands:
* open project:
```python
project1 = server1.open_project('1234567890123')
```
* print full URL to project:
```python
project1.project_url()
```
* list columns:
```python
project1.columns
```
* compute text facet on first column (**fails with OpenRefine >=3.2**):
```python
project1.compute_facets(facet.TextFacet(project1.columns[0]))
```
* print returned object
```python
facets = project1.compute_facets(facet.TextFacet(project1.columns[0])).facets[0]
for k in sorted(facets.choices, key=lambda k: facets.choices[k].count, reverse=True):
print(facets.choices[k].count, k)
```
* compute clusters on first column:
```python
project1.compute_clusters(project1.columns[0])
```
* apply rules from file to project:
```python
project1.apply_operations('duplicates-deletion.json')
```
* export project:
```python
project1.export(export_format='tsv')
```
* print the returned fileobject:
```python
print(project1.export(export_format='tsv').read())
```
* save the returned fileobject to file:
```python
with open('export.tsv', 'wb') as f:
f.write(project1.export(export_format='tsv').read())
```
* templating export (**function was added in this fork**, see [Advanced Templating](#advanced-templating) above):
```python
data = project1.export_templating(
prefix='''{ "events" : [
''',template=''' { "name" : {{jsonize(cells["name"].value)}}, "purchase" : {{jsonize(cells["purchase"].value)}} }''',
rowSeparator=''',
''',suffix='''
] }''')
print(data.read())
```
* print help screen with available commands (many more!):
```python
help(project1)
```
* example for custom commands:
```python
project1.do_json('get-rows')['total']
```
* delete project:
```python
project1.delete()
```
See also:
- Jupyter notebook by Trevor Muñoz (2013-08-18): [Programmatic Use of Open Refine to Facet and Cluster Names of 'Dishes' from NYPL's What's on the menu?](https://nbviewer.jupyter.org/gist/trevormunoz/6265360)
- Jupyter notebook by Tony Hirst (2019-01-09) [Notebook demonstrating how to control OpenRefine via a Python client.](https://nbviewer.jupyter.org/github/ouseful-PR/openrefineder/blob/4cef25a4ca6077536c5f49cafb531499fbcad96e/notebooks/OpenRefine%20Demos.ipynb)
- Unittests [test_refine.py](tests/test_refine.py) and [test_tutorial.py](tests/test_tutorial.py) (both importing [refinetest.py](tests/refinetest.py))
- [OpenRefine API](https://github.com/OpenRefine/OpenRefine/wiki/OpenRefine-API) in official OpenRefine wiki
## Binder
@ -451,12 +598,29 @@ See also:
- no registration needed, will start within a few minutes
- [restricted](https://mybinder.readthedocs.io/en/latest/faq.html#how-much-memory-am-i-given-when-using-binder) to 2 GB RAM and server will be deleted after 10 minutes of inactivity
- [bash_kernel demo notebook](https://nbviewer.jupyter.org/github/felixlohmeier/openrefineder/blob/master/openrefine-client-bash.ipynb) for using the openrefine-client in a Linux Bash environment [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/felixlohmeier/openrefineder/master?urlpath=/tree/openrefine-client-bash.ipynb)
- [python2 demo notebook](https://nbviewer.jupyter.org/github/felixlohmeier/openrefineder/blob/master/openrefine-client-python.ipynb) for using the openrefine-client in a Python 2 environment [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/felixlohmeier/openrefineder/master?urlpath=/tree/openrefine-client-python.ipynb)
## Development
If you would like to contribute to the Python client library please consider a pull request to the upstream repository [refine-client-py](https://github.com/PaulMakepeace/refine-client-py/).
### Tests
TODO
Ensure you have OpenRefine running (i.e. available at http://localhost:3333). If necessary set the environment variables `OPENREFINE_HOST` and `OPENREFINE_PORT` to change the URL.
The Python client library includes several unit tests.
- run all tests
```sh
python setup.py test
```
- run subset test_facet
```sh
python setup.py --test-suite tests.test_facet
```
There is also a script that uses docker images to run the unit tests with different versions of OpenRefine.
@ -492,8 +656,9 @@ Note to myself: When releasing a new version...
```sh
./tests.sh -a
jupyter notebook tests/cli_python2.ipynb
```
2. Make final changes in Git
- update versions (e.g. 0.3.7 und 0-3-7) in [README.md](https://github.com/opencultureconsulting/openrefine-client/blob/master/README.md#download)
@ -502,28 +667,29 @@ Note to myself: When releasing a new version...
3. Build executables with PyInstaller
- Run PyInstaller in Python 3 environments on native Windows, macOS and Linux. Should be "the oldest version of the OS you need to support"! Current release is built with:
- Run PyInstaller in Python 2 environments on native Windows, macOS and Linux. Should be "the oldest version of the OS you need to support"! Current release is built with:
- Ubuntu 16.04 LTS (64-bit)
- macOS Sierra 10.12 (64-bit)
- Windows 7 (32-bit)
- macOS Sierra 10.12
- Windows 10
- One-file-executables will be available in `dist/`.
```sh
git clone https://github.com/opencultureconsulting/openrefine-client.git
cd openrefine-client
python3 -m pip install . --user
python3 -m pip install pyinstaller --user
python3 -m pip PyInstaller --onefile refine.py --hidden-import google.refine.__main__
python -m pip install . --user
python -m pip install pyinstaller --user
pyinstaller --onefile refine.py --hidden-import google.refine.__main__
```
4. Run test with Linux executable
```sh
./tests.sh -a
jupyter notebook tests/cli_bash.ipynb
```
5. Create release in GitHub
- draft [release notes](https://github.com/opencultureconsulting/openrefine-client/releases) and attach one-file-executables
@ -531,7 +697,6 @@ Note to myself: When releasing a new version...
6. Build package and upload to PyPI
```sh
TODO
python3 setup.py sdist bdist_wheel
python3 -m twine upload dist/*
```
@ -544,7 +709,7 @@ Note to myself: When releasing a new version...
8. Bump openrefine-client version in related projects
- openrefine-batch: [openrefine-batch.sh](https://github.com/opencultureconsulting/openrefine-batch/blob/master/openrefine-batch.sh#L7) and [openrefine-batch-docker.sh](https://github.com/opencultureconsulting/openrefine-batch/blob/master/openrefine-batch-docker.sh)
- openrefineder: [postBuild](https://github.com/felixlohmeier/openrefineder/blob/master/postBuild) and [openrefine-client-bash.ipynb](https://github.com/felixlohmeier/openrefineder/blob/master/openrefine-client-python.ipynb)
- openrefineder: [postBuild](https://github.com/felixlohmeier/openrefineder/blob/master/postBuild)
## Credits
@ -553,6 +718,14 @@ Note to myself: When releasing a new version...
David Huynh, [initial cut](<http://markmail.org/message/jsxzlcu3gn6drtb7)
[Felix Lohmeier](https://felixlohmeier.de), CLI features
[Artfinder](http://www.artfinder.com), inspiration
[Wolf Vollprecht](https://github.com/wolfv), port to python 3
[Felix Lohmeier](https://felixlohmeier.de), extended the CLI features
Some data used in the test suite has been used from publicly available sources:
- louisiana-elected-officials.csv: from http://www.sos.louisiana.gov/tabid/136/Default.aspx
- us_economic_assistance.csv: ["The Green Book"](http://www.data.gov/raw/1554)
- eli-lilly.csv: [ProPublica's "Docs for Dollars](http://projects.propublica.org/docdollars) leading to a [Lilly Faculty PDF](http://www.lillyfacultyregistry.com/documents/EliLillyFacultyRegistryQ22010.pdf) processed by [David Huynh's ScraperWiki script](http://scraperwiki.com/scrapers/eli-lilly-dollars-for-docs-scraper/edit/)

30
google/refine/history.py Normal file
View File

@ -0,0 +1,30 @@
#!/usr/bin/env python3
"""
OpenRefine history: parsing responses.
"""
# Copyright (c) 2011 Paul Makepeace, Real Programmers. All rights reserved.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>
class HistoryEntry(object):
# N.B. e.g. **response['historyEntry'] won't work as keys are unicode :-/
#noinspection PyUnusedLocal
def __init__(self, history_entry_id=None, time=None, description=None, **kwargs):
if history_entry_id is None:
raise ValueError('History entry id must be set')
self.id = history_entry_id
self.description = description
self.time = time

View File

@ -32,6 +32,7 @@ import requests
import urllib.request, urllib.parse, urllib.error
from google.refine import facet
from google.refine import history
REFINE_HOST = os.environ.get('OPENREFINE_HOST', os.environ.get('GOOGLE_REFINE_HOST', '127.0.0.1'))
REFINE_PORT = os.environ.get('OPENREFINE_PORT', os.environ.get('GOOGLE_REFINE_PORT', '3333'))
@ -248,7 +249,6 @@ class Refine:
# POST is broken at the moment, so we send it in the URL
new_style_options = dict(opts, **{
'encoding': s(encoding),
'separator': s(separator)
})
params = {
'options': json.dumps(new_style_options),
@ -257,6 +257,7 @@ class Refine:
# old style options
options = {
'format': project_format,
'separator': s(separator),
'ignore-lines': s(ignore_lines),
'header-lines': s(header_lines),
'skip-data-lines': s(skip_data_lines),
@ -265,7 +266,7 @@ class Refine:
'process-quotes': s(process_quotes),
'store-blank-rows': s(store_blank_rows),
'store-blank-cells-as-nulls': s(store_blank_cells_as_nulls),
'include-file-sources': s(include_file_sources)
'include-file-sources': s(include_file_sources),
}
files = None
@ -359,6 +360,7 @@ class RefineProject:
self.project_id = project_id
self.engine = facet.Engine()
self.sorting = facet.Sorting()
self.history_entry = None
# following filled in by get_models()
self.key_column = None
self.has_records = False
@ -390,6 +392,11 @@ class RefineProject:
response = self.server.urlopen_json(command,
project_id=self.project_id,
data=data)
if 'historyEntry' in response:
# **response['historyEntry'] won't work as keys are unicode :-/
he = response['historyEntry']
self.history_entry = history.HistoryEntry(he['id'], he['time'],
he['description'])
return response
def get_models(self):

View File

@ -25,7 +25,7 @@ def read(filename):
return open(os.path.join(os.path.dirname(__file__), filename)).read()
setup(name='openrefine-client',
version='0.3.9',
version='0.3.8',
description=('The OpenRefine Python Client Library provides an '
'interface to communicating with an OpenRefine server. '
'This fork extends the command line interface (CLI).'),

188
tests.sh
View File

@ -1,5 +1,5 @@
#!/bin/bash
# Script for running functional tests against the CLI
# Script for running tests with different OpenRefine and Java versions based on Docker images.
# Copyright (c) 2011 Paul Makepeace, Real Programmers. All rights reserved.
@ -16,101 +16,115 @@
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>
# ================================== CONFIG ================================== #
# defaults:
all=(3.2-java12 3.2-java11 3.2-java10 3.2-java9 3.2 3.1-java9 3.1 3.0-java9 3.0 2.8-java9 2.8 2.8-java7 2.7 2.7-java7 2.5-java7 2.5-java6 2.1-java6 2.0-java6)
main=(3.2 3.1 3.0 2.8 2.7 2.5-java6 2.1-java6 2.0-java6)
interactively=false
port="3333"
cd "${BASH_SOURCE%/*}/" || exit 1
# help screen
function usage () {
cat <<EOF
Usage: ./tests.sh [-t TAG] [-i] [-p] [-a] [-h]
port=3334
Script for running tests with different OpenRefine and Java versions.
It uses docker images from https://hub.docker.com/r/felixlohmeier/openrefine.
if [[ ${1} ]]; then
version="${1}"
Examples:
./tests.sh -a # run tests on all OpenRefine versions (from 2.0 up to 3.2)
./tests.sh -t 3.2 # run tests on tag 3.2
./tests.sh -t 3.2 -i # run tests on tag 3.2 interactively (pause before and after tests)
./tests.sh -t 3.2 -t 2.7 # run tests on tags 3.2 and 2.7
Advanced:
./tests.sh -j # run tests on all OpenRefine versions and each with all supported Java versions (requires a lot of docker images to be downloaded!)
./tests.sh -t 3.1 -i -p 3334 # run tests on tag 3.1 interactively on port 3334
Running tests interactively (-i) allows you to examine OpenRefine GUI at http://localhost:3333.
Execute the script concurrently in another terminal on another port (-p 3334) to compare changes in the OpenRefine GUI at http://localhost:3333 and http://localhost:3334.
Available tags (java 8 if java not mentioned in tag):
EOF
for t in ${all[*]} ; do
echo "$t"
done
exit 1
}
# check input
NUMARGS=$#
if [ "$NUMARGS" -eq 0 ]; then
usage
fi
# check system requirements
DOCKER="$(command -v docker 2> /dev/null)"
if [ -z "$DOCKER" ] ; then
echo 1>&2 "This action requires you to have 'docker' installed and present in your PATH. You can download it for free at http://www.docker.com/"
exit 1
fi
DOCKERINFO="$(docker info 2>/dev/null | grep 'Server Version')"
if [ -z "$DOCKERINFO" ]
then
echo "command 'docker info' failed, trying again with sudo..."
DOCKERINFO="$(sudo docker info 2>/dev/null | grep 'Server Version')"
echo "OK"
docker=(sudo docker)
if [ -z "$DOCKERINFO" ] ; then
echo 1>&2 "This action requires you to start the docker daemon. Try 'sudo systemctl start docker' or 'sudo start docker'. If the docker daemon is already running then maybe some security privileges are missing to run docker commands.'"
exit 1
fi
else
version="3.2"
docker=(docker)
fi
refine="openrefine-${version}/refine"
if [[ ${2} ]]; then
client="$(readlink -e "${2}")"
else
client="python3 $(readlink -e refine.py)"
fi
cmd="${client} -H localhost -P ${port}"
# =============================== REQUIREMENTS =============================== #
# check existence of java and cURL
if [[ -z "$(command -v java 2> /dev/null)" ]] ; then
echo 1>&2 "ERROR: OpenRefine requires JAVA runtime environment (jre)" \
"https://openjdk.java.net/install/"
exit 1
fi
if [[ -z "$(command -v curl 2> /dev/null)" ]] ; then
echo 1>&2 "ERROR: This shell script requires cURL" \
"https://curl.haxx.se/download.html"
exit 1
fi
# download OpenRefine
if [[ -z "$(readlink -e "${refine}")" ]]; then
echo "Download OpenRefine..."
mkdir -p "$(dirname "${refine}")"
curl -L --output openrefine.tar.gz \
"https://github.com/OpenRefine/OpenRefine/releases/download/${version}/openrefine-linux-${version}.tar.gz"
echo "Install OpenRefine in subdirectory $(dirname "${refine}")..."
tar -xzf openrefine.tar.gz -C "$(dirname "${refine}")" --strip 1 --totals
rm -f openrefine.tar.gz
# do not try to open OpenRefine in browser
sed -i '$ a JAVA_OPTIONS=-Drefine.headless=true' \
"$(dirname "${refine}")"/refine.ini
# set autosave period from 5 minutes to 25 hours
sed -i 's/#REFINE_AUTOSAVE_PERIOD=60/REFINE_AUTOSAVE_PERIOD=1500/' \
"$(dirname "${refine}")"/refine.ini
echo
CURLINFO="$(command -v curl 2>/dev/null)"
if [ -z "$CURLINFO" ] ; then
echo 1>&2 "This action requires you to have 'curl' installed and present in your PATH."
exit 1
fi
# ================================== SETUP =================================== #
dir="$(readlink -f "tests/tmp")"
mkdir -p "${dir}"
rm -f tests.log
echo "start OpenRefine server..."
${refine} -v warn -p ${port} -d "${dir}" &>> tests.log &
pid_server=${!}
timeout 30s bash -c "until curl -s 'http://localhost:3334' \
| cat | grep -q -o 'OpenRefine' ; do sleep 1; done" \
|| error "starting OpenRefine server failed!"
echo
# ================================== TESTS =================================== #
echo "running tests, please wait..."
tests=()
results=()
for t in tests/*.sh; do
tests+=("${t}")
echo "========================= ${t} =========================" &>> tests.log
bash "${t}" "${cmd}" "${version}" &>> tests.log
results+=(${?})
# get user input
options="t:p:iajh"
while getopts $options opt; do
case $opt in
t ) tags+=("${OPTARG}");;
p ) port="${OPTARG}";export OPENREFINE_PORT="$port";;
i ) interactively=true;;
a ) tags=("${main[*]}");;
j ) tags=("${all[*]}");;
h ) usage ;;
\? ) echo 1>&2 "Unknown option: -$OPTARG"; usage; exit 1;;
: ) echo 1>&2 "Missing option argument for -$OPTARG"; usage; exit 1;;
* ) echo 1>&2 "Unimplemented option: -$OPTARG"; usage; exit 1;;
esac
done
echo
shift $((OPTIND - 1))
# ================================= TEARDOWN ================================= #
# print config
echo "Tags: ${tags[*]}"
echo "Port: $port"
echo ""
echo "cleanup..."
{ kill -9 "${pid_server}" && wait "${pid_server}"; } 2>/dev/null
rm -rf "${dir}"
echo
# safe cleanup handler
cleanup()
{
echo "cleanup..."
${docker[*]} stop "$t"
}
trap "cleanup;exit" SIGHUP SIGINT SIGQUIT SIGTERM
# ================================= SUMMARY ================================== #
printf "%s\t%s\n" "code" "test"
printf "%s\t%s\n" "----" "----------------"
for i in "${!tests[@]}"; do
printf "%s\t%s\n" "${results[$i]}" "${tests[$i]}"
# run setup.py tests for each docker tag
for t in ${tags[*]} ; do
echo "=== Tests for $t ==="
echo ""
echo "Begin: $(date)"
${docker[*]} run -d -p "$port":3333 --rm --name "$t" felixlohmeier/openrefine:"$t"
until curl --silent -N http://localhost:"$port" | cat | grep -q -o "Refine" ; do sleep 1; done
echo "Refine running at http://localhost:${port}"
if [ $interactively = true ]; then read -r -p "Press [Enter] key to start tests..."; fi
python2 setup.py test
if [ $interactively = true ]; then read -r -p "Press [Enter] key to stop OpenRefine..."; fi
${docker[*]} stop "$t"
echo "End: $(date)"
echo ""
done
echo
if [[ " ${results[*]} " =~ [1-9] ]]; then
echo "failed tests! check tests.log for debugging"; echo
else
echo "all tests passed!"; echo
fi

0
tests/__init__.py Normal file
View File

View File

@ -1,57 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
cat << "DATA" > "tmp/${t}/${t}.transform"
[
{
"op": "core/column-addition",
"engineConfig": {
"mode": "row-based"
},
"newColumnName": "apply",
"columnInsertIndex": 2,
"baseColumnName": "b",
"expression": "grel:value.replace('2','⛲')",
"onError": "set-to-blank"
}
]
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
a b apply c
1 2 ⛲ 3
0 0 0 0
$ \ \ '
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --apply "tmp/${t}/${t}.transform" "${t}"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -1,57 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
cat << "DATA" > "tmp/${t}/${t}.transform"
[
{
"op": "core/column-addition",
"engineConfig": {
"mode": "row-based"
},
"newColumnName": "apply",
"columnInsertIndex": 2,
"baseColumnName": "b",
"expression": "grel:value.replace('2','TEST')",
"onError": "set-to-blank"
}
]
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
a b apply c
1 2 TEST 3
0 0 0 0
$ \ \ '
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --apply "tmp/${t}/${t}.transform" "${t}"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

819
tests/cli_bash.ipynb Normal file
View File

@ -0,0 +1,819 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Test executable in a Linux Bash environment"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Install\n",
"\n",
"This notebook requires a [Bash kernel](https://github.com/takluyver/bash_kernel) environment and an OpenRefine server running at http://127.0.0.1:3333."
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"/tmp/20190822_013937\n"
]
}
],
"source": [
"workspace=$(date +%Y%m%d_%H%M%S)\n",
"mkdir -p /tmp/$workspace\n",
"cp -r data /tmp/$workspace\n",
"cd /tmp/$workspace && pwd"
]
},
{
"cell_type": "code",
"execution_count": 50,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"https://github.com/opencultureconsulting/openrefine-client/releases/download/v0.3.8/openrefine-client_0-3-8_linux:\n",
"2019-08-22 01:39:40 ERROR 404: Not Found.\n"
]
}
],
"source": [
"wget -nv https://github.com/opencultureconsulting/openrefine-client/releases/download/v0.3.8/openrefine-client_0-3-8_linux -O openrefine-client\n",
"chmod +x openrefine-client"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## README.MD"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Download"
]
},
{
"cell_type": "code",
"execution_count": 51,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Download to file duplicates.csv complete\n"
]
}
],
"source": [
"./openrefine-client --download \"https://git.io/fj5hF\" --output=duplicates.csv"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create"
]
},
{
"cell_type": "code",
"execution_count": 52,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"id: 2334935475634\n",
"rows: 10\n"
]
}
],
"source": [
"./openrefine-client --create duplicates.csv"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### List"
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" 2334935475634: duplicates\n"
]
}
],
"source": [
"./openrefine-client --list"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Info"
]
},
{
"cell_type": "code",
"execution_count": 54,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" id: 2334935475634\n",
" url: http://127.0.0.1:3333/project?project=2334935475634\n",
" name: duplicates\n",
" modified: 2019-08-21T23:40:30Z\n",
" created: 2019-08-21T23:40:30Z\n",
" rowCount: 10\n",
"importOptionMetadata: [{u'storeEmptyStrings': True, u'fileSource': u'duplicates.csv', u'storeBlankRows': True, u'encoding': u'', u'projectName': u'duplicates', u'processQuotes': True, u'separator': u',', u'trimStrings': False, u'limit': -1, u'storeBlankCellsAsNulls': True, u'guessCellValueTypes': False, u'includeFileSources': False}]\n",
" column 001: email\n",
" column 002: name\n",
" column 003: state\n",
" column 004: gender\n",
" column 005: purchase\n"
]
}
],
"source": [
"./openrefine-client --info \"duplicates\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Export"
]
},
{
"cell_type": "code",
"execution_count": 55,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"email\tname\tstate\tgender\tpurchase\n",
"danny.baron@example1.com\tDanny Baron\tCA\tM\tTV\n",
"melanie.white@example2.edu\tMelanie White\tNC\tF\tiPhone\n",
"danny.baron@example1.com\tD. Baron\tCA\tM\tWinter jacket\n",
"ben.tyler@example3.org\tBen Tyler\tNV\tM\tFlashlight\n",
"arthur.duff@example4.com\tArthur Duff\tOR\tM\tDining table\n",
"danny.baron@example1.com\tDaniel Baron\tCA\tM\tBike\n",
"jean.griffith@example5.org\tJean Griffith\tWA\tF\tPower drill\n",
"melanie.white@example2.edu\tMelanie White\tNC\tF\tiPad\n",
"ben.morisson@example6.org\tBen Morisson\tFL\tM\tAmplifier\n",
"arthur.duff@example4.com\tArthur Duff\tOR\tM\tNight table\n"
]
}
],
"source": [
"./openrefine-client --export \"duplicates\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Apply"
]
},
{
"cell_type": "code",
"execution_count": 56,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Download to file duplicates-deletion.json complete\n"
]
}
],
"source": [
"./openrefine-client --download \"https://git.io/fj5ju\" --output=duplicates-deletion.json"
]
},
{
"cell_type": "code",
"execution_count": 57,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"File duplicates-deletion.json has been successfully applied to project 2334935475634\n"
]
}
],
"source": [
"./openrefine-client --apply duplicates-deletion.json \"duplicates\""
]
},
{
"cell_type": "code",
"execution_count": 58,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"email\tcount\tname\tstate\tgender\tpurchase\n",
"arthur.duff@example4.com\t2\tArthur Duff\tOR\tM\tDining table\n",
"ben.morisson@example6.org\t1\tBen Morisson\tFL\tM\tAmplifier\n",
"ben.tyler@example3.org\t1\tBen Tyler\tNV\tM\tFlashlight\n",
"danny.baron@example1.com\t3\tDanny Baron\tCA\tM\tTV\n",
"jean.griffith@example5.org\t1\tJean Griffith\tWA\tF\tPower drill\n",
"melanie.white@example2.edu\t2\tMelanie White\tNC\tF\tiPhone\n"
]
}
],
"source": [
"./openrefine-client --export \"duplicates\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Export XLS"
]
},
{
"cell_type": "code",
"execution_count": 59,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Export to file deduped.xls complete\n"
]
}
],
"source": [
"./openrefine-client --export \"duplicates\" --output deduped.xls"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Delete"
]
},
{
"cell_type": "code",
"execution_count": 60,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Project 2334935475634 has been successfully deleted\n"
]
}
],
"source": [
"./openrefine-client --delete \"duplicates\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Templating"
]
},
{
"cell_type": "code",
"execution_count": 61,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"id: 1633409429491\n",
"rows: 10\n"
]
}
],
"source": [
"./openrefine-client --create duplicates.csv --projectName=advanced"
]
},
{
"cell_type": "code",
"execution_count": 62,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{ \"events\" : [\n",
" { \"name\" : \"Melanie White\", \"purchase\" : \"iPhone\" },\n",
" { \"name\" : \"Jean Griffith\", \"purchase\" : \"Power drill\" },\n",
" { \"name\" : \"Melanie White\", \"purchase\" : \"iPad\" }\n",
"] }"
]
}
],
"source": [
"./openrefine-client --export \"advanced\" \\\n",
"--prefix='{ \"events\" : [\n",
"' \\\n",
"--template=' { \"name\" : {{jsonize(cells[\"name\"].value)}}, \"purchase\" : {{jsonize(cells[\"purchase\"].value)}} }' \\\n",
"--rowSeparator=',\n",
"' \\\n",
"--suffix='\n",
"] }' \\\n",
"--filterQuery='^F$' \\\n",
"--filterColumn='gender'"
]
},
{
"cell_type": "code",
"execution_count": 63,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Export to files complete. Last file: advanced_3.json\n"
]
}
],
"source": [
"./openrefine-client --export \"advanced\" \\\n",
"--prefix='{ \"events\" : [\n",
"' \\\n",
"--template=' { \"name\" : {{jsonize(cells[\"name\"].value)}}, \"purchase\" : {{jsonize(cells[\"purchase\"].value)}} }' \\\n",
"--rowSeparator=',\n",
"' \\\n",
"--suffix='\n",
"] }' \\\n",
"--filterQuery='^F$' \\\n",
"--filterColumn='gender' \\\n",
"--output=advanced.json \\\n",
"--splitToFiles=true"
]
},
{
"cell_type": "code",
"execution_count": 64,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Export to files complete. Last file: advanced_melanie.white@example2.edu.json\n"
]
}
],
"source": [
"./openrefine-client --export \"advanced\" \\\n",
"--prefix='{ \"events\" : [\n",
"' \\\n",
"--template=' { \"name\" : {{jsonize(cells[\"name\"].value)}}, \"purchase\" : {{jsonize(cells[\"purchase\"].value)}} }' \\\n",
"--rowSeparator=',\n",
"' \\\n",
"--suffix='\n",
"] }' \\\n",
"--filterQuery='^F$' \\\n",
"--filterColumn='gender' \\\n",
"--output=advanced.json \\\n",
"--splitToFiles=true \\\n",
"--suffixById=true"
]
},
{
"cell_type": "code",
"execution_count": 65,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"advanced_1.json \u001b[0m\u001b[38;5;33mdata\u001b[0m\n",
"advanced_2.json deduped.xls\n",
"advanced_3.json duplicates.csv\n",
"advanced_jean.griffith@example5.org.json duplicates-deletion.json\n",
"advanced_melanie.white@example2.edu.json \u001b[38;5;40mopenrefine-client\u001b[0m\n"
]
}
],
"source": [
"ls"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Delete"
]
},
{
"cell_type": "code",
"execution_count": 66,
"metadata": {
"scrolled": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Project 1633409429491 has been successfully deleted\n"
]
}
],
"source": [
"./openrefine-client --delete \"advanced\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Unicode"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### fruits"
]
},
{
"cell_type": "code",
"execution_count": 67,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"id: 2280962953279\n",
"rows: 5\n",
" id: 2280962953279\n",
" url: http://127.0.0.1:3333/project?project=2280962953279\n",
" name: evil-fruits\n",
" modified: 2019-08-21T23:40:43Z\n",
" created: 2019-08-21T23:40:43Z\n",
" rowCount: 5\n",
"importOptionMetadata: [{u'storeEmptyStrings': True, u'fileSource': u'data/cli/evil-fruits.tsv', u'storeBlankRows': True, u'encoding': u'', u'projectName': u'evil-fruits', u'processQuotes': True, u'limit': -1, u'trimStrings': False, u'storeBlankCellsAsNulls': True, u'guessCellValueTypes': False, u'includeFileSources': False}]\n",
" column 001: 🔣\n",
" column 002: code\n",
" column 003: meaning\n",
"🔣\tcode\tmeaning\n",
"🍇\t1F347\tGRAPES\n",
"🍉\t1F349\tWATERMELON\n",
"🍒\t1F352\tCHERRIES\n",
"🍓\t1F353\tSTRAWBERRY\n",
"🍍\t1F34D\tPINEAPPLE\n"
]
}
],
"source": [
"./openrefine-client --create data/cli/evil-fruits.tsv\n",
"./openrefine-client --info \"evil-fruits\"\n",
"./openrefine-client --export \"evil-fruits\""
]
},
{
"cell_type": "code",
"execution_count": 68,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Export to file emojis.csv complete\n",
"🔣,code,meaning\n",
"🍇,1F347,GRAPES\n",
"🍉,1F349,WATERMELON\n",
"🍒,1F352,CHERRIES\n",
"🍓,1F353,STRAWBERRY\n",
"🍍,1F34D,PINEAPPLE\n"
]
}
],
"source": [
"./openrefine-client --export \"evil-fruits\" --output emojis.csv\n",
"cat emojis.csv"
]
},
{
"cell_type": "code",
"execution_count": 69,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{ \"emojis\" : [\n",
" { \"symbol\" : \"🍇\", \"meaning\" : \"GRAPES\" },\n",
" { \"symbol\" : \"🍉\", \"meaning\" : \"WATERMELON\" },\n",
" { \"symbol\" : \"🍍\", \"meaning\" : \"PINEAPPLE\" }\n",
"] }"
]
}
],
"source": [
"./openrefine-client --export \"evil-fruits\" \\\n",
"--prefix='{ \"emojis\" : [\n",
"' \\\n",
"--template=' { \"symbol\" : {{jsonize(with(row.columnNames[0],cn,cells[cn].value))}}, \"meaning\" : {{jsonize(cells[\"meaning\"].value)}} }' \\\n",
"--rowSeparator=',\n",
"' \\\n",
"--suffix='\n",
"] }' \\\n",
"--filterQuery='^1F34' \\\n",
"--filterColumn='code'"
]
},
{
"cell_type": "code",
"execution_count": 70,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Export to files complete. Last file: trái cây_3.json\n"
]
}
],
"source": [
"./openrefine-client --export \"evil-fruits\" \\\n",
"--prefix='{ \"emojis\" : [\n",
"' \\\n",
"--template=' { \"symbol\" : {{jsonize(with(row.columnNames[0],cn,cells[cn].value))}}, \"meaning\" : {{jsonize(cells[\"meaning\"].value)}} }' \\\n",
"--rowSeparator=',\n",
"' \\\n",
"--suffix='\n",
"] }' \\\n",
"--filterQuery='^1F34' \\\n",
"--filterColumn='code' \\\n",
"--output='trái cây.json' \\\n",
"--splitToFiles=true"
]
},
{
"cell_type": "code",
"execution_count": 71,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Export to files complete. Last file: trái cây_🍍.json\n"
]
}
],
"source": [
"./openrefine-client --export \"evil-fruits\" \\\n",
"--prefix='{ \"emojis\" : [\n",
"' \\\n",
"--template=' { \"symbol\" : {{jsonize(with(row.columnNames[0],cn,cells[cn].value))}}, \"meaning\" : {{jsonize(cells[\"meaning\"].value)}} }' \\\n",
"--rowSeparator=',\n",
"' \\\n",
"--suffix='\n",
"] }' \\\n",
"--filterQuery='^1F34' \\\n",
"--filterColumn='code' \\\n",
"--output='trái cây.json' \\\n",
"--splitToFiles=true \\\n",
"--suffixById=true"
]
},
{
"cell_type": "code",
"execution_count": 72,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" advanced_1.json emojis.csv\n",
" advanced_2.json \u001b[0m\u001b[38;5;40mopenrefine-client\u001b[0m\n",
" advanced_3.json 'trái cây_1.json'\n",
" advanced_jean.griffith@example5.org.json 'trái cây_2.json'\n",
" advanced_melanie.white@example2.edu.json 'trái cây_3.json'\n",
" \u001b[38;5;33mdata\u001b[0m 'trái cây_🍇.json'\n",
" deduped.xls 'trái cây_🍉.json'\n",
" duplicates.csv 'trái cây_🍍.json'\n",
" duplicates-deletion.json\n"
]
}
],
"source": [
"ls"
]
},
{
"cell_type": "code",
"execution_count": 73,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Project 2280962953279 has been successfully deleted\n"
]
}
],
"source": [
"./openrefine-client --delete \"evil-fruits\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### emoji-data"
]
},
{
"cell_type": "code",
"execution_count": 74,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"id: 2019865211741\n",
"rows: 20\n",
" id: 2019865211741\n",
" url: http://127.0.0.1:3333/project?project=2019865211741\n",
" name: dữ liệu biểu tượng cảm xúc\n",
" modified: 2019-08-21T23:41:06Z\n",
" created: 2019-08-21T23:41:06Z\n",
" rowCount: 20\n",
"importOptionMetadata: [{u'storeEmptyStrings': True, u'fileSource': u'data/cli/d\\u1eef li\\u1ec7u bi\\u1ec3u t\\u01b0\\u1ee3ng c\\u1ea3m x\\xfac.txt', u'storeBlankRows': True, u'encoding': u'', u'projectName': u'd\\u1eef li\\u1ec7u bi\\u1ec3u t\\u01b0\\u1ee3ng c\\u1ea3m x\\xfac', u'processQuotes': True, u'skipDataLines': 34, u'limit': 20, u'trimStrings': False, u'storeBlankCellsAsNulls': True, u'guessCellValueTypes': False, u'includeFileSources': False, u'headerLines': 0}]\n",
" column 001: Column 1\n",
" column 002: Column 2\n",
" column 003: Column 3\n",
" column 004: Column 4\n",
" column 005: Column 5\n",
" column 006: Column 6\n",
"Column 1\tColumn 2\tColumn 3\tColumn 4\tColumn 5\tColumn 6\n",
"00A9 ;\ttext ;\tL1 ;\tnone ;\tj\t# V1.1 (©) COPYRIGHT SIGN\n",
"00AE ;\ttext ;\tL1 ;\tnone ;\tj\t# V1.1 (®) REGISTERED SIGN\n",
"203C ;\ttext ;\tL1 ;\tnone ;\ta j\t# V1.1 (‼) DOUBLE EXCLAMATION MARK\n",
"2049 ;\ttext ;\tL1 ;\tnone ;\ta j\t# V3.0 (⁉) EXCLAMATION QUESTION MARK\n",
"2122 ;\ttext ;\tL1 ;\tnone ;\tj\t# V1.1 (™) TRADE MARK SIGN\n",
"2139 ;\ttext ;\tL1 ;\tnone ;\tj\t# V3.0 () INFORMATION SOURCE\n",
"2194 ;\ttext ;\tL1 ;\tnone ;\tz j\t# V1.1 (↔) LEFT RIGHT ARROW\n",
"2195 ;\ttext ;\tL1 ;\tnone ;\tz j\t# V1.1 (↕) UP DOWN ARROW\n",
"2196 ;\ttext ;\tL1 ;\tnone ;\tj\t# V1.1 (↖) NORTH WEST ARROW\n",
"2197 ;\ttext ;\tL1 ;\tnone ;\tj\t# V1.1 (↗) NORTH EAST ARROW\n",
"2198 ;\ttext ;\tL1 ;\tnone ;\tj\t# V1.1 (↘) SOUTH EAST ARROW\n",
"2199 ;\ttext ;\tL1 ;\tnone ;\tj\t# V1.1 (↙) SOUTH WEST ARROW\n",
"21A9 ;\ttext ;\tL1 ;\tnone ;\tj\t# V1.1 (↩) LEFTWARDS ARROW WITH HOOK\n",
"21AA ;\ttext ;\tL1 ;\tnone ;\tj\t# V1.1 (↪) RIGHTWARDS ARROW WITH HOOK\n",
"231A ;\temoji ;\tL1 ;\tnone ;\tj\t# V1.1 (⌚) WATCH\n",
"231B ;\temoji ;\tL1 ;\tnone ;\tj\t# V1.1 (⌛) HOURGLASS\n",
"2328 ;\ttext ;\tL2 ;\tnone ;\tx\t# V1.1 (⌨) KEYBOARD\n",
"23CF ;\ttext ;\tL2 ;\tnone ;\tx\t# V4.0 (⏏) EJECT SYMBOL\n",
"23E9 ;\temoji ;\tL1 ;\tnone ;\tj w\t# V6.0 (⏩) BLACK RIGHT-POINTING DOUBLE TRIANGLE\n",
"23EA ;\temoji ;\tL1 ;\tnone ;\tj w\t# V6.0 (⏪) BLACK LEFT-POINTING DOUBLE TRIANGLE\n"
]
}
],
"source": [
"./openrefine-client --create \"data/cli/dữ liệu biểu tượng cảm xúc.txt\" \\\n",
"--format=tsv \\\n",
"--headerLines=0 \\\n",
"--skipDataLines=34 \\\n",
"--limit=20\n",
"./openrefine-client --info \"dữ liệu biểu tượng cảm xúc\"\n",
"./openrefine-client --export \"dữ liệu biểu tượng cảm xúc\""
]
},
{
"cell_type": "code",
"execution_count": 75,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" 2019865211741: dữ liệu biểu tượng cảm xúc\n"
]
}
],
"source": [
"./openrefine-client --list"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Delete"
]
},
{
"cell_type": "code",
"execution_count": 76,
"metadata": {
"scrolled": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Project 2019865211741 has been successfully deleted\n"
]
}
],
"source": [
"./openrefine-client --delete \"dữ liệu biểu tượng cảm xúc\""
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Bash",
"language": "bash",
"name": "bash"
},
"language_info": {
"codemirror_mode": "shell",
"file_extension": ".sh",
"mimetype": "text/x-sh",
"name": "bash"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

2824
tests/cli_python2.ipynb Normal file

File diff suppressed because it is too large Load Diff

View File

@ -1,40 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
⌨,code,meaning
⛲,1F347,FOUNTAIN
⛳,1F349,FLAG IN HOLE
⛵,1F352,SAILBOAT
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
⌨ code meaning
⛲ 1F347 FOUNTAIN
⛳ 1F349 FLAG IN HOLE
⛵ 1F352 SAILBOAT
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv" --projectName "${t} biểu tượng cảm xúc ⛲"
${cmd} --export "${t} biểu tượng cảm xúc ⛲" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -1,40 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
a b c
1 2 3
0 0 0
$ \ '
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -1,53 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.json"
[
{
"⌨": "⛲",
"code": "1F347",
"meaning": "FOUNTAIN"
},
{
"⌨": "⛳",
"code": "1F349",
"meaning": "FLAG IN HOLE"
},
{
"⌨": "⛵",
"code": "1F352",
"meaning": "SAILBOAT"
}
]
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
_ - ⌨ _ - code _ - meaning
⛲ 1F347 FOUNTAIN
⛳ 1F349 FLAG IN HOLE
⛵ 1F352 SAILBOAT
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.json"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -1,53 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.json"
[
{
"a": 1,
"b": 2,
"c": 3
},
{
"a": 0,
"b": 0,
"c": 0
},
{
"a": "$",
"b": "\\",
"c": "\""
}
]
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
_ - a _ - b _ - c
1 2 3
0 0 0
$ \ """"
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.json"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -1,44 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
if [[ ${2} ]]; then
version="${2}"
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cp "data/example.ods" "tmp/${t}/${t}.ods"
# ================================= ASSERTION ================================ #
if [[ "${version}" = "2.7" ]]; then
cat << "DATA" > "tmp/${t}/${t}.assert"
⌨ code meaning Column Column 5 Column 6 Column 7 Column 8
⛲ 1F347 FOUNTAIN
⛳ 1F349 FLAG IN HOLE
⛵ 1F352 SAILBOAT
DATA
else
#TODO
echo "https://github.com/opencultureconsulting/openrefine-client/issues/4"
exit 200
fi
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.ods" --sheets 1
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -1,48 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
if [[ ${2} ]]; then
version="${2}"
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cp "data/example.ods" "tmp/${t}/${t}.ods"
#a b c
#1 2 3
#0 0 0
#$ \ '
# ================================= ASSERTION ================================ #
if [[ "${version}" = "2.7" ]]; then
cat << "DATA" > "tmp/${t}/${t}.assert"
a b c Column Column 5 Column 6 Column 7 Column 8
1.0 2.0 3.0
0.0 0.0 0.0
$ \ '
DATA
else
#TODO
echo "https://github.com/opencultureconsulting/openrefine-client/issues/4"
exit 200
fi
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.ods"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -1,40 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.tsv"
⌨ code meaning
⛲ 1F347 FOUNTAIN
⛳ 1F349 FLAG IN HOLE
⛵ 1F352 SAILBOAT
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
⌨,code,meaning
⛲,1F347,FOUNTAIN
⛳,1F349,FLAG IN HOLE
⛵,1F352,SAILBOAT
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.tsv"
${cmd} --export "${t}" --output "tmp/${t}/${t}.csv"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.csv"

View File

@ -1,40 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.tsv"
a b c
1 2 3
0 0 0
$ \ '
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.tsv"
${cmd} --export "${t}" --output "tmp/${t}/${t}.csv"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.csv"

View File

@ -1,81 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.txt"
⛲ 1F347 FOUNTAIN
⛳ 1F349 FLAG IN HOLE
⛵ 1F352 SAILBOAT
DATA
cat << "DATA" > "tmp/${t}/${t}.transform"
[
{
"op": "core/text-transform",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"columnName": "Column 1",
"expression": "grel:value.trim()",
"onError": "keep-original",
"repeat": false,
"repeatCount": 10
},
{
"op": "core/text-transform",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"columnName": "Column 2",
"expression": "grel:value.trim()",
"onError": "keep-original",
"repeat": false,
"repeatCount": 10
},
{
"op": "core/text-transform",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"columnName": "Column 3",
"expression": "grel:value.trim()",
"onError": "keep-original",
"repeat": false,
"repeatCount": 10
}
]
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
Column 1 Column 2 Column 3
⛲ 1F347 FOUNTAIN
⛳ 1F349 FLAG IN HOLE
⛵ 1F352 SAILBOAT
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.txt" --columnWidths "6" --columnWidths "6" --columnWidths "60"
${cmd} --apply "tmp/${t}/${t}.transform" "${t}"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -1,81 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.txt"
1 2 3
mon tue wed
$2 $300 $1
DATA
cat << "DATA" > "tmp/${t}/${t}.transform"
[
{
"op": "core/text-transform",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"columnName": "Column 1",
"expression": "grel:value.trim()",
"onError": "keep-original",
"repeat": false,
"repeatCount": 10
},
{
"op": "core/text-transform",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"columnName": "Column 2",
"expression": "grel:value.trim()",
"onError": "keep-original",
"repeat": false,
"repeatCount": 10
},
{
"op": "core/text-transform",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"columnName": "Column 3",
"expression": "grel:value.trim()",
"onError": "keep-original",
"repeat": false,
"repeatCount": 10
}
]
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
Column 1 Column 2 Column 3
1 2 3
mon tue wed
$2 $300 $1
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.txt" --columnWidths "6" --columnWidths "6" --columnWidths "6"
${cmd} --apply "tmp/${t}/${t}.transform" "${t}"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -1,39 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.txt"
1 2 3
mon tue wed
$2 $300 $1
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
Column 1
1 2 3
mon tue wed
$2 $300 $1
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.txt"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -1,44 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
if [[ ${2} ]]; then
version="${2}"
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cp "data/example.xls" "tmp/${t}/${t}.xls"
# ================================= ASSERTION ================================ #
if [[ "${version}" = "2.7" ]]; then
cat << "DATA" > "tmp/${t}/${t}.assert"
⌨ code meaning
⛲ 1F347 FOUNTAIN
⛳ 1F349 FLAG IN HOLE
⛵ 1F352 SAILBOAT
DATA
else
#TODO
echo "https://github.com/opencultureconsulting/openrefine-client/issues/4"
exit 200
fi
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.xls" --sheets 1
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -1,48 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
if [[ ${2} ]]; then
version="${2}"
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cp "data/example.xls" "tmp/${t}/${t}.xls"
#a b c
#1 2 3
#0 0 0
#$ \ '
# ================================= ASSERTION ================================ #
if [[ "${version}" = "2.7" ]]; then
cat << "DATA" > "tmp/${t}/${t}.assert"
a b c
1.0 2.0 3.0
0.0 0.0 0.0
$ \ '
DATA
else
#TODO
echo "https://github.com/opencultureconsulting/openrefine-client/issues/4"
exit 200
fi
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.xls"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -1,44 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
if [[ ${2} ]]; then
version="${2}"
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cp "data/example.xlsx" "tmp/${t}/${t}.xlsx"
# ================================= ASSERTION ================================ #
if [[ "${version}" = "2.7" ]]; then
cat << "DATA" > "tmp/${t}/${t}.assert"
⌨ code meaning
⛲ 1F347 FOUNTAIN
⛳ 1F349 FLAG IN HOLE
⛵ 1F352 SAILBOAT
DATA
else
#TODO
echo "https://github.com/opencultureconsulting/openrefine-client/issues/4"
exit 200
fi
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.xlsx" --sheets 1
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -1,48 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
if [[ ${2} ]]; then
version="${2}"
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cp "data/example.xlsx" "tmp/${t}/${t}.xlsx"
#a b c
#1 2 3
#0 0 0
#$ \ '
# ================================= ASSERTION ================================ #
if [[ "${version}" = "2.7" ]]; then
cat << "DATA" > "tmp/${t}/${t}.assert"
a b c
1.0 2.0 3.0
0.0 0.0 0.0
$ \ '
DATA
else
#TODO
echo "https://github.com/opencultureconsulting/openrefine-client/issues/4"
exit 200
fi
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.xlsx"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -1,95 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.xml"
<?xml version="1.0" encoding="UTF-8"?>
<root>
<record>
<icon>⛲</icon>
<code>1F347</code>
<meaning>FOUNTAIN</meaning>
</record>
<record>
<icon>⛳</icon>
<code>1F349</code>
<meaning>FLAG IN HOLE</meaning>
</record>
<record>
<icon>⛵</icon>
<code>1F352</code>
<meaning>SAILBOAT</meaning>
</record>
</root>
DATA
cat << "DATA" > "tmp/${t}/${t}.transform"
[
{
"op": "core/column-removal",
"columnName": "root"
},
{
"op": "core/column-removal",
"columnName": "root - record"
},
{
"op": "core/row-removal",
"engineConfig": {
"facets": [
{
"type": "list",
"name": "Blank Rows",
"expression": "(filter(row.columnNames,cn,isNonBlank(cells[cn].value)).length()==0).toString()",
"columnName": "",
"invert": false,
"omitBlank": false,
"omitError": false,
"selection": [
{
"v": {
"v": "true",
"l": "true"
}
}
],
"selectBlank": false,
"selectError": false
}
],
"mode": "record-based"
}
}
]
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
root - record - icon root - record - code root - record - meaning
⛲ 1F347 FOUNTAIN
⛳ 1F349 FLAG IN HOLE
⛵ 1F352 SAILBOAT
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.xml"
${cmd} --apply "tmp/${t}/${t}.transform" "${t}"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -1,95 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.xml"
<?xml version="1.0" encoding="UTF-8"?>
<root>
<record>
<a>1</a>
<b>2</b>
<c>3</c>
</record>
<record>
<a>0</a>
<b>0</b>
<c>0</c>
</record>
<record>
<a>$</a>
<b>\</b>
<c>'</c>
</record>
</root>
DATA
cat << "DATA" > "tmp/${t}/${t}.transform"
[
{
"op": "core/column-removal",
"columnName": "root"
},
{
"op": "core/column-removal",
"columnName": "root - record"
},
{
"op": "core/row-removal",
"engineConfig": {
"facets": [
{
"type": "list",
"name": "Blank Rows",
"expression": "(filter(row.columnNames,cn,isNonBlank(cells[cn].value)).length()==0).toString()",
"columnName": "",
"invert": false,
"omitBlank": false,
"omitError": false,
"selection": [
{
"v": {
"v": "true",
"l": "true"
}
}
],
"selectBlank": false,
"selectError": false
}
],
"mode": "record-based"
}
}
]
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
root - record - a root - record - b root - record - c
1 2 3
0 0 0
$ \ '
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.xml"
${cmd} --apply "tmp/${t}/${t}.transform" "${t}"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -0,0 +1,11 @@
email,name,state,gender,purchase,count,date
danny.baron@example1.com,Danny Baron,CA,M,TV (UTF-8: 📺),1,"Wed, 4 Jul 2001"
melanie.white@example2.edu,Melanie White,NC,F,<iPhone>,1,2001-07-04T12:08:56
danny.baron@example1.com, D. ("Tab") Baron,CA,M,Winter jacket,1,2001-07-04
ben.tyler@example3.org,Ben Tyler,NV,M,Flashlight,1,2001/07/04
arthur.duff@example4.com,Arthur Duff,OR,M,Dining table,1,2001-07
danny.baron@example1.com,Daniel Baron,,,Bike,1,2001
jean.griffith@example5.org,Jean Griffith,WA,F,Power drill,1,2000
melanie.white@example2.edu,Melanie White,NC,F,'iPad',1,1999
ben.morisson@example6.org,Ben Morisson,FL,M,Amplifier,1,1998
arthur.duff@example4.com,Arthur Duff,OR,M,Night table,1,1997
Can't render this file because it contains an unexpected character in line 4 and column 33.

View File

@ -0,0 +1,92 @@
[
{
"email": "danny.baron@example1.com",
"name": "Danny Baron",
"state": "CA",
"gender": "M",
"purchase": "TV (UTF-8: 📺)",
"count": 1,
"date": "Wed, 4 Jul 2001"
},
{
"email": "melanie.white@example2.edu",
"name": "Melanie White",
"state": "NC",
"gender": "F",
"purchase": "<iPhone>",
"count": 1,
"date": "2001-07-04T12:08:56"
},
{
"email": "danny.baron@example1.com",
"name": " D.\t(\"Tab\") Baron",
"state": "CA",
"gender": "M",
"purchase": "Winter jacket",
"count": 1,
"date": "2001-07-04"
},
{
"email": "ben.tyler@example3.org",
"name": "Ben Tyler",
"state": "NV",
"gender": "M",
"purchase": "Flashlight",
"count": 1,
"date": "2001/07/04"
},
{
"email": "arthur.duff@example4.com",
"name": "Arthur Duff",
"state": "OR",
"gender": "M",
"purchase": "Dining table",
"count": 1,
"date": "2001-07"
},
{
"email": "danny.baron@example1.com",
"name": "Daniel Baron",
"state": "",
"gender": "",
"purchase": "Bike",
"count": 1,
"date": 2001
},
{
"email": "jean.griffith@example5.org",
"name": "Jean Griffith",
"state": "WA",
"gender": "F",
"purchase": "Power drill",
"count": 1,
"date": 2000
},
{
"email": "melanie.white@example2.edu",
"name": "Melanie White",
"state": "NC",
"gender": "F",
"purchase": "'iPad'",
"count": 1,
"date": 1999
},
{
"email": "ben.morisson@example6.org",
"name": "Ben Morisson",
"state": "FL",
"gender": "M",
"purchase": "Amplifier",
"count": 1,
"date": 1998
},
{
"email": "arthur.duff@example4.com",
"name": "Arthur Duff",
"state": "OR",
"gender": "M",
"purchase": "Night table",
"count": 1,
"date": 1997
}
]

Binary file not shown.

View File

@ -0,0 +1,11 @@
email name state gender purchase count date
danny.baron@example1.com Danny Baron CA M TV (UTF-8: 📺) 1 Wed, 4 Jul 2001
melanie.white@example2.edu Melanie White NC F <iPhone> 1 2001-07-04T12:08:56
danny.baron@example1.com "D. (""Tab"") Baron" CA M Winter jacket 1 2001-07-04
ben.tyler@example3.org Ben Tyler NV M Flashlight 1 2001/07/04
arthur.duff@example4.com Arthur Duff OR M Dining table 1 2001-07
danny.baron@example1.com Daniel Baron Bike 1 2001
jean.griffith@example5.org Jean Griffith WA F Power drill 1 2000
melanie.white@example2.edu Melanie White NC F 'iPad' 1 1999
ben.morisson@example6.org Ben Morisson FL M Amplifier 1 1998
arthur.duff@example4.com Arthur Duff OR M Night table 1 1997
1 email name state gender purchase count date
2 danny.baron@example1.com Danny Baron CA M TV (UTF-8: 📺) 1 Wed, 4 Jul 2001
3 melanie.white@example2.edu Melanie White NC F <iPhone> 1 2001-07-04T12:08:56
4 danny.baron@example1.com D. ("Tab") Baron CA M Winter jacket 1 2001-07-04
5 ben.tyler@example3.org Ben Tyler NV M Flashlight 1 2001/07/04
6 arthur.duff@example4.com Arthur Duff OR M Dining table 1 2001-07
7 danny.baron@example1.com Daniel Baron Bike 1 2001
8 jean.griffith@example5.org Jean Griffith WA F Power drill 1 2000
9 melanie.white@example2.edu Melanie White NC F 'iPad' 1 1999
10 ben.morisson@example6.org Ben Morisson FL M Amplifier 1 1998
11 arthur.duff@example4.com Arthur Duff OR M Night table 1 1997

View File

@ -0,0 +1,11 @@
email name state gender purchase count date
danny.baron@example1.com Danny Baron CA M TV (UTF-8: 📺) 1 Wed, 4 Jul 2001
melanie.white@example2.edu Melanie White NC F <iPhone> 1 2001-07-04T12:08:5
danny.baron@example1.com D. ("Tab") Baron CA M Winter jacket 1 2001-07-04
ben.tyler@example3.org Ben Tyler NV M Flashlight 1 2001/07/04
arthur.duff@example4.com Arthur Duff OR M Dining table 1 2001-07
danny.baron@example1.com Daniel Baron Bike 1 2001
jean.griffith@example5.org Jean Griffith WA F Power drill 1 2000
melanie.white@example2.edu Melanie White NC F 'iPad' 1 1999
ben.morisson@example6.org Ben Morisson FL M Amplifier 1 1998
arthur.duff@example4.com Arthur Duff OR M Night table 1 1997

Binary file not shown.

Binary file not shown.

View File

@ -0,0 +1,93 @@
<?xml version="1.0" encoding="UTF-8"?>
<root>
<record>
<email>danny.baron@example1.com</email>
<name>Danny Baron</name>
<state>CA</state>
<gender>M</gender>
<purchase>TV (UTF-8: 📺)</purchase>
<count>1</count>
<date>Wed, 4 Jul 2001</date>
</record>
<record>
<email>melanie.white@example2.edu</email>
<name>Melanie White</name>
<state>NC</state>
<gender>F</gender>
<purchase>&lt;iPhone&gt;</purchase>
<count>1</count>
<date>2001-07-04T12:08:56</date>
</record>
<record>
<email>danny.baron@example1.com</email>
<name> D. (&quot;Tab&quot;) Baron</name>
<state>CA</state>
<gender>M</gender>
<purchase>Winter jacket</purchase>
<count>1</count>
<date>2001-07-04</date>
</record>
<record>
<email>ben.tyler@example3.org</email>
<name>Ben Tyler</name>
<state>NV</state>
<gender>M</gender>
<purchase>Flashlight</purchase>
<count>1</count>
<date>2001/07/04</date>
</record>
<record>
<email>arthur.duff@example4.com</email>
<name>Arthur Duff</name>
<state>OR</state>
<gender>M</gender>
<purchase>Dining table</purchase>
<count>1</count>
<date>2001-07</date>
</record>
<record>
<email>danny.baron@example1.com</email>
<name>Daniel Baron</name>
<state></state>
<gender></gender>
<purchase>Bike</purchase>
<count>1</count>
<date>2001</date>
</record>
<record>
<email>jean.griffith@example5.org</email>
<name>Jean Griffith</name>
<state>WA</state>
<gender>F</gender>
<purchase>Power drill</purchase>
<count>1</count>
<date>2000</date>
</record>
<record>
<email>melanie.white@example2.edu</email>
<name>Melanie White</name>
<state>NC</state>
<gender>F</gender>
<purchase>&apos;iPad&apos;</purchase>
<count>1</count>
<date>1999</date>
</record>
<record>
<email>ben.morisson@example6.org</email>
<name>Ben Morisson</name>
<state>FL</state>
<gender>M</gender>
<purchase>Amplifier</purchase>
<count>1</count>
<date>1998</date>
</record>
<record>
<email>arthur.duff@example4.com</email>
<name>Arthur Duff</name>
<state>OR</state>
<gender>M</gender>
<purchase>Night table</purchase>
<count>1</count>
<date>1997</date>
</record>
</root>

Binary file not shown.

View File

@ -0,0 +1,10 @@
<?xml version="1.0" encoding="UTF-8"?>
<record>
<email>danny.baron@example1.com</email>
<name>Danny Baron</name>
<state>CA</state>
<gender>M</gender>
<purchase>TV (UTF-8: 📺)</purchase>
<count>1</count>
<date>Wed, 4 Jul 2001</date>
</record>

Binary file not shown.

Binary file not shown.

Binary file not shown.

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,6 @@
🔣 code meaning
🍇 1F347 GRAPES
🍉 1F349 WATERMELON
🍒 1F352 CHERRIES
🍓 1F353 STRAWBERRY
🍍 1F34D PINEAPPLE
1 🔣 code meaning
2 🍇 1F347 GRAPES
3 🍉 1F349 WATERMELON
4 🍒 1F352 CHERRIES
5 🍓 1F353 STRAWBERRY
6 🍍 1F34D PINEAPPLE

5410
tests/data/eli-lilly.csv Normal file

File diff suppressed because it is too large Load Diff

Binary file not shown.

Binary file not shown.

Binary file not shown.

16
tests/data/fixed-rows.csv Normal file
View File

@ -0,0 +1,16 @@
Tom Dalton sent $3700 to Betty Whitehead on 01/17/2009
377 El Camino Real
"San Jose, CA"
Status: received
Morgan Lawless received $10500 from Bob Henselman on 02/05/2009
2798 Lancaster Dr.
"New York, NY"
Status: deposited
Eric Bateman sent $22000 to Liz Benedict on 03/02/2009
89 Deerfield Cr.
"Springfield, WA"
Status: received
Robert Hartfort received $20000 from Ron Ingleman on 03/28/2009
198 Broadway Ave.
"Saratoga, CA"
Status: unknown
1 Tom Dalton sent $3700 to Betty Whitehead on 01/17/2009
2 377 El Camino Real
3 San Jose, CA
4 Status: received
5 Morgan Lawless received $10500 from Bob Henselman on 02/05/2009
6 2798 Lancaster Dr.
7 New York, NY
8 Status: deposited
9 Eric Bateman sent $22000 to Liz Benedict on 03/02/2009
10 89 Deerfield Cr.
11 Springfield, WA
12 Status: received
13 Robert Hartfort received $20000 from Ron Ingleman on 03/28/2009
14 198 Broadway Ave.
15 Saratoga, CA
16 Status: unknown

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,18 @@
Tom Dalton sent $3700 to Betty Whitehead on 01/17/2009
377 El Camino Real
"San Jose, CA"
Status: received
Morgan Lawless received $10500 from Bob Henselman on 02/05/2009
2798 Lancaster Dr.
"New York, NY"
(000) 555-6717
Status: deposited
Eric Bateman sent $22000 to Liz Benedict on 03/02/2009
89 Deerfield Cr.
"Springfield, WA"
(000) 555-1411
Status: received
Robert Hartfort received $20000 from Ron Ingleman on 03/28/2009
198 Broadway Ave.
"Saratoga, CA"
Status: unknown
1 Tom Dalton sent $3700 to Betty Whitehead on 01/17/2009
2 377 El Camino Real
3 San Jose, CA
4 Status: received
5 Morgan Lawless received $10500 from Bob Henselman on 02/05/2009
6 2798 Lancaster Dr.
7 New York, NY
8 (000) 555-6717
9 Status: deposited
10 Eric Bateman sent $22000 to Liz Benedict on 03/02/2009
11 89 Deerfield Cr.
12 Springfield, WA
13 (000) 555-1411
14 Status: received
15 Robert Hartfort received $20000 from Ron Ingleman on 03/28/2009
16 198 Broadway Ave.
17 Saratoga, CA
18 Status: unknown

View File

@ -1,35 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,2,3
DATA
# ================================= ASSERTION ================================ #
cat << DATA > "tmp/${t}/${t}.assert"
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv" --projectName "${t} biểu tượng cảm xúc ⛲"
${cmd} --delete "${t} biểu tượng cảm xúc ⛲"
${cmd} --list | grep "${t}" | cut -d ':' -f 2 > "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -1,35 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,2,3
DATA
# ================================= ASSERTION ================================ #
cat << DATA > "tmp/${t}/${t}.assert"
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --delete "${t}"
${cmd} --list | grep "${t}" | cut -d ':' -f 2 > "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -1,21 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# ================================== ACTION ================================== #
${cmd} --download "https://git.io/fj5ju" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "data/duplicates-deletion.json" "tmp/${t}/${t}.output"

View File

@ -1,44 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.tsv"
🔣 code meaning
🍇 1F347 GRAPES
🍉 1F349 WATERMELON
🍒 1F352 CHERRIES
🍓 1F353 STRAWBERRY
🍍 1F34D PINEAPPLE
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
🔣,code,meaning
🍇,1F347,GRAPES
🍉,1F349,WATERMELON
🍒,1F352,CHERRIES
🍓,1F353,STRAWBERRY
🍍,1F34D,PINEAPPLE
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.tsv"
${cmd} --export "${t}" --output "tmp/${t}/${t} biểu tượng cảm xúc 🍉.csv"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t} biểu tượng cảm xúc 🍉.csv"

View File

@ -1,40 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.tsv"
a b c
1 2 3
0 0 0
$ \ '
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.tsv"
${cmd} --export "${t}" --output "tmp/${t}/${t}.csv"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.csv"

View File

@ -1,72 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
if [[ ${2} ]]; then
majorversion="${2%%.*}"
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
⌨,code,meaning
⛲,1F347,FOUNTAIN
⛳,1F349,FLAG IN HOLE
⛵,1F352,SAILBOAT
DATA
# ================================= ASSERTION ================================ #
if [[ "$majorversion" = 2 ]]; then
cat << "DATA" > "tmp/${t}/${t}.assert"
<html>
<head>
<title>export-html-utf8</title>
<meta charset="utf-8" />
</head>
<body>
<table>
<tr><th>⌨</th><th>code</th><th>meaning</th></tr>
<tr><td>&#9970;</td><td>1F347</td><td>FOUNTAIN</td></tr>
<tr><td>&#9971;</td><td>1F349</td><td>FLAG IN HOLE</td></tr>
<tr><td>&#9973;</td><td>1F352</td><td>SAILBOAT</td></tr>
</table>
</body>
</html>
DATA
else
cat << "DATA" > "tmp/${t}/${t}.assert"
<html>
<head>
<title>export-html-utf8</title>
<meta charset="utf-8" />
</head>
<body>
<table>
<tr><th>⌨</th><th>code</th><th>meaning</th></tr>
<tr><td>⛲</td><td>1F347</td><td>FOUNTAIN</td></tr>
<tr><td>⛳</td><td>1F349</td><td>FLAG IN HOLE</td></tr>
<tr><td>⛵</td><td>1F352</td><td>SAILBOAT</td></tr>
</table>
</body>
</html>
DATA
fi
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --export "${t}" --output "tmp/${t}/${t}.html"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.html"

View File

@ -1,50 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
<html>
<head>
<title>export-html</title>
<meta charset="utf-8" />
</head>
<body>
<table>
<tr><th>a</th><th>b</th><th>c</th></tr>
<tr><td>1</td><td>2</td><td>3</td></tr>
<tr><td>0</td><td>0</td><td>0</td></tr>
<tr><td>$</td><td>\</td><td>&apos;</td></tr>
</table>
</body>
</html>
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --export "${t}" --output "tmp/${t}/${t}.html"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.html"

View File

@ -1,43 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
⌨,code,meaning
⛲,1F347,FOUNTAIN
⛳,1F349,FLAG IN HOLE
⛵,1F352,SAILBOAT
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
⌨,code,meaning
⛲,1F347,FOUNTAIN
⛳,1F349,"FLAG IN HOLE"
⛵,1F352,SAILBOAT
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --export "${t}" --output "tmp/${t}/${t}.ods"
(cd tmp/"${t}" &&
ssconvert -S "${t}.ods" "${t}.csv" &&
mv "${t}.csv.1" "${t}.output")
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -1,47 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
if [[ -z "$(command -v ssconvert 2> /dev/null)" ]] ; then
echo 1>&2 "ERROR: This test requires ssconvert (gnumeric)"
exit 127
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --export "${t}" --output "tmp/${t}/${t}.ods"
(cd tmp/"${t}" &&
ssconvert -S "${t}.ods" "${t}.csv" &&
mv "${t}.csv.1" "${t}.output")
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -1,44 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
🔣,code,meaning
🍇,1F347,GRAPES
🍉,1F349,WATERMELON
🍒,1F352,CHERRIES
🍓,1F353,STRAWBERRY
🍍,1F34D,PINEAPPLE
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
🔣 code meaning
🍇 1F347 GRAPES
🍉 1F349 WATERMELON
🍒 1F352 CHERRIES
🍓 1F353 STRAWBERRY
🍍 1F34D PINEAPPLE
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --export "${t}" --output "tmp/${t}/${t}.tsv"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.tsv"

View File

@ -1,40 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
a b c
1 2 3
0 0 0
$ \ '
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --export "${t}" --output "tmp/${t}/${t}.tsv"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.tsv"

View File

@ -1,44 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
🔣,code,meaning
🍇,1F347,GRAPES
🍉,1F349,WATERMELON
🍒,1F352,CHERRIES
🍓,1F353,STRAWBERRY
🍍,1F34D,PINEAPPLE
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
🔣 code meaning
🍇 1F347 GRAPES
🍉 1F349 WATERMELON
🍒 1F352 CHERRIES
🍓 1F353 STRAWBERRY
🍍 1F34D PINEAPPLE
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --export "${t}" > "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -1,43 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
⌨,code,meaning
⛲,1F347,FOUNTAIN
⛳,1F349,FLAG IN HOLE
⛵,1F352,SAILBOAT
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
⌨,code,meaning
⛲,1F347,FOUNTAIN
⛳,1F349,FLAG IN HOLE
⛵,1F352,SAILBOAT
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --export "${t}" --output "tmp/${t}/${t}.xls"
(cd tmp/"${t}" &&
ssconvert -S "${t}.xls" "${t}.csv" &&
mv "${t}.csv" "${t}.output")
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -1,43 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --export "${t}" --output "tmp/${t}/${t}.xls"
(cd tmp/"${t}" &&
ssconvert -S "${t}.xls" "${t}.csv" &&
mv "${t}.csv" "${t}.output")
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -1,45 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
⌨,code,meaning
⛲,1F347,FOUNTAIN
⛳,1F349,FLAG IN HOLE
⛵,1F352,SAILBOAT
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
⌨,code,meaning
⛲,1F347,FOUNTAIN
⛳,1F349,FLAG IN HOLE
⛵,1F352,SAILBOAT
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --export "${t}" --output "tmp/${t}/${t}.xlsx"
(cd tmp/"${t}" &&
ssconvert -S "${t}.xlsx" "${t}.csv" &&
mv "${t}.csv" "${t}.output")
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -1,43 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --export "${t}" --output "tmp/${t}/${t}.xlsx"
(cd tmp/"${t}" &&
ssconvert -S "${t}.xlsx" "${t}.csv" &&
mv "${t}.csv" "${t}.output")
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -1,40 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
a b c
1 2 3
0 0 0
$ \ '
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --export "${t}" > "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -1,41 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
Column 1
a,b,c
1,2,3
0,0,0
$,\,'
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv" --format "line-based"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -1,40 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --export "${t}" --format "csv" > "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -1,27 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
Script to provide a command line interface to an OpenRefine server.
DATA
# ================================== ACTION ================================== #
${cmd} --help | sed '3q;d' > "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -1,39 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.tsv"
🔣 code meaning
🍇 1F347 GRAPES
🍉 1F349 WATERMELON
🍒 1F352 CHERRIES
🍓 1F353 STRAWBERRY
🍍 1F34D PINEAPPLE
DATA
# ================================= ASSERTION ================================ #
cat << DATA > "tmp/${t}/${t}.assert"
column 001: 🔣
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.tsv"
${cmd} --info "${t}" | grep 'column 001' > "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -1,35 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,2,3
DATA
# ================================= ASSERTION ================================ #
cat << DATA > "tmp/${t}/${t}.assert"
column 002: b
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --info "${t}" | grep 'column 002' > "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -1,35 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,2,3
DATA
# ================================= ASSERTION ================================ #
cat << DATA > "tmp/${t}/${t}.assert"
${t}
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --list | grep "${t}" | cut -d ':' -f 2 > "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -1,64 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
#if [[ ${2} ]]; then
# version="${2}"
# majorversion="${2%%.*}"
#fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
a,b,c
1,2,3
0,0,0
$,\,'
DATA
cat << "DATA" > "tmp/${t}/${t}.transform"
[
{
"op": "core/column-addition",
"engineConfig": {
"mode": "row-based"
},
"newColumnName": "apply",
"columnInsertIndex": 2,
"baseColumnName": "b",
"expression": "grel:value.replace('2','TEST')",
"onError": "set-to-blank"
}
]
DATA
# ================================= ASSERTION ================================ #
#if [[ "${version}" = "2.7" ]]; then
cat << "DATA" > "tmp/${t}/${t}.assert"
a b apply c
1 2 TEST 3
0 0 0 0
$ \ \ '
DATA
#else
#fi
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --apply "tmp/${t}/${t}.transform" "${t}"
${cmd} --export "${t}" --output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

52
tests/refinetest.py Normal file
View File

@ -0,0 +1,52 @@
#!/usr/bin/env python
"""
refinetest.py
RefineTestCase is a base class that loads Refine projects specified by
the class's 'project_file' attribute and provides a 'project' object.
These tests require a connection to a Refine server either at
http://127.0.0.1:3333/ or by specifying environment variables REFINE_HOST
and REFINE_PORT.
"""
# Copyright (c) 2011 Paul Makepeace, Real Programmers. All rights reserved.
import os
import unittest
from google.refine import refine
PATH_TO_TEST_DATA = os.path.join(os.path.dirname(__file__), 'data')
#noinspection PyPep8Naming
class RefineTestCase(unittest.TestCase):
project_file = None
project_format = 'text/line-based/*sv'
project_options = {}
project = None
# Section "2. Exploration using Facets": {1}, {2}
def project_path(self):
return os.path.join(PATH_TO_TEST_DATA, self.project_file)
def setUp(self):
self.server = refine.RefineServer()
self.refine = refine.Refine(self.server)
if self.project_file:
self.project = self.refine.new_project(
project_file=self.project_path(), project_format=self.project_format, **self.project_options)
def tearDown(self):
if self.project:
self.project.delete()
self.project = None
def assertInResponse(self, expect):
desc = None
try:
desc = self.project.history_entry.description
self.assertTrue(expect in desc)
except AssertionError:
raise AssertionError('Expecting "%s" in "%s"' % (expect, desc))

View File

@ -1,58 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
email,name,state,gender,purchase
danny.baron@example1.com,Danny Baron,CA,M,TV
melanie.white@example2.edu,Melanie White,NC,F,iPhone
danny.baron@example1.com,D. Baron,CA,M,Winter jacket
ben.tyler@example3.org,Ben Tyler,NV,M,Flashlight
arthur.duff@example4.com,Arthur Duff,OR,M,Dining table
danny.baron@example1.com,Daniel Baron,CA,M,Bike
jean.griffith@example5.org,Jean Griffith,WA,F,Power drill
melanie.white@example2.edu,Melanie White,NC,F,iPad
ben.morisson@example6.org,Ben Morisson,FL,M,Amplifier
arthur.duff@example4.com,Arthur Duff,OR,M,Night table
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
{ "events" : [
{ "name" : "Melanie White", "purchase" : "iPhone" },
{ "name" : "Jean Griffith", "purchase" : "Power drill" },
{ "name" : "Melanie White", "purchase" : "iPad" }
] }
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --export "${t}" \
--prefix '{ "events" : [
' \
--template ' { "name" : {{jsonize(cells["name"].value)}}, "purchase" : {{jsonize(cells["purchase"].value)}} }' \
--rowSeparator ',
' \
--suffix '
] }
' \
--facets '{"type":"list","name":"gender","columnName":"gender","expression":"value","omitBlank":false,"omitError":false,"selection":[{"v":{"v":"F","l":"F"}}],"selectBlank":false,"selectError":false,"invert":false}' \
--output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -1,54 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
🔣,code,meaning
🍇,1F347,GRAPES
🍉,1F349,WATERMELON
🍒,1F352,CHERRIES
🍓,1F353,STRAWBERRY
🍍,1F34D,PINEAPPLE
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
{ "emojis" : [
{ "symbol" : "🍇", "meaning" : "GRAPES" },
{ "symbol" : "🍉", "meaning" : "WATERMELON" },
{ "symbol" : "🍍", "meaning" : "PINEAPPLE" }
] }
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --export "${t}" \
--prefix '{ "emojis" : [
' \
--template ' { "symbol" : {{jsonize(with(row.columnNames[0],cn,cells[cn].value))}}, "meaning" : {{jsonize(cells["meaning"].value)}} }' \
--rowSeparator ',
' \
--suffix '
] }
' \
--filterQuery '^1F34' \
--filterColumn 'code' \
--output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -1,59 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
email,name,state,gender,purchase
danny.baron@example1.com,Danny Baron,CA,M,TV
melanie.white@example2.edu,Melanie White,NC,F,iPhone
danny.baron@example1.com,D. Baron,CA,M,Winter jacket
ben.tyler@example3.org,Ben Tyler,NV,M,Flashlight
arthur.duff@example4.com,Arthur Duff,OR,M,Dining table
danny.baron@example1.com,Daniel Baron,CA,M,Bike
jean.griffith@example5.org,Jean Griffith,WA,F,Power drill
melanie.white@example2.edu,Melanie White,NC,F,iPad
ben.morisson@example6.org,Ben Morisson,FL,M,Amplifier
arthur.duff@example4.com,Arthur Duff,OR,M,Night table
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
{ "events" : [
{ "name" : "Melanie White", "purchase" : "iPhone" },
{ "name" : "Jean Griffith", "purchase" : "Power drill" },
{ "name" : "Melanie White", "purchase" : "iPad" }
] }
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --export "${t}" \
--prefix '{ "events" : [
' \
--template ' { "name" : {{jsonize(cells["name"].value)}}, "purchase" : {{jsonize(cells["purchase"].value)}} }' \
--rowSeparator ',
' \
--suffix '
] }
' \
--filterQuery '^F$' \
--filterColumn 'gender' \
--output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -1,58 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
email,name,state,gender,purchase
arthur.duff@example4.com,Arthur Duff,OR,M,Dining table
,Arthur Duff,OR,M,Night table
ben.morisson@example6.org,Ben Morisson,FL,M,Amplifier
ben.tyler@example3.org,Ben Tyler,NV,M,Flashlight
danny.baron@example1.com,Daniel Baron,CA,M,Bike
,Danny Baron,CA,M,TV
,D. Baron,CA,M,Winter jacket
jean.griffith@example5.org,Jean Griffith,WA,F,Power drill
melanie.white@example2.edu,Melanie White,NC,F,iPad
,Melanie White,NC,F,iPhone
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
{ "events" : [
{ "name" : "Melanie White", "purchase" : "iPad" } { "name" : "Melanie White", "purchase" : "iPhone" }
] }
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --export "${t}" \
--prefix '{ "events" : [
' \
--template ' { "name" : {{jsonize(cells["name"].value)}}, "purchase" : {{jsonize(cells["purchase"].value)}} }' \
--rowSeparator ',
' \
--suffix '
] }
' \
--mode "record-based" \
--splitToFiles true \
--output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
ls "tmp/${t}"
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}_6.output"

View File

@ -1,52 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
🔣,code,meaning
🍇,1F347,GRAPES
🍉,1F349,WATERMELON
🍒,1F352,CHERRIES
🍓,1F353,STRAWBERRY
🍍,1F34D,PINEAPPLE
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
{ "emojis" : [
{ "symbol" : "🍍", "meaning" : "PINEAPPLE" }
] }
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --export "${t}" \
--prefix '{ "emojis" : [
' \
--template ' { "symbol" : {{jsonize(with(row.columnNames[0],cn,cells[cn].value))}}, "meaning" : {{jsonize(cells["meaning"].value)}} }' \
--rowSeparator ',
' \
--suffix '
] }
' \
--splitToFiles true \
--suffixById true \
--output "tmp/${t}/trái cây.json"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/trái cây_🍍.json"

View File

@ -1,58 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
email,name,state,gender,purchase
danny.baron@example1.com,Danny Baron,CA,M,TV
melanie.white@example2.edu,Melanie White,NC,F,iPhone
danny.baron@example1.com,D. Baron,CA,M,Winter jacket
ben.tyler@example3.org,Ben Tyler,NV,M,Flashlight
arthur.duff@example4.com,Arthur Duff,OR,M,Dining table
danny.baron@example1.com,Daniel Baron,CA,M,Bike
jean.griffith@example5.org,Jean Griffith,WA,F,Power drill
melanie.white@example2.edu,Melanie White,NC,F,iPad
ben.morisson@example6.org,Ben Morisson,FL,M,Amplifier
arthur.duff@example4.com,Arthur Duff,OR,M,Night table
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
{ "events" : [
{ "name" : "Arthur Duff", "purchase" : "Night table" }
] }
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --export "${t}" \
--prefix '{ "events" : [
' \
--template ' { "name" : {{jsonize(cells["name"].value)}}, "purchase" : {{jsonize(cells["purchase"].value)}} }' \
--rowSeparator ',
' \
--suffix '
] }
' \
--splitToFiles true \
--suffixById true \
--output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
ls "tmp/${t}"
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}_arthur.duff@example4.com.output"

View File

@ -1,51 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
🔣,code,meaning
🍇,1F347,GRAPES
🍉,1F349,WATERMELON
🍒,1F352,CHERRIES
🍓,1F353,STRAWBERRY
🍍,1F34D,PINEAPPLE
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
{ "emojis" : [
{ "symbol" : "🍍", "meaning" : "PINEAPPLE" }
] }
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --export "${t}" \
--prefix '{ "emojis" : [
' \
--template ' { "symbol" : {{jsonize(with(row.columnNames[0],cn,cells[cn].value))}}, "meaning" : {{jsonize(cells["meaning"].value)}} }' \
--rowSeparator ',
' \
--suffix '
] }
' \
--splitToFiles true \
--output "tmp/${t}/trái cây.json"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/trái cây_5.json"

View File

@ -1,57 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
email,name,state,gender,purchase
danny.baron@example1.com,Danny Baron,CA,M,TV
melanie.white@example2.edu,Melanie White,NC,F,iPhone
danny.baron@example1.com,D. Baron,CA,M,Winter jacket
ben.tyler@example3.org,Ben Tyler,NV,M,Flashlight
arthur.duff@example4.com,Arthur Duff,OR,M,Dining table
danny.baron@example1.com,Daniel Baron,CA,M,Bike
jean.griffith@example5.org,Jean Griffith,WA,F,Power drill
melanie.white@example2.edu,Melanie White,NC,F,iPad
ben.morisson@example6.org,Ben Morisson,FL,M,Amplifier
arthur.duff@example4.com,Arthur Duff,OR,M,Night table
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
{ "events" : [
{ "name" : "Arthur Duff", "purchase" : "Night table" }
] }
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --export "${t}" \
--prefix '{ "events" : [
' \
--template ' { "name" : {{jsonize(cells["name"].value)}}, "purchase" : {{jsonize(cells["purchase"].value)}} }' \
--rowSeparator ',
' \
--suffix '
] }
' \
--splitToFiles true \
--output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
ls "tmp/${t}"
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}_10.output"

View File

@ -1,54 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
🔣,code,meaning
🍇,1F347,GRAPES
🍉,1F349,WATERMELON
🍒,1F352,CHERRIES
🍓,1F353,STRAWBERRY
🍍,1F34D,PINEAPPLE
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
{ "emojis" : [
{ "symbol" : "🍇", "meaning" : "GRAPES" },
{ "symbol" : "🍉", "meaning" : "WATERMELON" },
{ "symbol" : "🍒", "meaning" : "CHERRIES" },
{ "symbol" : "🍓", "meaning" : "STRAWBERRY" },
{ "symbol" : "🍍", "meaning" : "PINEAPPLE" }
] }
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --export "${t}" \
--prefix '{ "emojis" : [
' \
--template ' { "symbol" : {{jsonize(with(row.columnNames[0],cn,cells[cn].value))}}, "meaning" : {{jsonize(cells["meaning"].value)}} }' \
--rowSeparator ',
' \
--suffix '
] }
' \
--output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

View File

@ -1,64 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# =================================== DATA =================================== #
cat << "DATA" > "tmp/${t}/${t}.csv"
email,name,state,gender,purchase
danny.baron@example1.com,Danny Baron,CA,M,TV
melanie.white@example2.edu,Melanie White,NC,F,iPhone
danny.baron@example1.com,D. Baron,CA,M,Winter jacket
ben.tyler@example3.org,Ben Tyler,NV,M,Flashlight
arthur.duff@example4.com,Arthur Duff,OR,M,Dining table
danny.baron@example1.com,Daniel Baron,CA,M,Bike
jean.griffith@example5.org,Jean Griffith,WA,F,Power drill
melanie.white@example2.edu,Melanie White,NC,F,iPad
ben.morisson@example6.org,Ben Morisson,FL,M,Amplifier
arthur.duff@example4.com,Arthur Duff,OR,M,Night table
DATA
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
{ "events" : [
{ "name" : "Danny Baron", "purchase" : "TV" },
{ "name" : "Melanie White", "purchase" : "iPhone" },
{ "name" : "D. Baron", "purchase" : "Winter jacket" },
{ "name" : "Ben Tyler", "purchase" : "Flashlight" },
{ "name" : "Arthur Duff", "purchase" : "Dining table" },
{ "name" : "Daniel Baron", "purchase" : "Bike" },
{ "name" : "Jean Griffith", "purchase" : "Power drill" },
{ "name" : "Melanie White", "purchase" : "iPad" },
{ "name" : "Ben Morisson", "purchase" : "Amplifier" },
{ "name" : "Arthur Duff", "purchase" : "Night table" }
] }
DATA
# ================================== ACTION ================================== #
${cmd} --create "tmp/${t}/${t}.csv"
${cmd} --export "${t}" \
--prefix '{ "events" : [
' \
--template ' { "name" : {{jsonize(cells["name"].value)}}, "purchase" : {{jsonize(cells["purchase"].value)}} }' \
--rowSeparator ',
' \
--suffix '
] }
' \
--output "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"

148
tests/test_facet.py Normal file
View File

@ -0,0 +1,148 @@
#!/usr/bin/env python
"""
test_facet.py
"""
# Copyright (c) 2011 Paul Makepeace, Real Programmers. All rights reserved.
import json
import unittest
from google.refine.facet import *
class CamelTest(unittest.TestCase):
def test_to_camel(self):
pairs = (
('this', 'this'),
('this_attr', 'thisAttr'),
('From', 'from'),
)
for attr, camel_attr in pairs:
self.assertEqual(to_camel(attr), camel_attr)
def test_from_camel(self):
pairs = (
('this', 'this'),
('This', 'this'),
('thisAttr', 'this_attr'),
('ThisAttr', 'this_attr'),
('From', 'from'),
)
for camel_attr, attr in pairs:
self.assertEqual(from_camel(camel_attr), attr)
class FacetTest(unittest.TestCase):
def test_init(self):
facet = TextFacet('column name')
engine = Engine(facet)
self.assertEqual(facet.selection, [])
self.assertTrue(str(engine))
facet = NumericFacet('column name', From=1, to=5)
self.assertEqual(facet.to, 5)
self.assertEqual(facet.From, 1)
facet = StarredFacet()
self.assertEqual(facet.expression, 'row.starred')
facet = StarredFacet(True)
self.assertEqual(facet.selection[0]['v']['v'], True)
facet = FlaggedFacet(False)
self.assertEqual(facet.selection[0]['v']['v'], False)
self.assertRaises(ValueError, FlaggedFacet, 'false') # no strings
facet = TextFilterFacet('column name', 'query')
self.assertEqual(facet.query, 'query')
def test_selections(self):
facet = TextFacet('column name')
facet.include('element')
self.assertEqual(len(facet.selection), 1)
facet.include('element 2')
self.assertEqual(len(facet.selection), 2)
facet.exclude('element')
self.assertEqual(len(facet.selection), 1)
facet.reset()
self.assertEqual(len(facet.selection), 0)
facet.include('element').include('element 2')
self.assertEqual(len(facet.selection), 2)
class EngineTest(unittest.TestCase):
def test_init(self):
engine = Engine()
self.assertEqual(engine.mode, 'row-based')
engine.mode = 'record-based'
self.assertEqual(engine.mode, 'record-based')
engine.set_facets(BlankFacet)
self.assertEqual(engine.mode, 'record-based')
engine.set_facets(BlankFacet, BlankFacet)
self.assertEqual(len(engine), 2)
def test_serialize(self):
engine = Engine()
engine_json = engine.as_json()
self.assertEqual(engine_json, '{"facets": [], "mode": "row-based"}')
facet = TextFacet(column='column')
self.assertEqual(facet.as_dict(), {'selectError': False, 'name': 'column', 'selection': [], 'expression': 'value', 'invert': False, 'columnName': 'column', 'selectBlank': False, 'omitBlank': False, 'type': 'list', 'omitError': False})
facet = NumericFacet(column='column', From=1, to=5)
self.assertEqual(facet.as_dict(), {'from': 1, 'to': 5, 'selectBlank': True, 'name': 'column', 'selectError': True, 'expression': 'value', 'selectNumeric': True, 'columnName': 'column', 'selectNonNumeric': True, 'type': 'range'})
def test_add_facet(self):
facet = TextFacet(column='Party Code')
engine = Engine(facet)
engine.add_facet(TextFacet(column='Ethnicity'))
self.assertEqual(len(engine.facets), 2)
self.assertEqual(len(engine), 2)
def test_reset_remove(self):
text_facet1 = TextFacet('column name')
text_facet1.include('element')
text_facet2 = TextFacet('column name 2')
text_facet2.include('element 2')
engine = Engine(text_facet1, text_facet2)
self.assertEqual(len(engine), 2)
self.assertEqual(len(text_facet1.selection), 1)
engine.reset_all()
self.assertEqual(len(text_facet1.selection), 0)
self.assertEqual(len(text_facet2.selection), 0)
engine.remove_all()
self.assertEqual(len(engine), 0)
class SortingTest(unittest.TestCase):
def test_sorting(self):
sorting = Sorting()
self.assertEqual(sorting.as_json(), '{"criteria": []}')
sorting = Sorting('email')
c = sorting.criteria[0]
self.assertEqual(c['column'], 'email')
self.assertEqual(c['valueType'], 'string')
self.assertEqual(c['reverse'], False)
self.assertEqual(c['caseSensitive'], False)
self.assertEqual(c['errorPosition'], 1)
self.assertEqual(c['blankPosition'], 2)
sorting = Sorting(['email', 'gender'])
self.assertEqual(len(sorting), 2)
sorting = Sorting(['email', {'column': 'date', 'valueType': 'date'}])
self.assertEqual(len(sorting), 2)
c = sorting.criteria[1]
self.assertEqual(c['column'], 'date')
self.assertEqual(c['valueType'], 'date')
class FacetsResponseTest(unittest.TestCase):
response = """{"facets":[{"name":"Party Code","expression":"value","columnName":"Party Code","invert":false,"choices":[{"v":{"v":"D","l":"D"},"c":3700,"s":false},{"v":{"v":"R","l":"R"},"c":1613,"s":false},{"v":{"v":"N","l":"N"},"c":15,"s":false},{"v":{"v":"O","l":"O"},"c":184,"s":false}],"blankChoice":{"s":false,"c":1446}}],"mode":"row-based"}"""
def test_facet_response(self):
party_code_facet = TextFacet('Party Code')
engine = Engine(party_code_facet)
facets = engine.facets_response(json.loads(self.response)).facets
self.assertEqual(facets[0].choices['D'].count, 3700)
self.assertEqual(facets[0].blank_choice.count, 1446)
self.assertEqual(facets[party_code_facet], facets[0])
# test iteration
facet = [f for f in facets][0]
self.assertEqual(facet, facets[0])
if __name__ == '__main__':
unittest.main()

31
tests/test_history.py Normal file
View File

@ -0,0 +1,31 @@
#!/usr/bin/env python
"""
test_history.py
"""
# Copyright (c) 2011 Paul Makepeace, Real Programmers. All rights reserved.
import unittest
from google.refine.history import *
class HistoryTest(unittest.TestCase):
def test_init(self):
response = {
u"code": "ok",
u"historyEntry": {
u"id": 1303851435223,
u"description": "Split 4 cells",
u"time": "2011-04-26T16:45:08Z"
}
}
he = response['historyEntry']
entry = HistoryEntry(he['id'], he['time'], he['description'])
self.assertEqual(entry.id, 1303851435223)
self.assertEqual(entry.description, 'Split 4 cells')
self.assertEqual(entry.time, '2011-04-26T16:45:08Z')
if __name__ == '__main__':
unittest.main()

80
tests/test_refine.py Normal file
View File

@ -0,0 +1,80 @@
#!/usr/bin/env python
"""
test_refine.py
These tests require a connection to a Refine server either at
http://127.0.0.1:3333/ or by specifying environment variables
OPENREFINE_HOST and OPENREFINE_PORT.
"""
# Copyright (c) 2011 Paul Makepeace, Real Programmers. All rights reserved.
import csv
import unittest
from google.refine import refine
import refinetest
from io import StringIO
class RefineServerTest(refinetest.RefineTestCase):
def test_init(self):
server_url = 'http://' + refine.REFINE_HOST
if refine.REFINE_PORT != '80':
server_url += ':' + refine.REFINE_PORT
self.assertEqual(self.server.server, server_url)
self.assertEqual(refine.RefineServer.url(), server_url)
# strip trailing /
server = refine.RefineServer('http://refine.example/')
self.assertEqual(server.server, 'http://refine.example')
def test_list_projects(self):
projects = self.refine.list_projects()
self.assertTrue(isinstance(projects, dict))
def test_get_version(self):
version_info = self.server.get_version()
for item in ('revision', 'version', 'full_version', 'full_name'):
self.assertTrue(item in version_info)
def test_version(self):
self.assertTrue(self.server.version in ('3.2'))
class RefineTest(refinetest.RefineTestCase):
project_file = 'duplicates.csv'
def test_new_project(self):
self.assertTrue(isinstance(self.project, refine.RefineProject))
def test_wait_until_idle(self):
self.project.wait_until_idle() # should just return
def test_get_models(self):
self.assertEqual(self.project.key_column, 'email')
self.assertTrue('email' in self.project.columns)
self.assertTrue('email' in self.project.column_order)
self.assertEqual(self.project.column_order['name'], 1)
def test_delete_project(self):
self.assertTrue(self.project.delete())
def test_open_export(self):
response = refine.RefineProject(self.project.project_url()).export()
lines = response.text.splitlines()
self.assertTrue('email' in lines[0])
for line in lines[1:]:
self.assertTrue('M' in line or 'F' in line)
def test_open_export_csv(self):
response = refine.RefineProject(self.project.project_url()).export()
csv_fp = csv.reader(StringIO(response.text), dialect='excel-tab')
row = csv_fp.__next__()
self.assertTrue(row[0] == 'email')
for row in csv_fp:
self.assertTrue(row[3] == 'F' or row[3] == 'M')
if __name__ == '__main__':
unittest.main()

View File

@ -0,0 +1,81 @@
#!/usr/bin/env python3
"""
test_refine_small.py
"""
# Copyright (c) 2011 Paul Makepeace, Real Programmers. All rights reserved.
import unittest
from google.refine import refine
class RefineRowsTest(unittest.TestCase):
def test_rows_response(self):
rr = refine.RowsResponseFactory({
u'gender': 3, u'state': 2, u'purchase': 4, u'email': 0,
u'name': 1})
response = rr({
u'rows': [{
u'i': 0,
u'cells': [
{u'v': u'danny.baron@example1.com'},
{u'v': u'Danny Baron'},
{u'v': u'CA'},
{u'v': u'M'},
{u'v': u'TV'}
],
u'starred': False,
u'flagged': False
}],
u'start': 0,
u'limit': 1,
u'mode': u'row-based',
u'filtered': 10,
u'total': 10,
})
self.assertEqual(len(response.rows), 1)
# test iteration
rows = [row for row in response.rows]
self.assertEqual(rows[0]['name'], 'Danny Baron')
# test indexing
self.assertEqual(response.rows[0]['name'], 'Danny Baron')
class RefineProjectTest(unittest.TestCase):
def setUp(self):
# Mock out get_models so it doesn't attempt to connect to a server
self._get_models = refine.RefineProject.get_models
refine.RefineProject.get_models = lambda me: me
# Save REFINE_{HOST,PORT} as tests overwrite it
self._refine_host_port = refine.REFINE_HOST, refine.REFINE_PORT
refine.REFINE_HOST, refine.REFINE_PORT = '127.0.0.1', '3333'
def test_server_init(self):
RP = refine.RefineProject
p = RP('http://127.0.0.1:3333/project?project=1658955153749')
self.assertEqual(p.server.server, 'http://127.0.0.1:3333')
self.assertEqual(p.project_id, '1658955153749')
p = RP('http://127.0.0.1:3333', '1658955153749')
self.assertEqual(p.server.server, 'http://127.0.0.1:3333')
self.assertEqual(p.project_id, '1658955153749')
p = RP('http://server/varnish/project?project=1658955153749')
self.assertEqual(p.server.server, 'http://server/varnish')
self.assertEqual(p.project_id, '1658955153749')
p = RP('1658955153749')
self.assertEqual(p.server.server, 'http://127.0.0.1:3333')
self.assertEqual(p.project_id, '1658955153749')
refine.REFINE_HOST = '10.0.0.1'
refine.REFINE_PORT = '80'
p = RP('1658955153749')
self.assertEqual(p.server.server, 'http://10.0.0.1')
def tearDown(self):
# Restore mocked get_models
refine.RefineProject.get_models = self._get_models
# Restore values for REFINE_{HOST,PORT}
refine.REFINE_HOST, refine.REFINE_PORT = self._refine_host_port
if __name__ == '__main__':
unittest.main()

490
tests/test_tutorial.py Normal file
View File

@ -0,0 +1,490 @@
#!/usr/bin/env python
"""
test_tutorial.py
The tests here are based on David Huynh's Refine tutorial at
http://davidhuynh.net/spaces/nicar2011/tutorial.pdf The tests perform all the
Refine actions given in the tutorial (except the web scraping) and verify the
changes expected to be observed explained in the tutorial.
These tests require a connection to a Refine server either at
http://127.0.0.1:3333/ or by specifying environment variables
OPENREFINE_HOST and OPENREFINE_PORT.
"""
# Copyright (c) 2011 Paul Makepeace, Real Programmers. All rights reserved.
import unittest
from google.refine import facet
import refinetest
class TutorialTestFacets(refinetest.RefineTestCase):
project_file = 'louisiana-elected-officials.csv'
project_options = {'guess_cell_value_types': True}
def test_get_rows(self):
# Section "2. Exploration using Facets": {3}
response = self.project.get_rows(limit=10)
self.assertEqual(len(response.rows), 10)
self.assertEqual(response.limit, 10)
self.assertEqual(response.total, 6958)
self.assertEqual(response.filtered, 6958)
for row in response.rows:
self.assertFalse(row.flagged)
self.assertFalse(row.starred)
def test_facet(self):
# Section "2. Exploration using Facets": {4}
party_code_facet = facet.TextFacet(column='Party Code')
response = self.project.compute_facets(party_code_facet)
pc = response.facets[0]
# test look by index same as look up by facet object
self.assertEqual(pc, response.facets[party_code_facet])
self.assertEqual(pc.name, 'Party Code')
self.assertEqual(pc.choices['D'].count, 3700)
self.assertEqual(pc.choices['N'].count, 15)
self.assertEqual(pc.blank_choice.count, 1446)
# {5}, {6}
engine = facet.Engine(party_code_facet)
ethnicity_facet = facet.TextFacet(column='Ethnicity')
engine.add_facet(ethnicity_facet)
self.project.engine = engine
response = self.project.compute_facets()
e = response.facets[ethnicity_facet]
self.assertEqual(e.choices['B'].count, 1255)
self.assertEqual(e.choices['W'].count, 4469)
# {7}
ethnicity_facet.include('B')
response = self.project.get_rows()
self.assertEqual(response.filtered, 1255)
indexes = [row.index for row in response.rows]
self.assertEqual(indexes, [1, 2, 3, 4, 6, 12, 18, 26, 28, 32])
# {8}
response = self.project.compute_facets()
pc = response.facets[party_code_facet]
self.assertEqual(pc.name, 'Party Code')
self.assertEqual(pc.choices['D'].count, 1179)
self.assertEqual(pc.choices['R'].count, 11)
self.assertEqual(pc.blank_choice.count, 46)
# {9}
party_code_facet.include('R')
response = self.project.compute_facets()
e = response.facets[ethnicity_facet]
self.assertEqual(e.choices['B'].count, 11)
# {10}
party_code_facet.reset()
ethnicity_facet.reset()
response = self.project.get_rows()
self.assertEqual(response.filtered, 6958)
# {11}
office_title_facet = facet.TextFacet('Office Title')
self.project.engine.add_facet(office_title_facet)
response = self.project.compute_facets()
self.assertEqual(len(response.facets[2].choices), 76)
# {12} - XXX not sure how to interpret bins & baseBins yet
office_level_facet = facet.NumericFacet('Office Level')
self.project.engine.add_facet(office_level_facet)
# {13}
office_level_facet.From = 300 # from reserved word
office_level_facet.to = 320
response = self.project.get_rows()
self.assertEqual(response.filtered, 1907)
response = self.project.compute_facets()
ot = response.facets[office_title_facet]
self.assertEqual(len(ot.choices), 21)
self.assertEqual(ot.choices['Chief of Police'].count, 2)
self.assertEqual(ot.choices['Chief of Police '].count, 211)
# {14}
self.project.engine.remove_all()
response = self.project.get_rows()
self.assertEqual(response.filtered, 6958)
# {15}
phone_facet = facet.TextFacet('Phone', expression='value[0, 3]')
self.project.engine.add_facet(phone_facet)
response = self.project.compute_facets()
p = response.facets[phone_facet]
self.assertEqual(p.expression, 'value[0, 3]')
self.assertEqual(p.choices['318'].count, 2331)
# {16}
commissioned_date_facet = facet.NumericFacet(
'Commissioned Date',
expression='value.toDate().datePart("year")')
self.project.engine.add_facet(commissioned_date_facet)
response = self.project.compute_facets()
cd = response.facets[commissioned_date_facet]
self.assertEqual(cd.error_count, 959)
self.assertEqual(cd.numeric_count, 5999)
# {17}
office_description_facet = facet.NumericFacet(
'Office Description',
expression=r'value.match(/\D*(\d+)\w\w Rep.*/)[0].toNumber()')
self.project.engine.add_facet(office_description_facet)
response = self.project.compute_facets()
od = response.facets[office_description_facet]
self.assertEqual(od.min, 0)
self.assertEqual(od.max, 110)
self.assertEqual(od.numeric_count, 548)
class TutorialTestEditing(refinetest.RefineTestCase):
project_file = 'louisiana-elected-officials.csv'
project_options = {'guess_cell_value_types': True}
def test_editing(self):
# Section "3. Cell Editing": {1}
self.project.engine.remove_all() # redundant due to setUp
# {2}
self.project.text_transform(column='Zip Code 2',
expression='value.toString()[0, 5]')
self.assertInResponse('transform on 6958 cells in column Zip Code 2')
# {3} - XXX history
# {4}
office_title_facet = facet.TextFacet('Office Title')
self.project.engine.add_facet(office_title_facet)
response = self.project.compute_facets()
self.assertEqual(len(response.facets[office_title_facet].choices), 76)
self.project.text_transform('Office Title', 'value.trim()')
self.assertInResponse('6895')
response = self.project.compute_facets()
self.assertEqual(len(response.facets[office_title_facet].choices), 67)
# {5}
self.project.edit('Office Title', 'Councilmen', 'Councilman')
self.assertInResponse('13')
response = self.project.compute_facets()
self.assertEqual(len(response.facets[office_title_facet].choices), 66)
# {6}
response = self.project.compute_clusters('Office Title')
self.assertTrue(response)
# {7}
clusters = self.project.compute_clusters('Office Title', 'knn')
self.assertEqual(len(clusters), 7)
first_cluster = clusters[0]
self.assertEqual(len(first_cluster), 2)
self.assertEqual(first_cluster[0]['value'], 'DPEC Member at Large')
self.assertEqual(first_cluster[0]['count'], 6)
# Not strictly necessary to repeat 'Council Member' but a test
# of mass_edit, and it's also what the front end sends.
self.project.mass_edit('Office Title', [{
'from': ['Council Member', 'Councilmember'],
'to': 'Council Member'
}])
self.assertInResponse('372')
response = self.project.compute_facets()
self.assertEqual(len(response.facets[office_title_facet].choices), 65)
# Section "4. Row and Column Editing, Batched Row Deletion"
# Test doesn't strictly follow the tutorial as the "Browse this
# cluster" performs a text facet which the server can't complete
# as it busts its max facet count. The useful work is done with
# get_rows(). Also, we can facet & select in one; the UI can't.
# {1}, {2}, {3}, {4}
clusters = self.project.compute_clusters('Candidate Name')
for cluster in clusters[0:3]: # just do a few
for match in cluster:
# {2}
if match['value'].endswith(', '):
response = self.project.get_rows(
facet.TextFacet('Candidate Name', match['value']))
self.assertEqual(len(response.rows), 1)
for row in response.rows:
self.project.star_row(row)
self.assertInResponse(str(row.index + 1))
# {5}, {6}, {7}
response = self.project.compute_facets(facet.StarredFacet(True))
self.assertEqual(len(response.facets[0].choices), 2) # true & false
self.assertEqual(response.facets[0].choices[True].count, 2)
self.project.remove_rows()
self.assertInResponse('2 rows')
class TutorialTestDuplicateDetection(refinetest.RefineTestCase):
project_file = 'duplicates.csv'
def test_duplicate_detection(self):
# Section "4. Row and Column Editing,
# Duplicate Row Detection and Deletion"
# {7}, {8}
response = self.project.get_rows(sort_by='email')
indexes = [row.index for row in response.rows]
self.assertEqual(indexes, [4, 9, 8, 3, 0, 2, 5, 6, 1, 7])
# {9}
self.project.reorder_rows()
self.assertInResponse('Reorder rows')
response = self.project.get_rows()
indexes = [row.index for row in response.rows]
self.assertEqual(indexes, list(range(10)))
# {10}
self.project.add_column(
'email', 'count', 'facetCount(value, "value", "email")')
self.assertInResponse('column email by filling 10 rows')
response = self.project.get_rows()
self.assertEqual(self.project.column_order['email'], 0) # i.e. 1st
self.assertEqual(self.project.column_order['count'], 1) # i.e. 2nd
counts = [row['count'] for row in response.rows]
self.assertEqual(counts, [2, 2, 1, 1, 3, 3, 3, 1, 2, 2])
# {11}
self.assertFalse(self.project.has_records)
self.project.blank_down('email')
self.assertInResponse('Blank down 4 cells')
self.assertTrue(self.project.has_records)
response = self.project.get_rows()
emails = [1 if row['email'] else 0 for row in response.rows]
self.assertEqual(emails, [1, 0, 1, 1, 1, 0, 0, 1, 1, 0])
# {12}
blank_facet = facet.BlankFacet('email', selection=True)
# {13}
self.project.remove_rows(blank_facet)
self.assertInResponse('Remove 4 rows')
self.project.engine.remove_all()
response = self.project.get_rows()
email_counts = [(row['email'], row['count']) for row in response.rows]
self.assertEqual(email_counts, [
(u'arthur.duff@example4.com', 2),
(u'ben.morisson@example6.org', 1),
(u'ben.tyler@example3.org', 1),
(u'danny.baron@example1.com', 3),
(u'jean.griffith@example5.org', 1),
(u'melanie.white@example2.edu', 2)
])
class TutorialTestTransposeColumnsIntoRows(refinetest.RefineTestCase):
project_file = 'us_economic_assistance.csv'
def test_transpose_columns_into_rows(self):
# Section "5. Structural Editing, Transpose Columns into Rows"
# {1}, {2}, {3}
self.project.transpose_columns_into_rows('FY1946', 64, 'pair')
self.assertInResponse('64 column(s) starting with FY1946')
# {4}
self.project.add_column('pair', 'year', 'value[2,6].toNumber()')
self.assertInResponse('filling 26185 rows')
# {5}
self.project.text_transform(
column='pair', expression='value.substring(7).toNumber()')
self.assertInResponse('transform on 26185 cells')
# {6}
self.project.rename_column('pair', 'amount')
self.assertInResponse('Rename column pair to amount')
# {7}
self.project.fill_down('country_name')
self.assertInResponse('Fill down 23805 cells')
self.project.fill_down('program_name')
self.assertInResponse('Fill down 23805 cells')
# spot check of last row for transforms and fill down
response = self.project.get_rows()
row10 = response.rows[9]
self.assertEqual(row10['country_name'], 'Afghanistan')
self.assertEqual(row10['program_name'],
'Department of Defense Security Assistance')
self.assertEqual(row10['amount'], 113777303)
class TutorialTestTransposeFixedNumberOfRowsIntoColumns(
refinetest.RefineTestCase):
project_file = 'fixed-rows.csv'
project_format = 'text/line-based'
project_options = {'header_lines': 0}
def test_transpose_fixed_number_of_rows_into_columns(self):
if self.server.version not in ('2.0', '2.1'):
self.project.rename_column('Column 1', 'Column')
# Section "5. Structural Editing,
# Transpose Fixed Number of Rows into Columns"
# {1}
self.assertTrue('Column' in self.project.column_order)
# {8}
self.project.transpose_rows_into_columns('Column', 4)
self.assertInResponse('Transpose every 4 cells in column Column')
# {9} - renaming column triggers a bug in Refine <= 2.1
if self.server.version not in ('2.0', '2.1'):
self.project.rename_column('Column 2', 'Address')
self.project.rename_column('Column 3', 'Address 2')
self.project.rename_column('Column 4', 'Status')
# {10}
self.project.add_column(
'Column 1', 'Transaction',
'if(value.contains(" sent "), "send", "receive")')
self.assertInResponse('Column 1 by filling 4 rows')
# {11}
transaction_facet = facet.TextFacet(column='Transaction',
selection='send')
self.project.engine.add_facet(transaction_facet)
self.project.compute_facets()
# {12}, {13}, {14}
self.project.add_column(
'Column 1', 'Sender',
'value.partition(" sent ")[0]')
# XXX resetting the facet shows data in rows with Transaction=receive
# which shouldn't have been possible with the facet.
self.project.add_column(
'Column 1', 'Recipient',
'value.partition(" to ")[2].partition(" on ")[0]')
self.project.add_column(
'Column 1', 'Amount',
'value.partition(" sent ")[2].partition(" to ")[0]')
# {15}
transaction_facet.reset().include('receive')
self.project.get_rows()
# XXX there seems to be some kind of bug where the model doesn't
# match get_rows() output - cellIndex being returned that are
# out of range.
#self.assertTrue(a_row['Sender'] is None)
#self.assertTrue(a_row['Recipient'] is None)
#self.assertTrue(a_row['Amount'] is None)
# {16}
for column, expression in (
('Sender',
'cells["Column 1"].value.partition(" from ")[2].partition(" on ")[0]'),
('Recipient',
'cells["Column 1"].value.partition(" received ")[0]'),
('Amount',
'cells["Column 1"].value.partition(" received ")[2].partition(" from ")[0]')
):
self.project.text_transform(column, expression)
self.assertInResponse('2 cells')
# {17}
transaction_facet.reset()
# {18}
self.project.text_transform('Column 1', 'value.partition(" on ")[2]')
self.assertInResponse('4 cells')
# {19}
self.project.reorder_columns(['Transaction', 'Amount', 'Sender',
'Recipient'])
self.assertInResponse('Reorder columns')
class TutorialTestTransposeVariableNumberOfRowsIntoColumns(
refinetest.RefineTestCase):
project_file = 'variable-rows.csv'
project_format = 'text/line-based'
project_options = {'header_lines': 0}
def test_transpose_variable_number_of_rows_into_columns(self):
# {20}, {21}
if self.server.version not in ('2.0', '2.1') :
self.project.rename_column('Column 1', 'Column')
self.project.add_column(
'Column', 'First Line', 'if(value.contains(" on "), value, null)')
self.assertInResponse('Column by filling 4 rows')
response = self.project.get_rows()
first_names = [row['First Line'][0:10] if row['First Line'] else None
for row in response.rows]
self.assertEqual(first_names, [
'Tom Dalton', None, None, None,
'Morgan Law', None, None, None, None, 'Eric Batem'])
# {22}
self.project.move_column('First Line', 0)
self.assertInResponse('Move column First Line to position 0')
self.assertEqual(self.project.column_order['First Line'], 0)
# {23}
self.project.engine.mode = 'record-based'
response = self.project.get_rows()
self.assertEqual(response.mode, 'record-based')
self.assertEqual(response.filtered, 4)
# {24}
self.project.add_column(
'Column', 'Status', 'row.record.cells["Column"].value[-1]')
self.assertInResponse('filling 18 rows')
# {25}
self.project.text_transform(
'Column', 'row.record.cells["Column"].value[1, -1].join("|")')
self.assertInResponse('18 cells')
# {26}
self.project.engine.mode = 'fd'
# {27}
blank_facet = facet.BlankFacet('First Line', selection=True)
self.project.remove_rows(blank_facet)
self.assertInResponse('Remove 14 rows')
self.project.engine.remove_all()
# {28}
self.project.split_column('Column', separator='|')
self.assertInResponse('Split 4 cell(s) in column Column')
class TutorialTestWebScraping(refinetest.RefineTestCase):
project_file = 'eli-lilly.csv'
filter_expr_1 = """
forEach(
value[2,-2].replace("&#160;", " ").split("), ("),
v,
v[0,-1].partition(", '", true).join(":")
).join("|")
"""
filter_expr_2 = """
filter(
value.split("|"), p, p.partition(":")[0].toNumber() == %d
)[0].partition(":")[2]
"""
def test_web_scraping(self):
# Section "6. Web Scraping"
# {1}, {2}
self.project.split_column('key', separator=':')
self.assertInResponse('Split 5409 cell(s) in column key')
self.project.rename_column('key 1', 'page')
self.assertInResponse('Rename column key 1 to page')
self.project.rename_column('key 2', 'top')
self.assertInResponse('Rename column key 2 to top')
self.project.move_column('line', 'end')
self.assertInResponse('Move column line to position 2')
# {3}
self.project.sorting = facet.Sorting([
{'column': 'page', 'valueType': 'number'},
{'column': 'top', 'valueType': 'number'},
])
self.project.reorder_rows()
self.assertInResponse('Reorder rows')
first_row = self.project.get_rows(limit=1).rows[0]
self.assertEqual(first_row['page'], 1)
self.assertEqual(first_row['top'], 24)
# {4}
filter_facet = facet.TextFilterFacet('line', 'ahman')
rows = self.project.get_rows(filter_facet).rows
self.assertEqual(len(rows), 1)
self.assertEqual(rows[0]['top'], 106)
filter_facet.query = 'alvarez'
rows = self.project.get_rows().rows
self.assertEqual(len(rows), 2)
self.assertEqual(rows[-1]['top'], 567)
self.project.engine.remove_all()
# {5} - tutorial says 'line'; it means 'top'
line_facet = facet.NumericFacet('top')
line_facet.to = 100
self.project.remove_rows(line_facet)
self.assertInResponse('Remove 775 rows')
line_facet.From = 570
line_facet.to = 600
self.project.remove_rows(line_facet)
self.assertInResponse('Remove 71 rows')
line_facet.reset()
response = self.project.get_rows()
self.assertEqual(response.filtered, 4563)
# {6}
page_facet = facet.TextFacet('page', 1) # 1 not '1'
self.project.engine.add_facet(page_facet)
# {7}
rows = self.project.get_rows().rows
# Look for a row with a name in it by skipping HTML
name_row = [row for row in rows if '<b>' not in row['line']][0]
self.assertTrue('WELLNESS' in name_row['line'])
self.assertEqual(name_row['top'], 161)
line_facet.From = 20
line_facet.to = 160
self.project.remove_rows()
self.assertInResponse('Remove 9 rows')
self.project.engine.remove_all()
# {8}
self.project.text_transform('line', expression=self.filter_expr_1)
self.assertInResponse('Text transform on 4554 cells in column line')
# {9} - XXX following is generating Java exceptions
#filter_expr = self.filter_expr_2 % 16
#self.project.add_column('line', 'Name', expression=filter_expr)
# {10} to the final {19} - nothing new in terms of exercising the API.
if __name__ == '__main__':
unittest.main()

View File

@ -1,27 +0,0 @@
#!/bin/bash
# =============================== ENVIRONMENT ================================ #
if [[ ${1} ]]; then
cmd="${1}"
else
echo 1>&2 "execute tests.sh to run all tests"; exit 1
fi
t="$(basename "${BASH_SOURCE[0]}" .sh)"
cd "${BASH_SOURCE%/*}/" || exit 1
mkdir -p "tmp/${t}"
# ================================= ASSERTION ================================ #
cat << "DATA" > "tmp/${t}/${t}.assert"
Usage:
DATA
# ================================== ACTION ================================== #
${cmd} | head -n 1 | cut -c 1-6 > "tmp/${t}/${t}.output"
# =================================== TEST =================================== #
diff -u "tmp/${t}/${t}.assert" "tmp/${t}/${t}.output"