move Dockerfile and provide README.md for docker repo
This commit is contained in:
parent
221d7da379
commit
92b2316077
|
@ -0,0 +1,58 @@
|
||||||
|
## batch processing with python-client
|
||||||
|
|
||||||
|
There are some client libraries for OpenRefine that communicate with the [OpenRefine API](https://github.com/OpenRefine/OpenRefine/wiki/OpenRefine-API). I have prepared a docker container on top of the [Python Library from PaulMakepeace](https://github.com/PaulMakepeace/refine-client-py/) and extended the CLI with some options to create new OpenRefine projects from files.
|
||||||
|
|
||||||
|
### basic usage
|
||||||
|
|
||||||
|
1) start server:
|
||||||
|
```docker run -d --name=openrefine felixlohmeier/openrefine```
|
||||||
|
|
||||||
|
2) start client (prints help screen):
|
||||||
|
```docker run --rm --link openrefine felixlohmeier/openrefine-client -H openrefine```
|
||||||
|
|
||||||
|
### example for customized run commands in interactive mode (e.g. for usage in terminals)
|
||||||
|
|
||||||
|
1) start server in terminal A:
|
||||||
|
```docker run --rm --name=openrefine -p 80:3333 -v /home/felix/refine:/data:z felixlohmeier/openrefine -i 0.0.0.0 -m 4G -d /data```
|
||||||
|
* automatically remove docker container when it exits
|
||||||
|
* set name "openrefine" for docker container
|
||||||
|
* publish internal port 3333 to host port 80
|
||||||
|
* mount host directory /home/felix/refine as working directory
|
||||||
|
* make openrefine available in the network
|
||||||
|
* increase java heap size to 4 GB
|
||||||
|
* set refine workspace to /data
|
||||||
|
* OpenRefine should be available at http://localhost
|
||||||
|
|
||||||
|
2) start client in terminal B (prints help screen):
|
||||||
|
```docker run --rm --link openrefine -v /home/felix/refine:/data:z felixlohmeier/openrefine-client -H openrefine```
|
||||||
|
* automatically remove docker container when it exits
|
||||||
|
* build up network connection with docker container "openrefine"
|
||||||
|
* mount host directory /home/felix/refine as working directory
|
||||||
|
* apply history in file /home/felix/refine/history.json to project with id 1234567890123
|
||||||
|
|
||||||
|
### example for customized run commands in detached mode (e.g. for usage in shell scripts)
|
||||||
|
|
||||||
|
1) define variables
|
||||||
|
* ```workingdir=/home/felix/refine```
|
||||||
|
* ```inputfile=example.csv```
|
||||||
|
* ```jsonfile=test.json```
|
||||||
|
|
||||||
|
|
||||||
|
2) start server
|
||||||
|
```docker run --d --name=openrefine -v ${workingdir}:/data:z felixlohmeier/openrefine -i 0.0.0.0 -m 4G -d /data```
|
||||||
|
|
||||||
|
3) create project (import file)
|
||||||
|
```docker run --rm --link openrefine -v ${workingdir}:/data:z felixlohmeier/openrefine-client -H openrefine -c $inputfile```
|
||||||
|
|
||||||
|
4) get project id
|
||||||
|
```project=($(docker run --rm --link openrefine -v ${workingdir}:/data felixlohmeier/openrefine-client -H openrefine --list | cut -c 2-14))```
|
||||||
|
|
||||||
|
5) apply transformations from json file
|
||||||
|
```docker run --rm --link -v ${workingdir}:/data felixlohmeier/openrefine-client -H openrefine -f ${jsonfile} ${project}```
|
||||||
|
|
||||||
|
6) export project to file
|
||||||
|
```docker run --rm --link openrefine -v ${workingdir}:/data felixlohmeier/openrefine-client -E --output=${project}.tsv ${project}```
|
||||||
|
|
||||||
|
7) cleanup
|
||||||
|
* ```docker stop -t=500 openrefine```
|
||||||
|
* ```docker rm openrefine```
|
Loading…
Reference in New Issue