shortened tutorial and added simple stats

This commit is contained in:
felixlohmeier 2022-11-01 20:48:26 +00:00
parent 8885fe89fb
commit 04fa7883cb
4 changed files with 13 additions and 20 deletions

View File

@ -4,7 +4,7 @@ tasks:
before: gem install --silent bashly
init: |
wget -q -O openrefine.tar.gz "https://oss.sonatype.org/service/local/artifact/maven/content?r=releases&g=org.openrefine&a=openrefine&v=3.6.2&c=linux&p=tar.gz"
tar --exclude 'licenses' --exclude 'LICENSE.txt' --exclude 'README.md' -xzf openrefine.tar.gz --strip 1
tar --exclude 'licenses' --exclude 'LICENSE.txt' --exclude 'licenses.xml' --exclude 'README.md' -xzf openrefine.tar.gz --strip 1
rm openrefine.tar.gz
command: |
sudo ln -s "${PWD}/orcli" /usr/local/bin/

View File

@ -73,46 +73,33 @@ Optional:
orcli import csv "https://git.io/fj5hF" --projectName "duplicates"
```
3. Show OpenRefine project's metadata
3. Remove duplicates (coming soon)
```sh
orcli info "duplicates"
```
4. Remove duplicates (coming soon)
5. Export data from OpenRefine project to tab-separated-values (TSV) file `duplicates.tsv`
4. Export data from OpenRefine project to tab-separated-values (TSV) file `duplicates.tsv`
```sh
orcli export tsv "duplicates" --output "duplicates.tsv"
```
6. Write out your session history to file `example.sh` (and delete the last line to remove the history command)
5. Write out your session history to file `example.sh` (and delete the last line to remove the history command)
```sh
history -a "example.sh"
sed -i '$ d' example.sh
```
7. Exit playground
6. Exit playground
```sh
exit
```
8. Run batch process
7. Run whole process again
```sh
./orcli run example.sh
```
9. Cleanup example files
```sh
rm duplicates.tsv
rm example.sh
```
## Usage
* Use integrated help screens for available options and examples for each command.
@ -127,6 +114,8 @@ Optional:
OPENREFINE_URL="http://localhost:3333" orcli list
```
* If OpenRefine does not have enough memory to process the data, it becomes slow and may even crash. Check the message after the run command finishes to see how much memory was used and adjust the memory allocated to OpenRefine accordingly with the `--memory` flag (default: 2048M).
## Development
orcli uses [bashly](https://github.com/DannyBen/bashly/) for generating the one-file script from files in the `src` directory

4
orcli
View File

@ -1,5 +1,5 @@
#!/usr/bin/env bash
# This script was generated by bashly 0.8.9 (https://bashly.dannyb.co)
# This script was generated by bashly 0.8.10 (https://bashly.dannyb.co)
# Modifying it manually is not recommended
# :wrapper.bash3_bouncer
@ -930,6 +930,8 @@ orcli_run_command() {
awk 1 "${files[$i]}"
)
done
# print stats
log "used $(($(ps --no-headers -o rss -p "$OPENREFINE_PID") / 1024)) MB RAM and $(ps --no-headers -o cputime -p "$OPENREFINE_PID") CPU time"
fi
}

View File

@ -88,4 +88,6 @@ else
awk 1 "${files[$i]}"
)
done
# print stats
log "used $(($(ps --no-headers -o rss -p "$OPENREFINE_PID") / 1024)) MB RAM and $(ps --no-headers -o cputime -p "$OPENREFINE_PID") CPU time"
fi