diff --git a/SUMMARY.md b/SUMMARY.md index 73e494d..09a3965 100644 --- a/SUMMARY.md +++ b/SUMMARY.md @@ -33,8 +33,6 @@ * [Metadaten eines Anbieters im CSV-Format](weitere-anwendungsfalle/metadaten-eines-anbieters-im-csv-format.md) * [Nachlässe aus Kalliope via SRU](weitere-anwendungsfalle/nachlasse-aus-kalliope-via-sru.md) -* [CSV zu MIDAS-XML und MIDAS-XML zu Excel](weitere-anwendungsfalle/csv-zu-midas-xml-und-midas-xml-zu-excel.md) -* [Dublin Core XML manipulieren](weitere-anwendungsfalle/dublin-core-xml-manipulieren.md) * [RDF vom AV-Portal der TIB](weitere-anwendungsfalle/rdf-vom-av-portal-der-tib.md) ## Suchindex Solr diff --git a/openrefine/marc_alternativ.json b/anwendungsfall-marc21/marc-vorverarbeitung.json similarity index 100% rename from openrefine/marc_alternativ.json rename to anwendungsfall-marc21/marc-vorverarbeitung.json diff --git a/anwendungsfall-marc21/transformation-mit-openrefine-in-finc-schema.md b/anwendungsfall-marc21/transformation-mit-openrefine-in-finc-schema.md index 465c356..dc0cbfe 100644 --- a/anwendungsfall-marc21/transformation-mit-openrefine-in-finc-schema.md +++ b/anwendungsfall-marc21/transformation-mit-openrefine-in-finc-schema.md @@ -4,33 +4,75 @@ Ziel: Daten für den Import in den Suchindex vorbereiten MARC21 ist sehr komplex und das [finc-Schema](https://github.com/finc/index/blob/master/schema.xml) hat ebenfalls etliche Felder, die teilweise kompliziert zu bilden sind. In dieser Summerschool können wir daher nur einen Teil erproben. -Arbeitstabelle \(in Summerschool erstellt\): [openrefine/wiley.xls](/openrefine/wiley.xls) - -## Felder definieren - -Neue Spalte anlegen: +## Neue Spalte anlegen für Definition der Felder (gemäß finc Schema) * column Subfields / Edit column / Add column based on this column... / Column Name: finc / Expression: "" -Felder definieren \(Beispiel für Titel = MARC 245\): +## Felder definieren + +Arbeitstabelle \(in Summerschool erstellt\): [wiley-mapping.xls](https://felixlohmeier.gitbooks.io/summerschool-openrefine/content/anwendungsfall-marc21/wiley-mapping.xls) + +Die Zuordnung der MARC-Daten zum finc-Schema kann in der Regel in einer drei folgenden Varianten erfolgen. + +### Variante A: Felder direkt aus einem MARC-Tag/Subfield definieren + +Beispiel für author = MARC 100 * show as: rows -* column Tags / Facet / text facet / "245" auswählen -* column Subfields / Facet / text facet / "a" und "b" auswählen -* column Indicators / Facet / text facet / "00", "02" und "04" auswählen -* column finc / edit cells / transform... / "title" +* column Tags / Facet / text facet / "100" auswählen +* column Subfields / Facet / text facet / "a" auswählen +* column Indicators / Facet / text facet / "1\" auswählen +* column finc / edit cells / transform... / "author" * close facets -Neue Zeile einfügen \(ist in OpenRefine nur mit einem Trick möglich\): +Beispiel für id = MARC 001 + +* column Tags / Facet / text facet / "001" auswählen +* column finc / edit cells / transform... / "id" +* close facets + +## Variante B: Felder aus mehreren MARC-Tags/Subfields zusammensetzen + +Wenn Felder sich aus mehreren Tag-Subfield-Indicator-Kombinationen zusammensetzen sollen, ist es am einfachsten, wenn die Teilbestandteile zunächst einzeln definiert werden. Beispiel für Titel = MARC 245 + +* column Tags / Facet / text facet / "245" auswählen +* column Subfields / Facet / text facet / "a" auswählen +* column Indicators / Facet / text facet / "00", "02" und "04" auswählen +* column finc / edit cells / transform... / "title1" +* close facets +* column Tags / Facet / text facet / "245" auswählen +* column Subfields / Facet / text facet / "a" auswählen +* column Indicators / Facet / text facet / "10", "12", "13" und "14" auswählen +* column finc / edit cells / transform... / "title2" +* close facets +* column Tags / Facet / text facet / "245" auswählen +* column Subfields / Facet / text facet / "b" auswählen +* column Indicators / Facet / text facet / "00" und "04" auswählen +* column finc / edit cells / transform... / "title3" +* close facets +* column Tags / Facet / text facet / "245" auswählen +* column Subfields / Facet / text facet / "b" auswählen +* column Indicators / Facet / text facet / "10", "12", "13" und "14" auswählen +* column finc / edit cells / transform... / "title4" +* close facets + +## Variante C: Zusätzliche Felder definieren + +Um zusätzliche Felder mit Werten zu belegen, die in den Ausgangsdaten nicht vorkommen, müssen zunächst mit einem Trick neue Zeilen eingefügt werden. Beispiel für die Facette "Zugang" im SLUB-Katalog. * column Tags / Facet / text facet / "LDR" auswählen * column Content / edit cells / transform... / Expression: value + "NEUEZEILE" * column Content / edit cells / Split multi-valued cells / Separator: NEUEZEILE * close facet -* Die neuen Zeilen kann nun über facet by blank ausgewählt werden: column Tags / Facet / Customized facets / facet by blank / true +* column Tags / Facet / Customized facets / facet by blank / true +* column finc / edit cells / transform... / "access_facet" +* column content / edit cells / transform... / "Electronic Resources" +* close facet ## Transponieren +Nachdem alle Felder (bzw. in Variante B deren Teilbestandteile) definiert wurden, müssen die Gesamtdaten transponiert werden, um die für den Import in den Suchindex benötigte Struktur zu erhalten. + * All / Edit columns / Re-order / remove columns ... / Spalten "RecordNumber", "Tags", "Indicators", "Subfields" nach rechts bewegen \(d.h. löschen\) * column finc / Facet / Customized facets / Facet by blank / true * All / Edit rows / Remove all matching rows @@ -39,9 +81,23 @@ Neue Zeile einfügen \(ist in OpenRefine nur mit einem Trick möglich\): * column finc / Facet / Customized facets / Facet by blank / true * All / Edit rows / Remove all matching rows * close facet -* column finc / transpose / columnize by key/value columns ... / ok -* Bei Bedarf Spalten manuell alphabetisch sortieren: All / Edit columns / Re-order / remove columns ... / +* column finc / transpose / columnize by key/value columns ... / key column: finc; value column: Content +* Spalten manuell sortieren: All / Edit columns / Re-order / remove columns ... / alphabetisch sortieren, Ausnahme: Spalte "id" soll als erstes stehen + +## Teilbestandteile zusammenfassen + +Beispiel für Titel + +* column title1 / Edit column / Add column based on this column... / Column Name: title / forNonBlank(cells["title1"].value,v,v,"") + forNonBlank(cells["title2"].value,v," "+v,"") + forNonBlank(cells["title3"].value,v," "+v,"") + forNonBlank(cells["title4"].value,v," "+v,"") +* column title / edit cells / transform... / value.trim().slice(0,-1) +* All / Edit columns / Re-order / remove columns ... / Spalten "title1", "title2", "title3", "title4" nach rechts bewegen \(d.h. löschen\) + +## Optional: Transformationsschritte als JSON-Konfiguration + +* Alle Transformationsschritte oben als JSON-Konfiguration: [wiley-minimal.json](https://felixlohmeier.gitbooks.io/summerschool-openrefine/content/anwendungsfall-marc21/wiley-minimal.json) ## Export Wählen Sie oben rechts im Menü Export den Menüpunkt `Tab-separated-value` + +Ergebnis: [wiley.tsv](https://www.felixlohmeier.de/slub/wiley/wiley.tsv) (aus Lizenzgründen zugriffsgeschützt) diff --git a/anwendungsfall-marc21/vorverarbeitung-mit-marcedit-und-openrefine.md b/anwendungsfall-marc21/vorverarbeitung-mit-marcedit-und-openrefine.md index 8a5938b..be1c39e 100644 --- a/anwendungsfall-marc21/vorverarbeitung-mit-marcedit-und-openrefine.md +++ b/anwendungsfall-marc21/vorverarbeitung-mit-marcedit-und-openrefine.md @@ -1,13 +1,23 @@ # Vorverarbeitung mit MarcEdit und OpenRefine -## Beispieldaten von MARC21 in TSV konvertieren +Ausgangsdaten: MARC21-Daten im Binärformat: [2016-2017_Wiley_UBCM_Auswahl-Kauf.mrc](https://www.felixlohmeier.de/slub/wiley/2016-2017_Wiley_UBCM_Auswahl-Kauf.mrc) (aus Lizenzgründen zugriffsgeschützt) -vgl. Anleitung im vorigen Kapitel +## Daten mit MarcEdit von MARC21 in TSV konvertieren + +Starten Sie MarcEdit, öffnen Sie den Bildschirm "OpenRefine Data Transfer" und geben Sie die folgenden Daten in die Maske ein: + +* Source File: Ausgangsdatei im MARC21-Format auswählen +* Save File: Ordner auswählen, Dateiname vergeben und bei "save as type" Tabbed Delimited Files (*.tsv) auswählen +* Export to OpenRefine auswählen und Button "Process" drücken + +Ergebnis: [wiley-marcedit-export.tsv](https://www.felixlohmeier.de/slub/wiley/wiley-marcedit-export.tsv) (aus Lizenzgründen zugriffsgeschützt) + +Achtung: MarcEdit ersetzt Dollarzeichen im Inhalt durch ```{dollar}```, damit das Dollarzeichen eindeutig als Steuerzeichen erkannt werden kann. ## Daten in OpenRefine laden * Menü Create Project -* TSV-Datei hochladen +* Im vorigen Schritt mit MarcEdit erstellte TSV-Datei hochladen * In den Optionen "store blank rows" deaktivieren ## Subfields aufteilen @@ -16,29 +26,32 @@ Führen Sie folgende Transformationsschritte in OpenRefine durch: * column Column / Edit column / Remove this column * column Content / Text filter: $ -* column Content / add column based on this column / Subfields / forEach\(value.split\("$"\),v,get\(v,0\)\).join\("$"\) -* column Content / edit cells / transform... / forEach\(value.split\("$"\),v,slice\(v,1\)\).join\("$"\) +* All / Edit rows / Star rows +* column Content / Edit cells / Transform... / value.slice(1) * close text filter -* column Subfields / edit cells / split multi-valued cells... / $ * column Content / edit cells / split multi-valued cells... / $ +* column RecordNumber / Facet / Customized facets / Facet by blank / true +* All / Edit rows / Star rows +* close facet +* All / Facet / Facet by star / true +* column Content / add column based on this column / Subfields / value.get(0) +* column Content / Edit cells / Transform... / value.slice(1) +* close facet ## Records bilden Führen Sie folgende Transformationsschritte in OpenRefine durch: -* column Subfields / Facet / customized facets / Facet by blank / false +* All / Facet / Facet by star / true * column RecordNumber / edit cells / Fill down * column Tags / edit cells / Fill down * column Indicators / edit cells / Fill down * close facet +* All / Edit rows / Unstar rows * column RecordNumber / edit cells / Blank down * Show: 5 rows * Show as: records ## Optional: Transformationsschritte als JSON-Konfiguration -* Alle Transformationsschritte oben als JSON-Konfiguration: [openrefine/marc.json](/openrefine/marc.json) -* In der Summerschool erarbeitete Alternativlösung: [openrefine/marc\_alternativ.json](/openrefine/marc_alternativ.json) - - - +* Alle Transformationsschritte oben als JSON-Konfiguration: [marc-vorverarbeitung.json](https://felixlohmeier.gitbooks.io/summerschool-openrefine/content/anwendungsfall-marc21/marc-vorverarbeitung.json) diff --git a/anwendungsfall-marc21/wiley-mapping.xls b/anwendungsfall-marc21/wiley-mapping.xls new file mode 100644 index 0000000..c52595e Binary files /dev/null and b/anwendungsfall-marc21/wiley-mapping.xls differ diff --git a/anwendungsfall-marc21/wiley-minimal.json b/anwendungsfall-marc21/wiley-minimal.json new file mode 100644 index 0000000..d6f6915 --- /dev/null +++ b/anwendungsfall-marc21/wiley-minimal.json @@ -0,0 +1,931 @@ +[ + { + "op": "core/column-removal", + "description": "Remove column Column", + "columnName": "Column" + }, + { + "op": "core/row-star", + "description": "Star rows", + "engineConfig": { + "mode": "row-based", + "facets": [ + { + "mode": "text", + "caseSensitive": false, + "query": "$", + "name": "Content", + "type": "text", + "columnName": "Content" + } + ] + }, + "starred": true + }, + { + "op": "core/text-transform", + "description": "Text transform on cells in column Content using expression grel:value.slice(1)", + "engineConfig": { + "mode": "row-based", + "facets": [ + { + "mode": "text", + "caseSensitive": false, + "query": "$", + "name": "Content", + "type": "text", + "columnName": "Content" + } + ] + }, + "columnName": "Content", + "expression": "grel:value.slice(1)", + "onError": "keep-original", + "repeat": false, + "repeatCount": 10 + }, + { + "op": "core/multivalued-cell-split", + "description": "Split multi-valued cells in column Content", + "columnName": "Content", + "keyColumnName": "RecordNumber", + "separator": "$", + "mode": "plain" + }, + { + "op": "core/row-star", + "description": "Star rows", + "engineConfig": { + "mode": "row-based", + "facets": [ + { + "omitError": false, + "expression": "isBlank(value)", + "selectBlank": false, + "selection": [ + { + "v": { + "v": true, + "l": "true" + } + } + ], + "selectError": false, + "invert": false, + "name": "RecordNumber", + "omitBlank": false, + "type": "list", + "columnName": "RecordNumber" + } + ] + }, + "starred": true + }, + { + "op": "core/column-addition", + "description": "Create column Subfields at index 4 based on column Content using expression grel:value.get(0)", + "engineConfig": { + "mode": "row-based", + "facets": [ + { + "omitError": false, + "expression": "row.starred", + "selectBlank": false, + "selection": [ + { + "v": { + "v": true, + "l": "true" + } + } + ], + "selectError": false, + "invert": false, + "name": "Starred Rows", + "omitBlank": false, + "type": "list", + "columnName": "" + } + ] + }, + "newColumnName": "Subfields", + "columnInsertIndex": 4, + "baseColumnName": "Content", + "expression": "grel:value.get(0)", + "onError": "set-to-blank" + }, + { + "op": "core/text-transform", + "description": "Text transform on cells in column Content using expression grel:value.slice(1)", + "engineConfig": { + "mode": "row-based", + "facets": [ + { + "omitError": false, + "expression": "row.starred", + "selectBlank": false, + "selection": [ + { + "v": { + "v": true, + "l": "true" + } + } + ], + "selectError": false, + "invert": false, + "name": "Starred Rows", + "omitBlank": false, + "type": "list", + "columnName": "" + } + ] + }, + "columnName": "Content", + "expression": "grel:value.slice(1)", + "onError": "keep-original", + "repeat": false, + "repeatCount": 10 + }, + { + "op": "core/fill-down", + "description": "Fill down cells in column RecordNumber", + "engineConfig": { + "mode": "row-based", + "facets": [ + { + "omitError": false, + "expression": "row.starred", + "selectBlank": false, + "selection": [ + { + "v": { + "v": true, + "l": "true" + } + } + ], + "selectError": false, + "invert": false, + "name": "Starred Rows", + "omitBlank": false, + "type": "list", + "columnName": "" + } + ] + }, + "columnName": "RecordNumber" + }, + { + "op": "core/fill-down", + "description": "Fill down cells in column Tags", + "engineConfig": { + "mode": "row-based", + "facets": [ + { + "omitError": false, + "expression": "row.starred", + "selectBlank": false, + "selection": [ + { + "v": { + "v": true, + "l": "true" + } + } + ], + "selectError": false, + "invert": false, + "name": "Starred Rows", + "omitBlank": false, + "type": "list", + "columnName": "" + } + ] + }, + "columnName": "Tags" + }, + { + "op": "core/fill-down", + "description": "Fill down cells in column Indicators", + "engineConfig": { + "mode": "row-based", + "facets": [ + { + "omitError": false, + "expression": "row.starred", + "selectBlank": false, + "selection": [ + { + "v": { + "v": true, + "l": "true" + } + } + ], + "selectError": false, + "invert": false, + "name": "Starred Rows", + "omitBlank": false, + "type": "list", + "columnName": "" + } + ] + }, + "columnName": "Indicators" + }, + { + "op": "core/row-star", + "description": "Unstar rows", + "engineConfig": { + "mode": "row-based", + "facets": [] + }, + "starred": false + }, + { + "op": "core/blank-down", + "description": "Blank down cells in column RecordNumber", + "engineConfig": { + "mode": "row-based", + "facets": [] + }, + "columnName": "RecordNumber" + }, + { + "op": "core/column-addition", + "description": "Create column finc at index 5 based on column Subfields using expression grel:\"\"", + "engineConfig": { + "mode": "row-based", + "facets": [] + }, + "newColumnName": "finc", + "columnInsertIndex": 5, + "baseColumnName": "Subfields", + "expression": "grel:\"\"", + "onError": "set-to-blank" + }, + { + "op": "core/text-transform", + "description": "Text transform on cells in column finc using expression grel:\"author\"", + "engineConfig": { + "mode": "row-based", + "facets": [ + { + "omitError": false, + "expression": "value", + "selectBlank": false, + "selection": [ + { + "v": { + "v": "100", + "l": "100" + } + } + ], + "selectError": false, + "invert": false, + "name": "Tags", + "omitBlank": false, + "type": "list", + "columnName": "Tags" + }, + { + "omitError": false, + "expression": "value", + "selectBlank": false, + "selection": [ + { + "v": { + "v": "a", + "l": "a" + } + } + ], + "selectError": false, + "invert": false, + "name": "Subfields", + "omitBlank": false, + "type": "list", + "columnName": "Subfields" + }, + { + "omitError": false, + "expression": "value", + "selectBlank": false, + "selection": [ + { + "v": { + "v": "1\\", + "l": "1\\" + } + } + ], + "selectError": false, + "invert": false, + "name": "Indicators", + "omitBlank": false, + "type": "list", + "columnName": "Indicators" + } + ] + }, + "columnName": "finc", + "expression": "grel:\"author\"", + "onError": "keep-original", + "repeat": false, + "repeatCount": 10 + }, + { + "op": "core/text-transform", + "description": "Text transform on cells in column finc using expression grel:\"id\"", + "engineConfig": { + "mode": "row-based", + "facets": [ + { + "omitError": false, + "expression": "value", + "selectBlank": false, + "selection": [ + { + "v": { + "v": "001", + "l": "001" + } + } + ], + "selectError": false, + "invert": false, + "name": "Tags", + "omitBlank": false, + "type": "list", + "columnName": "Tags" + } + ] + }, + "columnName": "finc", + "expression": "grel:\"id\"", + "onError": "keep-original", + "repeat": false, + "repeatCount": 10 + }, + { + "op": "core/text-transform", + "description": "Text transform on cells in column finc using expression grel:\"title1\"", + "engineConfig": { + "mode": "row-based", + "facets": [ + { + "omitError": false, + "expression": "value", + "selectBlank": false, + "selection": [ + { + "v": { + "v": "245", + "l": "245" + } + } + ], + "selectError": false, + "invert": false, + "name": "Tags", + "omitBlank": false, + "type": "list", + "columnName": "Tags" + }, + { + "omitError": false, + "expression": "value", + "selectBlank": false, + "selection": [ + { + "v": { + "v": "a", + "l": "a" + } + } + ], + "selectError": false, + "invert": false, + "name": "Subfields", + "omitBlank": false, + "type": "list", + "columnName": "Subfields" + }, + { + "omitError": false, + "expression": "value", + "selectBlank": false, + "selection": [ + { + "v": { + "v": "00", + "l": "00" + } + }, + { + "v": { + "v": "02", + "l": "02" + } + }, + { + "v": { + "v": "04", + "l": "04" + } + } + ], + "selectError": false, + "invert": false, + "name": "Indicators", + "omitBlank": false, + "type": "list", + "columnName": "Indicators" + } + ] + }, + "columnName": "finc", + "expression": "grel:\"title1\"", + "onError": "keep-original", + "repeat": false, + "repeatCount": 10 + }, + { + "op": "core/text-transform", + "description": "Text transform on cells in column finc using expression grel:\"title3\"", + "engineConfig": { + "mode": "row-based", + "facets": [ + { + "omitError": false, + "expression": "value", + "selectBlank": false, + "selection": [ + { + "v": { + "v": "245", + "l": "245" + } + } + ], + "selectError": false, + "invert": false, + "name": "Tags", + "omitBlank": false, + "type": "list", + "columnName": "Tags" + }, + { + "omitError": false, + "expression": "value", + "selectBlank": false, + "selection": [ + { + "v": { + "v": "b", + "l": "b" + } + } + ], + "selectError": false, + "invert": false, + "name": "Subfields", + "omitBlank": false, + "type": "list", + "columnName": "Subfields" + }, + { + "omitError": false, + "expression": "value", + "selectBlank": false, + "selection": [ + { + "v": { + "v": "00", + "l": "00" + } + }, + { + "v": { + "v": "04", + "l": "04" + } + }, + { + "v": { + "v": "02", + "l": "02" + } + } + ], + "selectError": false, + "invert": false, + "name": "Indicators", + "omitBlank": false, + "type": "list", + "columnName": "Indicators" + } + ] + }, + "columnName": "finc", + "expression": "grel:\"title3\"", + "onError": "keep-original", + "repeat": false, + "repeatCount": 10 + }, + { + "op": "core/text-transform", + "description": "Text transform on cells in column finc using expression grel:\"title2\"", + "engineConfig": { + "mode": "row-based", + "facets": [ + { + "omitError": false, + "expression": "value", + "selectBlank": false, + "selection": [ + { + "v": { + "v": "245", + "l": "245" + } + } + ], + "selectError": false, + "invert": false, + "name": "Tags", + "omitBlank": false, + "type": "list", + "columnName": "Tags" + }, + { + "omitError": false, + "expression": "value", + "selectBlank": false, + "selection": [ + { + "v": { + "v": "a", + "l": "a" + } + } + ], + "selectError": false, + "invert": false, + "name": "Subfields", + "omitBlank": false, + "type": "list", + "columnName": "Subfields" + }, + { + "omitError": false, + "expression": "value", + "selectBlank": false, + "selection": [ + { + "v": { + "v": "12", + "l": "12" + } + }, + { + "v": { + "v": "13", + "l": "13" + } + }, + { + "v": { + "v": "14", + "l": "14" + } + }, + { + "v": { + "v": "10", + "l": "10" + } + } + ], + "selectError": false, + "invert": false, + "name": "Indicators", + "omitBlank": false, + "type": "list", + "columnName": "Indicators" + } + ] + }, + "columnName": "finc", + "expression": "grel:\"title2\"", + "onError": "keep-original", + "repeat": false, + "repeatCount": 10 + }, + { + "op": "core/text-transform", + "description": "Text transform on cells in column finc using expression grel:\"title4\"", + "engineConfig": { + "mode": "row-based", + "facets": [ + { + "omitError": false, + "expression": "value", + "selectBlank": false, + "selection": [ + { + "v": { + "v": "245", + "l": "245" + } + } + ], + "selectError": false, + "invert": false, + "name": "Tags", + "omitBlank": false, + "type": "list", + "columnName": "Tags" + }, + { + "omitError": false, + "expression": "value", + "selectBlank": false, + "selection": [ + { + "v": { + "v": "b", + "l": "b" + } + } + ], + "selectError": false, + "invert": false, + "name": "Subfields", + "omitBlank": false, + "type": "list", + "columnName": "Subfields" + }, + { + "omitError": false, + "expression": "value", + "selectBlank": false, + "selection": [ + { + "v": { + "v": "14", + "l": "14" + } + }, + { + "v": { + "v": "10", + "l": "10" + } + }, + { + "v": { + "v": "12", + "l": "12" + } + }, + { + "v": { + "v": "13", + "l": "13" + } + } + ], + "selectError": false, + "invert": false, + "name": "Indicators", + "omitBlank": false, + "type": "list", + "columnName": "Indicators" + } + ] + }, + "columnName": "finc", + "expression": "grel:\"title4\"", + "onError": "keep-original", + "repeat": false, + "repeatCount": 10 + }, + { + "op": "core/text-transform", + "description": "Text transform on cells in column Content using expression grel:value + \"NEUEZEILE\"", + "engineConfig": { + "mode": "row-based", + "facets": [ + { + "omitError": false, + "expression": "value", + "selectBlank": false, + "selection": [ + { + "v": { + "v": "LDR", + "l": "LDR" + } + } + ], + "selectError": false, + "invert": false, + "name": "Tags", + "omitBlank": false, + "type": "list", + "columnName": "Tags" + } + ] + }, + "columnName": "Content", + "expression": "grel:value + \"NEUEZEILE\"", + "onError": "keep-original", + "repeat": false, + "repeatCount": 10 + }, + { + "op": "core/multivalued-cell-split", + "description": "Split multi-valued cells in column Content", + "columnName": "Content", + "keyColumnName": "RecordNumber", + "separator": "NEUEZEILE", + "mode": "plain" + }, + { + "op": "core/text-transform", + "description": "Text transform on cells in column finc using expression grel:\"access_facet\"", + "engineConfig": { + "mode": "row-based", + "facets": [ + { + "omitError": false, + "expression": "isBlank(value)", + "selectBlank": false, + "selection": [ + { + "v": { + "v": true, + "l": "true" + } + } + ], + "selectError": false, + "invert": false, + "name": "Tags", + "omitBlank": false, + "type": "list", + "columnName": "Tags" + } + ] + }, + "columnName": "finc", + "expression": "grel:\"access_facet\"", + "onError": "keep-original", + "repeat": false, + "repeatCount": 10 + }, + { + "op": "core/text-transform", + "description": "Text transform on cells in column Content using expression grel:\"Electronic Resources\"", + "engineConfig": { + "mode": "row-based", + "facets": [ + { + "omitError": false, + "expression": "isBlank(value)", + "selectBlank": false, + "selection": [ + { + "v": { + "v": true, + "l": "true" + } + } + ], + "selectError": false, + "invert": false, + "name": "Tags", + "omitBlank": false, + "type": "list", + "columnName": "Tags" + } + ] + }, + "columnName": "Content", + "expression": "grel:\"Electronic Resources\"", + "onError": "keep-original", + "repeat": false, + "repeatCount": 10 + }, + { + "op": "core/column-reorder", + "description": "Reorder columns", + "columnNames": [ + "Content", + "finc" + ] + }, + { + "op": "core/row-removal", + "description": "Remove rows", + "engineConfig": { + "mode": "row-based", + "facets": [ + { + "omitError": false, + "expression": "isBlank(value)", + "selectBlank": false, + "selection": [ + { + "v": { + "v": true, + "l": "true" + } + } + ], + "selectError": false, + "invert": false, + "name": "finc", + "omitBlank": false, + "type": "list", + "columnName": "finc" + } + ] + } + }, + { + "op": "core/blank-down", + "description": "Blank down cells in column finc", + "engineConfig": { + "mode": "row-based", + "facets": [] + }, + "columnName": "finc" + }, + { + "op": "core/multivalued-cell-join", + "description": "Join multi-valued cells in column Content", + "columnName": "Content", + "keyColumnName": "Content", + "separator": "$" + }, + { + "op": "core/key-value-columnize", + "description": "Columnize by key column finc and value column Content with note column ", + "keyColumnName": "finc", + "valueColumnName": "Content", + "noteColumnName": "" + }, + { + "op": "core/column-reorder", + "description": "Reorder columns", + "columnNames": [ + "id", + "access_facet", + "author", + "title1", + "title2", + "title3", + "title4" + ] + }, + { + "op": "core/column-addition", + "description": "Create column title at index 4 based on column title1 using expression grel:forNonBlank(cells[\"title1\"].value,v,v,\"\") + forNonBlank(cells[\"title2\"].value,v,\" \"+v,\"\") + forNonBlank(cells[\"title3\"].value,v,\" \"+v,\"\") + forNonBlank(cells[\"title4\"].value,v,\" \"+v,\"\")", + "engineConfig": { + "mode": "row-based", + "facets": [] + }, + "newColumnName": "title", + "columnInsertIndex": 4, + "baseColumnName": "title1", + "expression": "grel:forNonBlank(cells[\"title1\"].value,v,v,\"\") + forNonBlank(cells[\"title2\"].value,v,\" \"+v,\"\") + forNonBlank(cells[\"title3\"].value,v,\" \"+v,\"\") + forNonBlank(cells[\"title4\"].value,v,\" \"+v,\"\")", + "onError": "set-to-blank" + }, + { + "op": "core/text-transform", + "description": "Text transform on cells in column title using expression grel:value.trim().slice(0,-1)", + "engineConfig": { + "mode": "row-based", + "facets": [] + }, + "columnName": "title", + "expression": "grel:value.trim().slice(0,-1)", + "onError": "keep-original", + "repeat": false, + "repeatCount": 10 + }, + { + "op": "core/column-reorder", + "description": "Reorder columns", + "columnNames": [ + "id", + "access_facet", + "author", + "title" + ] + } +] diff --git a/openrefine/marc.json b/openrefine/marc.json deleted file mode 100644 index 9d9907f..0000000 --- a/openrefine/marc.json +++ /dev/null @@ -1,163 +0,0 @@ -[ - { - "op": "core/column-removal", - "description": "Remove column Column", - "columnName": "Column" - }, - { - "op": "core/column-addition", - "description": "Create column Subfields at index 4 based on column Content using expression grel:forEach(value.split(\"$\"),v,get(v,0)).join(\"$\")", - "engineConfig": { - "mode": "row-based", - "facets": [ - { - "mode": "text", - "caseSensitive": false, - "query": "$", - "name": "Content", - "type": "text", - "columnName": "Content" - } - ] - }, - "newColumnName": "Subfields", - "columnInsertIndex": 4, - "baseColumnName": "Content", - "expression": "grel:forEach(value.split(\"$\"),v,get(v,0)).join(\"$\")", - "onError": "set-to-blank" - }, - { - "op": "core/text-transform", - "description": "Text transform on cells in column Content using expression grel:forEach(value.split(\"$\"),v,slice(v,1)).join(\"$\")", - "engineConfig": { - "mode": "row-based", - "facets": [ - { - "mode": "text", - "caseSensitive": false, - "query": "$", - "name": "Content", - "type": "text", - "columnName": "Content" - } - ] - }, - "columnName": "Content", - "expression": "grel:forEach(value.split(\"$\"),v,slice(v,1)).join(\"$\")", - "onError": "keep-original", - "repeat": false, - "repeatCount": 10 - }, - { - "op": "core/multivalued-cell-split", - "description": "Split multi-valued cells in column Content", - "columnName": "Content", - "keyColumnName": "RecordNumber", - "separator": "$", - "mode": "plain" - }, - { - "op": "core/multivalued-cell-split", - "description": "Split multi-valued cells in column Subfields", - "columnName": "Subfields", - "keyColumnName": "RecordNumber", - "separator": "$", - "mode": "plain" - }, - { - "op": "core/fill-down", - "description": "Fill down cells in column RecordNumber", - "engineConfig": { - "mode": "row-based", - "facets": [ - { - "omitError": false, - "expression": "isBlank(value)", - "selectBlank": false, - "selection": [ - { - "v": { - "v": false, - "l": "false" - } - } - ], - "selectError": false, - "invert": false, - "name": "Subfields", - "omitBlank": false, - "type": "list", - "columnName": "Subfields" - } - ] - }, - "columnName": "RecordNumber" - }, - { - "op": "core/fill-down", - "description": "Fill down cells in column Tags", - "engineConfig": { - "mode": "row-based", - "facets": [ - { - "omitError": false, - "expression": "isBlank(value)", - "selectBlank": false, - "selection": [ - { - "v": { - "v": false, - "l": "false" - } - } - ], - "selectError": false, - "invert": false, - "name": "Subfields", - "omitBlank": false, - "type": "list", - "columnName": "Subfields" - } - ] - }, - "columnName": "Tags" - }, - { - "op": "core/fill-down", - "description": "Fill down cells in column Indicators", - "engineConfig": { - "mode": "row-based", - "facets": [ - { - "omitError": false, - "expression": "isBlank(value)", - "selectBlank": false, - "selection": [ - { - "v": { - "v": false, - "l": "false" - } - } - ], - "selectError": false, - "invert": false, - "name": "Subfields", - "omitBlank": false, - "type": "list", - "columnName": "Subfields" - } - ] - }, - "columnName": "Indicators" - }, - { - "op": "core/blank-down", - "description": "Blank down cells in column RecordNumber", - "engineConfig": { - "mode": "row-based", - "facets": [] - }, - "columnName": "RecordNumber" - } -] diff --git a/openrefine/wiley.xls b/openrefine/wiley.xls deleted file mode 100644 index 4e345ea..0000000 Binary files a/openrefine/wiley.xls and /dev/null differ diff --git a/weitere-anwendungsfalle/csv-zu-midas-xml-und-midas-xml-zu-excel.md b/weitere-anwendungsfalle/csv-zu-midas-xml-und-midas-xml-zu-excel.md deleted file mode 100644 index e69de29..0000000 diff --git a/weitere-anwendungsfalle/dublin-core-xml-manipulieren.md b/weitere-anwendungsfalle/dublin-core-xml-manipulieren.md deleted file mode 100644 index e69de29..0000000