This page describes all the production steps for the ontology of Nouns and Verbs of the Gallo-Italic variety spoken in Nicosia and Sperlinga, among with the utilized software tools and the intermediate products.
Download in Turtle Formatnicosiaesperlinga-base.ttl is an OWL ontology providing the Lexicon metadata. It is the base OWL ontology containing just the ontology individual, the lexicon one and some entries, which will has to be populated with all the remaining lexical entries in the Gallo-Italic variety spoken in Nicosia e Sperlinga.
pdfimporter is a tool that extracts lemmas from the Vocabolario del dialetto galloitalico di Nicosia e Sperlinga and places the corresponding entries into nicosiaesperlinga-base.ttl.
Thus, running the following command in the same directory with nicosiaesperlinga-base.ttl will produce nicosiaesperlinga-lemmas.ttl, i.e., an ontology with all the nouns and verbs of the Gallo-Italic variety spoken in Nicosia and Sperlinga.
java -jar pdfimporter.jar nicosiaesperlinga.pdf
sicilian-derivationbuilder finds out, using a brute force approach, all the possible derivations through Gallo-Sicilian Features which transform Sicilian etymons into lemmas in nicosiaesperlinga-lemmas.ttl.
java -jar sicilian-derivationbuilder.jar nicosiaesperlinga-lemmas.ttl
It produces the file derivations-bf.csv, enumerating one derivation per row.
These derivations have the form
lemma <--feature_label_1--intermediate_form_1<-- ... intermediate_form_n<--feature_label_n<--sicilian_etymon java
This file is then revised and reworked by lexicographers that eliminate multiple derivations for the same lemma, remove unplausible ones, and add further derivations produced using the online derivation tool.
derivations-revised.csv is the file that resulted after the manual intervention of lexicographers.
These derivations are then imported into the final ontology of nouns and verbs in the gallo-italic variety spoken in Nicosia e Sperlinga using gs-derivationsimporter. This tool takes as input three arguments:
When run without arguments, it searches for derivations-revised.csv as derivations file,
copies nicosiaesperlinga-lemmas.ttl into nicosiaesperlinga.ttl and places derivations in it
indicating sic
as language tag for etymons.
java -jar gs-derivationsimporter.jar
The imported derivations can be verified by means of liph-validator, a tool that checks that all the derivation steps occurring in a ontology are compliant with the definition of the linguistic phenomena they refers to. More in details, liph-validator takes as arguments
In our context
java -jar liph-validator.jar nicosiaesperlinga.ttl https://gallosiciliani.unict.it/ns/gs-features?ttl
gs-derivationsextractor allows one to produce a CSV file useful for statistical purposes enumerating all the derivations in an OWL ontology in turtle format, provided that the linguistic phenomena in the derivations are in the GalloSicilian Features Ontology.
The former argument of gs-derivationsextractor is the ontology file in turtle format, whereas the latter is the name of the CSV file which will be produced.
java -jar gs-derivationsextractor.jar nicosiaesperlinga.ttl derivations-ext.csv
The generated file derivations-ext.csv has the following columns:
id | a unique identifier for the row |
---|---|
lemma vnisp | indicating the Gallo-Sicilian lemma ending the derivation |
derivazione | containing the derivation |
tratti disattesi | enumerating all the Gallo-Sicilian features which could have affected the etymon, but are not in the derivation |
nuovo indice di galloitalicità | the rate between the total number of the Gallo-Sicilian features which could have affected the etymon and those that occurred in the derivation |
Afer | `sì` if there is some feature belonging to the category Apheresis, `no` otherwise |
Assib | `sì` if there is some feature belonging to the category Assibilation, `no` otherwise |
Degem | `sì` if there is some feature belonging to the category Degemination, `no` otherwise; |
Deretr | `sì` if there is some feature belonging to the category Deretroflexion, `no` otherwise |
Dissim | `sì` if there is some feature belonging to the category Dissimilation, `no` otherwise |
Ditt | sì` if there is some feature belonging to the category Diphthongization, `no` otherwise |
Leniz | `sì` if there is some feature belonging to the category Lenition, `no` otherwise |
Palat | `sì` if there is some feature belonging to the category Palatalization, `no` otherwise |
Vocal | `sì` if there is some feature belonging to the category Vocalization, `no` otherwise |
microtratto 1 | for the first feature in the derivation, if any |
microtratto 2 | for the second feature in the derivation, if any |
microtratto 3 | for the third feature in the derivation, if any |
microtratto 4 | for the fourth feature in the derivation, if any |
microtratto 5 | for the fifth feature in the derivation, if any |
microtratto 6 | for the sixth feature in the derivation, if any |
microtratto 7 | for the seventh feature in the derivation, if any |
microtratto 8 | for the eighth feature in the derivation, if any |