Help:- Reference Data developmentindexpreviousnextside index

Reference Data is the PLCSlib mechanism for extending the semantics of the underlying PLCS PSM. This extension mechanism can be used within a particular business context in which case it is documented within the PLCSlib environment and is available for use within DEXs defined in that context. In addition a project or organization may want to further extend the context specific reference data.

An overview of Reference data and its use within PLCSlib is provided in Reference Data overview.

This section outlines how to develop reference data for use within a context. However, the approaches defined here can also be applied within projects or organizations extending PLCSlib reference data for their own particular exchange requirements.

PLCSlib Reference Data Librarys (RDLs) are developed at the Context level. This significantly reduces the number of RDLs required to support DEXs whilst ensuring the compatibility of the reference data between DEXs in the same context.

Figure 1 shows the architecture of the different Reference Data Libraries.

Each ontology shown in the diagram is stored in an OWL file. The file is identified by a URI that resolves the OWL file. This URI is used to import the OWL ontology into other ontologies.

The ontology itself is identified by an IRI.

      <owl:Ontology rdf:about="http://www.plcs.org/Exemplar/Exemplar-rdl">
        <owl:imports rdf:resource="http://docs.oasis-open.org/plcs/oasis-rdl-en.owl"/>
        </owl:Ontology>
images/rdarch.png
Figure 1 -  Reference Data Library Architecture

A major feature of this environment is the clear support for multiple natural language versions of an RDL. Language specific extension files just add language tagged annotations and do not add new classes or individuals. Notice the language specific files do not declare a new namespace for the concepts.

In many cases developers will only be interested in adding Reference data within a context, and so will only need to edit the file:

plcslib/data/contexts/<Cntx>/refdata/<cntx>-rdl-en.owl

This assumes the context specific extensions are developed using English as the default language. This situation is not always correct, for example in the context of Swedish Defence the default language will be Swedish. The new approach can handle this situation as shown in Figure 2. Here we can see the RDL master in this context is defined using a non-English language extending the same natural language version of the Standard PLCS Reference Data. If an English version of the RDL is required, for wider compatibility, this can be specified as a language extension of the master and only has to import the master RDL and provide English annotations.

images/rdarch-lang.png
Figure 2 -  Reference Data Library Architecture - alternative language context

The only new Referencs Data Libraries (RDLs) to be developed are context specific RDLs, since the core RDLs (i.e. the PSM RDL and the PLCS RDL) have already been developed. Context specific RDLs should be named according to the following naming scheme:

File location
The Context specific RDL file shall be created in
plcslib/data/contexts/<Cntx>/refdata/
Where <Cntx> is the context for which this RDL is being developed.
NOTE    The context in this usage can use appropriate capitalization, such as "SwedishDefense".
File name
The Context specific RDL file shall be called
<cntx>-rdl-<lang>.owl
Where <cntx> is the context for which this RDL is being developed and <lang> is the ISO 639-1 Language Code for the default language for the context.
NOTE    Here the context should be written in all lowercase to be consistent with the naming conventions.
IRI
The Context specific RDL identifier (IRI) shall be
http://www.plcs.org/<Cntx>/<cntx>-rdl
Where <cntx> is the context for which this RDL is being developed.

When creating a context specific RDL the RDL (the owl:Ontology) itself shall have the following annotations applied:

Scope (dc:coverage)
The RDL for a specific context shall identify the scope or applicability for the context using the dc:coverage annotation applied to the ontology itself. This shall be further qualified by a language code to allow other language extensions of this library to declare language specific translations of the scope statement.
Language (dc:language)
A language declaration that will be used to verify all annotations created in this RDL file are valid. This is represented as a dc:language annotation on the library itself.
Source (dc:source)
The RDL may identify a source for the reference data specified in this RDL. This is represented using dc:source annotation on the library. This shall be further qualified by a language code to allow other language extensions of this library to declare language specific sources.
Version (owl:versionInfo) mandatory property
The RDL version information is to be represented using owl:versionInfo annotation on the library. This shall be set to the value v#.## for use by human users of the RDL.
NOTE    This should not be qualified by a language code since this is language independent. It should also not be specified in language extensions since this is a property of the master RDL.
Revision (owl:versionInfo) mandatory property
Revision information for when the file was last edited using the owl:versionInfo annotation on the library. This shall be set to the value $Revision: $ for use by the CVS system. This shall be further qualified by a language code to allow it to be identified with the language specification and not the RDL as a whole.
Ontology Status (dc:type) mandatory property
The publication status of the RDL shall be represented with the dc:type annotation.
NOTE    This should not be qualified by a language code since this is language independent. It should also not be specified in language extensions since this is a property of the master RDL.
Last editor (dc:creator) mandatory property
The record of the last editor of the file shall be specified using the dc:creator annotation on the library. This shall be set to the value $Author: $ for use by the CVS system. This shall be further qualified by a language code to allow it to be identified with the language specification and not the RDL as a whole.
Last edit date (dc:date) mandatory property
A date value for when the file was last edited using the dc:date annotation on the library. This shall be set to the value $Date: $ for use by the CVS system. This shall be further qualified by a language code to allow it to be identified with the language specification and not the RDL as a whole.

Classes and individuals that have meaning in this context shall be declared within this RDL following the rules and guidelines identified below.

The development of Reference Data classes follows the process described in Figure 3

images/rd_development_process.png
Figure 3 -  Process model describing Reference Data development.

When creating a context specific RD class (the owl:class), each class shall have the following elements and annotations applied:

Identifier (rdf:about) mandatory property
The class must be given an identifier to allow it to be uniquely identified within the context of the RDL, including the contexts of any imported RDLs. The identifier for the class is combined with the URI for the RDL in which it is defined to create the URI for the class. Most OWL tools will raise an error if this identification is not unique in the current context. This identifier does not need to be interpretable in any natural language and could be an arbitrary string value, including a globally unique identification code (GUID). Given that the practice in the ontology community though, is to give meaningful identifiers (i.e. URI fragment identifiers) for classes, the PLCS community follow that practice. It is therefore recommended that the identifier used shall be a meaningful name in the default language of the RDL being defined and start with a capital letter. Identifiers shall contain no spaces or other special characters.
Subclass (rdfs:subclassof) mandatory property
For each new Class to be added to the RDL the parent class (or superclass) must be identified. Every class shall be a subclass of one of the PLCS class (defined in http://docs.oasis-open.org/ns/plcs/plcs) or their subclasses. It is possible that a class may have more than one superclass, in other words multiple inheritance is allowed. The new class is added as a subclass of the appropriate superclass and other superclasses added as necessary.
NOTE    Members of the new class are, by definition, members of all the identified superclasses. If this is not the case then the class hierarchy should be re-considered.

Labels are used to provide human readable identification in the desired language. The provided label shall be written all lower case, and may consist of more than one word separated by spaces. SKOS elements are used to provide this capability and the following are allowed:

Preferred label (skos:preflabel) mandatory property
A preferred label is the label that is the desired one and must always be provided
Alternative label (skos:altlabel)
One or more alternative labels can be specified. These are synonyms to the preferred label.
Deprecated label (skos:hiddenlabel)
This element provides the capability of listing labels that for one reason or another should not be used for the class.
NOTE    The use of rdfs:label is not permitted.

All chosen SKOS label related elements should be attributed with a language identifier using the ISO 639-1 language codes.

A class might represent either an information model related concept or a terminological concept, or both.

Definition (skos:definition)
A definition should be added to all classes capturing terminological concepts. A terminological concept is a commonly existing one, e.g. a technician.
Description (rdfs:comment) mandatory property
A description should be added for all classes capturing information model related concepts. An information model related concept is a concept created in order to capture a specific aspect within the scope of the model, e.g. a female technician.

If both a description and a definition is provided, which then should be equal, this then means that the class represents both an information model related concept, as well as a terminological concept. There can be a maximum of one description and one definition in each class. They should be formatted according to the following rules:

Source (dc:source)
If the description or the definition has been copied from elsewhere, the source should be specified. If the source is a standard, the source should be expressed as a combination of the identifier of the standard, its publication year and its name in quotation marks, e.g. ISO 10303-239:2005 - "Product LifeCycle Support"

Comments and examples may be provided for the class to add further clarity to the descriptions and definitions.

Note (skos:note)
Comments add information of the class that wasn't included in the description or definition
Example (skos:example)
Examples are valuable for adding clarity to the meaning of the class

There may be multiple comments and examples provided for each class. Both comments and examples should preferably only be single sentences, starting with capital letters and ending with a punctuation.

Both the comments and examples, as well as the descriptions and the definitions should be attributed with a language identifier using the ISO 639-1 language codes specifying the language used in the text.

In addition to providing information about the class/term, meta-data regarding the class should be entered as follows:

Last editor (dc:creator) mandatory property
The creator is the person who last edited the class.
Contributor (dc.contributor)
A contributor is anyone who at any time has edited the class. When a change is made, the person making the change becomes the editor, and the person who was previously the editor becomes a contributor, if they were not already listed as such. This enables the tracking of each person editing the class.

The creator and editor should both be specified with the full name and Company affiliation.

Last edit date (dc:date) mandatory property
The date YYYY-MM-DD when the last edit was made.
Class Status (dc:type) mandatory property

The status of the class, described in Figure 3, being one of:

created
in_work
ready_for_review
passed_review
approved
in_use
cancelled
Version introduction (owl:versionInfo) mandatory property
When a class is introduced to the ontology the version of the ontology at the time that the class was introduced should be added. This allows the tracking of what classes where added from one version to next.
Change Note (skos:changeNote) mandatory property

When a class is added to the ontology, it should be recorded in a change note. using the format:
[YYYY-MM-DD] <editor name>, <organization>: Initial definition
E.g.

[2012-05-15] Mats Nilsson, FMV: Initial definition

Every time a class is subsequently changed the change to the class should be documented as an additional change note using the same format.

Figure 4 shows an example of the annotation properties for a class shown in Protege.

images/rd_class_annotations.png
Figure 4 -  RD class annotations presented in the Protege annotation template

For each new Individual to be added to the RDL the Class that it is a member of must be identified. It is possible that an individual may be a member of more than one Class. The new individual is added as a member of the appropriate class and other classes that it is a member of are added as necessary.

NOTE    Members of a class are, by definition, members of all the identified superclasses. If this is not the case then the class hierarchy should be re-considered.

The individual must be given an identifier to allow it to be identified within the context of the RDL (this includes the context of any imported RDL). The identifier for the individual is used as part of the URI for the individual. Most OWL tools will raise an error if this identification is not unique in the current context. This identifier has no language interpretation and could be an arbitrary string value, including a globally unique identification code (GUID). Given that the practice in the ontology community is to give meaningful identifiers (i.e. URI fragment identifiers) for classes, the PLCS community follow that practice. It is therefore recommended that the identifier used shall be a meaningful name in the default language of the RDL being defined. Identifiers shall contain no spaces or other special characters.

Labels are used to provide human readable identification, possibly in multiple languages. The SKOS annotations identified below are allowed:

In all cases the skos label should be attributed with a language identification using the ISO 639-1 Language Code identified for the RDL.

NOTE    The use of rdfs:label is not permitted.

A description of the individual should be given using the rdfs:comment annotation. This should be attributed with the language code for the RDL.

Comments and examples may be provided for the individual using the skos:note and skos:example annotations respectively, again these should be attributed with the RDL language code.

If the individual represents a terminological entry (i.e. is a formally defined term that would appear in a dictionary etc.) then the definition of the term may be provided using the skos:definition annotation.

NOTE    Definitions should only be provided by language experts, if in doubt leave out!

In addition to providing information about the individual the developer should also provide meta-data as follows:

Each of these annotations shall include the default language identification to allow search and display using local languages.

The approach in PLCSlib to support different languages is to create separate RDL files for the different languages. This implies that the different languages are defined within different namespaces. Although this is true from a syntactic point of view there shall be no new classes or individuals declared within these namespaces. The RDL files shall only include new language specific annotation for classes or individuals declared in the RDL being extended. To support this concept quality checks will be written to enforce these rules.

The language extension should only provide annotations that are language qualified except the dc:language annotation on the RDL itself. This is used to validate all other annotations in the language specific RDL file.

Classes and individuals imported from the master RDL file may have language specific annotations added. This should result in usage of rdfs:about that refers to the original class/individual and applies additional language annotations.

The following links provide detailed information on how to develop and extend the OASIS PLCS RDL using specific tools:

Issue reporting