Skip to content

[term discussion] Should our current data type become a relation?

See other views and start of this discussion here.

My take and suggestion on the matter:

Our definitin of data type is difficult to use in actual knowledge graph data.

In knowledge graphs, data is stored in subject-predicate-object triples. E.g. here is how I think numeric data would be stored in a knowledge graph with HDO as one of the involved mid-level ontologies.

<tree x> <is a> <arb:tree>
<arb:tree> <owl:subclassOf> <bfo:object>

<arb:tree height> <owl:subclassOf> <bfo:quality>
<arb:tree> <bfo:has quality> <arb:tree height>

<arb:tree height measurement> <owl:subclassOf> <hdo:measurement process>
<arb:tree height> <obi:is_specified_input_of> <arb: tree height measurement>

<arb:tree height datum> <owl:subclassOf> <hdo:datum>
<arb:tree height datum> <obi:is_specified_output_of> <arb:tree height measurement>

Side note: I couldn't find a "measurement process" in OBO. Should we maybe provide that (maybe together with a "simulation process") as a subclass of planned process?

Now how to include the data type here? If our data type was a relation instead, it would be easy:

<arb:tree height datum> <rdf:value> <"13.37">
<arb:tree height datum> <hdo:data type> <hdo:decimal>
<hdo:decimal> <owl:subclassOf> <hdo:abstract data type>

In this scenario we should maybe rename data type to "has data type and abstract data type to "data type" to make it more simple for other ontology users. Another example of where it is done like this as well is CSVW.

If we keep our current data type, one would still need a relation for connecting the value(s) to an abstract data type as far as I can envision that:

<arb:tree height datum> <hdo:has abstract data type> <hdo:decimal>

But what is then the data type? I think, if we take our definition literally, ("Information which identifies the abstract data type...") that triple above as a whole is the data type. And we can have triples as subjects of other triples in RDF (https://www.w3.org/2021/12/rdf-star.html#introduction):

<< <arb:tree height datum> <hdo:has abstract data type> <hdo:decimal> >> <is a> <hdo:data type>

Side note: Since a data type can also be misinformation according to our comment on that term, data type should maybe rather be a subclass of signifier than of information.

What do you think of this? Am I missing something here?