This originates from the last edits in !18 (merged)
The term data structure was created as a sub-class of structured data.
ToDo:
discuss: do we need this class?
There are several classes (i.e. JSON-Schema that could be moved as instances to structured data)
The class is not well annotated (synonyms, contributors, DE) and should be completed (or made obsolete).
current proposition:
note: When we talk about data structures we can mean two very different things! E.g. when we point at a variable in computer code and say "That's a list." then we talk about both, the variable being an individual list data structure and the general data structure category of lists.
- specification (def: see HDO) - data structure specification (def: a specification which pertains to the structuredness of some data.) - list specification (def: a specification which requires a portion of data to consist of, at the top-level/primarily, an opening element, possibly multiple contained elements in sequence, and a closing element in this order.) - dictionary specification (def: ...) - object specifiction (def: ...) - ......- structural quality (def: see HDO) - structured (def: see HDO) - list structured (def: a structured quality(?) which inheres in data by virtue of that data being structured as specified in a list specification.) - dictionary structured - object structured - ......- structured data (def: see HDO) - list (def: structured data which has the quality list structured.) - dictionary - object - ...
As you can see, we don't have the term "data structure" itself, because it does not fit with the rest of HDO. In order to help people who are doing a Ctrl+F on "data structure" we can add it as a related synonym to structured data.
Edited
Designs
Child items
...
Show closed items
Linked items
0
Link issues together to show that they're related.
Learn more.
The current definition of data structure is inaccurate and, if we want to keep the term, we would need to reconsider the definition.
We have an intuitive idea of what a data structure is as something closer to a "specification" that is followed by other (structured) data. This other data is "contained" in the data structure.
We consider that, because of this, data structure could be moved to a subclass of specification. But we need to consider if data always follows a specification or if there could be data that doesn't follow any specification.
When thinking about an example of a data structure, such as a list, there are two sides to what the term mean. On one side there's the specification that describes what a list is, on the other side, there's the entity out of which instances of specific lists are created. E.g., the list [1, 2, 3] is an instance of a class list, but if not an instance of the list specification.
We think that the term data structure could reflect the specification side of this.
The class "list" could be a subclass of structured data and would be defined as structured data which conforms to the class "list" which is a subclass of "data stucture"
If we removed the term data structure, we should add it as synonym for "structured data" or some other term. This is because we expect users to look for the term "data structure" in HDO.
We should also consider how does data structure overlap with data type and abstract data type
This may be related to what was discussed in #25 (closed).
So a possible class hierarchy would look something like the following:
- specification - data structure - list (i.e. the rules which a specific portion of data has to adhere to, to be considered a list)...- structured data - list (i.e. a specific portion of data which adheres to the list specification)
obvious disadvantage: things like "list", "dictionary", "map", etc. would each appear two times
Among the previous discussions there is one where Pier and I talk about structure specifications: !18 (comment 2751517)
Based on this, we could try to move the above closer to our current "structural quality pattern":
- specification - data structure - list (i.e. the rules which a specific portion of data has to adhere to, to be considered a list)...- structural quality - structured - list structured (i.e. a quality which inheres in data which is structured according to the "list specification")...- structured data - list (i.e. data which has the quality "list structured")
This still doesn't solve the issue of having multiple classes with the same name.
Also, do we have to make the "list specification" an instance here?
While I'm writing this, my opinion on how we should model this is somewhat moving back to:
- specification - data structure specification (def: a specification which pertains to the structuredness of some data.) - list specification (def: a specification which requires a portion of data to consist of, at the top-level/primarily, an opening element, possibly multiple contained elements in sequence, and a closing element in this order.)...- structural quality - structured - list structured (def: a structured quality(?) which inheres in data by virtue of that data being structured as specified in a *list specification*.)...- structured data - list (def: structured data which has the quality *list structured*.)
I guess what I am trying to do here, is not having data structure as a subclass of specification because the sentence "A data structure is a specification." is making me feel very uneasy. However, the proposed definitions are independent of that.
If most of you think it's better, we can use the label "data structure" instead of "data structure specification". Or we use synonyms.
i like the general approach. I think some defintions could be improved / be made more clear. I.e. currently we have
data structure
def. Structured data which is intended to contain or facilitate the organization of other data.
if i understand you correctly this would be moved in the hierarchy under specification and become data structure specification - your definition uses structuredness which we currently dont have in the ontology (we have the quality structured which shcould probably be re-named then.
structural quality -> structuredness
structured
unstructured
Then your definition def: a specification which pertains to the structuredness of some data would then either be need to adjusted (i.e. ...pertains to the structured structuredness or we just leave it at the top-class.
we should resolve the problems with the DE defs when implementing this in a MR.
Gerrit and Leon suggest to make this more concise by including the individual structured qualities directly in the definition of the various structured data categories:
- specification (def: see HDO) - data structure specification (def: a specification which pertains to the structured quality of some data.) - list specification (def: a specification which requires a portion of data to consist of, at the top-level/primarily, an opening element, possibly multiple contained elements in sequence, and a closing element in this order.) - dictionary specification (def: ...) - object specifiction (def: ...) - ......- structured data (def: see HDO) - list (def: structured data, of which the structured quality conforms to a list specification.) - dictionary - object - ...
note: we could also mention the "list/dictionary/etc. structured" qualities in a comment if you think that's meaningful.
question: Do we have a "complies with / conforms to specification" property in HDO? If not, do we need one?
I would remove the meaningless 'some' from the definition. Moreover, I wonder if data structure differs from data structure specification - could we have some structure which is not specified? For example, a csv-like file using an unusual character as separator. If yes, I would further suggest to make 'data structure' a broad synonym.
- specification (def: see HDO) - data structure specification (def: a specification which pertains to the structured quality of data.), broad synonym: data structure
We need the "some". Without it, the definition would implicitly read "a specification which pertains to the structured quality of all data.".
data structures and data structure specifications are different things as discussed in the meeting today and in the issue description. I don't think that there are structures without a specification. A specification might not be written down, but as long as something is structured, a specification for that structure exists at least on an abstract level IMO.
- specification (def: see HDO) - data structure specification (def: a specification which pertains to the structured quality of a data set.) - list specification (def: a specification which requires a data set to consist of a sequence of elements.) - key-value pair specification (def: a specification which requires a data set to consist of two elements, the first of which (the key) identifies the other (the value). | exact syn: field specification) - dictionary specification (def: A specification which requires a data set to consist of a set of key-value pairs with unique keys. | exact syn: map specification | related syn: object specification) - graph specification (def: A specification which requires a data set to consist of at least two elements that are connected to each other.) - Individual: JSON specification - ......- data set (def see HDO) - data structure (def: A data set which is structured. | exact syn: structured data set | related syn: structured data) - list (def: A data structure, of which the structured quality conforms to a list specification.) - key-value pair (def: A data structure, of which the structured quality conforms to a key-value pair specification. | exact syn: field) - dictionary (def: A data structure, of which the structured quality conforms to a dictionary specification. | exact syn: map | related syn: object) - graph (def: A data structure, of which the structured quality conforms to a graph specification. | exact syn: map | related syn: object) - Individual: JSON (def: A data structure, of which the structured quality conforms to the JSON specification.) - ...
obsolete unstructured and integrate structured into structuredness (whether we call that structured in the future will be decided in another issue) (don't forget to maybe keep informative annotations of unstructured)
a bit of renaming and a few other tweaks:
- specification (def: see HDO) - data structure specification (def: a specification which pertains to the structured quality of a data set. | related syn: data structure) - list specification (def: a specification which requires a data set to consist of a sequence of elements.) - key-value pair specification (def: a specification which requires a data set to consist of two elements, the first of which realizes a key role in order to find and access the other element (the value). | exact syn: field specification) - dictionary specification (def: A specification which requires a data set to consist of a set of key-value pairs with unique keys. | exact syn: map specification | related syn: object specification) - graph specification (def: A specification which requires a data set to consist of at least two elements that are connected to each other.) - Individual: JSON specification - ......- data set (def see HDO) - structured data (def: A data set which is structured according to a data structure specification. | exact syn: structured data set | related syn: data structure) - list (def: A data structure, of which the structured quality conforms to a list specification.) - key-value pair (def: A data structure, of which the structured quality conforms to a key-value pair specification. | exact syn: field) - dictionary (def: A data structure, of which the structured quality conforms to a dictionary specification. | exact syn: map | related syn: object) - graph (def: A data structure, of which the structured quality conforms to a graph specification. | exact syn: map | related syn: object) - Individual: JSON (def: A data structure, of which the structured quality conforms to the JSON specification.) - ...
for the conforms to relation that we discussed earlier (i.e. strucutredness xyz conforms to data structure specificaiton): http://purl.org/dc/terms/conformsTo
Thanks for the hint! I like this term, but we should briefly think about whether its definition ("An established standard to which the described resource conforms.") is okay for us since it talks about an established standard and our potential use cases for this relation might also cover non-established and implicit specifications.