-
Notifications
You must be signed in to change notification settings - Fork 8
Closed
Description
Hi there,
rdfpandas is an interesting approach. I only used the RDF to CSV functionality so far.
I tried it on a smaller dataset and it worked. When I tried with a bigger Turtle file (> 600 MB), it failed. I made a similar experience with rdflib when dealing with bigger data sources (it is primarily slow).
However, here are some questions / observations:
- Is there a way to configure the prefixes for the properties (independent of those used in the source)?
- Sometimes I got more than a hundred columns for the same property, ranging from index [0] to [n]. I think this is the same problem one has in a relational DB design wehen dealing with multiple atomic values for a field but without normalisation. Would it make sense to create more than one table for a given RDF source and use foreign keys to relate to other tables? Or would this overcomplicate things as one would have to deal with m:n tables?
- I think keeping different types like literals and IRIs separate for the same property makes sense
I hope this feedback is useful.
Metadata
Metadata
Assignees
Labels
No labels