Skip to content

Feedback on RDF to CSV #20

@tobiasschweizer

Description

@tobiasschweizer

Hi there,

rdfpandas is an interesting approach. I only used the RDF to CSV functionality so far.
I tried it on a smaller dataset and it worked. When I tried with a bigger Turtle file (> 600 MB), it failed. I made a similar experience with rdflib when dealing with bigger data sources (it is primarily slow).

However, here are some questions / observations:

  • Is there a way to configure the prefixes for the properties (independent of those used in the source)?
  • Sometimes I got more than a hundred columns for the same property, ranging from index [0] to [n]. I think this is the same problem one has in a relational DB design wehen dealing with multiple atomic values for a field but without normalisation. Would it make sense to create more than one table for a given RDF source and use foreign keys to relate to other tables? Or would this overcomplicate things as one would have to deal with m:n tables?
  • I think keeping different types like literals and IRIs separate for the same property makes sense

I hope this feedback is useful.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions