Migrate code to the new graph structure #15

ichiriac · 2017-02-05T19:58:18Z

Rewrite every node with the new graph code.

This task will make possible these tasks :

ichiriac · 2017-02-12T19:17:21Z

HowTo :

The repository is a graph database. The main problem here is that it's hard to separate nodes relations in order to serialize data. Lets say :

<?php // file1.php
class foo { /* .... */ }

And another file :

<?php // file2.php
class bar extends foo { /* .... */ }

Here the structure of nodes :

REPOSITORY [
  [ FILE1.PHP ]
  [ FOO : CHILD OF FILE1.PHP; EXTENDED BY BAR ]
  [ FILE2.PHP ]
  [ BAR : CHILD OF FILE2.PHP; EXTENDS FOO ]
]

Lets say we want to serialize into separate structures FILE1.PHP and FILE2.PHP. We could implement a node traversal in order to also extract childs of this node. The main problem is references. That's a problem because points are an array based on positions, so the loading orders may break logic.

As the lazy loading is based on files entries, each relation should be related also to them. Another problem, is that we could move, rename, or copy/paste a class definition, so the class may change from a file. The relation is weak, we can't locate with precision a related node, we must pass with an intermediate lookup system.

Why lazy loading is a must have : projects could be huge, with tons of symbols, so putting everything in memory is not the best option. The way to go is to build memory from a caching structure, and load shards of data when the system require them.

Weak nodes relations may pass be indexes, same for reverse lookup
~~The index must be loaded at start and attach files with their symbols~~
Each node may have an UUID value so the references could be stateless.

I feel like a generic graph solution is not the best bet 👎

ichiriac · 2017-02-12T20:15:48Z

good reading : http://highscalability.com/unorthodox-approach-database-design-coming-shard

Data are denormalized. Traditionally we normalize data. Data are splayed out into anomaly-less tables and then joined back together again when they need to be used. In sharding the data are denormalized. You store together data that are used together.

This doesn't mean you don't also segregate data by type. You can keep a user's profile data separate from their comments, blogs, email, media, etc, but the user profile data would be stored and retrieved as a whole. This is a very fast approach. You just get a blob and store a blob. No joins are needed and it can be written with one disk write.

My approach is bad because I want to keep data normalized, need to try something 😄

ichiriac · 2017-02-19T17:17:03Z

done 😸

ichiriac added the enhancement label Feb 5, 2017

ichiriac added this to the First release 1.0.1 milestone Feb 5, 2017

ichiriac self-assigned this Feb 5, 2017

ichiriac mentioned this issue Feb 6, 2017

Put the codebase into a separate project glayzzle/grafine#1

Closed

7 tasks

ichiriac added a commit that referenced this issue Feb 11, 2017

#15 use node as point & start export/import helpers

61a6cb8

ichiriac mentioned this issue Feb 12, 2017

Implement hashing & sharding data glayzzle/grafine#3

Closed

6 tasks

ichiriac closed this as completed Feb 19, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migrate code to the new graph structure #15

Migrate code to the new graph structure #15

ichiriac commented Feb 5, 2017

ichiriac commented Feb 12, 2017 •

edited

Loading

ichiriac commented Feb 12, 2017

ichiriac commented Feb 19, 2017

Migrate code to the new graph structure #15

Migrate code to the new graph structure #15

Comments

ichiriac commented Feb 5, 2017

ichiriac commented Feb 12, 2017 • edited Loading

HowTo :

ichiriac commented Feb 12, 2017

ichiriac commented Feb 19, 2017

ichiriac commented Feb 12, 2017 •

edited

Loading