Histories File Optimisation of Cache-Based Services

From RetroShare
Jump to: navigation, search


The main aim is to implement a histories file system to prevent loading unecessarily all of the users cache to memory during retroshare start-up. Histories file is a simple way of loading cache files on demand by maintaining handles to their location via a single file per cache service.

It would be preferably implemented via the cache system with appropriate call backs (through inheritance) for cache service implementations. This work would target the distributed messaging service (p3Distrib) which generates many cache files. Hence it would seem the history file would assume a tree form where nodes would contain suffucient detail to determine which subnodes (i.e. cache file) should be loaded.

please use discussion tab at top to contribute.

Requirements / Design Questions:

* Consolidated Cache Files: The existing caches have thousands of very small files, with multiple instances of messages. The history should consolidate these files so we have only a handful of history files, with no duplicates
* Sorted Messages: How do we breakup the messages? Perhaps a set of History files per p3distrib group? 
* Incremental Additions (new messages) / Removal (old messages).
    We want to minimise the amount of rewriting history files. How is this handled? 
    Do we split the history into weekly / monthly sections? 
    If we receive an old out-of-date file, do we re-write that file or add to the current active history file.
* Uniqueness. Only want one copy of each message in the History. How do we ensure uniqueness without loading the whole history into memory. Do we need an index file... [msgId] -> [history File idx], which is quick to load?
* On demand Loading: We only want to load history when the user asks for it.
* Only for subscribed forums? seems sensible.
* Easy Validation... Don't want to have to revalidate each signature.
  Some File Hash + Signature per History File should be used (like the p3config stuff).
  Want to minimise the number of file re-writes and hash calculations.
* Reliability. Don't want to ever lose the history.
Personal tools

External websites