Documentation:design services histories
Contents
|
Design Document for History File Optimisation of Distrib Service.
Introduction.
The aim of the optimisation is to create a system for loading cache data on demand from p3GroupDistrib service. Currently all cache data is loaded into memory. It is possible to delay loading cache data until requested by the user. This has the potential to save on start up compute/data-movement cost and system memory.
High Level Design Overview
Data Structures
XML History file (not necessarily XML):
This is used to store configuration information on what cache data was present from the last Retroshare session. This can be used to populate a cache table (which contains grp to message cache dependency) without having to load the cache data.
Types of nodes:
<group> = group
<messages> = multiple messages from a group
<msg> = msg
<cacheid> = cache id
<sign> = signature for data (don't think this is need, as tabled data are already signed)
Cache Table:
Maintains whether a group's messages have been loaded or not. Important to note that more than one group may belong to a single cache ID.
Functionality
This deals with the parsing of the history file, and the infrastructure changes to p3GroupDistrib service to deal with this optimisation.
XML Parser
Currently using pugixml 1.0
Design Considerations.
Composition, Aggregation or simple inheritance?
Actual none, it will implemented by the cache service itself (p3GroupDistrib). The cache infrastructure is exposed adequately to the p3GroupDistrib to be able to implement histories.
As mentioned previous an important consideration is that a cache ID may be related to one or more messages or groups (instances of data).
Architectural Strategies.
The main architectural strategy is to use inheritance to give CacheService the ability to act as the history file service. By tying the data service closely together with history files one can maintain an easily extensible cache system. One can envisage deleting data once over a given memory (cache) limit.
Model / Design.
Interface
The interface in invisible to the GUI. It will be noted in the documentation to load distributed service messages as late as possible.
Internal Resources
History file: Need to have way of tracking whether a given cache data has been loaded based group id. Most likely maps will be used
Configuration.
Need to ensure new data arriving is handled appropriately, especially comment or forum messages. This is where xml document structure is useful for updating the document.
Use Cases.
Request for messages
- User requests messages from a group
- p3Group queries cache table to see if group has been loaded
- not loaded
- data is loaded
- cache table updated
- loaded, goes through normal request flow
- not loaded
Shutdown Saving History File
- history file encrypted
- then saved by saveConfig through p3Config
Start-up Loading History File
- history file loaded up through loadConfig() though p3Config and decrypted using gpg key ( p3GroupDistrib::encrypt() )
- cache table is built using history file
- old Groups/Messages loaded that are not found in cache table are loaded immediately
- and cache document updated, as with cache table
- it should be noted that historical caches are loaded first, then historical groups from these caches and then messages
- old Groups/Messages loaded that are not found in cache table are loaded immediately
note: it is important that history file is loaded before cache data
Receiving new Message/Group
- history document is updated with new message or group
- message or group is loaded immediately and cache table is updated
Low Level Design.
p3GroupDistrib
Introduction / Usage
This is an entity which implements Retroshare's Cache System .
Interface / Exports.
<source lang="cpp"> class p3GroupDistrib {
private:
void updateCacheDoc(); void updateCacheTable(); bool cached(const std::string& someId);
};
</source>
cacheDoc: contain information on how to create the cache table
cachetable: information on how whether this has been loaded or not.
updateCacheTable(): updates cache information
Resources
The Cache service needs to maintain a cache table which can link its identification for cache data to cache ids. Entries to this table should be made whenever cache data is received. The reason for this is to allow the XML document to be updated. Once the update is made, table entries relating to this update are removed.
For the p3GroupDistrib cache service one would maintain a table for groups and messages received.
<source lang="cpp">
// tells which group's msgs have been loaded or not std::map p3GroupDistrib::grpCacheTable; // (grp id, cache id)
// holds info needed to determine what cache data to load and not load pugi::xml_document p3GroupDistrib::cacheDoc;
</source>
Design / Model
History Document
This contain enough information to
<source lang="xml">
<group>
<grpId> id </grpId> <rs_peer_id> peerid <rs_peer_id> <subId> subId </subId> <message> <rs_peer_id> peerid </rs_peer_id> <subId> subId </subId> </message>
</group>
</source>
The xml element id is information actually sent in the call back to load uncached data, enabling the service to build the cache xml document.
Configuration
This is the more complicated bit. I guess a few points on keeping things tidy, can clarify in more detail with USE cases:
- New messages and groups will be arriving and this has to be dealth with
- Newly arrived messages are loaded into memory as history system needs further meta data from the cache file.
- On load cache does not load any old cache data (i.e. messages) until requested
- The service on startup loads up all group cache data.
- Requests for messages updates the cache table to say data has been loaded
- Cache document maintains consistency by being called before shut-down to finish add any newly received cache data (i.e. messages)
- When new cache data is received it should determine whether to load or not (cache/load groups, not msgs)
Use Cases / Interactions
Flow diagrams of how it interacts are useful here.
Initial Loading of data/config
On load of data, if xml file is available then data is not loaded if historical (from file)
Runtime
When new data arrives, this will be noted as 'not historical'. This will be loaded immediately. If a group required is not loaded or its messages, this dependency is resolved (grp and msgs are loaded first)
Shutdown
XML is saved which represents the current state of cache data.
References
Is this Entity described in other documents (Interactions or More Details),
Benefits, assumptions, risks/issues
Benefit
May save around 30% of current (nominal) system memory usage
Assumptions
Actual benefit of this implementation can only be realised by GUI loading cache service data as it needs them
Risks/Issues
None identified. Design seems plausible but not confirmed yet (need more functionality to current architecture asides from the ones mentioned here?)
TODO list
- be able to handle new msg notices
- handle case of cached subscribed messages which should not be marked as newly subscribed (and cache forwarded)
- GUI interaction: load on demand with current GUI