Distributed Large Installation Framework Concept 1.1 author: Dariusz 'tdi' Dwornikowski History: 22.09.2005 - The document was created (tdi) 23.09.2005 - Added failover secion and Wave term 1. Introduction 2. General idea 2.1 DLIF assumptions 2.2 Model 2.2.1 Simple model 2.2.2 Simple model with PBH 2.3 Communication proposal 2.3.1 Protocol 2.3.2 Communication scenario 2.3.3 Failover proposal 2.4 Security 3. Summary 1. Introduction DLIF in its rough concept is a framework to maintain large numbers of indentical or merely identical hosts, mostly used by system administrators. It is not supposed to be easy to use - it is supposed to be functional and very reliable. DLIF provides only the method of communication between the machines, leaving the way of syncing to the modules. 2. General idea DLIF bases on the idea of propagation and gossiping. Below there are basic assumptions that DLIF wants to achive. 2.1 DLIF assumptions: * high reliability * low network traffic * high level of security * scalability * server - client architecture * modularization * self updating 2.2 Model "Queen Bee Host" is the initial image, the ancestor of all hosts in all networks (QBH). "Host" is the machine, which gets image from QBH. "Network" is the medium and all layer 1 and 2 devices. "Princess Bee Host" is the initial image for particular network (PBH), something similar to router in a networking namespace. "Wave" successful sync of all hosts 2.2.1 Simple model --- Direction of Gossip propagation --- > _______ __ Host 1 | | / | Queen |-----| Network | --- Host .. | Bee | \ | Host | \__ Host N |_______| 2.2.2 Simple model with PBH --- Direction of Gossip propagation --- > ______ _______ __ Host 1 | | ___ Host 1 | | / | PBH | / | Queen |-----| Network A | ------------| |--- | Network B | --- Host ... | Bee | \ | | \ | Host | \__ Host N |______| `--- Host M |_______| 2.3 Communication proposal Communication within DLIF hosts is based on gossiping originating from QBH, either triggered manually by the administrator or some cron job. General mathematical idea of gossiping can be found here: * http://mathworld.wolfram.com/Gossiping.html. (MORE) 2.3.1 Protocol DLIF communication can be based on its own protocol (probably similar to SIP). TODO 2.3.2 Example scenario a) QBH triggers event by propagating gossip to known hosts along with the hosts list. b) host N receives READY event from the QBH or one of other hosts. c) host N starts initiation of syncing ( TODO ) d) syncing begins (rsync, netcat, cvs, svn, own module) e) host N stores some CLOCK, not to sync second time f) host N propagates information about finished sync to others ( that are not on the list host N received from QBH ) g) host N serves now as image host to others h) whole procedure is finished 2.3.3 Failover proposal * Newly connected host It may happen that newly connected host (after the "Wave"), will not have the latest image. That is why newly connected hosts can in such situation trigger by themselves sending some message to QBH or PBH. * Certainty of unification Robert Nowak proposed modified scenario of communication, where QBH first syncs with PBHs and only after that, PBHs propagate new image within their networks. This can significantly reduce the traffic within one network and give more reliable way of spreading the image. Disadvantage of such method is that QBH would have to know about all the PBHs in its network. 2.4 Security Proposal: * built in acls * coded messages * pam * chroot * Xen * ssl * self-control (by md5 sums stored outside) TODO 3. Summary This is only a draft concept. Feel free to submit ideas. Write to : tdi@pozman.pl