Posted by mheydt on 11/24/2007 4:52 PM | Comments (0)
While pondering the concept of determining how to handle distribution of content in a community to other members it became clear to me that handling content distribution at a file level would probably be too complicated.  So, I decided to come up with the concept of a package, which would loosely be a set of one or more files that can be published by a particular member of the community.

This does several things to make publishing content easier.  First, multiple files can be published as a single group and allows versoning to be handled in the aggregate instead of on the micro level.  Second, it is often that a set of content that is published often is just a number of files that are associated with each other (such as an application or a set of songs from an album).

Third, it helps provide a type of identity in the network to the data instead of the actual users.  Conceptually, for others to find data it must be identified in the network, and by storing information through out the network as to what system is hosting what packages it will be easier to find and share content.

Categories: p2pSB Posted by mheydt on 11/13/2007 4:54 PM | Comments (0)
One question that always comes up is how does a peer find other peers in the network? The first time the system is run, it will be true that the local node will not know of any other peers. With the assumption that a broadcast on the network is not possible, what must be done is a directed communication to one other peer at a predetermined address, which the client must be configured to know about. Many refer to this particular type of peer as a "rendevous" server, who's responsibility it is to be located at a static IP address (or a dynamic one but reachable through DNS) and can handle initial requests for a peer to connect to the peer network. I also believe that other types of peer networks refer to these rendevous servers as "super nodes".

To accomplish this in p2pSB, a node can be configured with a static set of peer nodes in the configuration file and those peers can be identified as having a rendevous service available. Upon initialization, a p2pSB node will attempt to connect to all other nodes known to have rendevous services. Upon receiving a connect message from another node, the rendevous service will return to the sender an acknowledgment message which the originator can use to know that it is now connected to the network and that it can start further peer discovery through those nodes (and it's through this reply that a node can identify its external ip address and pass that to other nodes so that they can communicate directly).

If you want to think of this from another perspective, a rendevous server in p2pSB can be a stripped down peer node with no other services that operates very similarly to a tracker in a bittorrent network, keeping track of nodes as they come online and go offline, and providing a place for nodes to find others in the network and to begin exchanging information. A difference is that you can designate your node as providing this service, or put other nodes out in the network to handle this for you as well as to spread the load amongst peers connecting to the network.

An additional benefit of this model is that in p2pSB rendevous nodes can be configured to bridge information to other rendevous nodes, forming a virtual backbone to help with scalability of finding nodes in the network. For example, peer discovery services in p2pSB has a special message sent to a rendevous node to ask it for the location of other rendevous nodes, which the client can then use to search for other peers.

Given this architecture, assume your are trying to find a particular node in the network that provides a particular service in the network. You know of this node and its services because you previously became a buddy with it by remembering it's node id so you can locate it again at a later time. You can't store the IP address as it may (and most likely will) change, so you just remember the ID of the buddy node. When connecting to the network, peer discovery services can identify all of the rendevous nodes and then broadcast a peer location request for that particular node by id. If any of the rendevous servers know that peer is online, they will reply with the external connection point for that node so that communications can begin.

If the discovery through rendevous servers fails, the client can then optionally broadcast to other known peers (but non-rendevous service peers) to see if they know of that specific peer. Note that this is more limited in nature as it is the rendevous services / nodes job to track any and all rendevous announcements and report them to others upon request, whereas a normal peer only tracks nodes that it is currently (or recently) in communications with, either by directly establishing communications with them or vice-versa.