Cascading Distributed Processing Network

Cascading Distributed Processing Network

Postby John Payne on 30 Aug 2010, 02:26

Some years ago, I happened upon an idea for a network addressing scheme, inspired by the then-new Inmos Transputer, which combined a 32-bit RISC core and 4 fast (for that time) bidirectional serial ports on a single chip, allowing each Transputer to connect directly to as many as four others. My scheme placed the address to which a packet should be sent at the very front, with the bits for navigating through the first node in a path coming first, followed by the bits for navigating through the next node, and so forth.

As the packet passes through each successive node, the bits needed for determining where to send the packet next would be stripped off, and their converse added to a return address at the tail end of the packet. If a node's processor was idle at the time it received a packet (or, with dedicated routing circuitry and no contention for the outgoing port), it could already be sending the packet on to the next node before having received much more than those first few bits. This arrangement should result in packet forwarding so fast the word "cascading" is apt.

The Transputer, with four connections, would have needed a mere two bits per node, using source-relative addressing and assuming that messages would never be sent back out the same port upon which they arrived. This would have provided one code each for the other three connections, and one to indicate that the message was directed to a port within that node and needn't be sent further. I don't actually favor this approach. A couple of extra bits, four per node, will cover most situations without the need for decoding an address relative to the source of the message. Also, maintaining the option to send a message back out the port on which it arrived provides something analogous to a ping, for measuring transmission latency, and having a standard size address component would simplify determining the alignment of the port number and message within the packet. In the event that four bits isn't enough, a second nibble could be added; each node would know how many nibbles it used. This would still be very low overhead.

Not having the resources of a corporation behind me, and not being acquainted with a patent attorney, I did nothing with this idea aside from posting it on a conferencing system. It now occurs to me that such a network would be ideal for the sort of distributed processing likely to be used in any complex robotic system.

While data can and, in the case of a robot, frequently does originate in hardware, for transmission through the network it would be handed over to software. The target of any message passed through the network would also be software, residing on a port of a node, either a running process with run-loop and one or more message queues (or stacks, or a structure which combines queue and stack), or else some dormant reentrant code waiting to be invoked. And again, in the case of a robot, this receiving code would frequently send the unpacked message on to hardware, but the network address would refer to the code not the hardware, the handler for a stepper motor for instance, not the motor itself. Likewise, the return address would lead to a message handler of some sort.

Such a network might also be used to distribute processing, by breaking algorithms up into parts and distributing the parts along the path between a node where data originates and some destination node where processed data is needed. This approach should work particularly well for adding components to vectors as data is passed up the chain from a digit through an arm-joint to the core structure of a machine, or from a camera with zoom, pan, and tilt mounted on a head with several degrees of freedom.

A specific use for such distributed processing would be the case where a message needs to be sent to more than a single destination. In that event, a fork process could be installed in the node where the paths diverge, avoiding the need to send duplicate messages in succession through part of the network, thereby allowing copies to arrive in a manner as timely as the original. (If it isn't already, much the same idea could be used on the internet for media distribution, saving huge amounts of bandwidth.)

The processing nodes composing such a network would not need to all use the same architecture, but they would all need to implement a runtime providing what might be termed a network instruction set.

There would need to be administrative services, perhaps also distributed, for example to repair a broken path if some node included in it went down, or to determine whether a more efficient path might be available with the recovery of a node or the addition of new hardware. At the minimum each node should have a standard process capable of responding to a request for a list of active ports (ports that actually have software installed on them), as well as a list of address components leading to other active nodes, and each process should be able to respond to a standard request by identifying both the messages it sends and those it is capable of handling.

The Transputer is old hat. Most modern processors have on-chip serial ports these days, as well as other ways in which processors might communicate (shared RAM, or QuickPath Interconnect or HyperTransport), meaning both that the implementation of such a network should require little if any custom hardware, and also that it would need to accommodate and make good use of radically heterogenous connectivity.

To the extent that I understand ROS, I believe it's similar enough to benefit from this approach, and has advantages of its own to bring to the table, not least the master nameserver.
John Payne
Posts: 17
Joined: 24 Apr 2010, 18:46
Location: Boulder, Colorado

Return to Podcast

Who is online

Users browsing this forum: No registered users and 8 guests