Design and Analysis of a Fully-Distributed Parallel Packet Switch with Buffered Demultiplexers
Abstract
A Parallel Packet Switch (PPS) is a multistage switch aimed at building a very high-speed switch using much slower devices. A PPS in general has three stages. Several packet switches are placed in the central stage, which operate slower than the external line’s rate. Incoming packets are spread over the center-stage switches by demultiplexers at the input stage. Packets destined to each output port need to be collected and reordered if necessary at the output stage. The initial proposed architecture for the PPS was based on a centralized mechanism with high complexity to distribute incoming packets over the center-stage switches [1]. To reduce the complexity, a distributed algorithm has been proposed in [2] that performs the packet distribution at each demultiplexer independently. The algorithmic complexity of this scheme is in the order of K2 that poses a scalability problem at high speeds as K grows, where K is the number of center-stage switches. In addition, each demultiplexer requires a high-speed buffer at the external line’s rate. In this paper, we have proposed a fully distributed algorithm (at each input line level) with minimal complexity of O(1). Besides demultiplexer buffer in the proposed architecture operates at the low internal link rate. We show that the performance of our architecture is comparable to that of [2]. In particular, we prove that it is stable without any speedup, that is, a bounded delay is guaranteed. The resulting PPS architecture is more simple and implementable.
Keywords
system design, parallel packet switch, load balancing, synchronization, multistage switch