Change search
Link to record
Permanent link

Direct link
Kumar, Shashi
Alternative names
Publications (10 of 48) Show all publications
Badri, S., Holsmark, R. & Kumar, S. (2012). Junction Based Routing: A Scalable Technique To Support Source Routing in Large NoC Platforms. In: Proceedings of Network on Chip Architectures 2012, ACM Digital Library: . Paper presented at Fifth International Workshop on Network on Chip Architectures (NoCArc 2012), 1st Dec. 2012, Vancouver, BC, Canada (pp. 45-50). ACM Digital Library
Open this publication in new window or tab >>Junction Based Routing: A Scalable Technique To Support Source Routing in Large NoC Platforms
2012 (English)In: Proceedings of Network on Chip Architectures 2012, ACM Digital Library, ACM Digital Library, 2012, p. 45-50Conference paper, Published paper (Refereed)
Abstract [en]

To support communication among hundreds of cores on a chip, on-chip communication must be well organized. In the embedded systems using such a chip, the communication patterns can be profiled off-line and routing can be well planned. Source routing has been shown to be suitable in such contexts. However, source routing has one serious drawback of overhead for storing the path information in header of every packet. This disadvantage becomes worse as the size of the network grows. In this paper we propose a technique, called Junction Based Routing (JBR), to remove this limitation. In the proposed technique, path information for only a few hops is stored in the packet header.

With this information, either the packet reaches the destination, or  reaches a junction from where the path information for on-ward path is picked up. There are many interesting issues related to this approach. We discuss and solve two important issues related to JBR, namely, the required number of junctions and their positions and path computation for efficient deadlock-free routing. A simulator has been developed to evaluate the performance of JBR and compare it with simple source routing. We observe that JBR has slightly worse performance as compared to pure source routing for packets with large payload. But JBR has  potential of higher performance for packets with small payloads.

Place, publisher, year, edition, pages
ACM Digital Library, 2012
Keywords
Network on Chip, Source Routing, Deadlock Free Routing, Junction, Router Architecture
National Category
Engineering and Technology
Identifiers
urn:nbn:se:hj:diva-20283 (URN)10.1145/2401716.2401727 (DOI)978-1-4503-1540-1 (ISBN)
Conference
Fifth International Workshop on Network on Chip Architectures (NoCArc 2012), 1st Dec. 2012, Vancouver, BC, Canada
Available from: 2013-01-18 Created: 2013-01-18 Last updated: 2018-09-14Bibliographically approved
Holsmark, R., Kumar, S. & Palesi, M. (2010). A Multi-Level Routing Scheme and Router Architecture to support Hierarchical Routing in Large Network on Chip Platforms. In: 4th Workshop on Highly Parallel Processing on a Chip (HPPC 2010). Paper presented at 4th Workshop on Highly Parallel Processing on a Chip (HPPC 2010).
Open this publication in new window or tab >>A Multi-Level Routing Scheme and Router Architecture to support Hierarchical Routing in Large Network on Chip Platforms
2010 (English)In: 4th Workshop on Highly Parallel Processing on a Chip (HPPC 2010), 2010Conference paper, Published paper (Refereed)
Abstract [en]

The concept of hierarchical networks is useful for designing a large heterogeneous NoC by reusing predesigned small NoCs as subnets. It can also be helpful when analyzing and designing a large NoC as interconnection of subnets at a higher level of abstraction. Hierarchical deadlock-free routing is required to enable deadlock-free interconnection of sub-networks with different internal routing algorithms. In this paper we show that multi-level addressing is a cost-effective implementation option for hierarchical deadlock-free routing. We propose a two-level routing scheme, which is not only efficient, but also  enables co-existence of algorithmic and table-based implementation in one router. A hierarchical view of the network simplifies addressing of network nodes and address decoding in the router. Synthesis results show that a 2-level hierarchical router design for an 8x8 NoC, can reduce area and power requirements by  up to ~20%, as compared to a router for the flat network. This work also proposes a new possibility for increasing the number of nodes available for subnet-to-subnet interfaces, while keeping the properties of hierarchical deadlock-freedom. We evaluate and discuss the communication performance in a 2-level hierarchical network for various subnet interface set-ups and traffic situations. A cycle accurate simulator has been developed and used for this purpose.

Keywords
Networks on Chip, Hierarchical Networks, Deadlock Free Routing, Router Architecture
National Category
Computer Engineering Other Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:hj:diva-13124 (URN)
Conference
4th Workshop on Highly Parallel Processing on a Chip (HPPC 2010)
Available from: 2010-09-17 Created: 2010-09-14 Last updated: 2018-01-12Bibliographically approved
Palesi, M., Holsmark, R., Wang, X., Kumar, S., Yang, M., Jiang, Y. & Catania, V. (2010). A Novel Mechanism to Guarantee In-Order Packet Delivery with Adaptive Routing Algorithms in Networks on Chip. In:  13th Euromicro Conference On Digital System Design Architectures, Methods and Tools. Paper presented at 13th Euromicro Conference On Digital System Design Architectures, Methods and Tools Lille, France, 1-3 September, 2010.
Open this publication in new window or tab >>A Novel Mechanism to Guarantee In-Order Packet Delivery with Adaptive Routing Algorithms in Networks on Chip
Show others...
2010 (English)In:  13th Euromicro Conference On Digital System Design Architectures, Methods and Tools, 2010Conference paper, Published paper (Refereed)
Abstract [en]

Although adaptive routing algorithms promise higher communication performance, as compared to deterministic routing algorithms, they suffer from the out-of-order packet delivery problem. In the context of Network on Chip, the area and computational overhead of ordering packets at the destination is high and may reverse any gain achieved through the use of adaptivity of the routing algorithm. In this paper, we describe a novel scheme for ensuring in-order packet delivery while retaining the performance advantages of adaptive routing. The hardware architecture of a router that supports the proposed scheme is described. Although the basic idea in our proposal is topology independent we evaluate and compare the performance of our scheme with both deterministic as well as adaptive routing algorithms for 2D mesh NoC. As compared to the XY routing algorithm, our technique significantly reduces the packet delay and improves the saturation point. The impact on router area and power dissipation is also discussed. Although the power consumption of routers increase, the energy consumption per flit increases less than 2% on average, since the higher performance allows for draining more traffic during a certain time window.

Keywords
Keywords-Network on Chip, Routing Algorithm, Router Design, Performance Analysis, Adaptivity, In-order packet delivery
National Category
Computer Engineering Other Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:hj:diva-13127 (URN)
Conference
13th Euromicro Conference On Digital System Design Architectures, Methods and Tools Lille, France, 1-3 September, 2010
Available from: 2010-09-14 Created: 2010-09-14 Last updated: 2018-01-12
Mubeen, S. & Kumar, S. (2010). Designing Efficient Source Routing for Mesh Topology Network on Chip Platforms. In: IEEE Euro-Micro Digital System Design 2010. Paper presented at IEEE Euro-Micro Digital System Design 2010. Los Alamitos, California: IEEE Computer Society
Open this publication in new window or tab >>Designing Efficient Source Routing for Mesh Topology Network on Chip Platforms
2010 (English)In: IEEE Euro-Micro Digital System Design 2010, Los Alamitos, California: IEEE Computer Society , 2010Conference paper, Published paper (Refereed)
Abstract [en]

Efficient on-chip communication is very important for exploiting enormous computing power available on a multi-core chip. Network on Chip (NoC) has emerged as a competitive candidate for implementing on-chip communication. Routing algorithms significantly affect the performance of a NoC. Most of the existing NoC architectural proposals advocate distributed routing algorithms for building NoC platforms. Although source routing offers many advantages, researchers avoided it due to its apparent disadvantage of larger header size requirement that results in lower bandwidth utilization. In this paper we make a strong case for the use of source routing for NoCs, especially for platforms with small sizes and regular topologies. We present a methodology to compute application specific efficient paths for communication among cores with a high degree of load balancing. The methodology first selects the most appropriate deadlock free routing algorithm, from a set of routing algorithms, based on the application’s traffic patterns. Then the selected (possibly adaptive) routing algorithm is used to compute efficient static paths with the goal of link load balancing. We demonstrate through simulation based evaluation that source routing has a potential of achieving higher performance, for example up to 28% lower latency even at medium load , as compared to distributed routing.  A simple scheme is proposed for encoding of router ports to reduce the header overhead. A generic simulator was developed for evaluation and performance comparison between source routing and distributed routing. We also designed a router to support source routing for mesh topology NoC platforms.

Place, publisher, year, edition, pages
Los Alamitos, California: IEEE Computer Society, 2010
Keywords
Network on Chip (NoC); Distributed Routing; Source Routing; Routing Algorithms; Performance Analysis
National Category
Other Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:hj:diva-12358 (URN)
Conference
IEEE Euro-Micro Digital System Design 2010
Note
The conference will be held in Sept. 2010 in FranceAvailable from: 2010-06-02 Created: 2010-06-02 Last updated: 2010-09-09Bibliographically approved
Palesi, M., Kumar, S. & Catania, V. (2010). Leveraging Partially Faulty Links Usage for Enhancing Yield and Performance in Network-on-Chip. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Vol. 29(3), 426-440
Open this publication in new window or tab >>Leveraging Partially Faulty Links Usage for Enhancing Yield and Performance in Network-on-Chip
2010 (English)In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, ISSN 0278-0070, E-ISSN 1937-4151, ISSN ISSN 0278-0070, Vol. Vol. 29, no 3, p. 426-440Article in journal (Refereed) Published
Abstract [en]

The communication infrastructure of a complex multicore system-on-a-chip is getting an increasing fraction of the overall chip area. According to the International Technology Roadmap for Semiconductors, killer defect density does not decrease over successive technology generations. For this reason, the probability that a manufacturing defect affects the communication system is predicted to increase. In this paper, we deal with manufacturing defects which affect the links in a network on-chip-based interconnection system. The goal of this paper is to show that by using effective routing functions, supported by appropriate selection policies and with a limited amount of extra logic in the router, it is easy to exploit partially faulty links to improve the performance of the system. We show that, instead of discarding partially faulty links, they can be used at reduced capacity to improve the distribution of the traffic over the network, yielding performance and power improvements. We couple an application-specific routing function with a set of selection policies which are aware of link fault distribution and evaluate them on both synthetic traffic and a real complex multimedia application. We also present an implementation of the router, augmented with the extra logic, to support both the proposed selection functions and the transmission of messages over partially faulty links. We analyze the router in terms of silicon area, timing, and power dissipation.

Place, publisher, year, edition, pages
IEEE Publisher, 2010
Keywords
Application Specific routing, congestion, fault tolerance, network-on-chip, performance analysis, router design, routing algorithm
National Category
Other Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:hj:diva-11582 (URN)10.1109/TCAD.2010.2041851 (DOI)
Available from: 2010-02-05 Created: 2010-02-05 Last updated: 2017-12-12Bibliographically approved
Palesi, M., Holsmark, R., Kumar, S. & Catania, V. (2009). Application Specific Routing Algorithms for Networks on Chip. IEEE Transactions on Parallel and Distributed Systems, 20(3), 316-330
Open this publication in new window or tab >>Application Specific Routing Algorithms for Networks on Chip
2009 (English)In: IEEE Transactions on Parallel and Distributed Systems, ISSN 1045-9219, E-ISSN 1558-2183, Vol. 20, no 3, p. 316-330Article in journal (Refereed) Published
Abstract [en]

In this paper we present a methodology to develop efficient and deadlock free routing algorithms for Network-on-Chip (NoC) platforms which are specialized for an application or a set of concurrent applications. The proposed methodology, called application specific routing algorithm (APSRA), exploits the application specific information regarding pairs of cores which communicate and other pairs which never communicate in the NoC platform to maximize communication adaptivity and performance. The methodology also exploits the known information regarding concurrency/non-concurrency of communication transactions among cores for the same purpose. We demonstrate, through analysis of adaptivity as well as simulation based evaluation of latency and throughput, that algorithms produced by the proposed methodology give significantly higher performance as compared to other deadlock free algorithms for both homogeneous as well as heterogeneous 2D mesh topology NoC systems. For example, for homogeneous mesh NoC, APSRA results in approximately 30% less average delay as compared to odd-even algorithm just below saturation load. Similarly the saturation load point for APSRA is significantly higher as compared to other adaptive routing algorithms for both homogeneous and non-homogeneous mesh networks.

Place, publisher, year, edition, pages
New York: IEEE Computer Society, 2009
Keywords
2D mesh topology, adaptive routing algorithms, application specific routing algorithms, deadlock free routing algorithms, network-on-chip platforms
National Category
Other Electrical Engineering, Electronic Engineering, Information Engineering Computer Sciences
Identifiers
urn:nbn:se:hj:diva-10927 (URN)10.1109/TPDS.2008.106 (DOI)
Available from: 2009-11-26 Created: 2009-11-26 Last updated: 2018-01-12Bibliographically approved
Palesi, M., Kumar, S. & Catania, V. (2009). Bandwidth-Aware Routing Algorithms for Networks-on-Chip Platforms. I E T Computers and Digital Techniques, 3(5), 413-429
Open this publication in new window or tab >>Bandwidth-Aware Routing Algorithms for Networks-on-Chip Platforms
2009 (English)In: I E T Computers and Digital Techniques, ISSN 1751-8601, Vol. 3, no 5, p. 413-429Article in journal (Refereed) Published
Abstract [en]

General purpose routing algorithms for a network-on-chip (NoC) platform may not be able to provide sufficient performance for some communication intensive applications. This may be because of low adaptivity offered by a general purpose routing algorithm resulting in some links getting highly congested. In this study the authors demonstrate that it is possible to design highly efficient application-specific routing algorithms which distribute traffic more uniformly by using information regarding applications communication behaviour (communication topology and communication bandwidth). The authors use off-line analysis to estimate expected load on various links in the network. The result of this analysis is used along with the available routing adaptivity in each router to distribute less traffic to links and paths which are expected to be congested. The methodology for application-specific routing algorithms is extended to incorporate these features to design highly adaptive deadlock-free routing algorithms which also distribute traffic more uniformly and reduce network congestion. The authors discuss architectural implications and analyse area and power overheads of the proposed approach on the design of a table-based NoC router.

National Category
Computer Engineering
Identifiers
urn:nbn:se:hj:diva-11413 (URN)10.1049/iet-cdt.2008.0082 (DOI)
Available from: 2010-01-21 Created: 2010-01-21 Last updated: 2018-01-12Bibliographically approved
Tornero, R., Kumar, S. & Mubeen, S. (2009). Distance Constrained Mapping to Support NoC Platforms based on Source Routing. In: Martti Forsell and Jesper Larsson Träff (Ed.), 3rd Highly Parallel Processing on Chip (HPPC 09) workshop, August 2009, Delft, Netherland.. Paper presented at EuroPar 2009 (pp. 8-17).
Open this publication in new window or tab >>Distance Constrained Mapping to Support NoC Platforms based on Source Routing
2009 (English)In: 3rd Highly Parallel Processing on Chip (HPPC 09) workshop, August 2009, Delft, Netherland. / [ed] Martti Forsell and Jesper Larsson Träff, 2009, p. 8-17Conference paper, Published paper (Refereed)
Identifiers
urn:nbn:se:hj:diva-11416 (URN)
Conference
EuroPar 2009
Available from: 2010-01-21 Created: 2010-01-21 Last updated: 2010-02-05Bibliographically approved
Holsmark, R., kumar, S., Palesi, M. & Mejia, A. (2009). HiRA: A methodology for deadlock free routing in hierarchical networks on chip. In: Networks-on-Chip, 2009. NoCS 2009. 3rd ACM/IEEE International Symposium on (pp. 2-11). IEEE Computer Society
Open this publication in new window or tab >>HiRA: A methodology for deadlock free routing in hierarchical networks on chip
2009 (English)In: Networks-on-Chip, 2009. NoCS 2009. 3rd ACM/IEEE International Symposium on, IEEE Computer Society , 2009, p. 2-11Conference paper, Published paper (Refereed)
Abstract [en]

Complexity of designing large and complex NoCs can be reduced/managed by using the concept of hierarchical networks. In this paper, we propose a methodology for design of deadlock free routing algorithms for hierarchical networks, by combining routing algorithms of component subnets. Specifically, our methodology ensures reachability and deadlock freedom for the complete network if routing algorithms for subnets are deadlock free. We evaluate and compare the performance of hierarchical routing algorithms designed using our methodology with routing algorithms for corresponding flat networks. We show that hierarchical routing, combining best routing algorithm for each subnet, has a potential for providing better performance than using any single routing algorithm. This is observed for both synthetic as well as traffic from real applications. We also demonstrate, by measuring jitter in throughput, that hierarchical routing algorithms leads to smoother flow of network traffic. A router architecture that supports scalable table-based routing is briefly outlined.

Place, publisher, year, edition, pages
IEEE Computer Society, 2009
Keywords
deadlock free routing, hierarchical networks, hierarchical routing algorithms, network-on-chip
National Category
Other Electrical Engineering, Electronic Engineering, Information Engineering Computer Sciences
Identifiers
urn:nbn:se:hj:diva-10928 (URN)10.1109/NOCS.2009.5071439 (DOI)978-1-4244-4142-6 (ISBN)
Available from: 2009-11-26 Created: 2009-11-26 Last updated: 2018-01-12Bibliographically approved
Mejia, A., Palesi, M., Flich, J., Kumar, S., Lopez, P., Holsmark, R. & Duato, J. (2009). Region-Based Routing: A Mechanism to Support Efficient Routing Algorithms in NoCs. IEEE Transactions on Very Large Scale Integration (vlsi) Systems, 17(3), 356-369
Open this publication in new window or tab >>Region-Based Routing: A Mechanism to Support Efficient Routing Algorithms in NoCs
Show others...
2009 (English)In: IEEE Transactions on Very Large Scale Integration (vlsi) Systems, ISSN 1063-8210, E-ISSN 1557-9999, Vol. 17, no 3, p. 356-369Article in journal (Refereed) Published
Abstract [en]

An efficient routing algorithm is important for large on-chip networks [network-on-chip (NoC)] to provide the required communication performance to applications. Implementing NoC using table-based switches provide many advantages, including possibility of changing routing algorithms and fault tolerance, due to the option of table reconfigurations. However, table-based switches have been considered unsuitable for NoCs due to their perceived high area and power consumption. In this paper, we describe the region-based routing (RBR) mechanism which groups destinations into network regions allowing an efficient implementation with logic blocks. RBR can also be viewed as a mechanism to reduce the number of entries in routing tables. RBR is general and can be used in conjunction with any adaptive routing algorithm. In particular, we have evaluated the proposed scheme in conjunction with a general routing algorithm, namely segment-based routing (SR) and an application specific routing algorithm (APSRA) using regular and irregular mesh topologies. Our study shows that the number of entries in the table is significantly reduced, especially for large networks. Evaluation results show that RBR requires only four regions to support several routing algorithms in a 2-D mesh with no performance degradation. Considering link failures, our results indicate that RBR combined with SR is able to tolerate up to 7 link failures in an 8times8 mesh. RBR also reduces area and power dissipation of an equivalent table-based implementation by factors of 8 and 10, respectively. Moreover, the degradation in performance of the network is insignificant when using APSRA combined with RBR.

Keywords
adaptive routing algorithm, application specific routing algorithm, fault tolerance, large on-chip networks, network-on-chip, region-based routing mechanism, segment-based routing, table-based switches
National Category
Other Electrical Engineering, Electronic Engineering, Information Engineering Computer Sciences
Identifiers
urn:nbn:se:hj:diva-10930 (URN)10.1109/TVLSI.2008.2012010 (DOI)
Available from: 2009-11-26 Created: 2009-11-26 Last updated: 2018-01-12Bibliographically approved
Organisations

Search in DiVA

Show all publications