Feature Articles: Initiatives for the Widespread Adoption of NetroSphere

MAGONIA (DPB: Distributed Processing Base) Applied to a Traffic Congestion Prediction and Signal Control System

Hiroaki Kobayashi, Takehiro Kitano, Mitsuhiro Okamoto,
and Takeshi Fukumoto

Abstract

High computational capacity and high fault tolerance are required in advanced social infrastructures such as future intelligent transportation systems and smart grids that collect and analyze large amounts of data in real time and then provide feedback. This article introduces application examples of the distributed processing technology of MAGONIA, a new server architecture (part of NetroSphere), in a traffic congestion prediction and signal control system, one of the social infrastructures currently under development at NTT DATA.

Keywords: MAGONIA, distributed processing, intelligent transport system (ITS)

PDF

1. Introduction

NTT Network Service Systems Laboratories is currently carrying out research and development on MAGONIA, a new server architecture and part of the NetroSphere concept, to quickly create new services and drastically reduce CAPEX/OPEX (capital expenditures and operating expenses) [1, 2]. The distributed processing base (DPB), the core technology of MAGONIA, provides the high fault tolerance, scalability, and short response time that telecom systems need. Since these functions are required in systems other than telecom systems, we aim to create new services and reduce development time by increasing the application range of MAGONIA. To this end, we have been collaborating with NTT DATA in experiments [3] since September 2015 to assess its applicability to a traffic congestion prediction and signal control system [4].

2. Overview of joint experiments

The traffic congestion prediction and signal control system analyzes vehicle behavior in a target area from a large amount of sensor data obtained in real time (referred to below as traffic simulation) in order to control traffic signals based on the results (Fig. 1). The system requires high fault tolerance for commercialization. Since the computational power required by the traffic simulation changes with the traffic volume, the system must be scalable to flexibly cope with increases and decreases in load. In addition, a short response time is necessary to feed back the analysis results to traffic signals within the time limit.


Fig. 1. Traffic congestion prediction and signal control system overview.

These issues must be solved to enable commercialization of the traffic congestion and signal control system. Therefore, the existing traffic simulation logic was ported to the MAGONIA DPB, and the resulting system was tested to assess its feasibility in terms of fault tolerance, scalability, and response time.

3. Computational model for traffic simulation

In traffic simulation, processing must be completed within the feedback cycle required for the traffic signal control. To satisfy this requirement, target areas for simulation are divided into small areas (referred to as cells), and multiple computational nodes are used to perform parallel processing of cells. Since vehicles are likely to move between cells in the meantime, computations in each cell are not completely independent and require cooperative operation between cells.

A model called bulk synchronous parallel (BSP) [5] is used to enable effective processing. BSP consists of repeated processing in a series of super step units that involve the three phases of local computation (LC), communication (Com), and synchronization (Sync). The processing in each phase of traffic simulation is summarized in Fig. 2.


Fig. 2. Model for processing traffic simulation.

4. Applying MAGONIA DPB

The effectiveness of the MAGONIA DPB has been demonstrated in telecom services. In such services, load is distributed in session units (example: call units). Because sessions are independent and stay independent in distributed processing, the operation characteristics differ considerably from the previously mentioned BSP model. To bridge this gap, new adapters supporting Com and Sync were added to the system (Fig. 3). With this approach, we have successfully implemented the traffic simulation on top of the DPB while minimizing the modification of the existing traffic simulation logic.


Fig. 3. Traffic congestion prediction and signal control system on the DPB.

4.1 System overview

The traffic system implemented on top of the DPB (Fig. 3) performs the following procedure. Firstly, the system distributes processing in the computational node clusters based on cell units. Each computational node then executes LC for each cell it is assigned to, and upon completing LC, it continues to the Com phase and sends the data to adjacent cells. When the Com phase is completed, it continues to the Sync phase, after which it starts the next super step.

4.2 Improved fault tolerance

To guarantee fault tolerance, the DPB supports checkpointing. The checkpoint of each cell is replicated to multiple computational nodes, and should a node fail, another node with a respective replicated cell unit will automatically fail over (take over processing). In this way, by replicating checkpoint and distributing failover on a cell-by-cell basis, the system can lessen the increase in the load.

In this experiment, it was verified that instant failover made it possible to complete simulation within the feedback cycle, even when a computational node failure occurred during traffic simulation. The capability to minimize the impact of faults to this high degree is a major characteristic of the MAGONIA DPB. Because checkpointing and failover are completed in the computational node cluster, fault tolerance can be achieved without relying on external data storage.

4.3 Improved scalability

The DPB supports the addition and removal of computational nodes without interrupting the simulation. To cope with load fluctuations due to changes in traffic volume, the number of computational nodes and the mapping of cells and computational nodes change dynamically to ensure there is a sufficient amount of resources. Through experiments, it was verified that it is possible to dynamically add and remove computational nodes without interrupting traffic simulation.

4.4 Reduced response time

Placing data in the memory of a computational node that executes LC reduces network and disk input/output. Additionally, overlapping LC and Com by assigning different threads to LC and Com minimizes time overheads. We were able to verify in experiments that the high speed of the new infrastructure made it possible to complete the traffic simulation within the feedback cycle to traffic signals.

5. Future outlook

In the future, we will use actual sensing data to assess the system. In addition to the telecom systems and ITS (intelligent transport systems) described here, MAGONIA will also be extended to a wide variety of services that demand fault tolerance and scalability.

References

[1] H. Shina, “MAGONIA: A New Server Architecture to Quick-start Services and Drastically Cut Development and Operating Costs: Towards the NetroSphere Concept,” IEICE Tech. Rep., Vol. 115, No. 404, NS2015-156, pp. 59–63, 2016 (in Japanese).
[2] K. Ono, H. Yoshioka, M. Kaneko, S. Kondoh, M. Miyasaka, Y. Soejima, T. Moriya, K. Kanishima, A. Masuda, J. Koga, T. Tsuchiya, N. Yamashita, K. Tsuchikawa, and T. Yamada, “Implementing the NetroSphere Concept at NTT,” NTT Technical Review, Vol. 13, No. 10, 2015.
https://www.ntt-review.jp/archive/ntttechnical.php?contents=ntr201510fa2.html
[3] Press release issued by NTT on April 12, 2016.
http://www.ntt.co.jp/news2016/1604e/160412a.html
[4] Press release issued by NTT DATA on May 31, 2016 (in Japanese).
http://www.nttdata.com/jp/ja/news/release/2016/053101.html
[5] L. G. Valiant, “A Bridging Model for Parallel Computation,” Comm. ACM, Vol. 33, No. 8, pp. 103–111, 1990.
Hiroaki Kobayashi
Research Engineer, Server Network Innovation Project, NTT Network Service Systems Laboratories.
He received a B.E. and M.E. in computer and information sciences from Tokyo University of Agriculture and Technology in 2011 and 2013. He joined NTT Network Service Systems Laboratories in 2013. He is currently studying distributed computing technology. He is a member of the Institute of Electronics, Information and Communication Engineers (IEICE).
Takehiro Kitano
Research Engineer, Server Network Innovation Project, NTT Network Service Systems Laboratories.
He received a B.E. and M.E. in computer science from Keio University, Tokyo, in 2007 and 2009. He joined NTT Network Service Systems Laboratories in 2009 and developed Next Generation Network (NGN) session control servers. He was especially active in improving performance and developing the Network Access Subsystem, which has DHCP (dynamic host configuration protocol) and authentication functions. He is currently studying distributed computing technology and distributed software architecture. During 2013–2016, he worked as an assistant secretary on the IEICE Steering Committee on Network Software. He is a member of IEICE.
Mitsuhiro Okamoto
Senior Research Engineer, Server Network Innovation Project, NTT Network Service Systems Laboratories.
He received a B.E. and M.E. in electrical engineering from Osaka Prefecture University in 1987 and 1989. He joined NTT in 1989 and carried out research and development of network service systems, including the Intelligent Network and IMT-2000. He is currently researching distributed computing technology. He is a member of IEICE.
Takeshi Fukumoto
Senior Research Engineer, Server Network Innovation Project, NTT Network Service Systems Laboratories.
He received a B.E. and M.E. in computer science and systems engineering from Kyushu Institute of Technology, Fukuoka, in 1994 and 1996. He joined NTT Network Service Systems Laboratories in 1996, where he was engaged in the research and development of communication node system software. He also worked on the development of the application software of the switchboard of the public backbone of NTT and was with the development team of NTT-NGN. He is a member of IEICE.

↑ TOP