Feature Articles: The NetroSphere Concept—Breathing New Life into Carrier Networks
Implementing the NetroSphere Concept at NTT
NTT announced the NetroSphere concept in February 2015 as a new research and development (R&D) vision that aims to provide a variety of services in a prompt, low-cost, and reliable manner to customers and service providers who use the network. To implement the NetroSphere concept, NTT Information Network Laboratory Group is conducting R&D in diverse areas ranging from network architecture to virtualization, hardware, and operations. This article introduces these R&D activities.
Keywords: NetroSphere, network architecture, operations
1. MAGONIA server architecture
NTT Network Service Systems Laboratories is developing a new server architecture called MAGONIA*1 to enable the early creation of network services and greatly reduce system development and operating costs as part of the NetroSphere concept. The conventional approach to developing network systems is to use hardware and software optimized for each network function while meeting stringent requirements in reliability and performance that are demanded of a carrier. However, a system having such a separately optimized silo-type equipment configuration increases the quantity and types of necessary equipment while requiring operators to be located separately. This type of configuration consequently drives up operating costs and keeps maintenance and management costs over the entire system high, and as such, it has hindered the development and rollout of new services.
MAGONIA configures network functions as software separate from hardware and provides a common platform for efficient development of those functions as applications (Fig. 1). The development of network functions using MAGONIA makes it easy to ensure scalability and high reliability, which in the past could not be done without driving up development costs. In particular, using MAGONIA to achieve scalability enables small starts for services and facilitates the early creation of new services.
1.2 Platform functions
MAGONIA features two platform functions: a distributed processing base for achieving an n-active (N-ACT) server cluster and a virtualization base for achieving a resource pool (Fig. 2).
The distributed processing base enables the construction of an efficient N-ACT cluster in which all servers are up and running by having the active system (in which processing is in progress) replicate the data of other servers, thereby doubling as a standby system. This negates the need for dedicated standby servers as in the active and standby (ACT-SBY) architecture used in conventional network systems. In an N-ACT cluster, servers are made to store replicated data based on a sorting algorithm called consistent hashing, which enables the number of servers making up a cluster to be increased or decreased on the fly without service downtime. In this way, system operation can always be achieved with the least number of servers needed to meet demand.
A resource pool, meanwhile, acts as a common source of resources (e.g., the processing power of central processing units (CPUs), the recording capacity of memory, and storage on physical servers) for multiple systems providing them. The resource pool in MAGONIA is achieved by the virtualization base, and it allocates resources in the form of virtual machines (VMs) according to requests issued by applications making up each system. This virtualization base conforms to the architecture framework of the European Telecommunications Standards Institute’s Industry Specification Group on Network Functions Virtualization (ETSI NFV ISG)*2 and consists mainly of the network function virtualization infrastructure (NFVI) based on VMs and a virtual network function manager (VNFM) that provides a management mechanism for autonomous control of those VMs (Fig. 3).
A system applying MAGONIA using these platform functions can increase its facility usage efficiency by a significant amount compared to existing systems. Furthermore, in the event of a server failure, it can incorporate an alternate VM into its N-ACT cluster from the resource pool, thereby enabling autonomous recovery without having to dispatch maintenance personnel to the site where the failure occurred. A reduction in system operating costs can therefore be expected.
At NTT, we are promoting more open activities so that a variety of companies and enterprises can participate in the implementation of MAGONIA.
1.3 Release of API specifications
In this regard, we envision the distributed processing base to be provided as middleware. In February 2015, we released the specifications for the application programming interface (API) of the distributed processing base so that anyone will be able to design middleware and applications conforming to those specifications.
As for the virtualization base, there has been rapid growth in cloud-related commercially available virtualization techniques in recent years, and at the same time, NFV-related products have been regularly launched on the market with the progress in standardization of NFV as virtualization architecture for network systems. We can expect competition in NFV-compliant products among a variety of vendors to intensify in the years to come. Our plan is to expand the use of the distributed processing base in combination with the virtualization base with the aim of implementing MAGONIA and achieving early implementation of the NetroSphere concept.
2. Proactive control and resource optimization technology
In the near future, the increase in 4K/8K high-definition video traffic and the introduction of co-creation services with customers through the wholesaling of fiber access are expected to make communications traffic all the more diverse and unpredictable. Moreover, as the modularization and combination of network functions based on the NetroSphere concept proceed, traffic flow will become increasingly complex, making it difficult to determine where traffic associated with particular services will actually flow in the network. NTT Network Technology Laboratories is addressing this complexity in the network and in traffic quality by conducting interdisciplinary research in the area of network science , making extensive use of technologies from different fields. We discuss here resource optimization technology using proactive control.
2.1 Optimized use of resources
The NetroSphere concept envisions a low-cost network made possible by combining common components. This will require technology for making effective use of pre-existing resources, which differs from the conventional approach to facility design and operation in which the service to be provided is known in advance and the facilities required for the service are deployed appropriately. Resource optimization technology aims to meet this need for using pre-existing resources by determining an optimal combination of software and hardware functions unevenly distributed in the network and implementing them on the physical network via integrated control functions (Fig. 4).
2.2 Key technologies
Specifically, the introduction of virtualization technology typified by software-defined networking (SDN) and NFV will enable the definition and construction of services on the logical network independent of the actual configuration of the physical network. The use of virtualization technology and the technologies summarized below will enable individual services to be mapped to the physical network in a flexible manner.
(1) Traffic prediction technology considering traffic-generation mechanisms
In addition to conventional techniques that uncover trends based on traffic observations, this technology will use human-flow analysis in physical space to reflect people—the source of traffic—and terminal mobility in traffic predictions to improve accuracy.
(2) Resource allocation technology considering availability in the face of future demand
This technology aims to appropriately allocate resources to meet present demand while taking into account the need to allocate resources to meet the demand for new services that may be launched in the future. Although it is impossible to predict future demand with perfect accuracy, we can enhance resource availability by setting aside a sufficient amount of allocable resources (bandwidth, CPU, memory, etc.) .
(3) Control technology considering prediction error
Traffic engineering aims to dynamically control traffic pathways according to traffic fluctuations, but the risk exists of a drop in communications quality due to excessive control caused by prediction error or to delay-time fluctuation caused by a change in path length. This technology is aimed at achieving robust control by minimizing the effects of prediction error through sequential and stepwise control based on model predictive control technology .
Each of the above technologies is substantially different from conventional control technologies that respond in a reactive manner when some kind of problem (congestion, quality degradation, etc.) occurs in the network. That is to say, each can be ranked as a proactive control technology that is intended to mitigate beforehand the risk of a problem occurring.
Resource optimization enables the low-cost provision of services through efficient use of software and hardware functions across the entire network. Proactive control, meanwhile, can improve network ease-of-use for customers by mitigating exposure to problems occurring in network functions. In addition, research conducted in areas throughout network science, including disaster avoidance and quality of experience (QoE), will drive the realization of the NetroSphere concept supporting customer and operator needs.
3. Multi-Service Fabric (MSF)
3.1 Flexible configuration
NTT Network Service Systems Laboratories is researching and developing Multi-Service Fabric (MSF). This technology divides the network into transport functions, service functions, and control functions (controller), in contrast to a conventional network configured with specialized equipment. It enables flexible network configuration through the formation of equipment clusters that combine general-purpose products.
The MSF contributes to the transport layer that routes communications and transfers data by optical means. MSF is technology that provides extensive and flexible functions, and its name is a reference to the many ways that fabric can be used according to how it is woven and cut.
3.2 Application to networks with special requirements
This technology can even be applied to carrier networks that have special high-reliability and high-scalability requirements. MSF aims for the deployment of new networks leading to flexible operations and drastic reductions in cost. In addition, by modeling the network configuration so that modularized server and switch pools can be managed as virtual resources and dynamically controlled, NTT can clarify the requirements for switches and other functional components. In this way, we aim to encourage the participation of many and diverse vendors and promote the commoditization of network equipment.
Conventional networks have been configured using specialized equipment developed on a function-by-function basis. There has consequently been a tendency to develop advanced and large-scale devices in which high reliability and large capacity is achieved in each piece of equipment.
In recent years, however, progress has been made in increasing the speed, capacity, and performance of general-purpose products with an emphasis on datacenter applications. In addition, advances have been made in researching and developing dynamic control technologies such as NFV and SDN and in standardizing interfaces between functions.
A conventional network that tends to consist of advanced and large-scale equipment as described above lacks flexibility in terms of adding or modifying functions or capacity according to business requirements. Such a network can also drive up costs since it is not possible for low-usage portions of certain equipment to be used by other equipment. Thus, to simultaneously achieve cost reductions and flexibility in network services, there is a need not only for higher levels of flexibility and efficiency, but also for a new type of high-function and high-reliability carrier network that can transfer and transport data in a network-wide manner without being limited to conventional router functions.
Meeting this need will require new network technology that can bring existing network technology at datacenters up to carrier grade and that can make maximum use of general-purpose products.
3.3 Separated functions
MSF architecture separates the network conventionally configured with specialized equipment into three sections: (1) transport functions, (2) service functions, and (3) control functions (MSF controller), as shown in Fig. 5. The transport functions section comprises simple general-purpose equipment (general-purpose switch cluster). The service functions section is achieved on the cloud separate from the transport functions section. Finally, the control functions section (MSF controller) controls the general-purpose switch cluster. The elemental technologies supporting MSF are configuration technology, which is aimed at making maximum use of general-purpose products in a low-cost L3SW (Layer-3 switch) cluster having a high degree of freedom, and control technology implemented in the controller.
3.4 Aim and effects of MSF
Our aim in using MSF is to ensure extendibility, improve maintainability and robustness, and promptly provide low-priced services. We seek the following effects by applying a network modeled on MSF technology.
(1) Simultaneous improvements in costs and flexibility
From a transport perspective, MSF is a means of providing essential carrier functions. It will enable the application of optical network technologies at all times and facilitate the low-cost configuration and flexible control of network elements.
(2) Maintenance flexibility through hardware commoditization
MSF will facilitate the commoditization of application technologies by simplifying hardware functions and by unifying specifications. It will enable faulty hardware to be replaced by equipment made by a different vendor, thereby adding flexibility to maintenance work.
(3) Flexible scalability with no boundaries
MSF will enable any type of network to be achieved using as much common architecture as possible. It can efficiently meet the specifications of individual networks and locations with flexible configurations ranging in size from small scale to large scale. At the same time, ongoing advances in general-purpose technologies will enable MSF to provide scalability with no boundaries or limits.
(4) Modularization of network equipment in the transport layer
By modularizing and simplifying equipment and using common architecture, MSF will transform large-scale black-box transport equipment into small-scale white-box equipment. With MSF, configuring a network with only simple hardware will make it easier to visualize operating conditions at the time of a system fault and reduce operation load by enabling multiple pieces of equipment to be abstracted and managed as a single unit.
The above effects mean that a network that applies MSF will be able to reduce facility costs and power consumption, improve maintainability and operability, and provide services in a prompt and uninterrupted manner.
3.5 Plans toward expanded use of MSF
Although still in the research stage, a prototype version of MSF was constructed in 2015. The plan is to use it in experiments conducted within the NTT Group to assess and improve the feasibility of this technology. The first step in these experiments will be to cluster general-purpose switches from multiple vendors with the aim of achieving a practical technology that can replace large-scale routers. The next step will be to establish network-wide, multi-layer coordination technology for transmission equipment and other peripheral devices.
In addition, NTT is engaging in joint studies with both domestic and overseas vendors in such fields as controllers, general-purpose switches, and virtualization technology, and is participating in technical discussions with overseas carriers, all to promote the development and spread of MSF-related technology.
There are also plans to hold proof of concept (POC) demonstrations with vendors in 2015 using open technologies (related to switch/controller products) to promote the spread of MSF as a viable technology. The effective use of such POC demonstrations should help spread MSF technology to vendors and carriers in other countries.
4. MOOS and OaaS
Operations cover a variety of tasks such as fulfillment, quality assurance, and billing. Under the NetroSphere concept, it is important that operations for managing and maintaining services be provided in a one-stop manner . In this regard, NTT Network Service Systems Laboratories is researching and developing two key technologies.
(1) MOOS (management, orchestration, and operation system), which performs end-to-end management and control of the service components
(2) OaaS (Operations as a Service), which provides operation functions to service providers
Conventional network services are developed on a silo structure in which specialized equipment is tied to individual services/networks, and the operation systems of these services/networks automate related tasks. When an equipment failure occurs in such a structure, it may be clear how to switch over equipment and recover from the failure. However, the diversification of services in recent years has made network configurations and operations increasingly complicated. Therefore, finding ways of reducing capital expenditure (CAPEX) and operating expenditure (OPEX) has become a more critical issue than ever before.
In contrast, the NetroSphere concept is based on separate functions and equipment and separate resources and equipment in order to simplify the network, as described before. This kind of architecture and its operations are depicted in Fig. 6. This architecture is virtually composed of the network (2) and server (3) layers on top of the infrastructure (1) layer; the service (4) layer is configured at the very top. MOOS performs end-to-end collective management of layers (1) to (4). For example, MOOS makes all the necessary network settings and shortens the lead time to service provision (Fig. 6(a)). The automatic detection of equipment faults and autonomous resetting of the network and servers improve service continuity (Fig. 6(b)).
In studying MOOS, we assume the equipment configuration, quality requirements, and operational workflow. In designing MOOS in detail, we investigate some actual use cases. For example, when some equipment on the infrastructure layer has physically failed and the operator has been notified of an alarm occurring on the network layer, how can we determine what physical servers the applications are running on, and what services are being provided? In this case, we need to quickly figure out such complicated relations between the applications and servers, determine the area/users affected by that failure, identify failure factors, and initiate a recovery process. Our aim is to establish necessary technologies needed by MOOS as functions, taking the service level agreement of provided services into consideration.
The Management and Orchestration (MANO) Working Group in ETSI NFV ISG completed Phase 1 studies on high-level architecture and functional requirements in December 2014. They commenced Phase 2 studies in January 2015 . At NTT, we are researching and developing MOOS while keeping in mind various factors such as NFV standardization trends.
Thus, with the aim of providing carrier services in a short time and reducing OPEX, we see MOOS as a means of constructing services quickly, adjusting the scale as needed, and controlling operations remotely. In addition to MOOS, we are researching/developing OaaS that would release those operation functions via an external interface to service providers. To promote the creation of new services, operation functions provided via an OaaS API assume the use of up-and-coming technologies in diverse fields such as the IoT (Internet of Things), cloud computing, virtualization technologies (NFV, SDN, etc.), and big data, as well as the ability to combine those technologies with services in a flexible and straightforward manner.
In this regard, the TeleManagement Forum (TM Forum)*3 is studying an architecture that is based on service requirements in a wholesale format . In general, such an interface must be designed and evaluated from various viewpoints including usability, productivity, flexibility, and information granularity and must also be provided considering the needs of various types of service providers. For these reasons, our aim is not just to specify a common API but also to establish base technology that would allow for the combination of multiple APIs or the simplification of an API (e.g., removing certain options).
Thus, NTT aims to achieve unprecedented operations that can produce added values both quickly and at low-cost through our R&D of MOOS and OaaS.
5. Operation process navigation technology
Operational efficiency is a universal theme required in every business domain. In this section, we introduce our operation process navigation technology that supports terminal operations in office work, for example, by simplifying complicated operations or reducing the number of simple repetitive operations, and thus improves operational efficiency.
This technology is also required in the NetroSphere concept where various network services need to be provided accurately, quickly, and efficiently.
Companies use various operation support systems (OSSs) to improve operational efficiency and ultimately reduce OPEX. System operators work at system terminals by following operation processes defined in advance. The OSSs need to be modified in accordance with changes in operation processes caused by, for example, the launch of a new service or product. However, such modification typically involves a very high expense and considerable time, so it is therefore difficult to modify OSSs in a timely manner. Consequently, some companies may decide to abandon the idea of modifying the OSSs at all. In such cases, operators have to add extra operation processes that the OSSs cannot handle, resulting in complicated or lengthy processes. This causes heavy workloads and a lengthy operation time. Moreover, it frequently leads to human errors.
To address this issue, we are studying navigation technology that supports terminal operations without modifying existing OSSs. This technology consists of three essential technologies.
(1) Information-gathering technology that scrambles information from operators and OSSs
(2) Analysis technology that derives useful knowledge from the gathered information
(3) Instruction technology that provides operators with that knowledge
Annotation technology , which can display any information anywhere on a terminal screen without modifying the OSSs (Fig. 7), is one of the instruction technologies. This technology can be used, for example, to show correct operation instructions and helpful warnings/notes for the operators. In this way, the operators can carry out accurate and prompt operations without having to refer to manuals or seek support from expert personnel.
Apart from the annotation technology, we are also working on (i) log analysis technology for gathering operation logs and extracting useful knowledge and (ii) system integration technology that simplifies complicated operation processes provided by the different OSSs.
The navigation technology is a general-purpose technology. It is therefore expected to contribute greatly to reducing OPEX in all business domains. Our next step is to enhance and extend the navigation technology as part of an ongoing effort to make operations more efficient.
6. Environmentally conscious technology (HVDC power supply system)
Implementing the NetroSphere concept requires a telecommunications infrastructure consisting of various facilities such as network equipment, air conditioning equipment, power supply equipment, and outdoor structures. Meanwhile, the environmental load across the entire telecommunications infrastructure must be reduced to achieve a sustainable society. For this reason, the NTT Information Network Laboratory Group has established a strategy coordination organization spanning all facility-managing research laboratories and NTT Group companies to research and develop elemental technologies for reducing the environmental load of information and communication technology (ICT) services (Fig. 8) . These technologies fall into the following six areas.
(1) Power-supply related technology for raising the level of self-sufficiency and supplying power to network equipment
(2) Air-conditioning related technology for raising the energy efficiency of air conditioning systems for telecom buildings
(3) Technology for integrated operation of network equipment and power-supply and air-conditioning systems
(4) Technology for reducing power consumption through network architecture and network equipment designed to contribute to energy savings across the entire network
(5) Resource conservation technology for creating a green telecom infrastructure
(6) Technology for dealing with electromagnetic radiation, lightning, and other disturbances from the external environment
At present, NTT is involved in ongoing efforts to reduce power consumption in network equipment, introduce high-voltage direct current (HVDC) power supply systems, and develop electromagnetic compatibility technologies.
Saving energy in network equipment is directly related to reducing electric power usage, that is, in reducing CO2 emissions. In addition, transforming the systems that supply power to network equipment can further promote energy savings and cost reductions. Specifically, this means the introduction of HVDC power supply systems that supply power to network equipment at 380 VDC (volts direct current) instead of the conventional 48 VDC or 100/200 VAC (volts alternating current).
Introducing an HVDC power supply system provides specific advantages. In comparison with conventional 48 VDC, it can reduce costs by enabling the use of thinner cables and improving power conversion efficiency, and in comparison with 100/200 VAC, it can dramatically improve reliability and power conversion efficiency.
To facilitate the introduction of equipment with new specifications different from those in the past, NTT is developing new power supply systems as described above while also focusing on the standardization of HVDC specifications. Moreover, to expand the lineup of HVDC-compatible network equipment, NTT has formulated and released “Technical Requirements for High-voltage DC Power Feeding Interfaces of ICT Equipment,” which specifies the technical requirements at the connection point between an HVDC power supply system and network equipment . In addition, the NTT Group has declared its intention to formulate a strategy for introducing HVDC power supply systems and promoting their use. NTT aims to take the lead in promoting energy savings across the entire ICT field .
In the above ways, the NTT Information Network Laboratory Group is engaged in a variety of activities to reduce environmental load and put the NetroSphere concept into practice.
7. Collaborative system for implementing the NetroSphere concept
In the R&D of the NetroSphere concept, NTT favors an open collaborative system that progresses from an initial stage of concept study and technology development to joint efforts with many and varied players. That is to say, we have no desire to force our proposed NetroSphere concept on others; rather, we wish to work with others having the same sense of direction to improve our technical expertise and to brainstorm new ways of using the network. In this way, we aim to create an even better network that could not previously be imagined.
With its eye on the implementation of the NetroSphere concept, NTT will continue its efforts in not just technology development but also in expanding this collaborative approach.