Thoughts on code development, access technologies, and telecommunication networks from developers within ADTRAN’s R&D organization.
ARCHIVED BLOG POSTS
Written by: Eric LangPublished: 8 November 2016
Developers at ADTRAN cover a lot of ground, from developing embedded networking devices to developing microservices that run in datacenters.
ADTRAN Mosaic Cloud Platform (Mosaic CP) is a product that gives users a bird’s-eye view of their network and is where all of these development layers come together. A typical high-level integration test for Mosaic CP pulls together components from across the org: a collection of Mosaic CP microservices and a mix of real hardware devices and simulated hardware devices. Some tests also incorporate virtualized ADTRAN products, where embedded product code has been cross-compiled to run on commodity x86 hosts.
Pulling these pieces together for integration testing, while testing them with the various hypervisors that our customers prefer to use, is a fun orchestration challenge.
Today I want to talk about a service we've developed in-house to make creating these types of integration test environments convenient for everyone inside the org: testers, developers, sales people, etc.
We call the system TestBed as a Service (TBaaS). Nathan, a developer here, touched on TBaaS in a previous blog post. Today I will share some of our design decisions and motivations for building TBaaS, and discuss how TBaaS is used within ADTRAN.
In times past, many development teams at ADTRAN maintained their own mini datacenters with fixed testing hardware. This approach has scaling issues and high maintenance costs, and as ADTRAN and telecom have begun shifting to SDN, it has become easier for teams to share commodity testing hardware.
The modern way of tackling testing environments is to use an orchestrator to build on-demand environments for you. When you build the environment from scratch every run, there is no chance of interfering with another test or getting stale state from a previous test. Hooray determinism! In the earlier stages of testing, this determinism is nice to have. Of course in the later production-like stages of testing, you have to be careful with "throw it away every time."
There are many orchestration systems available. Docker Swarm, for instance, is an orchestrator from Docker Inc., which specializes in orchestrating containers. OpenStack is another popular orchestrator, which generally uses virtual machines as the unit of virtualization. OpenStack is extremely powerful, though significantly more complex to manage than Swarm.
We evaluated both OpenStack and Swarm in detail and selected Swarm based on our need to host a large number of containers in a private cloud. We added an API layer on top of Swarm that gives us a multi-user interface and abstracts away details of the backend. While the backend was an important decision, we think of TBaaS as being a generic virtual resource manager, or an orchestrator of orchestrators. There are costs and benefits of adding your own API layer, and I'll touch back on this in a minute.
Swarm has proven remarkably flexible, even enough for us to orchestrate virtual machines with it! OpenStack does not support all of the hypervisors we need out of the box, but we eventually added support for it to take advantage of its VM expertise. There is room to consolidate our backends since the Magnum project from OpenStack now makes it easy to actually host Swarm within OpenStack.
TBaaS abstracts a testbed into a list of resource types, each of which has a type and arguments. This style allows you to model nearly anything, and is a common way to express resources in orchestration systems.
Here is a typical scenario with TBaaS, a testbed containing a network element simulator and a Mosaic CP VirtualBox VM:
There are a couple classes of resource types TBaaS provides to users: atomic types, like in the example above, but we also support higher level resources. An instance of the Firefly platform, for example:
This is similar to how a cloud host can give you access to a managed database instance. To you, it looks like a single entity, but behind the scenes there is an unknown amount of orchestration going on to make it available. Doing this same thing for instances of Firefly makes it easy for developers to test their applications.
In general, we add new resource plugins as they're requested or as we identify something that would have wide appeal across the org.
A risk of abstracting resources in this way is sinking to the lowest common denominator feature set. Docker supports unique operations that don't directly translate to VMs and vice versa. Our approach has been to remind ourselves that TBaaS is a thin convenience layer, and remain focused on providing a streamlined interface for creating throw away testbeds. We suggest that users with more advanced needs build on the backend technologies directly. Thus, the operations our TBaaS API supports are indeed very generic.
Over time we have built up various libraries and tools for TBaaS that are tuned for interacting with temporary environments.
For example, we have a succinct way to create an environment with Python:
with TestBed(config) as testbed:
This two-line sample helped us grab the attention of other development teams early on. The with statement is a Python construct for managing the creation and destruction of a resource. The testbed will be created and available by the time the inner scope executes, and destroyed when the scope exits.
We wrote a similar integration for py.test, a Python test framework that many teams here use. A testbed configuration is defined externally (not shown) and one or more testbed instances will be created and handed to your test cases. You can choose whether you want test cases to share the same instance or use separate instances:
The admin side of TBaaS is optimized for temporary environments as well. For example, we added an expiration system for testbeds, which nags users to renew their testbeds and automatically destroys testbeds that have not been renewed. This system helps keep utilization low while allowing people to hold testbeds as long as necessary.
Finally, these tools are nice for automation, but for human usage, we have a web GUI:
A couple years into the project we can look back and evaluate the impact of TBaaS on ADTRAN.
Our need to host containers and VMs with multiple hypervisors makes it difficult for a single orchestration system to meet all of our needs. TBaaS has shone here, providing a convenience layer and filling in the gaps of other orchestration systems we use.
Having your own convenience API layer has certain advantages:
Organizations with a narrower stack might not find these reasons sufficient to maintain an API. For example, an alternative implementation that provides the same essential "testbed as a service" benefits at the cost of less flexibility would be OpenStack + plugins (maybe) + admin scripting.
All and all, TBaaS has worked well for us over the past couple years, allowing us to hit the ground running with Firefly development and adapt to new needs as they arose. I am excited to see how ADTRAN's infrastructure will continue evolving over the coming years!
ADTRAN, Inc. is a leading global provider of networking and communications equipment. ADTRAN’s products enable voice, data, video and Internet communications across a variety of network infrastructures. ADTRAN solutions are currently in use by service providers, private enterprises, government organizations, and millions of individual users worldwide. For more information, please visit www.adtran.com.
Archived Blog Posts