Cloud Journey — Part 5

Chris Shayan
11 min readDec 11, 2021

Cloud Journey Series:

We are leading a great transformational projects that is already live and have hundreds of thousands customer on it, you can read more about this transformation value streams in “ Transformation at Techcombank” which I explained in details about our DevSecOps practices. In this post, I will share more about using Platform Ops to accelerate DevSecOps adoption.

The platform ops team is responsible for providing a self-service development, deployment and operational platform that enables multiple software delivery teams to build and operate their own products.

What is Platform Ops?

Platform ops is an approach to scaling DevSecOps that involves dedicating a team to the delivery of a shared self-service app platform. Technical professionals focused on DevSecOps should use a product mindset to establish a platform that helps agile development teams deliver higher quality faster. DevSecOps is a customer-value-driven approach for delivering solutions that uses agile methods, collaboration and automation to meet shared goals.

You might ask, “Why do we need yet another form of ‘Ops’? We have NetOps, SecOps, DevSecOps, and even FinOps!” But in fact Platform Ops is actually different from the other “Ops” and increasingly serves as the glue that holds together all the different organizations and use cases required to meet the needs of technology organizations building modern, distributed, and cloud‑native applications. Platform ops leaders and team members must be able to code, configure and deploy apps, plus all the other things that engineers do. They advise and collaborate, but they can also dive into the trenches to get stuff done. This is also, by the way, how they learn what it really feels like to use the platform they have built for the dev teams.

My Learnings

After we have successfully launched all transformational projects (retail omni-channel banking, business omni-channel banking, DevSecOps, private cloud and others, you can read more on “ Transformation at Techcombank”), I have learned few key lessons that might worth sharing regarding Platform Ops:

  • Platform ops is the team that delivers, maintains and improves a platform as a service (PaaS), including the continuous integration/continuous delivery (CI/CD) toolchain, for multiple agile application teams delivering custom-built software.
  • The goal of platform ops is to achieve efficiencies and economies of scale in a DevSecOps environment, where application product delivery teams are responsible for deploying and operating their own applications. A shared consistent platform reduces duplication of technology, enables automation and focuses expertise.
  • The platform is itself a product used by developers. Collaboration between platform ops and development teams is essential to create a software-defined platform that operates to the standard required to protect the organization while allowing the development teams to move fast.
  • The platform ops approach is effective whether the underlying infrastructure is in a public cloud, a private cloud, or a virtualized and automated environment on-premises. The details are different, but the high-level functions are common, including security, access control, compliance, cost management and performance management.
  • Treat your platform as a product. The developers using it are your customers, and their requirements should define the platform and your priorities. Your goal is to enable your internal customers (development teams) to deliver services to your organization’s customers.
  • Collaborate with application delivery teams to establish a cross-functional team, platform and operating model that meet business needs. Pushing a platform on application developers has a high risk of failure in capability and adoption.
  • Build teams, processes and culture that will continually improve — not just sustain — the platform through collaboration with platform users. The technologies you deliver to meet developer and platform needs will change and, thus, shouldn’t overshadow the nontechnical aspects of the initiative. Treat evangelism, support and training for your internal customers as seriously as a commercial platform provider would for its paying customers.
  • Use an agile approach by targeting an “anchor” application team and delivering a platform to meet their needs first. Often, this will mean taking over operational responsibility for a platform that the anchor team members were running themselves. Their success will create more demand and diverse requirements.

Platform ops teams will not always create or build a shared platform on-premises. The responsibility of platform ops is to deliver and operate a platform, whether it is:

  • Built on-premises from multiple components (such as open-source Kubernetes or a virtual machine environment highly automated with Red Hat Ansible)
  • Based on commercial platform products (such as Red Hat OpenShift or VMware Tanzu)
  • Composed with governance from public cloud services (such as AWS Elastic Beanstalk or Azure Kubernetes Service)

To apply product thinking to platform requirements gathering, start with customer goals, from developer and administrator personas, in specific scenarios, leading to a set of requirements. It is important to recruit willing, motivated and collaborative teams members with valuable technical skills for the formation of this team. Simply assigning your best, or available, resources to the team will undermine the culture from the start.

In a DevSecOps organization, you will relocate I&O professionals from traditional I&O silos into product development teams, supporting automation teams, cloud ops teams or SRE-style teams that sit alongside the platform ops team you are creating. These possibilities are shown in above figure. “ Analyzing the Role and Skills of the I&O Professional in DevOps “ provides context on this organization and skills transformation for I&O professionals.

Deployed and operational microservice applications encompass two architectural viewpoints:

  • The inner architecture, which describes the software architecture of an individual microservice and what it exposes to the outside world.
  • The outer architecture, which describes the operating environment, platform and distributed management ecosystem in which your microservices will be built, deployed, executed and supported.

Following figure shows a microservice infrastructure, including the relationship between these two architecture views and the key components of the outer architecture as well as Platform Ops.

Continuous delivery (CD) has emerged as the preferred architecture for modern, agile software development, and organizations using agile methodologies should adopt continuous delivery without reservation. Platform ops should prioritize delivering a rock-solid continuous application delivery pipeline because:

  • All product teams will benefit from a reliable CD pipeline, regardless of their deployment requirements, whether they are using containers, aPaaS or more traditional software packaging.
  • It will improve product teams’ ability to work in small batches and incorporate customer feedback along the way.
  • It greatly lowers the risk of errors during the release process and creates a level of predictability not found in manual deployments.
  • “How to Architect Continuous Delivery Pipelines for Cloud-Native Applications” discusses how today’s businesses demand rapid software delivery to stay ahead of the competition, adopt new business models and move into new markets.

Following figure shows the full pipeline and combines continuous integration with continuous delivery and deployment. Continuous delivery of software extends CI to include the automated build, test and non-production deployment of the application, followed by subsequent delivery to an operations or deployment function for production deployment.

Though a platform does not have to be based on containers or Kubernetes, those options offer a good example of the separation of concerns achievable in the platform ops model. The platform ops team is responsible for the care and feeding of the platform as a whole, and the product team creates and operates an application end to end by itself. Below figure summarizes the relative operations responsibilities for both the product team consuming the platform and the platform ops team. The role of platform ops is to enable the application teams to concentrate on delivering their application and forget about Kubernetes’ intricate infrastructure details. The model works when each type of operations team can do its work independently, occasionally collaborating on the deployment manifests (that is, files that define the relationship between the containers and the infrastructure).

Robust continuous delivery solutions require developing agile fluency and maturity, refactoring application architecture, and automating infrastructure to achieve a suitable delivery cadence. Continuous delivery (CD) makes it possible to regularly adapt software to incorporate user feedback, shifts in the market and changes to business strategy. Building a CD culture requires technical professionals to enhance development skills; change processes, practices and architecture; and automate everything. Engineering discipline is required to facilitate the complete automation of the delivery pipeline from the point where developers commit their code to the actual release to the user. Adoption of agile isn’t enough to enable CD. Agile teams are able to speed the development process, but without DevOps, they are not able to deliver any faster. This Solution Path provides guidance for technical professionals seeking to implement continuous delivery.

Meet Demand for Rapid, Continuous Delivery With a Fluid Release Schedule

Organizations need a better understanding of how MicroService Architecture fulfills new business demands for rapid, continuous change. I am proud to announce we have achieved this in Techcombank as well.

They also need a firmer grasp of the complexity trade-offs involved in achieving these benefits. This understanding can be gained by considering the analogy of a “train service” versus a “taxi service,” as shown in Figure 5:

  • Train service: The release schedule of traditional development approaches and architectures can be likened to a railroad schedule. The planning and delivery of different pieces of functionality are linked and delivered in specific, scheduled time intervals, much like regularly scheduled trains. This approach provides benefits in efficiency and dependency validation, but it constrains the speed and cadence of functional delivery. Passengers on a train cannot depart until the scheduled time. Similarly, any application functionality in a traditional architecture approach must wait to be delivered in a scheduled release — or wait for other pieces it depends on to be completed — before it can reach the users who need it. Agile Release Trains in the Scaled Agile Framework (SAFe) are an example of this model.
  • Taxi service: Microservices, by contrast, are more akin to a taxi-dispatching service. Taxi passengers depart as soon as they’re ready to go — rather than on a scheduled departure time. Likewise, microservices can be used to deliver smaller chunks of functionality fluidly, quickly and dynamically. Because pieces of new functionality don’t have to wait on a deployment schedule, users don’t have to wait as long to take advantage of new functionality. Of course, independently changing the components of your system means you have the additional complexity of managing dependencies. Managing dependences requires disciplined interface design and versioning. It also requires late binding between services at invocation time using service discovery and request routing.

Adopting MicroService Architecture means embracing new team structures to match the new responsibilities and processes needed to deliver the new architecture. If your IT leaders are supportive of MicroService Architecture, they must also understand, support and help to implement this change in focus for you and your peers. Below figure shows the difference between organizational structures suitable for traditional application delivery (on the left) and teams organized to deliver microservices (on the right).

Mesh app and service architecture, shown in below figure, allows you to optimize for agility through modularity at all levels of a system. It promotes clear definitions of the components involved, the data and functional requirements of those components, and the optimal communication channel requirements between components. Mesh App & Service Architecture (MASA) enables the composable business.

The following are some key principles that Mesh App & Service Architecture (MASA) prescribes in building such an architecture:

  • Each multi-experience app, API and multi-grained service should be independently built, tested and deployed.
  • APIs connecting apps to back-end services (outer APIs) should be mediated to promote consistency, simplify management and provide abstraction.
  • APIs should apply a consumer-centric design principle, where the developers/applications using those APIs are the consumers.
  • A domain-driven design should be used to identify strong boundaries, and to establish integration and composition requirements between those domains.
  • Domain interfaces should be designed to enable the consumption, composition and integration necessary to meet the flexibility goals of the solution, using the most appropriate API styles.
  • For each component of the architecture, ownership for implementation, versioning and operations must be established.
  • Adoption of MASA is gradual. Organizations must start small, make the necessary adjustments and evolve the architecture over time. MASA is never “done.”

The key “hot” application architecture patterns that we see continuing to dominate architecture discussions and planning are:

  • Microservices architecture — Use MSA when you are creating applications that demand a fast pace of change. Ensure that the increased cost of operation is justified by the ability to deliver business-driven changes quickly, and that the demand for change will remain over time. Do not use MSA for applications with a slow pace of change.
  • Event-driven architecture — Use EDA when you need to decouple data providers from consumers, and when you need to enable multiple consumers of a single stream of events. Do not use EDA for cases where the sender needs to know whether a message has been received.
  • Mesh app and service architecture — Use MASA when you need to create multiple user experiences from shared services. Do not use it for cases where a single N-tiered application will suffice.
  • Headless/API-centric architecture — Use headless architecture when you need to decouple user interfaces from back-end services in order to modify existing capabilities or integrate new capabilities. Do not use it when operational simplicity is a higher priority than agility and flexibility.

--

--

Chris Shayan

Purpose-Driven Product Experience Architect. I’m committed to Purpose-Driven Product engineering. My life purpose is Relentlessly elevating experience.