platform engineering
Platform engineering is a specialized discipline within software development that focuses on designing, building, maintaining and improving the toolchains and workflows software developers use. Platform engineering provides comprehensive and consistent tools and processes, which enables developers to focus on software development instead of managing underlying toolchains.
A platform engineering team can build a common suite of tools, services and workflows that provides all development teams with an internal developer platform to streamline development, eliminate waste and enhance efficiency.
A brief history of platform engineering
Software development lifecycles are accelerating as organizations push to gain market share, build competitive advantage, and drive innovation. Shorter development cycles mean more time and resources spent on the underlying toolchains needed for new software iterations. At the same time, development and deployment environments have become considerably more complex with factors such as security, data protection, regulatory considerations and varying cloud alternatives.
These toolchains have long been the direct responsibility of developers and project teams. Developers select, deploy, manage and maintain the tools, and then rely on separate operations staff to provision and configure the resources needed to deploy each build for testing and production.
This paradigm offers flexibility, but it also leads to inconsistencies, bottlenecks and waste. A development team that can barely keep pace with accelerating release schedules might struggle to maintain the supporting toolchain. Setting up tools and troubleshooting their issues takes time away from development. Each project team could adopt different tools and workflows, and development and operations pose challenging silos that are difficult to blend.
Rising platform complexity and the issues that come with it have led to platform engineering. Instead of leaving developers to the task, platform engineering provides a specialized role to manage the underlying platform of development tools, services and environments.
Platform engineering goals
Today, platform engineers implement toolchains and workflows as integrated software development platforms. To create effective platforms, platform engineering teams should keep in mind a number of goals:
- Developer productivity. A platform should support and enhance developer productivity. A meaningful platform doesn't just alleviate platform tasks from developers -- it actually enhances and accelerates development efforts.
- Self-service. A platform should be a frictionless resource. Developers should be free to use the tools and services needed with little (if any) intervention from the platform engineering team.
- Security. A platform must be secure -- not only in access to users, but also to protect the intellectual property reflected in the codebase as well as constituent data.
- Resilience. Developers cannot work without a platform, so platforms must provide both scalability and availability to meet the uptime and performance needs of developers.
- Observability. Platforms require monitoring to ensure performance and availability, so tools are typically fitted with instrumentation for health and performance monitoring. When a tool fails or crashes, platform engineers must act quickly to remediate the issue and prevent extended disruptions.
- Collaboration. Multiple teams, which can number dozens or even hundreds of people, might use the platform. This makes collaboration tools and capabilities a vital part of any platform. Capabilities might include collaborative repositories, interactive editing and testing tools, and other creative means of sharing ideas, work and results.
- Improvement. Platform development should enable developer success. Improvement can include regular troubleshooting and support, along with periodic updates and enhancements to tools, APIs, data stores, workflows and policies.
Platform engineer job responsibilities and skills
Platform engineers are responsible for handling all platform tasks, but specific responsibilities vary with the unique needs and size of each business. Common platform engineering responsibilities include the following:
- Design, implement and maintain the platform. Platform engineers are infrastructure experts, and their principal role is to provide the tools, services and physical infrastructure developers use. This often extends to creating workflows and policies needed to facilitate the platform and its use.
- Update and upgrade the platform. Platform engineers must evaluate and test new tools, and update existing tools as new patches and upgrades become available. They also document the platform and provide training to developers and other stakeholders.
- Monitor infrastructure. The platform must meet clear and well-defined metrics for performance and availability. Platform engineers use tools and technologies to monitor the many elements of the platform to check health, ensure performance and maintain security.
- Support platform and applications. Platform engineers are the helpdesk for the platform. Developers and other users will turn to platform engineers when they have problems with tools, services and application deployments. In turn, platform engineers perform troubleshooting and remediation of the platform and applications running on the platform.
- Implement self-service and automation. Modern software development platforms often utilize (or are closely modeled after) the public cloud, with a strong emphasis on automation and self-service. Platform engineers develop scripts, APIs and other resources to automate the platform for developers.
Platform engineers must also keep up with new technologies to ensure the platform is competitive. This often involves finding new opportunities to reduce costs and improve application performance and availability. In addition, platform engineers are excellent communicators who can work closely with platform stakeholders to discuss issues and develop a meaningful roadmap for platform development.
Successful platform engineers require a mix of hard and soft skills:
- Continuous integration/continuous delivery pipeline expertise. Platform engineers develop platforms with the primary purpose of supporting the software development process, so they must possess a detailed knowledge of the CI/CD pipeline and SDLC to build a suitable platform for developers.
- Coding. Even though they aren't part of software product teams, platform engineers must possess software development skills. This typically involves coding proficiency in major languages such as Python for scripting and automation; Java for building platform tools and services; and C++ for building major applications, operating systems, database applications and other vital custom software for the platform. Many platform engineers also require detailed knowledge of scripting languages and frameworks.
- Debugging. As specialized developers, platform engineers must be experts at troubleshooting and debugging the code used to create and run the platform. This demands expertise in analyzing logs and examining error messages, and then tracing the flow of code across each software and hardware element of the platform.
- Network expertise. Modern platforms depend on proper network operation, so platform engineers require working knowledge of networking concepts including TCP/IP, domain name system and HTTP. They must know how to both set up and secure a network.
- Cloud computing expertise. Platforms are typically modeled after clouds, and are increasingly deployed on public clouds such as AWS, Azure or Google Cloud Platform. Platform engineers should be able to deploy, monitor and manage tools and services deployed to public cloud infrastructures.
Platform engineering benefits and drawbacks
Platform engineering promises several important benefits:
- Improved project quality. More developer time and skills are available to work on projects instead of platforms. This helps speed delivery schedules, bolsters overall software quality, and leaves more time for testing.
- Improved development consistency. A well-developed platform provides all project teams with a uniform set of tools and processes. This creates a well-understood and well-supported development environment. Every developer and team uses the same tools and processes, building consistency in product development and deployment.
- Improved versatility and efficiency. When all developers use the same platform, members from one team can shift to another team or project and already be familiar with the tools and workflows. This can vastly accelerate learning curves and make cross-team and cross-project collaboration far more efficient than when teams have different tools or processes.
- Improved platform uptime. Platform engineers specialize in building, maintaining and improving the platform. Developers call the platform team when a tool or task encounters problems, leading to faster and more decisive remediations of platform problems.
Despite the benefits of platform engineering, there are several potential drawbacks organizations should consider:
- Buy-in. Platforms are costly and can carry some risk. Platform initiatives require strong involvement and support from senior management and project leaders.
- Larger staff. Creating a platform engineering team often means hiring more employees with platform expertise. This increases staff and overall costs for the business.
- Greater complexity. A software development platform represents yet another infrastructure that the business needs to control. Further, the business can become dependent on a single platform, which can sometimes constrain developers' creativity and potentially limit the ways that developers solve problems.
- Change can be difficult. Building a single platform can solve many problems for the business, but dependencies and performance expectations can make change more time-consuming and difficult -- though this is a core responsibility of platform engineers.
- Possible compatibility issues. A platform should be uniform, but no two projects are the same. A platform designed for one type of project might not suit all types of projects. Consequently, a platform might experience limitations in the types of projects that it can adequately support.