Principal Platform Architect
Description & Requirements
WHAT MAKES US A GREAT PLACE TO WORK
We are proud to be consistently recognized as one of the world’s best places to work. We are currently the top ranked consulting firm on Glassdoor’s Best Places to Work list and have earned the #1 overall spot a record seven times.
Extraordinary teams are at the heart of our business strategy, but these don’t happen by chance. They require intentional focus on bringing together a broad set of backgrounds, cultures, experiences, perspectives, and skills in a supportive and inclusive work environment. We hire people with exceptional talent and create an environment in which every individual can thrive professionally and personally.
WHO YOU’LL WORK WITH
As the premier consulting partner for the private equity industry, Bain's PEG boasts a global practice that is over three times larger than any competitor. Our network of over 1,000 professionals supports private equity and institutional investor clients through every stage of the investment life cycle, from deal generation and due diligence to portfolio value creation and exit planning.
Bain & Company is developing a suite of cutting-edge data and software solutions designed to revolutionize how the private equity industry uses data for investment insights and decision-making.
The PEG Innovation team's mission is to create analytical solutions for Bain clients, teams, and the broader institutional investor space using proprietary software and data products. This includes the development, commercialization, and daily management of Bain's proprietary datasets, data, and software businesses.
WHERE YOU’LL FIT WITHIN THE TEAM
The Principal Platform Architect is the technical authority for the PE due diligence platform’s core infrastructure and shared services. You set architectural direction across Platform Engineering, author and steward Architecture Decision Records (ADRs), and ensure the decisions made today do not become the liabilities of tomorrow. This is a hands-on role: you write code, prototype solutions, and review PRs regularly. You are comfortable working across the full stack — from Kubernetes network policy and service mesh controls to API contract design, event-driven integration patterns, and data security boundaries — and you lead technical decision-making across squads when problems are ambiguous, complex, or high-risk.
WHAT YOU'LL DO
Platform Architecture, Standards, and Technical Direction (80%)
- Set and maintain architectural standards across all platform services and infrastructure, including service decomposition, API design, event-driven integration patterns, and failure-mode analysis.
- Author, review, and steward Architecture Decision Records (ADRs) for significant technical decisions; ensure decisions are documented, discoverable, and enforced through implementation.
- Lead technical design reviews for new services, major features, and cross-cutting changes; identify risks early and define viable, secure, operable paths forward.
- Own the platform’s non-functional requirements (performance, scalability, reliability, security): define target outcomes, guide implementation, and validate through evidence (metrics, tests, load results, incident learnings).
- Define and enforce engineering standards across the estate via reusable CI workflows, templates, and shared tooling; reduce variance across teams while enabling delivery velocity.
- Design and govern platform security architecture: zero-trust networking, mTLS policy, Vault secret management, supply chain security controls, and strict data boundary enforcement.
- Provide technical leadership on Kubernetes and cloud platform architecture: cluster architecture, namespace isolation, RBAC, admission controls, autoscaling, GitOps delivery, and safe multi-tenant patterns.
- Guide event-driven platform design using Kafka, including schema evolution practices, CloudEvents envelope conventions, and reliable consumer patterns.
- Define observability and reliability practices: OpenTelemetry instrumentation standards, structured logging, distributed tracing, Prometheus metrics, and SLO definition/measurement.
Other (20%)
- Mentor senior engineers and technical leads; raise the technical bar across the team through code review, pairing, and coaching on architectural thinking and production engineering practices.
- Identify and address technical debt proactively; propose and drive remediation plans that balance delivery needs with long-term platform health.
- Collaborate with AI Engineering leadership on Agent Gateway architecture, safe AI workload patterns, and ephemeral agent compute requirements.
- Represent Platform Engineering in cross-functional technical discussions with Data Platform, Product Engineering, and product squads; align on shared contracts, ownership, and delivery plans.
- Use AI tooling to accelerate research, prototype options, and draft ADRs/runbooks; apply rigorous judgement and validation before any output influences production decisions.
ABOUT YOU
- Bachelor’s degree in Computer Science, Engineering, Information Systems, or a related field (or equivalent practical experience).
- 10+ years of experience in platform engineering, backend engineering, SRE/DevOps, or infrastructure engineering roles operating production systems at scale.
- Demonstrated experience serving as technical authority for distributed systems: architecture, service boundaries, API contracts, and event-driven design.
- Proven track record designing and operating Kubernetes-based platforms in production, including cluster architecture, workload isolation, and GitOps delivery patterns.
- Experience defining and enforcing engineering standards across multiple services/teams (CI/CD quality gates, templates, shared libraries, and coding standards).
- Experience designing security architecture for production platforms (zero-trust, mTLS, secrets management, supply chain security, and data boundary enforcement).
- Demonstrated ability to influence cross-functional stakeholders and drive decisions across multiple squads without relying on formal authority.
Platform Engineering/Architecture
- Deep expertise in distributed systems design: service decomposition, event-driven architecture, API design patterns, resilience strategies, and failure mode analysis.
- Strong Python proficiency; able to set and enforce standards across a multi-service Python codebase (FastAPI, Pydantic v2, SQLAlchemy 2.0, async patterns).
- Expert-level Kubernetes knowledge: cluster architecture, namespace isolation, RBAC, admission controllers, service mesh (Istio or equivalent), autoscaling (Karpenter/KEDA or equivalent), and GitOps (ArgoCD).
- Deep familiarity with infrastructure as code (Terraform) and cloud platform architecture (AWS or GCP) at production scale.
- Security architecture: zero-trust networking, mTLS, Vault secret management, supply chain security controls, and data boundary enforcement.
- Event-driven platforms: Kafka design and operations, schema evolution practices, consumer reliability patterns, and CloudEvents standards.
- Observability and reliability: Prometheus, Grafana, OpenTelemetry, structured logging, distributed tracing, alerting strategy, and SLO definition/measurement.
- Familiarity with durable execution patterns (Temporal or equivalent) and ephemeral workload management patterns (KEDA, Spot capacity strategies).
Generative AI and agentic systems
- Uses AI coding assistants (Cursor, GitHub Copilot, or equivalent) as a first-class part of the development workflow; critically evaluates, refactors, and validates agent-generated code rather than accepting it uncritically.
- Designs platform architectures that safely accommodate AI workloads: Agent Gateway patterns, egress controls, prompt/version governance, auditability, and ephemeral agent compute constraints.
- Familiar with agentic CI/CD patterns: LLM-assisted code review, automated architecture drift detection, and AI-generated runbook suggestions as pipeline quality gates.
- Capable of evaluating AI-generated infrastructure code (Terraform, Kubernetes manifests, Helm charts) for correctness, operability, and security before it enters the estate.
General
- Operates with a long time horizon; optimises for platform health in years, not just the next sprint.
- Writes code and reviews PRs regularly; remains deeply connected to the codebase and production realities.
- Treats documentation as a deliverable: ADRs, runbooks, and architectural diagrams are kept current and actionable.
- Raises concerns early and constructively; does not block progress without proposing viable alternatives.
- Uses AI tooling to move faster, but applies rigorous judgement and validation before any output influences production decisions.
- This role follows a hybrid model, requiring in-office presence at least 1 day per week
U.S. COMPENSATION INFORMATION
Compensation for this role includes base salary, annual discretionary performance bonus, 401(k) plan with an annual employer contribution based on years of service and Bain’s best in class benefits package (details listed below).
Some local governments in the United States require a good-faith, reasonable salary range be included in job postings for open roles. The estimated annualized compensation for this role is as follows:
In Atlanta, the good-faith, reasonable annualized full-time salary range for this role is between $164,500 - $179,500
In Boston, the good-faith, reasonable annualized full-time salary range for this role is between $189,250 - $206,500
In Dallas, the good-faith, reasonable annualized full-time salary range for this role is between $172,750 - $188,500
In Chicago, the good-faith, reasonable annualized full-time salary range for this role is between $181,000 - $197,500
In New York, the good-faith, reasonable annualized full-time salary range for this role is between $205,750 - $224,500
Placement within these ranges will vary based on factors such as experience, education, training, and skill level.
Compensation also includes a discretionary annual performance bonus, 401(k) plan with employer contribution, and Bain’s best-in-class benefits—including full premium coverage for medical, dental, and vision, generous paid time off, and more.
Annual discretionary performance bonus
This role may also be eligible for other elements of discretionary compensation
4.5% 401(k) company contribution, which increases after 3 years of service and is 100% vested upon start date
Bain & Company's comprehensive benefits and wellness program is designed to help employees achieve personal independence, protection and stability in the areas most important to you and your family.
Bain pays 100% individual employee premiums for medical, dental and vision programs, offering one of the most comprehensive medical plans for employees without impacting your paycheck
Generous paid time off, including parental leave, sick leave and paid holidays
Fully vested 401(k) company contribution
Paid Life and Long-Term Disability insurance
Annual fitness reimbursements