Inevitably, when discussing what we do at Included Health (formerly Grand Rounds), the Tech Stack Question comes up:

“So what tech stack are you on?”

I usually crack a joke about Erlang and brace myself for a lecture about how you can code Fibonacci in just two lines of code. (Rest assured, for the moment, we don’t have any Erlang code.)

It can be tempting to dismiss the Stack Question as irrelevant. We’re redefining healthcare! It doesn’t matter if we write code in Python or Go. We don’t hire engineers based on whether they know React vs. Angular.

But it does matter. While there may be a lot of tools we could use to get the job done, the tools we actually use can have profound implications on how we serve our customers and members. We want the tools to be able to do the job, but we also expect them to be mature, widely supported, liked by engineers, performant, and so on.

Also, if you were being hired to be a race driver, wouldn’t you want to know whether you’re signing up for the 24 Hours of Le Mans or the 24 Hours of Lemons? They can both be a lot of fun, but only one of them has real prize money.

So, if you’re considering joining us, or you are technologically curious, let’s talk tech stack.

Architecture

Our architecture is best described as a “hybrid”. We have several dozen services, or microservices, although there are a few we affectionately consider monoliths.

We’re working on ways to break the monoliths into more manageable chunks, but we’re not fully drinking the microservices KoolAid. We don’t want a proliferation of absurdly tiny services.

Instead, we see microservices as a tool that gives us finer-grained control over certain parameters, which, in turn, improve our engineering quality and speed.

Here are two articles that align with some of our philosophies at Included Health and are well worth the read:

We’ve been using GraphQL, REST, and REST/FHIR as our communications protocol, and while REST and FHIR are not going away, we’re definitely making a push to standardize around Federated GraphQL. There are advantages in being able to encapsulate parts of the graph in specific domains, manage schema evolution, and more.

While many companies are heading in the Federated GraphQL direction, the tooling is becoming more mature, and best practices are emerging, not everything has been quite figured out, so it’s an exciting area to work on!

Applications Back-End

Historically Included Health has run mostly on Ruby/Rails. We have many production services running on Ruby, including one of our bigger monoliths. We also have some Scala services.

Our applications were vertically integrated and relatively small, so the monolith worked quite well. When we had needs for new applications that did not fit neatly in the monolith, we were not shy about creating them separately.

Eventually, however, the vertically integrated approach reached its limits, and in 2020, we started separating the front-ends from the back-ends.

The back-ends started migrating to Golang in support of scaling both our member base, as well as our engineering teams. All new back-end services are being created in Go, and we are gradually migrating our legacy services as well. There are lots of opportunities in this area to work on projects that get leveraged across our entire set of applications.

Applications Front-End

Most of our web front-end was React/JavaScript. It was being served via our Rails infrastructure in one of our bigger monoliths. This made it harder to debug and fix performance issues, as well as do ongoing maintenance.

Since late 2020, we’ve been carving the front-end out as a standalone service. As part of this upgrade, we’re also using Next.js and fully converting everything over to Typescript, which will help us avoid some of the obvious type-safety bugs. Our first migrated service reduced our user-facing latencies by a factor of ~20x!

On the mobile side we use Swift and Kotlin respectively. We have a number of webviews in the mobile apps, so we’re migrating those to native, as well as re-aligning the native app to be a lot more server-configurable. We’re trying to keep the flexibility of the server-driver configuration, while also creating a fast, fluid, low-latency user experience.

Care Platform Team

The Care Platform provides our care and clinical team with the tools and data aggregate views they need to fulfil member needs.

We integrate a Salesforce product called Health Cloud with the Included Health ecosystem. Our Health Cloud instance is heavily customized to enable the care team to access medical data in streamlined views. The Care Platform team also owns care team-driven service lines end to end and creates the member-facing React and Typescript client code that helps members interface with the care team.

The Care Platform exposes care team-generated data to other internal systems through our GraphQL gateway via a Golang translation layer service. We’ve also integrated with the EHR athenahealth to deliver our telemedicine offering.

Included Health supports real-time communication with our care and clinical team through a Ruby chat service backed by Sendbird. Chat has become an essential communication channel and is preferred by many members.

Data Engineering & Data Science

Our data platform is at the heart of Included Health. To drive better health outcomes for our members, we are focused on making the best use of data. For example, using this data we can proactively reach out to members who are at greater risk for certain conditions and work with them to develop a plan that works for them.

Our data platform stores data in S3, catalogs the data with Glue, and orchestrates Spark job executions with Airflow. Our Spark data pipelines are language agnostic, and various teams write in Java, Scala, Python, and Spark SQL.

Humans interacting with the data platform use Querybook, our data collaboration environment. It supports data discovery, data notebooks, collaboration, and publishing new datasets back to the data platform.

Data scientists primarily work in Python, and our significant data science platform spans both our data platform and BigQuery. By the end of 2021, we expect to have moved off of BigQuery entirely to the data platform.

Over the next year, we’re exploring areas such as fields as first-class objects, a unified collaborative data model/mesh, reporting and presentation layers built to support our collaborative data model, and much more.

To learn more about our data platform, check out this post.

Developer Platform & Infrastructure

To meet the needs of our growing engineering teams, we’re heavily investing in providing the correct tooling to make sure they’re happy and productive. Our Developer Platform team collaborates closely with the rest of engineering to create those tools.

All our services run in AWS using Docker, Istio and Kubernetes. We use CircleCI for CI/CD and have a fairly solid process for creating and deploying across staging and production environments.

The newest work we are doing here is on observability and alerting. We’re consolidating our various existing systems around a Prometheus/Grafana solution to enable all services to get logs, exception management, dashboard, and alerts without any effort on their part.

Collecting metrics helps us understand the health of individual services, as well as the entire ecosystem, which enables us to quickly pinpoint problems and direct resources where they’re needed most.

VGltZSB0byB3cmFwLg==

Erlang jokes aside, this sums up the Tech Stack Question at Included Health.

In a future post, we’ll explore how we make technology choices, which are often messy and complicated. You rarely start from scratch with a blank canvas and fresh paint, but that’s what makes engineering life interesting.

We are always looking for talented engineers who want to make a difference. If this sounds interesting and exciting to you, we invite you to browse open positions.

(Thanks to Nick Gorski and Andy Ray for their contributions in answering the Stack Question.)

.