RubyConf 2019 – Principles of Awesome APIs and How to Build Them by Keavy McMinn

RubyConf 2019 – Principles of Awesome APIs and How to Build Them by Keavy McMinn


Good morning, everyone! 
[ Applause ] Don’t do that! Thanks. Several years ago,
I spent some time training on the island of Lanzarote. It’s a place many triathletes
love to hit. You can ride for hours without seeing a single car, but the weather and landscape
can be brutal.  When I first rode there, I had to stop a few
moments into the ride, because I couldn’t believe how strong the wind was. I thought
I was going to be blown clean off my bike. But I’m an idiot and I decided to train for the
ironman race there. I am kind of small, and on a triathlon bike, the wind can push me
around. It was gruelling. Some days the headwind was so strong that I had to work
really hard to inch forward at all. And I finished fast descents in such a state of
shock to still be upright that my legs trembled for minutes. 
I remember at the end of my first training day, I said to my friends, this course and
its weather was too tough for me. I don’t think I can do it. A more experienced friend
offered me great advice. She said try to work with the wind. Don’t fight it all the
time. Otherwise I’d just exhaust myself. Initially I wasn’t too sure how to physically do that
when you have to go in a certain direction. But through practice, I got better at getting
a sense for what the wind was doing and assessing the risk and learning to work with my bike
and my body. I gradually learned how to judge the crosswinds, how to ride at a 30 degree
angle, literally leaning into it. And I shifted my mental attitude. I stopped trying to ignore
or fight it. And I developed a respect for the wind. I used to talk out loud to it,
as you do.  The wind was just a fact, and it was a waste
of my energy to wish it was any different. Accepting and embracing that made all the difference. 
A good API needs to be consistent, stable, and well‑documented. We know this. As
consumers ourselves, these are the qualities we want from APIs that we have to use. As
consumers, we know how annoying a poor API experience can be. We might be totally dependent
on an API for our product, but yet we don’t have any control over it. We have to react
to it. Perhaps sometimes scrambling to get it updated just in time with producers’ changes. And
that’s when we even know about the change in advance. 
Or we might use a handful of APIs to help with smaller tasks, maybe someone set up a
chron job that runs a script and produces an API and produces a monthly report. We’re
not monitoring it because it’s just a little script somewhere. And it works just fine
like that for years, and then the department that reads that monthly report starts to notice
that the numbers just look a bit off. So, you dig and eventually discover that the API
is not returning numbers the same way. So, now half the data in that report is just wrong
or missing.  If an API change is handled poorly, it can
be really painful for us as consumers. Then from our experience as producers, there’s
a certain type of developer who doesn’t like things to be wrong. You may know this type
or be one yourself. The type of developer who writes scripts to catch the inconsistencies
in your code base, refactors the whole test suite to make sure every scenario is covered
efficiently. They notice all the details and give the most thoughtful and considered
code review. This is not me, by the way.  But this is someone you want on your API team. But
the struggle for them is sometimes when things are wrong with an API, they have to stay wrong. Sometimes
you can’t fix things because your users are accustomed to the wrong thing, so fixing it
would actually break it. And that can be really hard for some developers to accept. 
Being able to change the API or not, as the case often is, is probably the biggest pain
point we feel as developers of APIs. I want to just pause for a second to remind ourselves
what an API is, and why that ability to change is so hard. 
For many of our use cases, an API is the interface between our product and its users. It’s interacted
with by a code on the user’s side to perform something for their product. When we’re developing
software for humans to use, if we change how it looks or works slightly, the humans might
get confused, but they’ll figure it out. When it’s computers on the other side, that’s not
gonna fly. Code can’t just figure it out.  So because our users rely on our API’s code
interacting in the moment with their code, by the nature of that territory, it needs
to be a very conservative space for development. When we’re developing APIs, we’re bound by many
constraints, centered around the communication and expectations between both sides of that
interface. That’s the nature of developing APIs. Similar to the realities of riding
your bike in the wind, we’re developing in a landscape of constant constraints. 
If we want to produce an awesome API, which we know means being consistent, stable, and
well‑documented, we need to work with those constraints. Trying to fight the reality
of APIs, the nature of that interface is doomed to failure for the product and misery for
us as developers.  From other areas of development, we might
be used to imposing our will on the product, so it might be frustrating for us to not be
able to make changes how and when we want, but that’s the nature of building APIs. So,
we can either be miserable, fighting those constraints, or we can embrace them. 
One thing that I think is useful for us to remember is that these constraints are good
for us, too. If you’re writing an API for yourself, for a hobby project where you’re
the producer and consumer, you can do whatever you want where you want. But this talk is
for building APIs for other people to use. You want people to use your product, service,
tool.  The API might be the foundation of your entire
service, or it might enable your product to grow and evolve through the creativity and
power of all the tools that build on top of it. 
The ability to build an ecosystem around your product through the API is hugely powerful
for a business. It increases the value exponentially. You’re brokering relationships between parties that
otherwise might not be able to connect. But each time you break the API, that’s one less
tool that can work with your service.  So these constraints, when respected, offer
both sides some protection. They enable our users to lend their trust to build their software
on top of ours, and they also enable us to grow and maintain a wide user base that enables
us to be successful.  So here are some ways that I’ve found useful
for embracing and working with those constraints, to help achieve those main principles of being
consistent, stable, and well‑documented. First, to be consistent, we need to be rigorously
consistent in how we represent our data. For example, if a user has a property of Admin
that is returned as a Boolean in one end point, it shouldn’t be returned as a string or integer
in another. And you might think, well, of course. We wouldn’t do that. And no one
would intentionally. But as an API code base or team grows, these inconsistencies can slip
through to the public interface really easily. And that’s particularly easy to slip through when
there are multiple apps that make up a public API. 
So to help avoid those mistakes, we can use tooling to help maintain the consistency of
our API’s data. And there are a lot of choices for this. These are just a few. These recoveries
or specs provide a specific way to describe what objects exist in your API, what they
look like, and how people can work with them.  My preferred option is JSON schema, because
it’s relatively simple, but flexible and powerful. For example, you can use it to validate user input
as well as your output.  And in this conservative nature of API development,
it’s usually best to approach any shiny, new thing with caution. So, it’s reassuring to
know JSON schema has been around for years, and it’s battle‑tested in production by
several APIs that many of us know and use like Heroku, GitHub, and soon, Fastly. 
A schema can be the one source of truth for your API. Knowing that your source for data
is accurate means you can confidently use it to do things like generate documentation
examples. Test the representation of objects or your requests, some responses, and validate
user input. I’m going to talk about JSON schema, but you might prefer to choose a different
tool. The important thing is that you use something as your one source of truth. Select
the tool that best suits your context, your organization, your code base. 
So there are some particularly handy tools for working with JSON schema in Ruby. The
committee gem enables you to perform validation of user requests of that user‑supplied input
against your schema, which makes it really easy to provide the user with immediate feedback
if, say, they supply a parameter that’s the wrong type or not quite formatted correctly. And
you can do that validation centrally before a request even hits specific end point code. 
The PRMD gem enables you to combine and verify multiple schema files, which is super useful
once you’re beyond needing to describe one or two objects. It’s much easier to manage
those in a smaller chunk of JSON at a time. So, you might have one schema files for users
and another for teams, say.  You can see the schema for some APIs online. For
example, Heroku have an end point to read the schema for their API. And that’s a super‑useful
way to learn through real‑life examples of seeing the constraints that someone else
applies to their data as a complement to the examples you’ll find in the docs. 
And if you’re thinking this all sounds fantastic, let’s go back to work and make a schema to
enforce consistency. It’ll be amazing. A word of warning. Building a schema file like
this is the ideal if you have a greenfield project or just have the luxury of a fresh
start. If you have an existing API, you probably can’t apply a schema in all of those ways,
because you probably have a whole bunch of inconsistencies in your API. You might not
know about them all yet, but if it was left up to humans at this point, you probably do. And
that’s okay. Everyone has skeletons in their API closet. 
So where you don’t have a schema, a safer starting point would be first to get a measure
of your inconsistencies. So, you could take a close look at your documented responses,
run some comparisons between the response examples in your docs and the responses you
get when you actually make a call to the API. It might take a bit of time to set up a test
with sample data, and you might need to parse the examples you give in your documentation. But
then you can write Ruby scripts to make real requests to your APIs for each of those and
compare the results against the documentation.  You could note where and how the responses
are different. Are there any additional keys or missing keys in the hashes? Are any of
the attribute’s classes different than you said? And if those Booleans that are actually
a string? I did this recently. And even the exercise of reading your docs and forming
all of those requests is highly beneficial. It’s a really good way to familiarize yourself
with the public interface and see things from the consumer’s position and build up that
empathy.  You could create a schema for a small portion
of the API. Say in your app you have a user object and some end points for users. You
could try writing a schema just for those, and then run your tests for the user end points
against that schema and see how many of your tests feel and how they feel. You could even
do that in production and just log where the mismatches are. And if the output is actually
what you really want, then you can correct the schema to reflect that. But if the schema
was correct, and the output doesn’t match it, then you can’t change that output immediately,
because that’s the Output your users have gotten used to. So, it would break things
on their side if you fixed it.  Then, it can be really interesting and pretty
effective to get a better understanding of the inconsistencies by taking a deep look
through the code through parsing the code, using code to analyze the code. I’ve used
the whitequark parser for this job. It’s a parser for Ruby written in Ruby. 
You can use this gem to parse your code and form abstracts and text trees, ASTs, which
as the tree name suggests, gives you an object that you can then crawl around in to find
the branchs and leaves that you’re most interested in. This doesn’t need to be pretty or polished. My
own scripts that do this are definitely not. But this kind of tooling isn’t going into production. It’s
meant to help you explore the code and tell you things that you’re curious about. 
For example, you could use a parser to peer into end point code and see what’s being called. Maybe
you would monitor and measure the calls around authentication. You can note what methods
are being used, what arguments are passed to those, what classes the arguments are. Parsing
the code with code is hugely powerful, but just exploring this at all can give you really
useful insights.  Seeing your inconsistencies can help you make
decisions about them. Maybe you have done a thing three ways, but now you can pick the
one way that you want to deliberately choose for the future. I’ve done this recently for
the code base at Fastly where the public API is formed across multiple apps, and it’s given
us insights and measures into a ton of useful information that would be otherwise pretty
difficult to do.  And I think as a bonus, looking at and measuring
these things can set you up nicely to basically have a custom linter that would help you monitor
your code in the future.  So getting to consistent data might be a methodical
process with painstakingly slow work, but then it will be amazing. 
The second big principle of good API is stable. If
you do one thing, I recommend simply to think deeply and write down what the API is for. What
each end point is for. This sounds so simple that many people don’t do it, preferring instead
to dig into the more technical work of writing code or specs. But even with API‑specific
specs, I think you get distracted by the syntax and the opinions of the tooling. And all
of that can come later. But as a starting point, I believe it’s way more beneficial
to write down just in plain language what the API is for. That seemingly simple thing
is hard to do, but it’s important. So, in high‑level design for an API, I recommend
including the usage examples for any new end points. What would users do with it? What
variations might they do? What workflows would it need to support? What does your
business need from it? What are the potential performance issues? What would happen if there
were 10,000 of the objects that could potentially be returned? What does it look like? What
does the path look like? Are there any parameters in the path? What do they look like? And
what will you call it? Keeping in mind that if you can’t think of a neat title for an
end point, that’s probably a sign that your design is a bit off. 
And remember, of course, there’s a cost in adding new end points. More code to write,
test, maintain, document, and do code review for. So, you want to consider, do we really
need this addition? Do you REALLY want to walk that dog every morning when you’re tired
and there’s a blizzard blowing outside? And write those high‑level design docs before
you write any code, before you have invested in a particular approach, while the cost of
change is low. The more thought and care you put into the API design up‑front, the
more chance of shipping the right thing in the first place and reducing the risk of change
later.  So we need the API to be a calm, predictable,
and reliable space. But that interface is surrounded by things which are not calm. Everything
is changing. Our product, the technology choices to build and document our API, our
users’ needs and workflows, our assumptions about our users’ needs, our legal obligations,
security vulnerabilities, bugs in the code, our teammates, our company’s leadership, their
priorities. The API is at the center of a constant state of flux. On top of that, our
API also has to change in order to stay relevant to that wider context. So, the stable center
has to change with everything else, but that change has to be very controlled, if indeed
it’s even possible.  So we have this constant friction between
the need to change and our constraints. As developers or shepherds of the API, it’s good
for us to be aware of that friction and the demands for change, but not to react to all
of those demands all of the time. We need to optimize for walking the API along a very
stable, thoughtful, and kind path through all of that. Sometimes that will mean pushing
back against some of those forces that might disturb the calm center. That might involve
asking a lot of questions, helping to find creative ways to absorb the problem on your
side rather than inflict it on the consumer. Or sometimes saying no or not yet. 
Shepherds of the API often aren’t the most popular people in the engineering org. So,
if you don’t mind being unpopular, come join your local API team. Some literature on API
development will say that change is bad and therefore you should simply never make breaking
changes to your API. I don’t buy that approach. To me, it’s not working with the constraints,
it’s more like punting on them. And don’t get me wrong. Breaking changes should be
very rare and a last resort. But in reality, the need will arise. We need to be able to
evolve our API along with our product and respond to emergency situations like a major
availability or security problem.  And if you really think about it, change itself
is not necessarily bad. It’s negative impact from change that is bad. So, the goal is
to minimize the negative impact. A great way to help with that is to provide transitions. Provide
a transition workflow or a way for users to adapt to the new thing. And that could take
different forms. One would be previews, for example, the GitHub API releases new or changed
end points under a preview, which is a specific accept header that you can opt into using. Versioning,
for example, the stripe API offer very granular versions on a date basis and commit to supporting
all previous versions, which must be a huge undertaking behind the scenes. With either
of those approaches, you want to be sure that users can use multiple versions, say one on
staging as they try out and prepare for new behavior, but continue to use a different
version on production.  Another simple way to provide a transition,
say, for example, if you really need to change the name of an attribute, you could provide
both old and the new names for some transition period before you remove the old one. Whatever
form you choose or combination thereof, the key is to provide a way for users to transition
to the new thing gracefully.  But accept that the reality is that we may
not be able to reach everyone. And that some users may not take any action. So, it’s highly
unlikely going to achieve zero impact. The only way to get to zero is probably if you
have zero users. In which case, cool… The goal is to minimize the negative impact
as much as you can reasonably do. And lastly, an awesome API needs to be well‑documented. Provide
clear information to users on how you intend to change or break the API through things
like documenting your deprecation policy, for which I recommend you provide generous
conservative timeframes for. Expectations around any kind of preview periods, for which
I recommend you provide short time frames for, otherwise people forget it’s not really
the final version and not really suitable for production. And do provide a change log
with specific information on all the changes and how developers can adapt to use those. 
Being explicit and transparent about these things gives users some confidence in the
process and builds trust, which is a crucial ingredient to sustain an awesome API. And
then remember that people don’t read.  [ Laughter ]
So even if you’ve given all the notices and warnings, news of upcoming changes won’t reach
all your users, even with the best will in the world, even with great monitoring and
usage metrics, you may not be able to reach your users. Maybe the people that wrote the
code using your API are no longer the people who maintain it or care. The first time some
people will notice a change is when their thing that depended on it stops working. To
help compensate, you can also make use of brinites, where after all the regularly timed
notices, you would schedule a deploy of the change temporarily, maybe just for an hour,
but just enough time for people to freak out. And then you withdraw that and give people more
time to adapt to the upcoming change on their side before you make the switch for real. 
There’s so much detail to get right on an end point‑by‑end point basis, that it’s
easy to not look up and see the overall picture. But it’s highly beneficial to provide a way for
some people to see a holistic view of the API. And I mean inside your company, I should
say. But even having scripts that can capture a measure of key data points, which maybe
you stick in a spreadsheet could give your API or product team some useful insights. The
beauty of this work is that it doesn’t go near production. So, it doesn’t have to be
pretty or polished. Just something that enables you to take a holistic view of that data whenever
you need to.  For example, you may capture a note of things
like which endpoints are undocumented, which endpoints are in some state of not public. Maybe
a private beta or preview, and did they enter that phase? Anything that’s earmarked to
be deprecated. And when those are due to be removed all together. 
In order to collectively build and maintain the consistency and stability, you need to
govern choices and decisions. Having written down internal guidelines on how your company
builds APIs would make it easier for everyone, both current and future teammates to build
a consistent, stable API. Make sure you have clear guidelines on how to make changes. Any
ambiguity on how to make changes can be a major source of stress and frustration for
engineering teams. So, set clear expectations and share guidance internally on what’s okay
or not okay and how to go about making changes.  It’s super‑useful to have an internal API
style guide where, if it’s a restful API, you might include things like what patterns
for paths are okay or not okay, or that URLs with a verb or adjective in them is a smell. So,
you could encourage a fault of design to work for that 95% use case and allow you to fine‑tune
some results. Of course, there will be exceptions to any rule. One example from the GitHub
API, there’s a fetch to the latest release, which clearly has an adjective in the path.
That was a conscious design decision, because users almost always want just the latest release. So,
for that exceptional type of case, it’s a better interface if it optimizes for the common
use case, even if it’s not textbook restful.  Some choices will be uncontroversial, where
there’s fairly clear good design and bad design. With some choices around how you implement an API,
there is no correct answer. And people will have valid arguments for different approaches. But
here’s the thing: Most of this, the specific choice doesn’t really matter. What’s important
is that you make a choice and apply it consistently.  Your internal guidelines will always be guidelines
and decisions will need to be created by humans weighing out the trade‑offs. So lastly,
in your internal documentation, I highly recommend writing down who is responsible for what. To
build and maintain an awesome API, your group needs to have a clear understanding of ownership
and the decision‑making process. When that’s ambiguous, it creates frustration. And just
wastes time and energy. Also, it encourages poor quality code as endpoints become orphaned
if no one specifically owns them. No one fixes the flakey tests or bugs that arise,
because hey, it’s not their code.  So I recommend carefully deciding and writing
down things like who’s responsible for building APIs for new features. Who’s responsible
for the underlying concepts, the plumbing. And with documentation, who is responsible for
which parts of that work?  Everything needs an owner, a responsible party. And
that information should be written down and easily discoverable. And then who gets a
say in the decision‑making process? I find many people get uncomfortable setting up a
decision‑making process up front, maybe preferring to try to attain consensus. If
that’s the case in your group, I suggest it’s worth asking what happens when we can’t? And
maybe it’s worth writing down that answer just in case you don’t have consensus. 
So that island became one of my favorite places to train and race. I spent weeks and weeks
out there. I got to know the roads like the back of my hand. It made me a better athlete,
not just the technical skills, but with my mental toughness. And sure, I loved to ride
my bike on a calm, sunny day, but I know I can adjust my mindset and embrace the day
when the conditions get tough. I love working with APIs. And not even despite but because
of the winds that come with that territory. I love making something that helps other developers
make something. That people would choose to build on top of what I build. That’s a
pretty special thing to be part of, and I take that privilege seriously. So, I hope
you find something useful in some of these ways that I find helpful to work with the
constraints to build awesome APIs. But the important bit is that you don’t fight your
constraints. Know what they are. Plan for them. Work with your constraints. And I
want to leave you with this quote.  It doesn’t have to be fucking hard. It just
has to be consistent. Which is fucking hard.  [ Laughter ]
He’s talking about training for Iron Man triathlons, but I think it sums up API development pretty
well. Thank you.  [ Applause ]

Leave a Reply

Your email address will not be published. Required fields are marked *