APIs: Howto

by Robert Cooper

Related link: http://www.artima.com/forums/flat.jsp?forum=106&thread=142428



Wow. Just Wow.


Artima has an article up about do's and don'ts for designing an API, and it now goes into my short list of Biblical writings on programming in general. A lot of this is stuff that I have been ranting about for years -- including several recent discussions with my coworkers about *cough* violating said rules.



Some highlights:

API design goals

What should the design goals of your API be? Apart from compatibility, the following goals from Elliotte's
presentation seem like an excellent set:




  • It must be absolutely correct. In the case of XOM, this meant that the API could never produce
    malformed XML documents no matter what the caller did. For the JMX API, for example, it means that you can
    never get the MBean Server into an inconsistent state by registering strange MBeans in it or using funny
    ObjectNames or performing several operations concurrently.

  • It must be easy to use. This is hard to quantify. A good way to get an idea is to write lots of
    example code. Are there groups of operations that you keep having to repeat? Do you have to keep looking up
    your own API because you forget what things are called? Are there cases where the API doesn't do what you might
    expect?

  • It must be easy to learn. This overlaps considerably with ease of use. But there are some obvious
    principles to make learning easier. The smaller the API, the less there is to learn. Documentation should
    include examples. Where appropriate, the API should look like familiar APIs.

  • It must be fast enough. Elliotte was careful to put this in the list after the above items.
    Make sure the API is simple and correct. Then think about performance. You might be inclined to make
    API changes because the original API could only be implemented in an inefficient way. By all means change it to
    allow a more efficient implementation, provided you don't compromise correctness or simplicity. Don't
    rely on your intuition to know what performs well. Measure. Then tweak the API if you've determined
    that it really matters.

  • It must be small enough. This covers the size of the compiled code and especially the amount of memory
    it needs as it runs. The same principles as for speed apply. Make it simple and correct first; measure; and
    only then think about tweaking the API.


Be minimalist



Because of the compatibility requirement, it's much easier to put things in than to take them out. So
don't add anything to the API that you're not sure you need.



There's an approach to API design which you see depressingly often. Think of everything a user could possibly
want to do with the API and add a method for it. Toss in protected methods so users can subclass to tweak every
aspect of your implementation. Why is this bad?





  • The more stuff there is in the API, the harder it is to learn. Which classes and methods are the
    important ones? Which of the five different ways to do what I need is the best?



    The situation is exacerbated by the Javadoc tool, which dumps all the classes in a package, and all the
    methods in a class, in an undifferentiated lump. We can expect that JSR 260 will update the Javadoc tool to allow you to produce
    "views" of the API, and in that case fatter APIs will not be so overwhelming.



  • The bigger the API, the more things can go wrong. The implementation isn't going to be perfect, but
    the same investment in coding and testing will yield better results for a smaller API.



  • If your API has more methods than it needs, then it's taking up more space than it needs.




The right approach is to base the API on example code. Think of problems a user might want to solve with
the API. Add just enough classes and methods to solve those problems. Code the solutions. Remove anything
from the API that your examples don't need. This allows you to check that the API is useful. As a happy
side-effect, it gives you some basic tests. And you can (and should) share the examples with your users.



This really goes back to a lot of things I was trying to get at with Ruby the Rival. SWING is unapproachable. JNDI is so over inclusive and configuration reliant as to be problematic. JavaMail was based on a dare.



I think the other thing I would add to this segment, however, is this:

Higher level objects should be the ones people want to use. and Don't be afraid of building APIs on top of APIs

Think about all the J2EE elements that go unused because they are useless on their own (ServletRequest anyone?) or get effectively re-implemented to provide similar functionality in a slightly different environment.


Still every time I look at JavaMail or Swing I can't help but think that there should be a set of minimalist top-level classes that provide a simple, clean and obvious way to meet 80:20 requirement that I can then cast to more complicated objects if I need to get real fancy.


The next section of this I have to admit I agree with, but there are times...


There's a certain style of API design that's very popular in the Java world, where everything is expressed in
terms of Java interfaces (as opposed to classes). Interfaces have their place, but it is basically never a good
idea for an entire API to be expressed in terms of them. A type should only be an interface if you have a good
reason for it to be.
Here's why:




  • Interfaces can be implemented by anybody. Suppose String were an interface.
    Then you could never be sure that a String you got from somewhere obeyed the semantics you
    expect: it is immutable; its hashCode() is computed in a certain way; its length is never
    negative; and so on. Code that used String, whether user code or code from the rest of the J2SE
    platform, would have to go to enormous lengths to ensure it was robust in the face of String

    implementations that were accidentally incorrect. And to even further lengths to ensure that its security
    could not be compromised by deliberately evil String implementations.



    In practice, implementations of APIs that are defined entirely in terms of interfaces often end up cheating
    and casting objects to the non-public implementation class. DOM typically does this for example. So you
    can't give your own implementation of the DocumentType
    interface as a parameter to
    DOMImplementation.createDocument
    and expect it to work. Then what's the point in having
    interfaces?



  • Interfaces cannot have constructors or static methods. If you need an instance of an interface, you
    either have to implement it yourself, or you have to ask some other object for it. If Integer were an interface,
    then to get the Integer for a given int you could no longer use the obvious new
    Integer(n)
    (or, less obvious but still documented inside Integer,
    Integer.valueOf(n)). You would have to use IntegerFactory.newInteger(n) or whatever.
    This makes your API harder to understand and use.



  • Interfaces cannot evolve. Suppose you add a new method to an interface in version 2 of your API.
    Then user code that implemented the interface in version 1 will no longer compile because it doesn't implement
    the new method. You can still preserve binary compatibility by catching AbstractMethodError
    around calls to the new method but that is clunky. If you use an abstract class instead of an interface you
    don't have this problem. If you tell users not to implement the interface then you don't have this problem
    either, but then why is it an interface?



  • Interfaces cannot be serialized. Java serialization has its problems, but you can't always get away
    from it. The JMX API relies heavily on serialization, for example. For better or worse, the way serialization
    works is that the name of the actual implementation class is serialized, and an instance of that exact
    same class is reconstructed at deserialization. If the implementation class is not a public class in your API,
    then you won't interoperate with other implementations of your API, and it will be very hard for you to ensure
    that you even interoperate between different versions of your own implementation. If the implementation class
    is a public class in your API, then do you really need the interface as well?



A lot of really good things here. I know for one I get frustrated with this sometime. Even the ROME project, to which I am a constributor, suffers from this, and I found myself changing some of my code so that it matches the, um, unfavorable overuse of interfaces that was preexisting.


HOWEVER, until Sun decides to let us use an Object as an Interface for the purposes of Dynamic Proxies, you are kind of stuck with this. (Please don't mention CGLIB. Thanks.) Now, I consider this to be one of the things that is getting in the way of Java having the same kind of utility we see in Ruby that everyone loves so much, and I think it needs to be addressed. Until then, however, we need to code around it.


Lastly...
Don't implement Cloneable. It is usually less useful than you might think to create a copy of an object. If you do need this functionality, rather than having a clone() method it's generally a better idea to define a "copy constructor" or static factory method.

This, I agree with. I, however, have always seen more things along the lines of the Copy From interface (using ROME as an example -- alternately CopyTo, which I think is better personally). I find this has several advantages over a constructor. 1) It allows constructors for extended objects to use their natural flow while mapping bean properties from a different class properly. 2) It can be cascaded up and down objects with super.* calls easier than you can with super() calls. But hey, that's just me.

What are your rules?