The Line Between Code and What You See

Don made a very interesting point in today’s talk about distinguishing who owns the rights to some computer code (the developers), versus who owns the rights to the actual look-and-feel / expression it creates (the studio).

I just wanted to add that the line between the two is not always a clearly drawn line in the sand.  There was a very significant case decided early this summer by the Federal Circuit in the US, that I’m sure we’ll hear much more of in the future (think: SCOTUS).  That case is Oracle v. Google.

There were two main issues in Oracle: is there copyright in an application programming interface (API), and how much code has to be copied for literal copying of code to be an infringement.

APIs

In a nutshell, an API would be like the “interface” to, say, a game engine.  So I could write a game engine called RyanPeopleMover which has an interface that looks, in grossly (!) simplified form, like:

createPerson(name, initialXPosition, initialYPosition)
movePerson(name, amountX, amountY)

You could create and move people using this game engine by writing things like:

createPerson(Alice, 10, 15)
createPerson(Bob, 50, 60)
movePerson(Alice, +15, -5)
movePerson(Bob, -30, -20)

But, what if Jon came along and wrote his own game engine, JonMakeAndMovePeople, that used the exact same interface, but had different code behind it? Maybe his code has a better implementation of movePerson that, for example, performs the requisite mathematical computations more efficiently. Developers love when people create and utilize standardized APIs. If you were using my people-moving game engine before, you could swap mine out and Jon’s in, improving efficiency, without having to change any of your code. If I later change RyanPeopleMover to make it even better, you can swap my new version back in with essentially zero effort.

But, is Jon infringing on any of my intellectual property rights by using the API I created, even if the code behind the interface is different?

The facts of this case are that Oracle owns Java — a programming language with a large library of “pre-coded” functions for programmers to use. For example, I could write something like:

Math.round(userInput)

to take the user’s input and round it to the nearest whole number. Math.round(number) is part of Java’s standard library API. And, to be clear, Java’s standard API is massive, and performs tasks infinitely more complex than rounding numbers. Google, on the other hand, owns Android (the cell phones). Google wanted developers to be able to write apps for Android in a language they already knew, and allow existing Java programs (in general) to run (near-)effortlessly on Android. So Google copied large portions of the Java API to use as the Android App API.

The Federal Circuit held that there was copyright in an API, and that Google had violated Oracle’s copyright. I’m a software developer, so I’m sure everyone can guess my thoughts here. But, what are yours?

How Much Copying is Copying

The second key issue in Oracle is how much code (behind the API — the actual implementation of the interface) needs to be copied for it to be copying?

The facts of this case are that there are nine lines of code in Google’s implementation of one function it’s Java-ish language that are identical to the nine lines of code in Oracle’s (open-source) Java implementation.

Google had attempted to make sure there were no copies of implementation code in its Java-like language. They had performed a “clean-room implementation” of the Java functions (i.e., make the Google developers write the code to implement the API without looking at Oracle’s openly available implementations). But, one Google developer copied the code for a simple little function directly from the Oracle version (as a quick stopgap), and forgot to reimplement it another way.

Now, copying nine lines of text from a book, I think we can all agree, counts as “actual” copying. But how about lines of code?

I’ll try to keep the computer-babble to a bare minimum (one paragraph only, I swear). But, to describe the potential impact of the ruling, I need to babble briefly. One of the most common ways to store data in a computer program is contiguously in memory. So, if I had to store the numbers 10, 15, 20, and 30 in memory (say, as the X and Y coordinates for Player 1 and Player 2), I would often place them back-to-back in memory. This is called storing the numbers in an “array”. Then, I “index” the numbers in the array, so that I can retrieve or modify them. In Java, arrays are always and automatically indexed from 0 to one less than the size of the array. So, array[0] is 10, array[1] is 15, array[2] is 20, and array[3] is 30 (i.e., an array of length 4 is indexed from 0 to 3). A really simple programming bug to make is to accidentally access an array too high or too low, i.e., try to access array[-1] or array[4], which will access and possibly modify memory that contains who knows what (leading to bad, bad bugs). The Java programming language is nice — programs written in Java automatically detect if they attempt to access too low or too high in an array, and crash on detecting this happening (For those of you not from a computer background, trust me when I say that crashing is infinitely preferable to just allowing the illegal array access to occur).

Now, with that explained, there’s a very simple bit of code in the Java implementation that has one task: to take a range of indices, and ensure the entire range is legal. If it is, do nothing; if not, make the program crash. So, from the above example, if I asked if array[1] to array[3] is a legal range, nothing should happen. If I asked if array[1] to array[4] is a legal range, the program should simply choke, die, and spew error messages that are incomprehensible to anyone but programmers.

The implementation in Oracle’s Java, which took nine lines of code, looked (in interpreted fashion) like this:

rangeCheck(arrayLength, lowIndex, highIndex):
if (lowIndex is greater than highIndex) - throw errors and die
if (lowIndex is less than 0) - throw errors and die
if (highIndex is greater than or equal to arrayLength) - throw errors and die

That much code was all it took for the Federal Circuit Court to find that Google’s copying was, in fact, not de minimis copying. It has been remanded back to district court on Google’s potential for a fair use defence.

In my opinion, if you put 1000 experienced Java programmers in a room and asked them to implement the rangeCheck(arrayLength, lowIndex, highIndex) function, about 400 would implement it as above, about 400 would implement it as:

rangeCheck(arrayLength, lowIndex, highIndex):
if (lowIndex is less than 0) - throw errors and die
if (highIndex is greater than or equal to arrayLength) - throw errors and die
if (lowIndex is greater than highIndex) - throw errors and die

and the remaining 100 would ask why you brought them all the way to this room, just to implement such a simple piece of code that should only take nine lines to write.

(Edit: Haha, oh dear. I was going to edit the above paragraph, because I’m embarrassed by my stupidly simple math error. But, I decided to leave it as-is, because it’s a perfect demonstration of an off-by-one error — meaning accessing one-too-low or one-too-high in an array. 4+4+1 = 10, uhh, right?).

Now, to play devil’s advocate, there is still the text of the error messages spewed by the Java implementation, but the amount of text is incredibly minimal. One of the error messages, for example, would read, “lowIndex(3) > highIndex(2)” if lowIndex was 3, which is greater than highIndex which is 2 — the kind of brief error message most any programmer would use.

So, minimal or not?