OSCON Java 2011: Josh Bloch, “Java: The Good, Bad, and Ugly Parts”


Josh Bloch: These days people are
writing a lot of books with titles like ‘Javascript: The Good Part’
and ‘Java: The Good Parts’. But I owe you more. There’s always the yin and the yang. So I’m gonna give you
the “full monty” today. I am going to tell you about the bad
and the ugly parts as well. A few preliminaries that
I’ve gotta get out of the way. This is a highly opinionated talk. This is not pure technical fact. These are my opinions. They are not those of my employer or Kermit the Frog or Dr. Ruth
Westheimer or anyone else. The talk contains some criticism by nature but I’m trying to be constructive here
and I still love Java. So this afternoon, I’m gonna be giving a much longer talk in which I discuss and critique every
one of the 18 language changes between Java 1.0 and Java 7. It’s chock full of code examples
because this is a keynote. A, it’s short and B, I tried to have no code at all. I slipped and there’s one example. But what I’m gonna do,
this kind of meshes with the talk because this is about where it all started. I’m gonna restrict my attention to Java 1.0 basically the language
as it originally shipped and I’m gonna try to tell you
what the good parts were, what the bad parts were
and in my opinion, why it succeeded. Both talks, this morning’s
and this afternoon’s, are limited in scope to the language because I just don’t have time
to cover the libraries. So first the good parts. These ones on this slide are,
I think, fairly well known. Java has, it’s a safe language
with a managed run time. And that means you got no
seg faults, no scribble bugs, none of the things that C and C ++
programmers coped with for so many years and that also facilitates program portability when you have this managed run time because you are writing
to a virtual machine and that virtual machine of course can
be implemented on various hardware, various operating systems. They also tightly specified
all the primitive types in Java. How long is an nth? I don’t know. It depends on the underlying machine. It could be 32 bits, could be 16 bits. And that’s a problem. So I think once bitten twice shy. James and the James Gang
got that one right. And that was, that was critical. It’s also a natural accompaniment
to the managed run time because you define all
those types in the run time and the run time acts as a buffer between the realities of whatever hardware
you’re running on and the program. Dynamic linking was very important. Anyone who spent a lot of time
programming in C or C + + knows that when you change a library, you have to recompile
every single client program which meant clean builds were essential. And if you didn’t do that, you got strange bugs and you
spent a week chasing them down. So Java, by contrast, loads the libraries dynamically
and if you change the library, you don’t have to touch the clients. Everything just works with one small but important exception that
I’ll discuss later in the talk. And finally I think a team basic
attribute responsible for the success of the Java language, was its superficial syntactic
similarity to C and C + +. It didn’t scare the C + + programmers
and the C programmers. In fact, they could take a look
at a Java programmer and say , a Java program, I’m sorry. And say,
‘Yeah. I know what that does.’ They didn’t have to read
the language manual. They didn’t have to study.
They just looked at it. So in essence it was
an act of subversion. The Java language kind of snuck
the essence of languages like LISP and SMALL TALK
by the people who are used to programming
languages like C and C + +. And I think all of those things
are important and none of them should come
as a surprise to you. Now let’s look at the type system. It’s object oriented
which means two things. It means it supports encapsulation
and that was absolutely critical because you cannot prove the correctness
of components in isolation unless the components
can isolate their internals. Then there’s inheritance and
that was a marketing necessity. We could argue whether you really need
implementation inheritance or not but in 1995, if you tried to introduce a new language
that wasn’t object oriented, you would have been laughed
off the face of the earth. There’s multiple interface inheritance
which I think was a great idea. The Java team basically
looked at C + + and said, ‘It’s great to be able to support
multiple protocols but multiple implementation
inheritance is too gnarly. There’s just too much difficulty
that comes with it.’ So they kept the multiple
interface inheritance and discarded the multiple
implementation inheritance. And then there’s static typing. I know these days it’s kind of popular
to diss static typing but I think it was, it was critical for a couple reasons. The first reason is that it, and there actually may be three reasons ,
amongst our weaponry, it enables the IDE’s to generate
high quality code with very little effort on your part. Basically the auto completion says, ‘Ah yes. The type of this variable is such and such
so these are the methods you can call.’ And that’s great. It was also necessary from a performance
perspective especially in 1995, if they tried to do a dynamically
typed language, they never could have achieved
the sort of performance in the sort of time frame
that they did. And then another reason
it’s important is that in order to get big business
to take the language seriously, you know, they had to sort of
be able to offer the kind of safety that you get with static typing. If you can compile it, it is unlikely to have a certain
class of bugs at run time. What about random features? Well, you’ve got threads. Threads are critical. So in 1995, it was the twilight
of the uniprocessor era. There weren’t a lot of MP’s for sale
but the writing was on the wall. But more importantly, computing
had changed from the days when you just had a big batch computation, you fed it into the computer, it did what it did. Computers were being used
with microphones and speakers and all manner of sensors and computer programs were talking
to each other on the network. So there you have concurrency whether or not you have real parallelism
with multi-, multi-processors. You have concurrency and you need a
language that can handle that concurrency. And many people had tried to add threading
to language that didn’t have them. P-threads and so forth and they found
that it was fraught with peril. And in fact, even as early as 1995, academics were writing papers kind
of proving that it could not be done and Hans Boehm wrote
another paper a decade later. But I think if you talk to
the concurrency elite, they will tell you it’s simply impossible
to add threading to a language after the fact. So it was very fortunate that they decided
to put threading into Java from Day One. Garbage collection eliminates
all the pain, heartache and bugs associated with manual
memory management and then exceptions, error codes
had been shown to be error prone. If you look at C programs, people tend to ignore the error codes and the other thing is that
when you do, you don’t have seg faults. You’re way down
in the execution of a program. Something bad happens. You have to do something,
so you throw in an exception. And those were the key features. It turns out that what you leave out
can be as important as what you put in. James left out a bunch of things
that had been assumed critical. Exhibit A is lexical macros. If you look at a C program,
it’s all about macros. But macros have problems
especially lexical macros like C’s. And that decision turned out
to be a great one. First of all, it makes all Java programs
somewhat similar to one another which means I can take my program, give it to you and you can debug it
without having to learn all of my macros. It enables programmer portability
in that way and also it’s important for two ability, we were talking about all the IDE’s
and IntelliJ and Eclipse and Net beads and so forth. Once you have macros, it’s really hard to do auto completion. Multiple implementation inheritance
was another thing that it was kind of ballsy to leave out and it turned out to be a great
decision in retrospect. And finally, operator overloading. Operator overloading isn’t
inherently a bad thing but untrammeled operator overloading,
as practiced in C + +, is. I mean as soon as you start using
the left shift operator to do I/O, your program loses a lot
in the way of intelligibility. And the Oak Team just decided
they didn’t want to do that. Finally, I want to discuss a potent
pair of design decisions that are often overlooked. First of all, Java omitted
support for header files. Header files in C and C + +,
are kind of a nightmare because you have to keep them
in sync with the program. They’re in separate places
and then they added Javadoc. I think Javadoc is the unsung
hero in all of this. Javadoc takes the documentation
and puts it with the code. And everyone knows that it’s sort
of made good documentation a part of the Java culture
from Day One. That’s very important. But here’s the other thing it did. Once you take those two
design decisions together, you’ve co-located the
interface declaration, its documentation
and its implementation, they’re all together in one place. Now if you change anything, you’re almost forced
to change everything else. So things do not go out of sync. And I think that made Java
a much better, a much more productive and a much more bug free
language to program in. So I think those are pretty much
the main things that made Java succeed. Now we’re gonna get on to
the bad and the ugly parts. And in keeping with the Western
theme of today’s talk, you can see that Duke
there is holding a shootin’ iron and he appears to be
shootin’ off his own foot. So these are the cases
where he shot off his own foot. First of all, we have silent
widening conversions from int to float
and double to long. So basically you can have
a variable of type long and if you try to store its contents
in a double, the language will say,
‘Sure. No problem.’ But it’s lost information. That should generate probably an error
or at least a warning at compile time. It should require a cast. So these are things that are supposed
to be lost less but they aren’t. And then a related one. This is the only code in this talk but it turns out that these what are called compound
assignment operators have implicit narrowing casts. If you look at this code, it looks like the loop should
iterate sixteen times, right. We put in something with sixteen
ones and while we’re not zero, we shift it to the right once. Bang, bang, bang.
Knock off all 16 bits. You’re done.
Right? No.
It’s an infinite loop. Why? ‘Cause actually it turns into this. It turns out the shorts turn into ints and you just keep putting back
negative one into that variable. So that was a mistake. The operators double equals and
unequals are reference operators. They should be value operators. That is, they should call
equals if it is a sign. It was a mistake to take the nice syntax and waste it on the thing
that you rarely want to do and it’s a cause of frequent bugs
for beginners. They compare strings
using double equals and then they don’t know
why it doesn’t work. Now here’s, remember when I said
there’s a chink in the armor of dynamic linking of libraries. If you have a constant variable
like ‘Public Static final field library’ that field’s actually copied
into the client and if you forget
to recompile the client, you don’t get the new value of that. So that was a mistake,
once again. Lots of subtle bugs. What about constructors? Well, default constructors are bad. You forget all about
constructors and the compiler, in its infinite wisdom and mercy, gives you a public constructor. What if you didn’t want a constructor? What if it’s a static utility class or what if you didn’t want anybody
to construct a copy of the thing? You wanted to keep it private. So that was a mistake. And invoking overridden methods
from constructors should be illegal because it’s always wrong. Here’s a bunch of miscellaneous things. Lack of unsigned int and long
was a big mistake. And worse, bytes are signed. When do you use bytes? Byte manipulation, packing,
packets on networks, doing graphics or whatever. The sign extension always gets in the way. That code is buggy. It’s error prone. It’s filled with nasty end zero XF masks. So that was a mistake. The switch statement is not structured. It has fall through. Java is the newest language not to
have fixed that particular error. There was no good reason for that. Arrays should have overridden two strings
so that when you print an array, you don’t get garbage. That’s another one that
nails every CS 101 student. Exceptions, obliterate pending exceptions. If you have exceptions on the
stack and another exception is thrown, you lose the first one. That’s bad and it wasn’t necessary. And finally, cloneable
lacks a clone method , that makes no sense at all and it shouldn’t even exist
in the first place. Cloneable is a waste. If you want to be able
to create a clone, just put a method or
a constructor to do so. So in summary, the good parts
are the key design decisions. The basics of a language
James and his team got right. The bad and the ugly parts are
largely confined to the details. A market window opened up
in 1995 for a new language because people were pretty
much sick of the ones that existed at the time
and Java jumped through it. Some people have said that
this was all hype and marketing. That’s not true. Java’s success was the result
of the Oak Team making all of the right design decisions. Well, not all of the
right design decisions. Most of the right design decisions. And if you come back at 2:20, I’ll give you a much longer talk
with a lot more code discussing how we have built on this
legacy over the past decade and a half. And where we, where we did it proud
and where we dishonored it. Thank you very much. [applause]

Leave a Reply

Your email address will not be published. Required fields are marked *