Bartek Andrzejczak

Musings on software development

Why Not Scala?

This is a response to the article I’ve read this morning: Why Scala? by @clintmiller1

TL;DR version

I’ve created this blog some time ago, but couldn’t find a topic good enough for a first post, plus my free time is very limited right now. When I’ve read Why Scala? I knew, that I have to respond. It’s not like I don’t like Scala. I think it’s an awesome language with a huge potential. It could even be The next big thing. This post is not about how Scala sucks, because it doesn’t. It’s about a dishonest comparison.

Two thing that irritated me the most while reading Clint’s article were:

  • Disservice made to Java by ignoring good practices like switching from nulls to Optional (either from Guava or better from Java 8) or Null objects and ignoring the advantages of good language verbosity,
  • Ignoring the subject of source code maintainability which is kinda crucial to the applications that are going to be used and changed for more than just few months.

Full blown version

First I’d like to share my opinion on how the discussed article looks as a whole. It seems to me a bit like some my engineer’s thesis’ chapters. It’s conclusion seems to be drawn much earlier than the article was written. As a repository for my thesis’ application I’m using GitHub for a few reasons. The main one is that I just wanted to get better at using git and GitHub seemed like a reliable remote repo. I know it’s not an engineer’s thinking, but it’s how I thought about it. In my thesis I’ve written 3 A4 pages with 10pt font about why it was git and not subversion and mercurial. Of course all the arguments were pretty much chosen so the final result of the choice would be git. Maybe it’s just my opinion, but it’s how Clint’s article reads to me.

But let’s leave the whole picture behind and focus on the little details that made me annoyed.

Is code expressiveness the most important thing?

This idea that expressivity leads to higher productivity goes back to the 1960’s and Fred Brooks’ seminal book The Mythical Man Month. Brooks argues that the average number of lines of code a programmer writes per day is constant no matter what language is used. So, if a program takes 10,000 lines in one language and 20,000 lines in another, Brooks asserts that it would take twice as long to write the program in the second language as it would in the first.

When I was reading this paragraph I didn’t know what was coming. It seemed pretty fair, although it alarmed me a little bit: what is this “expressivity” that he’s talking about? Does he state, that generally the less you write to accomplish the goal, the better? What about about reading the code? Should the code be created on write once, read never basis as long as it works fine? That didn’t seem wise, but hold on.

First, in non-expressive languages important logic sometimes gets hidden in a big mess of boilerplate code. You have to read through a lot of junk to get to the key points in the code. Or worse, you may not notice some of the key logic points in the code at all because they’re so buried. With expressive languages, the key logic is much easier to spot. Second, expressive languages allow you to view more code at once on your monitor without having to scroll around as much.

I can get the “mess of boilerplate code”, but you can always pack it into some well named methods. You can always hide it. The difference between languages with high expressiveness and without it is that the first group hides it’s boilerplate inside their core and optionally puts out some API to mess with it while the other group makes you prepare the boilerplate by yourself. Yes, preparing boilerplate by yourself takes time, but doesn’t it give you additional control over your solution? The second argument is fairly simple and I can agree on that, but there’s a hitch. If your code sample fits on screen it doesn’t encourage you to make it better with refactoring. You could have 50 LOC method which fits on screen, which would be fine, while in less expressive language it would take 100 LOC and call for refactor. It’s a wild example, but you get the point, right?

The code

Point

And there’s the code… Here comes the disservice to Java. Just observe:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
class Point {
    private int x;
    private int y;

    public Point(int x, int y) {
      setX(x);
      setY(y);
    }

    public int getX() {
      return x;
    }

    public void setX(int x) {
      this.x = x;
    }

    public int getY() {
      return y;
    }

    public void setY(int y) {
      this.y = y;
    }

    @Override
    public boolean equals(Object other) {
      if (other instanceof Point) {
        Point otherPoint = (Point) other;
        return otherPoint.getX() == getX() &&
            otherPoint.getY() == getY();
      } else {
        return false;
      }
    }

    @Override
    public int hashCode() {
      return (new Integer[] {getX(), getY()}).hashCode();
    }
}

First let’s crash equals and hashCode. In modern IDEs they’re easily generated. Yes, they’re still there to create sort of crowd on the screen but it’s not like they decrease your productivity! Secondly, this is not a class. This is just a data structure. Remember C? It’s almost like the class keyword wants to be a struct. It’s not really OOP. If you really want to have structs in Java, why won’t you just create class with public properties. In the further notes, author points out that with public properties, you can’t simply add behavior to the setter, like an exception if x is greater then some limit. But is it a good place for the logic anyway? You could put the exception in the constructor, but if you want to change object during it’s life time from the outside (by using setter method) then it’s pointless. This class doesn’t encapsulate anything, even though the properties are private. It’s inherently bad designed.

By the way if you really want to have those getters and setters they can be easily generated too… Someone who created that functionality in IDEs should be somehow punished. Severely punished.

On the opposite site he gives an example from Scala:

1
case class Point(var x: Int, var y: Int)

It’s cool, that it’s so short, but something died inside of me when I saw those vars in Scala. Maybe var in Scala isn’t such a billion dollars mistake like a null in Java, but still, it should be used with a great consideration of the consequences.

Word map

Another code example that he gives is an example of the code mapping list of sentences into map of words and sentences they are in. Here it goes:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
public Map<String, List<String>> makeWordMap(
    List<String> sentences) {

  Map<String, List<String>> result =
    new HashMap<String, List<String>>();

  for (String sentence: sentences) {
    for (String word: words(sentence)) {
      List<String> sentencesForWord = result.get(word);
      if (sentencesForWord == null) {
        sentencesForWord = new ArrayList<String>();
        result.put(word, sentencesForWord);
      }
      sentencesForWord.add(sentence);
    }
  }

  return result;
}

I have two some big problems with this. The biggest one is that it’s procedural programming all over again. It’s some method operating on data. It’s not an object that is allowing us to use it’s behavior (through methods). He says, that List.get(int index) method can give you null and cannot instead give you some default value, but in fact in Java 8 there’s a method called getOrDefault. Also you can always use Guava’s firstNonNull or Optional.fromNullable. Another problem is the code structure itself. He could have used streams and lambdas, he could have extracted some methods, but he didn’t. In Scala his code looks as follows:

1
2
3
4
5
6
7
8
9
10
11
12
def makeWordMap(sentences: List[String]):
    Map[String, List[String]] = {

  val initMap = Map.empty[String, List[String]]

  sentences.foldLeft(initMap) { (map1, sentence) =>
    words(sentence).foldLeft(map1) { (map2, word) =>
      map2 +
        (word -> (sentence :: map2.getOrElse(word, Nil)))
    }
  }
}

That’s cool, but how much less verbose is it? foldLeft tells you absolutely nothing if you don’t know the method. Another concern are Scala operators like -> and ::, which are enigmatic unless you know the API. Above Scala code fails Ivona test [polish] which tells you something about it’s readability. It’s not like I hate Scala, but when you have a huge codebase of legacy Scala code it’s much harder to comprehend than Java legacy code. More on that later.

Plotter

Another code example shows how he hate interfaces. Seriously. The best code sample here is written in Python:

1
2
3
def plotSquares(n, plotter):
  for x in range(0, n + 1):
      plotter.plot(Point(x, x * x))

It’s supposed to be the best, because there’s no interface behind plotter, and you can just call the plot method without creating some obsolete code. Obsolete? Interface? Some people say, that every class with public methods should implement some interface. Interface is a contract metaphorically signed by a class that implements it. Of course it can throw NonImplementedException, but leaving that in your code is just silly. In python you never know if the object behind plotter variable will have plot method. Do you know when you’ll find out? When you’ll get runtime exception. And that is not the exception you want to get on a Friday evening.

Songs

The last code example shows some ignorance about good software practices, Java 8 and additional libraries like Guava. This is his Java code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Integer getSongLength(
    Artist artist,
    String albumName,
    String songName) {

  Album album = artist.getAlbum(albumName);
  if (album != null) {
    Song song = album.getSong(songName);
    if (song != null) {
      return song.getLength();
    } else {
      return null;
    }
  } else {
    return null;
  }
}

Yes, this code is definitely not the best you can write. But why is it so? It’s not like Java makes you create a bad code. Yes, maybe it doesn’t make it easy to write a great code from day one because of it’s old API which throws nulls at you like a fire hose, but the choice to put some heart into your code is absolutely yours. All of those methods could have returned Optional, either from Java 8 or from Guava and it would look much better. Java 8 Optional has some awesome stream-like methods like map, which would make the code even better. With all that I managed to write the code below (it would be better if it were truly OOP, but I wanted to stick ot the method signature that Clint provided.

1
2
3
4
5
6
7
Integer getSongLength(Artist artist, String albumName, String songName){
    return artist
            .getAlbum(albumName)
            .map(a -> a.getSong(songName))
            .map(Song::getTime)
            .orElseThrow(() -> new NoSuchSongException(artist, albumName, songName));
}

Is his Scala code better? In my opinion yes! Definitely! Just look at it:

1
2
3
4
5
6
7
8
9
10
def getSongLength(
    artist: Artist,
    albumName: String,
    songName: String): Option[Int] = {

    for {
        album <- artist.getAlbum(albumName)
        song  <- album.getSong(songName)
    } yield song.getLength
}

It’s pretty, but it doesn’t undo the disservice made to Java.

Maintainability

I think I might have written enough for my first blog post, but there is still untackled issue of code maintainability versus it’s expressiveness. I’ll try to keep it short, I promise!

If the code expressiveness and number of written lines of code are the only measurement you take in the project, you are on a highway to hell. During first moths of project lifetime it’s much faster not to write tests, do TDD and refactor. If you ignore those things you’ll produce one feature after another in a rocket speed. Just look at the diagram right below:

Stolen from Patrick Wilson-Welsh

While working on first three releases it’s so much easier to just skip refactoring. Heck, you can skip all good practices all together! The trouble comes only after the third release. By the fifth release it’s already twice as expensive to create unrefactored system than continuously refactored system. By that logic it really doesn’t matter so much if the language is very expressive or not. What matters in the end is if you can easily maintain your pace of delivering new features. In my opinion even though Scala is a great language, it’s harder to make it readable after a few years than Java. Of course Scala is young language, so it’s hard to tell. And even though it should be easier to write maintainable software in Java, we have so many crappy systems written in it that I can’t honestly tell it’s all about the language and it’s expressiveness.

Conclusion

So what’s my conclusion? Really it all depends on software engineers. There’s (probably) no project, that failed because the choice of programming language, but there are many that failed because of the skills and practices of the developer team. Just try not to be shortsighted, that’s all I’m asking.

Comments