Wednesday, January 27, 2016

Java vs Node/JavaScript, Mind Storm

I just watched in YouTube an interesting discussion of Java and Node supporters, the videos are in Spanish here and here. This subject is not indifferent to me. Soon my brain started to generate ideas, past facts, even traumatic experiences.

This article is an explosion of not ordered ideas about Java, Node, JavaScript, explicitly typed and weak typed languages.

The root of evil, origins of JavaScript

In spite of  I was very young, I lived the birth of JavaScript. The W3C page about this historic fact is really scary.

JavaScript, not to be confused with Java, was created in 10 days in May 1995 by Brendan Eich, then working at Netscape

I don't remember the first release of Netscape browser with JavaScript, but I remember the reasons of JS introduction:
  1. Provide a simple scripting language "inspired on" Java to control embedded Java applets in web pages.
  2. A simple scripting language to control forms. 

Later the public DOM representation of the page (partial initially), visible for JavaScript was introduced by Netscape, to provide some support of page changes driven by user actions beyond forms.

As you can see the initial motivation WAS NOT TO create a complete and powerful language to develop WEB CLIENT APPLICATIONS.

The JavaScript creators just were thinking on minor form manipulation (validations and similar), color changes and some texts added to the web page on user demand, no more than 100 lines of code (is an example).

Of course developers are ambitious, they want more and more, browsers could be more sophisticated and JavaScript could be improved... And yes it happened UNTIL Microsoft DESTROYED Netscape both the product and the company, MS Internet Explorer 5 won the browser war. Microsoft released later MSIE 6 AND FROZE the web technology for MANY years, being the browser the facto with no competition because the Netscape 4 (the loser) was a crap, with an enormous technical dept, the code was finally open sourced and the Mozilla Foundation was created to build the new generation of W3C compliant browsers (almost no code of the old Netscape was reused).

This reinvention took many years, meanwhile the absolute success of Windows XP + MSIE 6 and the lack of innovation of Microsoft in the web area, stalled the web technology including JavaScript, for many years, until Mozilla suite first, FireFox later, started to be popular (you know the rest of history). Today MSIE 8 (not very different to MSIE 6) still has a significative user market (yes really) and is probable this market share is greater in companies.

The browsers have been improved rendering pages and executing JS code, but JavaScript, the language, is mainly the same thanks to Microsoft.

JavaScript is as is as the result of lack of innovation, there is no "trascendental reason", there is no "deep thinking" about how to make a good language good enough for a long future, for instance the "prototype" feature just was a quick idea to create a mediocre object system, a mediocre version of Java and OOP , but good enough for a scripting language of  the past.

The monothread asynchronous programming is more performant than a thread based

Sure you have read A LOT OF TIMES how bad and costly is thread context switching and how good is mono-thread programming delegating blocking operations to a thread pool (yes again a thread pool, the dark secret of Node).

But the real facts do not agree, take a look Paul Tyma experiments, my own insights or the famous Tech Empower benchmarks, take a look how the venerable raw Java servlet blocking threads in I/O is performing, and the position of async I/O Java alternatives, of course Node is even in a lower position.

Java is tedious and verbose. Is programming with JavaScript quicker? 

It depends, if you are doing a very simple web application with Node, yes you are going to deliver some result before Java developers, but when the code becomes bigger and bigger the missing types are more a problem for productivity than an advantage.

The optional generics Java type system is not an evil invent to make the life of Java developers more sad, is a tool to make your code more robust, the verbosity cost is backed with a better code, any Java developer understands that "List<User> hello" is a list of User (readability) and only can contain User objects (robust) in spite of the stupid variable name, is it bad?

Can you make big web applications in Node? Yes of course, everything is possible with effort and discipline (in weak typed languages the discipline must be "military" even more strict than a explicitly typed one), but in the long term a dynamic weak typed language becomes a nightmare in projects with a lot of code, even in projects with a single developer (is my experience).

The problem of argumentation based on inferiority complex

I remember the times when Java was invented and was becoming more and more popular, Java was initially a C++ simplification, many C++ developers received Java with some skepticism. To win more Java developers from C++, Java supporters praised the "simplification" as a feature. For instance, the Java class system lacked the powerful templating system of C++, "you don't need it, everything is more simple in Java", this kind of argumentation trying to convert the inferiority in an advantage is unfair and a bad trick but it usually works. Years later Java 1.5 introduced generics, yes conceptually similar to the venerable C++ templates.

This is an example of the art of converting the limitation on feature, in spite of I'm a Java guy for many years (and a C++ guy before), Java marketing was not honest.

A Node instance has only one working thread, no concurrency problems, on the other hand, end developers are forced to execute ALL business logic of ALL concurrent users in the same single thread. This introduce a severe limitation, the problem of "CPU bound applications", your logic business must be very simple and quick otherwise all users will be stalled.

Related to the previous limitation, do you know the anecdote "Why npm's progress bar slows down install time by ~20%".  The explanation is simple, the progress bar updating is time consuming and is executed in the main thread instead of starting a new thread. I'm sure it will be fixed maybe executing the log operation in an extension coded in C/C++ in a different thread or similar. This makes me think, are we in 1970? Are we coding on Windows 3.1/95 (the later with a crappy not preemptive thread system)?

In practice any Node application is multithread because blocking I/O operations are executed through a thread pool, whether the thread context switching is time costly, Node has the SAME problem that a thread per request pool.

Node is trying to sell us how good is lacking thread programming, "the missing feature is the best feature". Something similar happens in JavaScript space, the inherent lack of features to structure code (that is the lack of decent OOP support) is sold as "freedom" and simplification.

JavaScript is mediocre, period (ES6 is a step forward, it would be even better whether types were optional like in Dart). Do you like JavaScript/Node? Right, but formally is a mediocre language, hard to defend out side of the browser, Java is by far superior.

Is Java the best language OOP/functional of the world? Of course is NOT, in the OOP space I'm sure most people agree with me that Scala, Kotlin, Ceylon, typed Groovy or C# are SUPERIOR languages, no problem, and in no way is the best functional language.

"The language is not important, the developer is"

Yes, you are heard/read this polite phrase often, and yes, a good developer can be good in ANY language... BUT...
  • A good surgeon is a good surgeon in a shelter tent and in a conventional operating room of a modern hospital. Try to ask him/her what is the best place for surgery...
  • Can you do surgery with a kitchen knife? Yes you can but I'm sure you prefer a scalpel.
  • Do you think a veterinary surgeon is appropriated to do surgery to humans?

The myth of complexity of concurrence management in "conventional" request-thread web apps

This is a very bad argument, a very big percentage of conventional web apps have NO problem with thread concurrency because, in most of them all of the code is single thread because they have no need of dealing with other parallel requests (threads), the database is in practice the synchronization point, al decent RDBMS ensure consistency in column writes and if you need a more sophisticated synchronization you have transactions.

Node, JavaScript and foundation libraries

Java is not only the language, a large library is part of the Java platform, it doesn't prevent smart developers to pick alternatives (for instance Guava, Apache HttpClient, Apache Commons etc).

The same does not apply to Node, something so basic as a decent collection system (beyond the crazy native arrays of JavaScript) must be externally downloaded.

An explicit type system is not only to reduce the number of tests

Yes you know how useful is a good type system, if you do something wrong usually some other place is broken and the compiler advises you with no need of executing tests.

However an explicit type system provides more very important benefits:
  • Readability: read the code of a Java class from a decent open source product, I'm sure you're going to understand the function of the class just reading the types.
  • Robust code: MyClass obj only can reference MyClass objects or inherited from, List<MyClass> list is a list and only can contain objects of MyClass or inherited (a final class is also possible).
  • Reliable code navigation: any decent Java IDE offers a "Find Usages" ("References" in Eclipse) feature, if you use this feature selecting "any" element of your code, the IDE is going to show with precision all uses of this element (an attribute, param, local variable, method, constructor, class name etc).
  • Reliable code refactoring: any decent Java IDE offers many options for easy and secure refactoring for instance name changing (the simpler case, IDEs offer more complex refactoring options). Brutal refactorings can be performed with almost no risk.

Paradigms, paradigms, paradigms

In the end of the day the important characteristic of your preferred language is the paradigms supported, for instance, structured (ex. C), OOP, OOP+functional (Java, Scala...), pure functional (Lisp, Haskell, Closure...).

In Node/JS you can do OOP, but the approach is tedious and cumbersome.

I suspect in practice OOP is not mainstream in JavaScript, maybe "objects", I suspect mainstream JavaScript code is like a C with functions capturing context vars, or something similar to VisualBasic 6 pre .Net (had classes but not inheritance and polymorphism).

I'm not kidding, take a look this C like code of Mozilla based products (for instance FireFox).

The myth of inheritance abuse

Some people say that "people" usually abuse using inheritance, one reason for bashing OOP (the typical all or nothing argument, bad argument in my opinion).

I don't agree, in the real world is the contrary, many people inherits ONLY when there is no other option (when the API being used provides an abstract class or interface to implement) but rarely you are seeing fully user defined inheritance trees not forced by the frameworks. Of course this is personal vision and isn't scientifically backed.

Many people ignore how to model a class system including inheritance, encapsulation and polymorphism when appropriated, others maybe because are following the mantra of some "gurus" saying "the inheritance is bad".

A long time ago was OOP became popular many people abuse of inheritance trying to fit too many things in the same class tree, for instance a class User inherited with visual code, persistent code, networking etc. Fortunately that time is gone, however some people today invites to back to this old crazy approach because they don't like the modern layering/services approaches (and usually is the same people saying the inheritance is bad).

Fortunately good OOP developers use the inheritance (the complete OOP features in general) when needed.

Are you able to figure out?

  • A Big Data tool made in Node.js
Hadoop, HBase, Spark, Storm, Kafka, Cassandra, Kubernetes, docker? Most of Big Data tools are made in Java or a similar JVM lang (docker and Kubernetes are Go).
  • Replacing Java Android core on top of C++ by Node.js
Maybe really a good idea? I remember years ago the filtered statement of a Google executive about the language on Android something like "Java is the only option, any else is crap" (modern JVM statically typed langs were not created yet).
  • A desktop application in Node.js 
FireFox, Thunderbird etc do not apply because most of the code is C++, the JavaScript/HTML/XUL is the external layer. In case of Java just see the three main IDEs: NetBeans, Eclipse e IntelliJ (Visual Studio as far I know is C++ and .Net).

Has Node.js a reasonable place in the World?

Yes course, Node is useful for conventional web sites with small and relatively simple business code.

Oh yes some important parts of PayPal are built with Node, almost the public web site, Right, OK, however PayPal is a relatively easy service, yes really, PayPal is far far away of the enormous complexity of the Google, Amazon or Apple ecosystems. I remember the article about the "competition" of the options, Java and Node, for replacing the old C++ based tech of PayPal, I'm sure you know Node won.

In my opinion is quicker to make a simple spaghetti than a solid Java version, and as far I remember the Java developers tried to include every Java framework invented in SpringSource and GitHub (the competition was to make a quick prototype of a service, not a production service) and imported also the kitchen sink. Yes Node is fine in the short term, the LONG term is different... remember the case of Twitter and Ruby, I respect Ruby, Twitter is as is thanks to the flexibility of Ruby in the beginning before becoming the current monster, by the way, today based mainly on Java.


No, this is not the end, stay tuned for the next entry about Java and JavaScript, much more personal....


Monday, January 11, 2016

Why Useful TDD is a Unicorn

Some description of the scene

First of all we must explain what is TDD, TDD = Test Driven Design, that is the design is driven by tests, only the design can be driven by tests if the tests are coded FIRST, that is BEFORE THE IMPLEMENTATION DETAILS.

Obviously the method or methods to test must be coded first, in theory with a simple and minimal implementation just to compile (in compiled languages of course) and run to make the test fail.

When I talk about TDD I want to underline the primary concept, TEST CODE IS FIRST CODED. Underlining this is absolutely necessary because many people think TDD = TESTING. Many articles about TDD are just articles about testing, and many praises to TDD are in practice praises to testing "in general", and when TDD is criticized or questioned, automatically is translated as a rant against TESTING. The need of testing is so old as software development.

In this article I'm going to talk about what I think about TDD, I'm no longer a young guy and fortunately I'm a not "consulting seller" and I'm not forced to public defend something I don't "believe", this is hard in a software world/industry plenty of people anxious to be "trendy" that we could name the "me too syndrome".

In this context is very hard to "contradict" the "industry", the industry is EVER trying to sell unicorns as real stuff, in my opinion TDD is one of these unicorns, many (most of?) people say "I've got TDD, I use TDD to make software, TDD is great". The problem of TDD is like teen sex, everybody talk a lot but in practice...

This article is brief, I've no interest on TDD ranting, TDD want to be a honest effort to make software better and predictable, there are a lot of articles about how great TDD is and of course some rants. This is just my vision.

Is TDD possible? The TDD paradox

Yes OF COURSE TDD is possible, the problem is in the details.

Someone has said you is doing TDD? Don't believe a word until you see him/her coding.

My thesis is simple, TDD is possible of course but only in trivial code!!! USEFUL TDD IS NEAR IMPOSSIBLE OR AN ABSURD WASTE OF TIME!!

Yes this provocative conclusion is presented first to make the explanation more expected :)

Try to remember how many articles introducing TDD you have read, yes most of them are trivial examples, the implementation code is so simple that most of the implementation must be written before the test because otherwise is not possible to execute any test.

Take a look to the Java code of this example, the article teaches you to avoid testing each method, instead invite you to test behaviors... Behaviors? What behaviors? The example is so extremely simple that is hard to find a "behavior" use case.

Do you remember the "revolutionary" Ian Cooper's talk TDD: Where Did It All Go Wrong?

The key phrases:

Avoid testing implementation details, test behaviors

– A test-case per class approach fails to capture the ethos for TDD. Adding a new class is not the trigger for writing tests. The trigger is implementing a requirement.

– Test outside-in, (though I would recommend using ports and adapters and making the ‘outside’ the port), writing tests to cover then use cases (scenarios, examples, GWTs etc.)

– Only writing tests to cover the implementation details when you need to better understand the refactoring of the simple implementation we start with.

Ian invites to avoid the classical typical low level testing apparently accepted by everybody for TDD (test any class, test any method). The rationale is simple, the less level, the more trivial code we are testing, the less level, the less value of the test. Consequently the number of tests are extremely high and in the same time the value of tests is extremely low, a lot of work for near nothing back.

OK we accept we must practice TDD (remember, test-first before implementation details) with behaviors. A "behavior" beyond a simple class or method implies in OOP a class system design,  that is, several top level classes collaborate to get a result or action (unless you are coding a kind of StringCalculator parser), ignoring the necessary internal classes (implementation details). Identifying the top level classes is not a trivial task, it involves trial and error, redesign, external API conceptualization...

In the best case you are going to get a bunch of main near empty top level classes and corresponding public methods, THEN perhaps you are ready to start to code tests (obviously red tests in TDD). When implementing the details frequently you are going to kick your head against the wall of the premature API optimism, the implementation details usually influences the public API of the module we are developing (the industry knows a lot about the problem of defining APIs in papers), because unless you have a clear pre-designed public external API for end users imposed as requisite, your API must be designed up to front and you must predefine the scope if you want really practice TDD.

The result is several iterations of implementation/refactorization coding and API changes until a final API is stable enough, meanwhile your first ambitious "test first" code in practice is just a draft being absurdly modified again and again and again... Are we really doing Test Driven Design? Really? Is worth the test first obsession? Is up to you but perhaps this is the time to think the sense of the hype: