Out of Context

28 Sep

Lots of things can be taken out of context. It is however the author’s burden to clarify. Redemption from a technologist from various battle fields.

1. Protobuf vs. JSON, XML or Avro. It is not a secret I “hate” protobuf. In the context of exchangeable data, a rigid RPC solution doesn’t make much sense.

2. Java vs. C#, Python or C(++)/PHP/Perl, Ruby, JavaScript …

Performance and platform zealot use C(++). PHP is at home with them to script their C(++) solutions for the web and Perl is for the backend. If we need highest performance and be fully open source, it is the best solution. This is the choice of Facebook and to some extent Google.

Java speed is not bad and it is more approachable for people like lots of libraries and less things to worry.  But it is always memory hunger and start slowly. Also the syntax and basic data type support is not the most desirable. I have no problem using it for projects that fit and customer desire. It is cheaper than C(++) in development for sure. It is also easy to setup clusters of middle-ware for business critical systems.

C# is comparable to Java. But for crazy reason Microsoft decided that only windows platform is worth supporting, leaving server and mobile to Java. When Java (Android) rules mobile and server, even staunchest Windows fan had to decamp to where jobs are. Windows become the old standard desktop only solution and so goes C#. Small business still uses Microsoft for all but large enterprises only use Microsoft for desktop. Monotouch/Unity attempted to piggyback on the skills of few die hard Microsoft developer left, cutting out their little survive space in the corner. If one doesn’t deliver what customer wants, customer will just go and not coming back. Said that, hack out quick client side solutions in C# is still much more easier and less frustrating than Java.

Python is a very pleasant language with also a lot of libraries and framework. The widespread Python 2.x is also not new and pre-date OOP. But it is fun, practical and useful. Networking, data analysis and visualization are among its strength.

Ruby arise since Jython never was really not good enough to do everything cPython can do. A new language without the huge libraries, an asset and formidable task to support,  fits the bill.  Smaller than python to install but like python it is more maintainable than perl for new users, it fits into rapid prototyping and system administration.

JavaScript started life in browser. It is more familiar for web developer and like Ruby was designed and implemented in C/JVM early on. It is also sometimes used for data query and server-side development (node.js).

We also have other notable languages  like scala, go, F# and old SQL, LISP…

In the end, it is moot to judge these languages by simple shootout. Use the right tool for the right task. If the algorithm is chosen right and IO bottleneck were resolved, they can all do a lot of different jobs. If we need to do simple things in nanoseconds, we can do it in C or hand tuned Assembly. If for general server programming, Java, PHP and sometimes Python are the choices.

3. revolution/evolution development

Depends on cases, sometimes it is easy to burn the old code and redo from almost scratch. Most of the time evolution makes more sense. This depends on whether there’s big requirement change or misfit and how maintainable the current code base are. It really sucks need to maintain bug for bug compatible — it is wrong and crazy. Typical symptom of lousy product management / business.

4. High level and low level programming

Most of the time we should all develop at highest possible since computer hours are cheap and development efforts are expensive. In real life we often have to plumb into the details to process the amount of data fast enough. So we go from tools like Hive, Pig, MatLab, R, Python, SQL to Java and C. We also use Columnar Store/bitmap index/bloom filer/estimation for OLAP, SSD and In-Memory Data Grids to improve performance in general. These days a lot of things are achieved with massive data and computing power (like Google translation). Since we never have enough hardware resource due to the cost, often it doesn’t hurt to be as efficient as possible.

5. MySQL vs. PostgreSQL or such

It is really simple, MySQL does simple things faster with good replication/clustering support. It is good for OLTP kind of work load (when data is not critical),  sharding. PostgresQL (including products in the family like Greenplum, Vertica)  works better with OLAP type of analytics. We have Hadoop for even large and less structured data and specialized document or graph database..

4. Strong Consistency vs. Eventual Consistency

Although majority of the cases Strong Consistency of data is more important. For cloud-based services, eventual consistencies can be cheaper, more scalable and thus more desirable. We can architect and implement system for either situations.

6. Synchronous vs. Asynchronous

Synchronous is simpler and more predictable. Asynchronous allows much better scale and often is MUST for performance. There are also thread pool, fiber (green threads, greenlet) and pure event-based asynchronous programming model. Use where fit and keep it simpler.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: