ReactiveCocoa
GitHub releases ReactiveCocoa, the “magic between GitHub for Mac” and “essentially an Objective-C version of .NET’s Reactive Extensions (Rx)”.
Composable operations on futures: are they here to stay?
Wake Up and Smell the Wasabi
Five and a half years ago, Joel Spolsky announced that they had indeed graduated writing FogBugz from VBScript to an in-house language defined as a superset of VBScript called Wasabi, reimplemented using the semantics that suited them, and that they did this instead of porting the code to, say, Ruby. (The day after, the rest of the Internet announced that Joel Spolsky had indeed graduated from occasionally correct to stark raving mad. That link is from Jeff Atwood, who would naturally go on to start a company with Joel.)
Two years ago, Facebook announced that they had indeed been working on switching over from whatever standard-ish (but likely tuned) PHP machinery they were using to their own HipHop cross-compiler, which would spit out C++ that would then be compiled and work faster.
Joel may well have been right in that Wasabi, having been written, could smooth the transition to a non-caveman language by introducing smarter features, not breaking the current code base and finally allowing a high-fidelity cross-compile to another language where it may be kept from there on out. (I have no idea how they got there, but I know that they are now on C# and ASP.NET. And let’s ignore for a minute the labor that went into maintaining the compiler, training the programmers and so on.)
But what to make of HipHop? Google has never had problems with scaling because it started out focused on solving a scaling problem and the culture never left them. For the first few years of Google’s existence, being able to fit, index and work over many more pages than any other search engine was part of Google’s advantage.
Facebook might not be able to rewrite everything over the weekend, but they pride themselves on rapid iteration. Surely they could figure out how to map everything onto a new architecture with different tools (possibly by inventing a new language, maybe channeling people like Andrei Alexandrescu) and perform the migration over a non-disastrous period of time.
Eevee is right: PHP is a monster of a bad language. (For the rest of my life, barring something even better, I will carry around that link to use when people ask why I hate PHP. A third of the text could be slander and a third out of date and it would still be enough to keep me away. It may run the Internet, but it does so in spite of itself, and not because of itself.) It doesn’t matter to Facebook’s customers that Facebook is written in a sub-standard language, but it matters to Facebook that Facebook is written in a sub-standard language because it gates their productivity and multiplies their maintenance and support.
It seems that half of my posts here end with hopes that there will be a new perfect language which will solve everything. While I would love that, I don’t think it’s coming.
What I do see are opportunities. Rather, thanks to Bret Victor, I see pain and suffering where needs are not being met. There’s a place for a few languages to be excellent in their field. (The most vivid image is of a proto-C# to replace C++ that would be efficient and not assume you were the world’s foremost expert on memory management, but actively take care of it itself. I don’t just mean garbage collection (or maybe not at all), but a virtual machine or JIT that would recognize patterns and say “hey, I know how to implement this pattern super effectively!”, beyond just solving general cases.)
When companies do things like Wasabi or HipHop, it’s a sign of something more. It is the first inkling of a desire to not accept the limitations of what they have before them. At their best, it’s also a sign that they don’t accept the limitations of any other method; at their worst, it means that they are so set in their ways or so cheap that seeking out something else doesn’t turn into an option no matter what. Sometimes, you can’t tell except by doing it and looking back.
This is not a scale from point Ew to point Ah, and the pot of gold at the end is not “inventing a new language”. But being willing to transcend every readily available option is the spark that is necessary to change the world, improve the state of the art and make life better. I wish we’d do it more often.
On LINQ, Standards, Databases and Fruit
Around ten years ago, development started on X# at Microsoft Research, later renamed Cω (C-omega). Cω attempted to unify data stores like XML documents and databases with the C# language.
Cω eventually became the LINQ project, standing for Language Integrated Query. Below only .NET itself, the new Metro-style/WinRT infrastructure and some early Internet work, it may be the broadest-reaching endeavor to ever be undertaken within Microsoft. LINQ became a way to write SQL-like queries, translated by a series of paper-thin syntactic transformations to extension methods/”query operators”, library methods or your own for producing or manipulating a sequence. LINQ on its lonesome forced VB.NET and C# to adopt a slew of features to make sure that queries could be written due to the relatively meager standards of version 2 of those languages.
But it didn’t end at letting C# programmers stop hiding their head in their lap whenever they were asked why Map was missing from a standard library with thousands of classes. Part of the supporting infrastructure was a way to turn the lambdas used for implementing the higher-order function/”query operator” input into expression trees, a language neutral enough sort of syntax tree, and the concept of a query provider. A query provider could, given that the expression tree versions of the queries were used, recompile or reinterpret the query into another query format. Most commonly, this is used to translate the query to various SQL dialects, but there are also query providers for XML and LDAP.
The first SQL query provider came from LINQ to SQL, a trailblazing ORM originally meant as a proof of concept and which only worked for Microsoft SQL Server. It took two years for Microsoft to deliver Entity Framework, “EF”, the real ORM which also works for every database with the correct “provider”, and one more major version for Entity Framework to grow into its role. EF is now updated a lot more frequently and slowly pleases more and more developers.
Which brings us to today. LINQ won’t do much for key-value databases that just model a dictionary (even if, and especially if, smart such databases, like Redis, take the opportunity to model useful operations on the values like set sorting or atomic increments). And while it works for traditional SQL databases, it hasn’t eliminated the difference between objects and database concepts, it has in a sense highlighted the differences that exist even more. I may not be shuttling data on or off DataTables, but I am more aware than ever about what’s happening where. I have to be slower about adopting new features in my favorite database to make sure the entire chain supports it, and some databases might not have providers or might have buggy providers. In some ways, LINQ and Entity Framework have driven me into the uncanny valley of database connectivity.
This isn’t purely a problem of using “Microsoft’s solutions”. To a large degree, it’s the cost of picking this sort of approach. Traveling less audaciously might be calmer and more consistent. But the box has been opened and the bobcat is clawing wildly at the portrait of gramps on the mantlepiece. You won’t get the painting back, but you’re going to anger a whole lot of bob-er, developers if you try to put them back in their boxes. LINQ-like principles have been able to be replicated in Scala, gleefully (to some) without having to add language features. Rails’ ActiveRecord, since a while back, uses an SQL builder approach of constructing the query and translating it into SQL at the last minute for the best performance. The principle is sound, and although it’s not perfect, it also hasn’t been carried to its logical conclusion.
What’s left seems like low-hanging fruit. I have heard nearly nothing about it, but I’ll be damned if twenty people aren’t sitting around working on it somewhere right now: There needs to be an object/node/graph database built explicitly on LINQ-like principles and in a way as to optimize for LINQ-based usage patterns. The data conforms to object patterns instead of its own designs. The kind of queries you send it informs a dynamic query analyzer, rebuilding and maintaining indexes for optimal access. You don’t have to worry, really, about primary keys, but more about references to other things.
Yeah, it’ll take a new approach and we’ll have to toss what we have. Good. It doesn’t look to me as if we’re in the best state we could be, nor as if there’s been a ton of recent research in this direction. There have been object databases, but only to the point where there’s barely a new one every couple of years and they ironically don’t smell of being open to fresh approaches. And Microsoft Research (again) is working on Trinity, a distributed graph database, but it’s not externally available and seems disproportionally involved with serialization to and from binary messages (and that’s ignoring the references to the cast of Friends in the manual).
Fundamentally, this isn’t that involved with LINQ. But LINQ is a nice handle for what this new database should be – assuming code access instead of database operator access, eliminating the divide between the database and the accessing system and finally giving us a new set of tradeoffs for a new decade.
Monostable
Mono 2.11 ships, including C# 5.0 await/async, compiler-as-a-service (different in scope and approach from Roslyn) and partial significant .NET 4.5 API before Microsoft’s stable versions of any of those things.
Say what you will of Miguel de Icaza, but I think he gets things done.
Schedule Your Recording Devices
Microsoft hosts Lang.NEXT early April. Features, among other things, the minds behind Scala, D, Dart and C#. Listen to the inspiring, mind-opening rhetoric from captains of the industry; to fresh, mould-breaking approaches from bright researchers; or to Herb Sutter.
In Isolation
“Firefox’s JavaScript engine becomes single-threaded“, extols Slashdot, and it’s a sign of the times as much as anything. It’s not a coincidence that most of the languages or concurrency libraries over the past few years have focused on message passing, on separate memory spaces, on isolation and on some sort of remoting.
The future is a small device where you want to keep power use low, where keeping the thread count down drops the responsibilities of the kernel as well as the mindless work necessary to coordinate.
The future is, potentially, a vast network of computers where your programs can start running in an instant on more computers, or even better, where it will always run in an undefined place.
The future is a desktop, laptop or tablet where power mostly isn’t an issue, but where it is simpler to think in queues, where nothing tramples on anything else.
No matter what precisely the future is, it is time to start hitting people in the head with lead pipes — figuratively! — when they try to attack pedestrian problems with anything that demands that you understand threading in detail to get it right.
Rustworthy
Roslyn
A CTP — publicly available, contents-may-settle, pre-beta preview — of the aforementioned Roslyn C#/VB structurally open compilers has been released.
So far, syntax trees, flow analysis and beyond-compilation reasoning such as “what type or thing would this be if I placed it into this location” seems to be almost fully available, although code generation is still missing in chunks. Comes with some interesting documentation. The project overview is so far a real page-turner and it’s interesting how they handle whitespace/comments round-tripping and partial compile error recovery and reporting.
It’s exciting to see Microsoft take the step of repaving Visual Studio’s tooling on top of this. Here’s hoping they start a trend.
Dart
Dart looks promising; something that, thanks to Gilad Bracha, no doubt, learns from JavaScript without just looking like a new syntax for it.
For example, the single-event-loop side effect of JavaScript web browser implementation is carried into Dart, through the workings of isolates which are once again eddies of sequentiality in a concurrent world, without access to shared state. Heavy initialization is explicitly ruled out by only allowing constants on some static constructs. Constructors can each have a “method name”, solving the overload problem for them. Factory constructors unifies factory methods and actual constructors. Optional types allows stapling on types and lets prototyped functionality turn into something solid. And so on.
The documentation is, to put it gently, “present”, but the language spec may be the most thought-out document.
The big question is “do we need another JavaScript alternative”? Maybe we don’t, but the actual warts in JavaScript (and I do mean warts, not just the way it’s constructed) are ugly enough that I want something more than a preprocessor to paper them over with. We’ll see how much of an alternative Dart turns out, but it’s got enough good ideas to get off the ground and I trust Gilad Bracha to not just toss in everything that looks exciting without thinking over how it’ll upset the rest of the language.
Sharper
I’m spending much of my day nose-deep in C#, the language that initially got a bad reputation for looking a lot like Java and that still lives by that image, despite all the things that have happened in the interim. Each new version since has had a feature with a marked lasting impact (maybe except for C# 4′s dynamic). The feature for C# 5 was telegraphed in advance to be await / async for bringing a sane asynchrony model to .NET and making continuation-passing style look like sequential code. With this in mind, I had thought that even bigger things were in store at the recent BUILD conference.
Imagine my surprise when Anders Hejlsberg took the stage and went deeper into the project codenamed Roslyn than ever before, and announced that a first CTP, Microsoft-speak for limited commitment-free public alpha, is upon us (mid-October). Better yet, Anders stopped short of saying that the source will eventually be available.
Roslyn has been discussed since back in 2008 under the murky premise of “opening up the C# compiler while rewriting it in C#”. The new information was that Roslyn opens up nearly everything at every stage in the compiler to be a public, supported API, and that Microsoft is retooling Visual Studio to consume only these APIs in a future version (beyond the coming Visual Studio 11/2012).
C# is going to go overnight from not having eval to essentially having something much more interesting. You can’t execute dynamically generated code willy-nilly as in most dynamically interpreted/compiled languages, but you can grab the compiler’s opinion on and analysis of any piece of code. You can get syntax trees, you can figure out binding and find out which types are involved, you can get information about where the breakpoints can go in a piece of code, you can get the IL and you can actually compile and emit the code in situ.
In addition, there are two more pieces to this. First, there’s an API model for writing dynamic read-eval-print loops where lines of code can be interpreted in a script-like fashion and kept intact as you go; running code, importing namespaces, referencing new assemblies, redefining methods or properties and so on. This would otherwise have to be reinvented for every such implementation and is a nice nod.
Secondly, there’s an object model for the language, something showcased by being able to translate between C# and VB (for which equivalent work is being done) and writing smart, custom refactorings, which was previously only in the hands of people who were already basically writing their own C# compiler, like ReSharper or CodeRush.
The Mono effort has laudably been supporting a C# REPL for years now, but they haven’t gone all out and figured out all of this. (They have diverted their resources to other efforts, and I think it was the right decision for Mono users, but I just wanted to highlight that their REPL and Roslyn are not equal in scope.)
When Roslyn is said and done, its team is going to have set a new bar for statically-compiled languages. C# and VB.NET will out of the box be able to do all of this. The onus must now be on the rest of the statically-compiled languages to provide the same thing, or risk being passed by, at least in programmer satisfaction.
Update: I mentioned that “Anders stopped short of saying that the source will eventually be available”. In the Channel 9 Live segment with Anders Hejlsberg from BUILD, at 25:50 in, there’s this: “It’s my hope that we can share as much of the Roslyn project with the community as is at all possible. For sure we’ll share the APIs. I’d love to share the source as well so people can do their own things with it, like understand how the compiler works and build extensions to it and experiment with new stuff in the language. The more people are doing that then the better we’re all going to do, you know, with the community.”
What this sounds like isn’t just “we’re going to drop a zip somewhere every few months” (although I’m sure they won’t develop everything in a public repository), but “we’re building this to encourage your participation and help in evolving the language and using the language outside of where it’s been, and I’d love to outright promise source to you except for the occurrence of lawyers”, which is a rather fantastic attitude.