Thursday, November 25, 2010

How free software/open source means never being unsupported

The strangest aspect of promoting free software (software that has been freed, as opposed to software available for no money), also known as open source, is that many people are concerned about the lack of support. "If we switch to Ubuntu", argues the skeptic, "we don't have the support of a giant corporation like Microsoft if things go wrong."

I'm going to give a contrary view, and use my experience with Android, the popular free software operating system for touchscreen smartphones, as a great example of how wrong this view is in practice.

My conversion to GNU/Linux

But before I do, let me explain my history a little. My first computers ran proprietary operating systems. I bought a computer from Sinclair (called a QL), and then a computer from Commodore (called an Amiga), both of which suffered from a substantial lack of support. The issues were different in both cases: Sinclair went bust, and was bought out by a company, Amstrad, that had no interest in the computer I'd bought. It was discontinued, as was all support. Commodore suffered with financing and mismanagement issues, and while the technical people at Commodore put together an extraordinary device, it became clear after a while that the platform itself - despite being decades ahead of the competition technically - had no future. What support there was was third party.

After Commodore went bankrupt, I faced a choice of jumping ship again to yet another proprietary product vendor, which would probably have been Microsoft, who was producing a software product I considered inferior to what I already had, or going in a different direction. I choose the fledgling GNU/Linux platform, buying some cheap hardware to experiment with, and eventually using it as my primary system.

My needs were relatively modest and non-mainstream in the beginning, and so the lack of support from hardware manufacturers and big software companies wasn't an issue. Over time, the community around GNU/Linux filled in the gaps. Drivers for even relatively obscure devices started to be written, and the software infrastructure issues - the need to be able to open the occasional Word document or Excel spreadsheet - started to be filled in by free software/open source enthusiasts who reverse engineered these file formats and created full applications able to handle them.

Still, in the early part of the 21st Century, GNU/Linux still seemed awkward and difficult to work with, and my own needs required something more integrated with the rest of the world. So, with Mac OS X looking increasingly good, I bought a Mac and used Macs for a few years.

And then I switched back to GNU/Linux. Why? Because the Mac was constricting. Because support wasn't what I thought it would be. To do something as simple as using the "latest" version of Java (which in Apples case was always at least a year behind), I'd have to upgrade to the latest version of Mac OS X, which in turn would only run on some of my hardware, and which, to be quite frank with you, I didn't even like much. I could not run the operating system on a Thinkpad, despite disliking the Powerbook hardware and pointing device. I was reliant on Apple for certain core software, for all hardware, and there was only a limited amount third parties could do to fix the gaps.

In the mean time, the major issues with GNU/Linux were resolved, because they could be! The system was entirely open, anyone could fix the issues, and so people did. A community of developers, including some provided by major corporations like Novell, IBM, and Redhat worked together on a graphical user interface called GNOME 2, that made the system slick and easy to use. Canonical worked on putting together a package that could install on virtually any hardware and provide a default environment that was feature full and easy to use, and called it Ubuntu. All of these organizations were able to do this because the system was free, because it was entirely open, nobody had to ask the permission of anyone else to work on this, and as long as people wanted support, the support would be there.

And now we're at the stage where Ubuntu, the most popular desktop variant of GNU/Linux, is arguably a better package over all, for the majority of people, than either Mac OS X or Windows, being, arguably, more functional than either, and being close in terms of ease of use to the former.

Android

My first Android phone I bought a whole six months ago. The phone is a T-Mobile myTouch 3G Slide, which I bought because I was buying my wife one, because it had exceptional reviews, and because I didn't know the G2 was just around the corner. The Slide is a fairly capable phone, but there are a number of issues with the operating system on it - certain apps, such as the phone, crash regularly, the Bluetooth headset support is virtually unusable (you can't use voice dialing through the headset, for example, and even if you did you have to interact with the phone's touchscreen to "confirm" the operation you requested. Oh, and just to add insult to injury, the voice dialer "recognizes" commands like "Turn off Bluetooth", which it does without asking for confirmation!), and so on. The version of Android running is 2.1, a recent and respectable release, but nothing to write home about.

HTC released an update in August that fixed some bugs in the phone, but left the headline issues above still very much alive. The next step was to hope that HTC would get around to fixing the issues with the next operating system update, perhaps fixing them if and when it releases Android 2.2.

Well, that's what my option would have been had Android been proprietary. After all, that's what I'd have to put up with if I had an iPhone, or webOS phone, or Blackberry, or Windows Mobile phone. In all of these cases, I'd be dependent upon three things:

  • The maker of the operating system (Apple, Palm/HP, RIM, or Microsoft) fixing the bug if it's their bug.
  • The maker of the phone (Apple, RIM, or one of a swathe of third parties) fixing the bug if it's a bug they themselves introduced when customizing the operating system to work with their product.
  • The maker of the phone actually planning to release an update, and getting the update out.
But this is Android, and Android is free software, and quite frankly, there are a lot of people from the free software/open source worlds that simply aren't willing to rely upon third parties to support them. They're like me - people who have relied upon companies like Sinclair, Commodore, Apple, and even Microsoft, to provide them with what they need, only to be let down again and again and again and again - and these people have said "You know, let's take Android, let's fix the issues that bother us, and share our fixes with the rest of the world."

My phone hasn't been running the stock HTC Android operating system for several months now. Instead, I've been running something called CyanogenMod. CyanogenMod is a third party variant of Android, where a developer called Steve Kondik (he uses the online nick "Cyanogen", hence the name), and other like minded developers, have taken Android, fixed the headline issues they consider important, and ported this version to as many devices as they can.

What's the result? Well, with this latest version (CyanogenMod 6.1RC2), every major issue I had with the Slide's operating system is fixed. The system is stable, all the apps work, Bluetooth headsets work properly (including voice dialing), and I also have the benefit of running a more recent version of Android, 2.2, that includes some nice new features like easy "tethering" to a laptop or similar device.

If that were all there is to it, that would be very positive all by itself: I've had issues fixed that those reliant upon HTC are still waiting for. But, in fact, the benefits are actually greater. The first Android phone was the G1 (also known as the HTC Dream), which is barely two years old. HTC stopped supporting this around a year ago, which means owners reliant upon HTC for updates are out of luck if they want a more recent version of the operating system, or any severe bugs fixed.

The same will, ultimately, happen to most of our devices. Now, I'm drooling over various Android phones, and wouldn't mind an upgrade, but I've only had this one for six months, and like most real people, I don't want to spend $400-500 on a new phone every few months. Most people keep their devices for at least two years. So the knowledge that, in all seriousness, it's very unlikely HTC will release updates for my phone in six months from now, is somewhat disconcerting. Or would be if I were reliant upon HTC.

And again, if I was using a proprietary operating system, I would be reliant upon HTC. Those buying a Windows phone right now will find it impossible to get updates a year from now - not without buying new hardware.

But plenty of people with G1s are running Android 2.2, despite HTC cutting off the air. They're able to do this because Cyanogen, and others, have released versions of their variants of Android for the G1. As long as the hardware is capable of running it, and people are still using the phones in question, you will see ports of the latest versions of Android for these phones, because the phones are supportable.

And that's the key word: supportable. Free software makes hardware supportable. Proprietary software means that only a small group of people can support something, and only companies with unlimited resources (and no agendas!) will ever provide unlimited support.

Interested? You can find information about CyanogenMod here. The CyanogenMod operating system is only available for a subset of Android devices, you can check the Wiki for the full list. But if it runs now on a device now, then that device is, by definition, supportable.

If support is one of your priorities, you need to ask yourself what kind of support you need: the support of people with the same concerns as you? Or the support of someone who benefits from putting you on an upgrade treadmill, and who may, in time, decide it's just not worth the effort.

Sunday, November 14, 2010

Choosing a web technology

When I first started web development, which would have been around 1997 or 1998, writing web-based applications was not pleasant, and the choices were slim. You generally had to use CGI, using C or Perl, although some alternatives were being developed at that time, many of which have become major frameworks today.

For the most part, the successful frameworks, with one exception, have recognized that programmers do not make great UI designers, and UI designers do not make great programmers. I've worked with people who do fantastic jobs on the UI design front, backed with the most hideous, poorly written, unreliable, difficult to maintain, junk code on the back-end. And, well, I'd have to admit that while my UI work of late has been better, my early attempts were somewhat limited.

So, says the majority of modern frameworks, what you need to do is separate the application logic from the HTML. Different systems have achieved this to different degrees, but most now at least make a passable effort, if only by using a system of HTML templates, in which code can be embedded. A UI designer need only create the HTML templates, knowing what hooks to use to bring up the application specific logic.

Let's go through four popular web development systems, in my mind from least successful to most, and explain what's going on.


ColdFusion

ColdFusion is, well, "interesting". The system pretty much goes against the logic described above, taking the view that the UI designer should be the programmer, and designing a "programming environment" that's supposed to be familiar to anyone who can mark up HTML. And in some ways that explains its success, many UI designers love ColdFusion because the concepts are easy for them to understand, allowing them to handle the entire product development themselves.

Add additional hands, especially a split of programmer and designer people, to a CF project, and it starts to show its flaws. ColdFusion requires the same people handle both ends of the project, with a certain amount of logic embedded on every page - indeed, every CF page is by default a program. You can't design something in an HTML editor and have it run in CF without making significant changes.

And the advantage for designers, that the "code" syntax is similar to that of HTML, is a massive handicap when you put a CF project in front of a programmer. CF's native code format is difficult to read and follow. While CF also supports a more programmer friendly syntax, that syntax is a bolt-on, and so any project started by a person who used the HTML-like syntax will contain a large amount of that code that needs to be maintained.

There are workarounds for CF's issues. The major successful CF project I've been exposed to used web services to abstract as much of the programming logic as possible, ensuring the front-end only contained the code that was absolutely necessary. I still wouldn't have used CF, but it meant a UI designer could be involved who would have otherwise had issues with a different environment.

CF also, for what it's worth, has other issues, the major one being that the only complete implementation of the system is proprietary, and has licensing issues. It's hard for me to think of an occasion in which I'd recommend the use of ColdFusion for a new project, but others would.


PHP

PHP is, like ColdFusion, an established web application development system with a long history, based around a custom programming language. Unlike CF, PHP makes no attempt to provide a UI designer friendly programming system - PHP scripts look similar in many ways to Perl, a language notorious for its confusing, punctuation-based, syntax.

However, PHP changes things around in one important respect: PHP pages are, by default, HTML. Code is embedded in the HTML where needed. This critical difference makes it easier for programmers and UI designers to work together: UI designers can develop and prototype HTML templates, and programmers can modify those templates with embedded PHP mark-up, with the integrity of the HTML remaining.

PHP is free software, that is, there are no licensing implications from merely using it, and modern iterations of PHP have moderately advanced programming features. It's not a bad system, not ideal, but it works fairly well.


JSP

Early in the history of Java, there was some debate as to where the language was going and what it would be most suited to. After an abortive effort trying to promote it as a solution to cross platform development of desktop applications, its real niche as an enterprise-class secure language started to become more apparent. Sun developed various standards for web development, starting with two technologies, servlets and JavaServer Pages (JSP.)

While servlets took a strictly programmer-oriented approach to web development, similar in some ways to CGI, JSP took the same template approach that PHP was taking, and took it further. JSP pages are, like PHP, standard HTML pages with embedded Java. But JSP is a substantially more advanced framework, oriented towards larger, more difficult to maintain, applications. In particular, it is easier for a UI designer to have complete control over the HTML generation, maintaining the HTML template system, while handing over business logic tasks to a programmer. JSP achieves this in a variety of ways, allowing programmers to work at different levels with the HTML, from embedding raw code on a JSP page (as with PHP), to providing tag libraries (custom "HTML" tags the designer can embed in the page, whose logic is controlled by the programmer.) Because the programming side is Java, programmers have a wide variety of options and can make use of a well supported, modern, scalable, environment with a huge amount of support.

JSP is rarely used as-is. For all of its advantages over something like PHP, Java's infamous over-engineering frequently makes it hard to do the simplest things, and as a result a large library of third party frameworks have come into being to make it easier to develop full applications. But JSP is an extremely flexible system, and coupled with good Javascript libraries like GWT and jQuery, it's one of two I'd generally recommend for all but the smallest of projects.


JavaServer Faces and Facelets

Originally built-upon JSP, JSF has come a long way from its origins as an attempt to make an MVC-friendly Java framework. Early versions were, frankly, abysmal, notorious for inheriting Java's over-engineering and bureaucracy, undermining the principle advantages of the framework.

Over time, the major issues have been fixed, with JSF 1.2 introducing "Facelets", a replacement for JSP that more closely matched JSF's requirements, and a large collection of major improvements with JSF 2, that allowed Facelets to be built around HTML templates, and that undid massive amounts of the bureaucracy associated with JSF using a system of attributes.

In some respects, it's better to pretend JSF didn't exist until JSF2. So let's do that and discuss what JSF2 has over JSP.

With JSF2, pages are marked up using standard HTML, as with JSP and PHP before. However, standard HTML tags that need to be manipulated in some way, such as form controls, can be linked to internal JSF2 controls using a custom "jsfc" attribute, making the link between the original HTML and Facelet version even more transparent.

But that's not the major advantage. Whereas the systems mentioned previously simple allow code and HTML to be mixed, JSF introduces the concept of a "view", which matches closely the user's perception of how a website is navigated. Each view consists of an HTML template, and a set of "beans" (high level Java objects) that link to different fields on the template. When the user updates things - changes the text on forms, hits submit buttons, etc - those beans are similarly updated, and the beans are also asked whether the user should be directed to a new page (or view), or the same template with updates (ie the same view.)

This is a much more high level approach than JSP, and from a split-of-development standpoint, it makes for a much more scalable system. Programmers and UI designers can easily understand each other's work, with the UI designers doing barely any programming, and the programmers putting together reusable interfaces to the business logic. And (with JSF2, at least) development is faster - the developers do not have to concern themselves with parsing the results of every form submission, they simply have to put together the containers, the beans, that contain the data. Even if no web designer is involved, JSF is a great platform.

It's a staggeringly powerful system, and I suspect that if it wasn't for earlier releases, JSF2 would be the most popular Java web platform around. As you've probably figured out by now, JSF2 is the other of the two frameworks I would recommend (the first, as I said earlier, was something JSP based), but you'd probably want to know why I might recommend a JSP solution over a JSF solution.

JSF's Achilles heel is that it is entirely session and form based. Navigation must be done via submitting forms. The results of those form submissions modify the beans associated with the current view or session. This means that pages cannot be "spidered" - that is, a JSF application is pretty close to impossible to put into a search engine, and  users cannot bookmark pages. There are workarounds, such as PrettyFaces, but they, to a certain extent, aren't really designed for the kinds of applications you'd want JSF for in the first place.

If I was writing a Javascript-heavy interactive application where much of the logic is embedded in the scripts, there'd be little reason for me to use JSF, which would mostly get in the way. That's not a hard and fast rule, but it works as a generalization. And if I was writing something like, say, a retail front-end, where each product has its own product page, I certainly wouldn't want to use JSF, and JSP would be fine for those applications.

Conversely, if I'm writing an actual application, like an airline reservation system, or an Intranet-located front-end to a database, JSP would work, but JSF would be easier to maintain, and easier to work with.


Conclusion

In the end, there's a certain comfort factor involved, and its hard to get people used to knocking up quick and dirty tools using CF or PHP to move to a more robust, powerful, system. But I must admit, after this long in the industry, it's hard for me to recommend those kinds of systems, even for the "quick and dirty" tools. Java provides two extremely good, robust, options for web development, with features not present in CF or PHP, that suit team work and rapid development practices. So if you ever commission something from me, and want to know why I'm using JSP or JSF, you now know why.

Monday, September 27, 2010

Ubuntu is ready for the desktop. But.

For as long as I can remember, the joke has been that "This year will be the year of Linux on the desktop", which ceased to be funny around 2006 or so, when Ubuntu genuinely became a usable, just-works, great to work with operating system for the rest of us. Ubuntu may never have been the clear leader in any category, but "Almost as user friendly as Mac OS X (and light-years ahead of Windows)" and "Almost as functional as Windows (and light-years ahead of Mac OS X)" certainly isn't a bad place to be.

There's just one irony here though, while the Ubuntu people have done a great job making an operating system traditionally associated with cheap servers into a platform for the rest of us, their attempts to make a server have become increasingly ridiculous.

I'm going to explain why.

Domain management

I've had an interesting experience recently. I wanted to put together a "poor man's Active Directory" for myself, and I also wanted to integrate a product with an existing domain.

Active Directory is a technology from Microsoft that, in turn, is built upon (amongst others) two major open standards, Kerberos and LDAP. The two technologies are largely complementary although they have some overlap, which we'll come to in a moment. AD is an example of something called a domain controller system. Domain control is a centralized security system, whereby a single server stores all of the information about all of the users on a network, plus security and configuration information.

How important is it? Answer, very. Virtually every corporate entity with a network with more than a handful of users should be using a system like this. Every non-trivial network has a number of services running on it, that at minimum include people's ability to log in to their PCs, network shares for storing data, VPNs, and email. Without a centralized security system, without a domain controller, each of these subsystems needs to be managed separately, with multiple passwords, multiple concepts of what constitutes a "user", and with it being increasingly the case that complex security will have no security substituted for it.

Domain controllers use a system called LDAP to store information about all the users and other entities on the network. When a server needs to know whether to grant access to a resource to a specific user, the server simply makes a query from the LDAP database, and uses the results to make a decision. Typically users are added to one or more groups, and then the group membership determines their roles. But the LDAP database also stores other information, so a server can extract information such as the user's email address or even their phone number.

Take a typical email system. There are multiple ways in which email and LDAP can be interlinked. Email clients can use the LDAP database to configure themselves. The email server can use LDAP to configure itself, determining which users to store and what groups (mailing lists) to set up. When a user is composing an email, they can use LDAP to look up other people in the corporate directory.

In a modern domain controller, the LDAP system is complemented by Kerberos. The job of Kerberos is to identify the person trying to access a resource. Typically when a user logs in (say, into their PC in a Windows environment), what's happening behind the scenes is that they're actually having their credentials checked by Kerberos, which then issues them a ticket. From that point on, until the user logs out, whenever the user does some kind of network activity, the ticket is available for if a server asks for it. And servers will ask for a ticket when they're trying to determine who is using their services.

Let's take the email example again to explain how Kerberos fits in to all of this. Usually when you make a connection to an email server, you need to tell the server who you are so that the server can determine who's email to give you. Without a Kerberos system, that involves sending a username and password. However, if the email system accepts Kerberos, then your email client will simply send the Kerberos ticket you received when you logged in to your PC identifying you to the server, which means you can safely use the email client to access your email without you having to give the email client your username and password.

So domain management is a pillar of modern networking, up there with DHCP and even DNS as something no corporate network should be without.  And virtually any piece of software designed to run within a corporate network needs to be built to support it.

And that's not hard. There's nothing proprietary about the technologies used with even Active Directory. (AD does include some non-standard extensions to Kerberos, but they only matter if you plan to create a complex network that has several Kerberos servers from different vendors within it.)

Now, it's important to note that while the systems are built upon these two common standards, the way LDAP is used differs from implementation to implementation. Most non-AD implementations standardize on something called the Posix schema (a schema is a set of rules about how to store information in LDAP and what to store.) The Posix schema is built for Unix-like operating systems, and is based on the Unix security model. The Unix security model is, to be honest, woefully outdated, which is why when Microsoft built Active Directory they virtually ignored the Posix schema and built their own, based upon the model of security used by the Windows NT (200x, XP, Vista, and 7) operating systems. Despite the poor reputation of Windows in the security sphere, the actual model it uses (as opposed to the implementation...) is extremely solid, so this was a wise choice.

So, what's the problem?

Domain controllers and Ubuntu

As I said above, I tried doing two things involving domain controllers in the last month or two. One was setting up a "poor man's Active Directory server". Now, by that I mean a cheap domain controller.

Microsoft's own solutions to domain management are actually fairly (well, perhaps a better term would be "ridiculously") expensive. For Active Directory, Microsoft requires that you buy a server (upwards of $1,000) version of Windows, upgrade all of your XP, Vista, and Windows 7 installations to Professional or above (add around $100 per client), and then add an additional $70ish/client "Client Access License" to that.

So the cheap solution is to install a free or open source server, and jump through the necessary hoops needed to integrate it with non-Posix boxes.

Ubuntu has instructions here on how to set this up.Theoretically, if you follow the instructions, you end up with a Kerberos domain controller with an LDAP back-end, which in theory is what you're after. In practice, those instructions do not work.

They've never worked. They can't work. And even if you fix the obvious issues, such as the wrong LDAP administrator account being given, you still end up with entirely separate LDAP and Kerberos systems. Add users to one, they will not appear in the other. If a user changes their password, they need to change it in two places.

And that's leaving aside personal preference issues such as the decision to go with MIT Kerberos rather than the infinitely superior Heimdal system.

What I want to know is why that page even exists? Why post bogus guides on your website that will cause people to waste enormous amounts of time trying to get something to work? There is no way in a million years anyone put that information together in good faith, it's simply not possible it ever worked, for anyone.

And that's my first issue with Ubuntu server. It's not that it's not capable of being a domain controller - I mean, if all else fails you can compile and configure, by hand, virtually any combination of open source software on Ubuntu, so it can be one. It's the attitude, here's something fairly important in the great scheme of things, and they don't really know what it is, but they don't want to admit it, so they're willing to post complete crap rather than actually say "This isn't supported yet."

To make matters worse, part of the problem here appears to be a decision by the Ubuntu people to completely change the way the OpenLDAP LDAP server works on 10.04. There doesn't appear to be any advantage in the changes they've done, changes that effectively remove support for the most common format used to distribute schemas. Someone, somewhere (perhaps the OpenLDAP people, but Canonical, the producers of Ubuntu, at least has the ability to delay such changes if they're not ready for prime time - which clearly they aren't) decided to go ahead and incorporate the changes in Ubuntu without considering for a second what the actual consequences of such a move would be.

The second problem came when I attempted to integrate a server with an Active Directory system. This should have been easy, the Ubuntu people ship a tool with Ubuntu called likewise-open that does everything on the back-end and ensures things like ssh and ftp "just work", and then as Apache natively supports Kerberos, LDAP, and a whole host of other protocols, it should just be a matter of telling your web apps to use Apache's built-in authentication. How hard could it be?

Likewise-open is broken. There's a bug on it, but nobody seems to be interested in fixing it. Essentially, the system doesn't support default domains at the time of writing - it says it does, but it ignores the setting. Every user needs to log in as COMPANYDOMAIN\userid. What? Why?

Now, in fairness, you can use Kerberos to get around some of this, except where the apps you install that actually use the usernames have restrictions upon what can be in them, or where client software still hasn't been Kerberized and requires that you use a username and password anyway. As an example, the popular secure shell client Putty is awaiting full Kerberization, and certain tools, such as the popular Subversion source code management system, actually use Putty to talk to the back-end. So much for transparent "just works" operation.

And that brings me on to the second part: despite the ease of doing so, most open source applications eschew using the built-in authentication schemes offered by the underlying platforms, instead taking the trouble to re-invent, and re-invent, and re-invent the wheel, over and over again. Those that are willing to talk to something like LDAP rarely do so cleanly or generically, usually requiring a specific schema be supported, and in some cases the ability to write back to the database.

Did I get to do what I planned? Well, yes, but never by doing things the official way. One web app I was integrating supported a plug-in architecture, and so after a lot of searching through the code I ended up writing a plug-in to handle authentication and integration with the AD back-end.  The "Poor man's AD" I haven't completed, largely because I need to update the hardware involved, but I did get a proof of concept running based around Apache's excellent Apache Directory project.


Why is it so bad?

Full blown domain controllers are either easy to set up, or they're cheap, but never both. Microsoft gets away with charging what it does for Active Directory because the alternatives are just too difficult to get working in the majority of cases. I might add that as a developer it's always been awkward to play with the technologies because in the context of employment, the people who have domain servers are rarely the people who need to experiment with them and integrate their security with them. There are several implications to this, but one is that the number of people who understand domain security is limited to a select few, a handful who have had exposure to it, plus a very small subset of geeks who are interested in it and are willing to spend a small fortune on experimenting.

This is ultimately somewhat damaging. In order to be integrated into anything but the smallest of networks, a web server, web app, file server, or any other type of server, really needs to be capable of being integrated into that network's security system. If it isn't, it becomes not merely another box to be managed, but another system, another domain, to be managed too. Quite literally, just adding one domain-less Ubuntu server to a Windows network, used by everyone in the organization, potentially doubles the amount of work required to administer the network. Right now, it's not clear that enough people in the open source world understand this.
  • Canonical clearly understands that the concept is necessary, but has no idea what it is or how to do it. The most obvious solution for Canonical right now is to get someone in who knows what they're doing to set up a proper domain management system, based upon the Posix schema, that can be installed as easily as, say, Apache, and to ensure packages are shipped that makes it easy to set up Ubuntu systems as clients of both that solution, and Active Directory. Likewise, every app in Ubuntu's repository that acts as a server, be it a web app or anything else, should be capable of using domain based security. No excuses. Oh, and before anything else is done, take out the bogus documentation - it doesn't work, it wastes people's time, it shouldn't be there.
  • Kudos to the Apache Directory Project, and to Samba, both of whom, from separate angles, know what needs to be done on the domain controller end, and are working on it. Also kudos to RedHat for FreeIPA, although I personally think the Apache system makes more sense as an architecture, making Kerberos and LDAP part of the same physical server rather than having them awkwardly communicate with one another.
  • Anyone considering writing a non-trivial open source web app or anything similar should ask themselves whether re-inventing the wheel to make their own super special cookie-based authentication system is really, actually, a good idea. Stop it! Knock it off! Learn what authentication is, and do it properly.
  • The client side needs to be thought out. At this point, Ubuntu clients can, if set up correctly, authenticate against an Active Directory system and have its security managed by that system, but making Windows clients authenticate against a Posix domain controller is a little less clean.  While Windows Professional allows a machine to be joined to an arbitrary Kerberos domain, the security of the system isn't going to be managed by such a controller. As such, while systems like the Apache Directory Project and FreeIPA are very welcome, given the extent to which no centralized security system can be taken seriously unless it takes into account the number of Windows desktops out there, some effort needs to be made to make a good domain client for Windows.
How hard could it be?

Friday, September 17, 2010

Issues with the monoculture

A few years ago, there was a brief blip when people started to talk about a major security issue in modern computing being "the monoculture". Part of this was a response to the anti-monopolist lawsuits against Microsoft, plus an attempt to explain why Microsoft's operating systems received so much attention from hackers, and non-Microsoft operating systems barely any.

The issue is this: if everyone runs the same operating system, then anyone who writes virus or worm that exploits a fault in that operating system will find their virus or worm impacts virtually everyone. There are two angles to this:

  • From the point of view of a victim, the consequences of a monoculture can be devastating. A single virus can destroy a business's ability to function, as every employee's personal computer becomes infected and non-functional, as well as the computers belonging to their suppliers (and clients.)
  • From the point of view of the hacker, spreading the virus becomes merely a step of finding a mechanism to identify other computers, as it's guaranteed that those computers can be infected.
The discussion died down for a variety of reasons. Ironically, many of those who would benefit from the argument refused to accept it because it meant accepting their own chosen, non Microsoft, platforms were just as flawed as Windows. A case in point is Mac OS X. Mac OS X has always had security holes. Earlier versions, up until a point release of 10.3, actually were so insecure that all a programmer had to do to "deliver" malware to a user was to ensure a website they browsed to sent it. Safari would, automatically, without the user's involvement or say so, by default download and unarchive the "application", and its mere presence on the user's disk would "install" it. To get it to run,all you had to do was ensure the application was associated with a common file extension or two, so that the next time the user clicked on that type of file, it would open. This was something the hacker/programmer could do, they didn't need the user to do anything.

Now that particular hole (which was open for years, undermining the notion that Mac OS X was ever built by anyone who considered security a priority) has been fixed, but holes still pop-up. Mac OS X has, since its release, had security updates delivered automatically every month or so. These updates would be, by definition, unnecessary if Mac OS X was already secure. Yet many Apple enthusiasts, to this day, didn't and don't accept the idea that Mac OS X might be insecure. The fact no virus has hit OS X users has been used as evidence of this.

But in actual fact, the reason Mac OS X hasn't been hit by a virus has been because it has such a comparatively small market share, in any community of users. Now, that last bit takes some explaining.

Back in the 1980s, viruses were common and were successful on a range of platforms, even platforms that weren't particularly popular. MS DOS had many, but so did the Commodore Amiga and the Atari ST, both of whose market shares were dwarfed in comparison to the PC. Part of the reason for this had to do with the "networks" at the time. Networks, in the 1980s, generally consisted of people swapping disks with one another. Those disks were intended for a single platform - people didn't expect a disk for a Mac to even be readable by an Amiga or PC. Largely because, without special software, they weren't. So while Macs might have had a small market share, Mac viruses were successful because they were monocultures within the networks they belonged to. As were Amigas and STs.

Fast forward to today, and that's not the case. Someone who writes a virus for a Mac will find that the vast majority of machines their virus "hits" will be unable to run it. Worse, if they use something like the old "Disguise an executable as a JPG and email it to everyone" trick that plagued users during the late 1990s and early 21st Century, they'd find that Mac users would be warned pretty quickly by all the non-Mac users who would receive the corrupt JPG. The fact the virus would attempt to infect uninfectable computers would makes alarms ring.

So, anyway, the point here is not to analyze Mac security holes, but explain why standardizing on a single platform has negative implications for any business. By standardizing on a single platform, regardless of what that platform is, you do many things that will negatively impact your security:
  • You significantly improve a viruses' chance of success by ensuring the majority of machines it will hit can be infected.
  • You reduce the likelihood of early warning and detection, by ensuring that viruses don't attempt (and fail) to infect machines they weren't built for.
  • You ensure that any virus that hits your organization will have a devastating, crippling, affect on your business.
Moving away from the monoculture tends to scare many system administrators. Licensing seems easier if everyone uses the same platform (although it can also be much more expensive), and certain tools work better with specific platforms. Still, much of that is changing. Apple is doing its best to ensure Mac OS X fits transparently into organizations that are primarily based upon Windows, and while Ubuntu seems to lack a community of developers that understand, say, domain-based security, it is at least moving in the right direction.

There's certainly no reason to prevent your users from using the right tools for the job, and by doing so, you also help make your own network more robust, and you become a better Internet citizen. The right choice is choice.

Friday, August 20, 2010

Oracle, Google, and Java

Oracle's decision to sue Google over the use of Java technologies in the Android operating system has certainly cast a cloud over the Java platform in general. Sun's behavior towards Java was paternalistic but not aggressive: a single lawsuit, against Microsoft, was issued at a time when Microsoft was doing what it could to muddy what Java actually was (in some senses, Microsoft's actions constituted something akin to trademark infringement, even if the technicalities of the case said otherwise), but beyond that, Sun did not sue anyone producing similar or competing platforms.

And the competing or similar platforms were legion. The FSF, Apache foundation, and IBM produced their own independent Java Virtual Machine implementations, IBM licensing and certifying their's, but the others being entirely independent. Open source developers put together a Java alternative called Parrot. And, of course, Google, while Java was still owned by Sun, put together Dalvik.

What's Dalvik? Well, it's the virtual machine at the heard of the Android operating system. Programs are written in a high level language like Java, that is translated into Dalvik code, and then this is the code that is distributed, in much the same way that code written for the Java platform is translated into Java bytecode, put into JAR files, and distributed.

Sun did not object, strongly at least, to what Google was doing. In fact, then Sun CEO, Jonathan Schwartz, welcomed Android upon its announcement. This isn't to suggest Sun didn't have concerns: Sun wanted mobile devices to converge around the J2ME system, and while Android contained some Java technologies, the underlying platform was entirely incompatible with J2ME.

Oracle has taken these political concerns, and turned them into a lawsuit. Given the fact people generally believed that Java had been turned into an open source technology by Sun, this is raising concerns about the status of the entire platform.

Are people right to be concerned? Well, let's address what's going on.

The status of Java

Java's status as an open source technology is affirmed by two actions Sun made before being swallowed by Oracle. The first was to take the official Sun version of Java, and release the source code under the GNU General Public License. This license permits anyone to take the code and distribute it or modified versions, as long as they too provide the code to those they distribute too, and they license it under the GPL too. The basis of the GPL is copyright law, so Sun's license covered any risk of infringing copyright by copying Java - as long as you kept to the license, you would not be infringing upon Sun's copyrights.

The second action was to create a patent grant. This took the software patents that form the core of Sun's Java technologies, and allowed anyone to implement technology covered by those patents, as long as certain conditions were met. However, the conditions here were more limited than those imposed by the GPL copyright license: you could only implement technologies using the Java patents if, and only if, you implemented the official, Sun sanctioned, Java specification. There's some debate as to whether you could create a product that implemented a superset of Java using the license, but there's no debate that implementing a subset (or a superset of a subset) would violate the license.

In some ways, the latter license seriously cripples the former license. The GPL is supposed to allow those who receive code under it complete freedom to modify that code as they see fit (within the confines of the law.) However, the patent grant limits that freedom quite severely.

So, in essence:
  • You can create a custom implementation of Java, based upon Sun's code, if, and only if, your implementation follows the Java specification.
That's the legal status of Java, at least, as Oracle understands it.

Consequences

Java was only recently made open source, and a large community has centered around it despite Java's initial lack of openness. Hence it's unlikely, at this stage, that Oracle's actions will kill Java outright. However, Oracle's lawsuit has angered quite a few people in the Java community.

Google's embrace of the Java language has also helped increase momentum for the language against rivals such as C#. It's not clear to me how Google will address the issues, and a lot will depend upon whether Oracle win their lawsuit (or Google believes they will.) The best thing that could happen for almost everyone would be for Google to take the suit to court, and win.  This would ensure Java remains a safe, open, platform, without anyone believing they're taking an excessive legal risk by dabbling with it.

But what if Google loses? Well, the consequences would be substantial.
  • Google and Oracle would have to negotiate a system of licenses covering Android, or else see the destruction of the Android platform. This might involve changes to Android itself, or it might involve some form of payment to Oracle to cover future updates. In the worst case, Android might cease to be open source, but such a move would likely end much of the support for the platform.
  • Developers would be wary of working with Java at different levels. Those who write end user applications would be largely unaffected, although the use of technologies developed to work around Java's limitations could be compromised. As development goes down the chain, frameworks, alternative libraries, and third party implementations of Java would increasingly be impacted by developers concerned about the legal risks of playing with the platform. The progress made integrating Java with many GNU/Linux distributions may be partially undone.
  • Mindshare would inevitably shift towards alternatives, rightly or wrongly. Microsoft's .NET platform, for instance, certainly would benefit from an overly litigious Oracle.
On the surface, as long as developers see Java as a platform, and avoid third party implementations or extensions, Oracle's lawsuit might not be an issue. But the risk is that if Oracle continues down this path, successfully, we'd be looking at a stagnating platform. One hopes that Oracle will change course, or else Google will win the right to create its own implementation of a technology Sun had claimed it had freed.

Monday, August 16, 2010

On Virtualization

I've been using virtual machines almost as long as I've had computers capable of running them. Back in the 1980s, computers weren't as standardized as they are today, and it became very popular to make emulators that would simulate one computer, running on another, so that you could run software for the emulated machine.

As time moved on, and computers started to standardize upon a common architecture, the emphasis moved from emulating to somehow fooling an operating system into believing that it had control of the computer when in fact, the system was running as just another program. Microsoft Windows 3.1 provided DOS in exactly this way if you had a powerful enough CPU. And a company called VMWare commercialized a system that allowed you to run far more powerful operating systems.

So, what is virtualization? And what is it good for?

Virtualization is simple to describe, but it takes many forms and has even more applications. In principle, if you can run more than one operating system instance on your computer at a time, then you are engaging in virtualization.

In early instances, usually called "emulation" at the time, virtualization was used to provide compatibility with programs written for a different platform. For example, the Commodore Amiga had available for it several PC emulators, programs that simulated a complete IBM PC, allowing a real copy of MS DOS to be installed, and real MS DOS applications be run.

We're still doing it. Microsoft's Windows 7 comes with something called Windows XP Mode. Windows XP Mode is actually Windows XP running inside of a virtual PC - a process designed to follow Windows XP into thinking it's running inside of a real computer. Many Mac OS X users run a tool called Parallels, that makes it easy to run Windows applications without having to reboot into Windows, losing access to their Mac OS X applications.

The same technologies used to simulate whole computers on a user's desktops can also be used for applications other than compatibility. Developers, for example, love virtualized computers. On my development laptop, I'm running Windows 7 as my primary environment, but I also develop for GNU/Linux, and I have VirtualBox installed so that I can run Ubuntu without ever leaving my Windows desktop. In theory, Microsoft's Virtual PC, which is provided as standard with Windows 7, ought to be capable of running Ubuntu, but I've had problems there that I hope will be fixed in a future update of either Ubuntu or Virtual PC.

Having virtual computers as a developer means more than being able to develop for other platforms. Virtual computers can easily be wiped, replaced, backed-up, duplicated, and so on. I can create test environments without worrying about losing my primary environment.

Now, there's another major reason why you might want to use virtualization, but it doesn't lend itself to the "run program that pretends to be a computer on your desktop" approach. Increasingly people are using servers. Servers provide tools over a network, such as web sites, and generally servers need to be reliable, have excellent connectivity, and need to be able to run the applications installed on them without the risk of one upsetting another.

What servers generally do not need is oodles of memory. And given the above requirements, virtualization is a good fit, as long as the virtual servers are all installed on computers that are powerful enough, are reliable, and are located somewhere with good connectivity.

A fairly popular system for managing such an environment is Xen. Xen runs as an operating system in its own right, called a hypervisor, and it encourages the use of paravirtualization, a technique whereby operating systems know that they're not really running on bare computers, but instead cooperate with the underlying virtualization system. Typically, in a Xen set up, a computer is set up with the base Xen hypervisor, a very basic, lightweight, operating system installation called a "Dom0" that can be used to control the system, and then one or more actual servers that do the work. Reboot a Xen system... should you need to... and Xen will shutdown the virtual machines and bring them back up. Your single computer becomes a multitude, running as many applications as you need.

How does this work? Well, I'll give you an example. I have a machine in my closer that is a dedicated Xen box. It runs several virtual computers, all of which, at this point, run Ubuntu.
  • I have one VM that handles my DSL connection. Because this VM handles the connection to the outside world, I can keep the number of services on this to a minimum which helps keep my network secure. People would find it hard to hack into this VM, and they will find it hard to hack into other computers on my network because in order to get that access, they would need to hack my VM.
  • I have one VM that handles my email and runs a website. My wife and I use MediaWiki to store things like shopping lists and package tracking numbers, in a way we can both get at easily. This VM runs the website, and the databases needed by the website.
  • I've set up other VMs for more obscure things I do from time to time. Some time ago I wanted to learn about a system called Kerberos, and so I set up a VM to manage a Kerberos domain. Because this was a virtual machine, I did not have to worry about the configuration messing anything up I was not playing with.
Why use Ubuntu? Well, part of the reason is that Ubuntu is one of the operating systems that supports Xen's paravirtualization system. Another is that Ubuntu has no licensing issues that would cause problems with virtualization, making it easy to create and destroy installations as I need them. But other options exist: if you must have Windows, Xen has a method to run it (alas, without the benefits of paravirtualization), and Xen supports other variants of GNU/Linux, as well as more obscure choices such as Solaris.

For my next job, I'm expecting to have to do a lot of development under GNU/Linux. I'll be developing software for servers, and the ability to create and destroy test servers as I need them will be invaluable. Ultimately, the software will be running on production servers that themselves will be virtualized, making things easier for the system administrators as well as saving the company a fortune in hardware costs. It's a beautiful thing.

Intimidation

I had a very strange conversation with my wife and my step father a while ago. Both said that when they met me, they found me intimidating. You see, they heard that I'm a computer programmer, and automatically assumed I was smart, and found that intimidating. Of course, once they got to know me, and discovered I was really a harmless, big, doofus, they didn't find it an issue any more.

Had a similar issue today. Our fence has collapsed, and I came across my neighbor fixing it.I hadn't made any attempts myself because, quite honestly, I wouldn't know where to start. This is ignorance on my part, and a fault: I should be doing something about my lack of knowledge, rather than running away from tasks that need that knowledge.

My neighbor, nice guy, getting on with it, knows exactly what to do, asked me if I wanted various things done, and I didn't really know what to tell him. I felt a tad awkward. He knew all of this stuff. I didn't.

Sunday, August 8, 2010

"Can I use this open source program?"

I've heard some very weird things about open source of late, with people convincing themselves of mysterious legal dangers that somehow lurk in the open source world but not the proprietary software world, from one person who, head of IT at a major corporation, announced it was OK to "use software under the GPL but not the GNU license" (GNU is a software project, and is licensed under the GPL), to others who are convinced that Android isn't open source because (follow this logic!) the license allows phone manufacturers to make proprietary modifications - which means that they can't do anything unless they ask Google for permission.

Confused? You wouldn't be the first person. (And yes, Android is open source, and Google can not prevent you from doing whatever you want with it.)

Look, here's the deal with open source. You are very, very, unlikely to get into trouble for using open source software. Indeed, the mere act of having open source software installed on your computer, and running these programs, will never get you into trouble. Proprietary software, on the other hand, cannot claim that. You may think it's just a matter of paying for an application, but in fact obscure rules exist in many proprietary software licenses that are easy to trip over entirely unintentionally.

That same corporation that had problems understanding open source licenses? Had to pay several million dollars because it uninstalled a proprietary application it used, and re-installed it on another server, one with more than one CPU. Yes, really.

So why are people confused about open source? And what kind of risks do you take in using open source software?

There are really two issues, and most of my clients will never run into either issue. The first is that open source software normally has some requirements associated with its redistribution. In some cases, the requirements are minor and inconsequential: you might be required to include a notice crediting the authors if you redistribute their work. On the other end of the spectrum, some licenses require that you provide the same rights you received when you got the code to anyone you give a modified version of the program to. That is, if you were to take Linux, for example, and make a change to it and give people copies of your modified Linux, you'd have to make sure the receipients are allowed to change the copies you give them too.

The second problem is that some people are taking out patents on the way programs work, and open source, by its nature, has no major protections against people being sued for using those patented technologies. To be fair, running proprietary software carries risks too, the legal status of so-called software patents is still the subject of much debate (though you should assume they are legal for now), and you can probably imagine that the chances of anyone finding out you are using a patented technology for software you run privately, on your own PC, is slim, but the issue has been raised.

Let's get back to the first point though, about running afoul of an open source license, and let's determine how easy it is to tell whether the issue might affect you. Remember, if it's open source, the chances are that there are less licensing implications for you than there would be if the software was proprietary. Many have learned this the hard way.

So, you're thinking of using a particular open source program, and you want to know if you need to study the license before using it. How do you determine if this will be an issue?

1. Is the program really open source?

This is the first question you should ask. You might not have to pay anything to download a particular application, but that doesn't mean it's open source. Adobe's Acrobat Reader and Apple's Safari are both proprietary programs that are available at no cost, for example.

In order to check whether an application is open source, ensure that it is, in its entirety, licensed under one or more of the licenses approved by the Open Source Initiative, and described as "Free software licenses" by the Free Software Foundation. The lists are here and here respectively.I would avoid any license that isn't considered open source, or free software, by both groups.

Also when checking, be aware that some software packages are distributed in a non-open source form. For example, Sun's (excellent) VirtualBox tool is available in an open source form, but if you want a version you can just install and run, Sun only distributes a closed, proprietary, variant with some unusual restrictions from their website. How unusual are those restrictions? Well, if you want to install it on your PC for your individual use, I believe you can do that, but you can't ask me to install it for you, I'd be violating the license if I did. This is why proprietary software is where the real licensing issues are, it's easy to run afoul of them doing things you'd expect to be perfectly innocent.

If the software really is open source, go to question 2, otherwise go to step 4.

2. Are you going to be selling or distributing software?

If not, you're in the clear. If you're not actually distributing software, you're not going to be distributing the application you're concerned about either, so stop worrying and install the app! Otherwise, go to question 3.

3. How will you be using the software?

OK, we've established you're in the business of distributing software, which means we need to figure out if you're going to be distributing this code directly or indirectly. So, how would you characterize your use of the software?
  • You're going to run the application on your PC for work unrelated to software development or distribution ---> stop worrying and install the app. It's not an issue.
  • You'll be incorporating it into software your organization runs internally, and only internally ---> stop worrying and install the app.  In fairness, there are some exceptions to this rule, but they're relatively obscure and only really apply if you're using a more liberal definition of "organization" than most people would use.
  • You'll be using it to create things you'll be using in something you redistribute, but will not be including anything from the application itself to others. For example, you plan to use the GIMP application to create some graphics for a computer game, or you intend to use GCC to compile your application. ---> Again, stop worrying and install the app. It's not an issue. But make sure that's really what you're doing.
  • You'll be including some or all of the code in something you'll be redistributing to others ---> go to step 4.
4. If you got here, you need to read the license.

If you got here, it means you'll be distributing some part of the open source program to others, or the code was never open source to begin with. If you're going to be distributing open source, you may have some obligations that go with getting the rights you were given. A good place to get a gist of what those requirements may be is to check that FSF licensing page I listed above. The FSF generally describes each license as follows:
  • A "Strong copyleft" is a license that requires you release your changes and additions to the code under the same license.
  • A "Weak copyleft" is a license that requires you release only your changes to the code itself under the same license. If you have added something entirely new, you don't need to license that under the license, you can release it under any license you want, including a proprietary license.
  • Licenses that aren't copyleft generally only require you provide attribution.
These are not the only differences, and some licenses may have somewhat unusual requirements you may also find objectionable, such as allowing a named third party to make modifications to your code without releasing them back. But the above should get you started.

Generally speaking, open source licenses are more liberal and less likely to get you into trouble than proprietary licenses. Much of the confusion has to do with expectations - everyone knows that if you were to redistribute copies of Microsoft Word you made yourself without permission, Microsoft would sue you into the ground. OpenOffice.org, conversely, is open source, so you have the ability to do just that, but the copyright holders do ask, nonetheless, you do something in return. The license will tell you what.

Friday, August 6, 2010

Google Apps as a small business office system

I'll admit, setting up Harritronics was a spur of the moment thing. I had been asked to do some contracting work, was advised to set up an LLC, and jumped in to get everything up and running before the start date. Aside from the legal stuff (which the State of Florida makes phenomenally easy via sunbiz.org), I wanted to make sure I had the same, or even better, computing facilities available as I had at previous employers. That meant email and an integrated calendar, a web server, and somewhere where I can deposit documents, as well as a full office suite.

Now, I'm a nerd, I have a lot of that anyway. In my closet is a server box running three virtualized servers, connected to the Internet via DSL, and those servers include something to receive email on my own domain, and even a web server. With a Wiki. But while this kind of thing is fun to do for yourself, for a business it carries with it risks. I'm at the mercy of disk crashes, issues with my Internet link, and issues with software updates going wrong. It also wasn't enough by itself, I can read my email on my network, but not outside of it; I did not have an online calendar of any description; and while I love the Mediawiki system (which also powers Wikipedia), it's not something I would recommend as a general shared document repository.

So what were the available options?

The first was to set up something myself using something called a Virtual Private Server, or VPS. You can "rent" a VPS, which is a virtual computer sitting in a managed server room somewhere across the country, and then install virtually any software on it you wish. Thus, the services offered by the server are available anywhere on the Internet, the system is backed up and rock solid - you don't have to worry about your own Internet connection causing problems - and it's as flexible as the software you install. The downside of such a solution is that you still have to spend a lot of time or money installing the "right" software.

The second was to buy an out-of-the-box solution such as the Microsoft Exchange and Sharepoint suites. The issues I have with these are that they're (very!) expensive, they take time to install and get ready, they're overengineered for a small business, and unless you combine them with a VPS you still have the issue of your system being dependent upon your own computers and networks. I also admit that I'm uncomfortable with proprietary solution, and Exchange is infamous for working poorly with others.

The third was to get third party email and web hosting. And, in the course of researching this, I found something called Google Apps.


What is Google Apps?


Despite the name, Google Apps is not a collection of applications, or at least, not as we generally think of them. Google Apps is an integrated email, calendar, web site, and document repository system, based on GMail, Google Calendar, Google Sites, and Google Docs. The system is designed such that you register a domain name (like Harritronics.com), point the domain name at Google, and then you can set up users under the domain, each of which have their own email addresses, calendars, etc, and all of whom have shared access to certain common resources. It's similar to Microsoft Exchange and Microsoft Sharepoint, but it's all on the web.

And if you have less than fifty users, the system is 100% free. You see ads, and if you choose to create a public website using Google Apps, then your customers will see a message at the bottom of your website saying you're using Google Sites, but otherwise there's no cost, in any real sense, to you the customer. For over fifty users, or if you expect to be receiving and storing a lot of email, you need to subscribe to the "Premium" version, which costs a whopping $50 per user per year, which is probably affordable to any business that can afford to employ fifty employees, and certainly compares well to, say, an Exchange license.

As a subscriber, you create users as you see fit, and each user automatically has access to the services Google Apps provides. New users appear in the address books of every person in your organization, and you can easily assign them rights to access or change shared resources such as websites and calendars.

Let's go through the features one by one.

GMail

The most obvious feature of Google Apps is GMail. Email to your chosen domain is available to your users using the GMail interface. If you'd prefer to read your email using a normal, desktop, client, you can use any client that supports IMAP (or POP3), which these days is pretty much all of them. The same client can also be used to compose email as long as it supports "Secure SMTP". Google provides standalone email clients for Android and iPhone too.

I have to admit I don't find the GMail interface particularly attractive. Most email systems have two things in common: they present one email to the user at a time (multiple windows notwithstanding), and they provide the ability for the user to create folders in which to file emails once read.

Not GMail. GMail takes the approach that users should read "conversations", so when you click on an email, all the emails that are part of the same thread (the emails responded to, etc) come up at the same time. As you read emails, you either mark them read and leave them filed in an archive, or you can tag them with one or more tags that you can go to later.

In theory, it's a great system. In practice, it's one of those times where programmers have spent a little too much time trying to be clever. Presenting entire conversations at once gets confusing very quickly, and the tagging feature, while interesting, doesn't seem to, in practice, help with filing any more than folders did. Indeed, the combination of the two work against one another - whether intentionally or not, conversations, not individual emails, end up being tagged.

To a certain extent, all of this is mitigated by the ability to use third party IMAP clients. I'll discuss IMAP in a moment.


Google Calendar

The second major feature of Google Apps is the integrated calendar system. Again, this is the regular Google Calendar system, but enhanced to allow sharing between employees. Users can create multiple calendars, subscribe to other calendars, publish their calendars, and they can schedule appointments, sending out meeting requests and processing incoming meeting requests in a way integrated with GMail. Sending meeting requests to parties outside of Google Calendar system works too, if their system supports calendars (such as Microsoft Outlook/Exchange users.)

As with GMail, the system is integrated with Android and the iPhone, although the Android version is, right now, a little kludgy.


Google Sites

Google Sites is a tool that makes it easy to set up websites, both public and private. As with Google Calendar and GMail, Google Apps simply integrates their existing Google Sites service and allows users you set up to collaborate.

Google Sites takes an interesting approach to website development. Instead of having users hand-roll raw HTML, they edit predefined templates, with an editor that makes it easy to change fonts, create links, and so on. The feature-set is similar to that of a Wiki, but the service is much more user friendly, and there's more flexibility in terms of defining what users can do and see what. Google Sites users can also embed more complex applications, such as shared spreadsheets.

There doesn't appear to be any restriction on the number of websites you set up using Google Sites, and you can set up sites with varying levels of access. In my case, www.harritronics.com is set up as a public website anyone can read. I also set up some private websites for my organization - a company homepage, for instance, that shows all of the links to get to all of the different systems supported.

You can't do anything you want using Google Sites as Google has in place strict restrictions on the look and feel of the sites you build, but the service works fairly reasonably both as a way to advertise a business, and as a collaboration tool. If you were considering setting up a Wiki in-house, or you want to put together a basic website to explain your services, the system will work fairly well.


Google Docs

Google Docs is probably the most easy to misunderstand part of Google Apps. Again, the system is a commandeered service Google was already operating. Google Docs provides a spreadsheet, presentation editor, and a word processor, that all work within your browser.

And if that was all there was too it, you probably could stop right there. OpenOffice.org, for instance, provides word processors, spreadsheets, and presentation editors (and a whole lot of other great tools) for free that are far more feature-full. You would probably not choose to write your novel, or create an invoice, using the Google Docs word processor. You would probably prefer not to design an advanced report using the Google Docs spreadsheet.

But that's not what these tools are for. What makes Google Docs special is that these are dynamic multiuser applications that run over the network. Your employees can all access the same document without having to worry about different copies and one person locking everyone else out of the same thing. Two people on different sides of the planet can update a spreadsheet, which in turn is shown in real time on a website you've built yourself.

Back-up a bit. I said above "You would probably not choose to (...) create an invoice using the Google Docs word processor", but actually if the invoice is complex enough, you may actually want to involve several people in its preparation, and so you might very well use Google Docs to write the draft.

Google Apps provides a central repository for your company, making it easier for employees to share documents with one another.


Integration with the desktop

As I noted above, Google Apps has a web based interface that, on occasion, is supplemented by systems that allow you to make use of dedicated applications. For example, you can avoid the GMail interface and use any IMAP client. This means that in theory, you can use, say, Microsoft Outlook, for your email. It also means that your users, given the right email client, can have offline access to email. But that isn't entirely the case, and here's why.

Users of Microsoft Exchange are used to an environment in which everything is integrated. Your address book includes everyone in your organization, and everyone you've manually added to it. Likewise, GMail includes the same feature through its own email interface. What Google Apps does not do, however, is provide third party applications with access to the same information.

Now to be fair, there are workarounds, and some would argue that the process of synchronizing contacts is not exactly standardized anyway. There's a system called LDAP which can be used to store global address books, but Google Apps certainly doesn't publish user lists that way. For individual contacts associated with a single user, a system called SyncML exists, but this is a system with very little support: Google Apps has limited support, and client support is hit and miss. Outlook does not directly support the standard, but plug-ins exist to provide it. Unfortunately the available plug-ins don't appear to work with Google Apps.

Calendar support is certainly better, but this too is hampered by a lack of universally promoted standards. Google allows Calendars to be published using the open iCalendar protocol, and edited using the open CalDav protocol. Unfortunately, Outlook only supports read only access to the former, and nothing at all concerning the latter. Other clients have mixed support. The Mozilla Foundation produces an excellent email client called Thunderbird, that has a plug-in called Lightning that supports CalDav, but I found it somewhat unreliable when connecting to Google Calendar.

In fairness, third party plugs exist to do most of these, but their ease of installation, and their reliability, I found questionable. One Calendar synchronization app I used, that'll remain nameless, kept popping up a requester asking me to select an Outlook profile, every time it tried to synchronize. Nothing worked until I uninstalled Office 2010, which was initially installed using Click-to-run, and re-installed it using the separate download Microsoft conveniently forgets to tell you exists.

The issue with not having the three (email, contacts, and calendar information) available to a single application is more of a concern than you might realize. Mix a desktop client and the web interface, and without a lot of syncing you'll have problems keeping track of which calendars have which appointments, and you'll be constantly finding yourself wondering what happened to your contacts. The reality is that the only way to use all three together is to use the Google web interface, and the clients Google has written for specific platforms.


Advantages and Disadvantages

Google Apps has various advantages and disadvantages over its more conventional cousins. Let's compare, for the sake of comparison, to Microsoft's Exchange suite.

Pros:
  • It's cheap. For up to 50 users, you don't have to pay a penny. Moreover, it's free as in freedom - you don't have to be tracking licenses, as you would with a typical Exchange configuration.
  • You can run your entire business using some office-grade laptops, a DSL connection, and a Wifi router you bought at Office Depot. I'm serious. The age of the "server room" for anything but the largest enterprises is rapidly coming to an end. You can hire some guy from Harritronics (ahem) to come over and set everything up, and then just call him, or someone similar (Geek Squad?) if you actually have a computer problem.
  • It's fast. While the 15 second launch time of Outlook might not be dramatically greater than the 5 second launch time of GMail (on my system at least), it certainly cuts down the annoyance factor!
  • You can access everything everywhere. People can work from home, or even from their cellphones, without the need to configure VPNs or other complicated inter-network systems. I found it astonishing that for the first week or so I was effectively running Harritronics from my Android cellphone 90% of the time.
  • Disaster recovery is built-in. If a hurricane destroys your office, you can be up and running as soon as you find a coffee shop with a working Wifi system. Your email, calendars, contacts, and important documents will all be waiting for you.
  • It's easy to administer. Just go to the website and add or delete users as you need to. No special software required, and no concerns about the right people being locked out of the system, as long as you've set everything up properly.
  • It's easy to set up. I had the system up and running within ten minutes, and that included registering a domain name at register.com
  • The system is entirely cross platform. Your users can use Internet Explorer, Firefox, Safari, or Google Chrome, on Microsoft Windows, Mac OS X, or even Ubuntu GNU/Linux, to access everything in the system. The only other recommended tool is Adobe Acrobat, as various tools "print" by exporting a PDF file. Your employees can run the operating system and web browser they feel most comfortable with, which improves productivity and makes for happy employees, as well as reducing the risks of monocultures - where running the same operating system throughout an organization makes that organization exceptionally vulnerable to viruses and hackers.
Cons
  • Desktop applications always feel better on a desktop, no matter how well a website is designed, and GMail really isn't that friendly. At this stage, it's hard to recommend the use of Google Apps with desktop applications because of the difficulty configuring the two to work together seamlessly.
  • You may have legal restrictions that prevent you from making full use of the service, in particular the collaboration services may not be useful to you if you work in the healthcare field. The important thing to note is that the entire system is hosted on servers outside of your direct control, and while there's no reason to especially distrust Google's security over, say, a firewall on your own network, legally there is a distinction.
  • Unless you pay for the premium version of Google Apps, you will not be able to (easily) integrate your office network's security system with that of Google Apps. In practice, this means users will need to remember two sets of passwords, the password they use to log into their computer, and the password they use to log into Google Apps.
  • You don't manage the security, and you can't lock out access to the system to computers not on your network. While Google's security is fairly robust, there's nothing ultimately stopping someone from, say,  China hacking into one of your user's accounts if he or she can guess the password, something less likely to be possible if the email server was behind your own firewall. This means you need to have policies in force requiring users to set strong passwords, something easier said than done.
Useful links and workarounds

I used Register.com (sponsored link) to register my domain name. You don't need to buy anything other than the domain itself, although depending on your needs, you might want to buy the web hosting service if you'd prefer your public website to have no Google Apps branding at all.

Once you have the domain name set up, you can go in Register.com's control panel to set up specific domain names to point at Google. For example, I set up www.harritronics.com to point at ghs.google.com using the following:
  1. Go to register.com, and click on "Your account"
  2. Log in, using the same registration details you used when you bought the domain name.
  3. User "Services for account", you'll see your domain name (and some additional services underneath that you can ignore.) Click on the domain name itself.
  4. Scroll down to the bottom of the screen that comes up, and you'll see a set of options under the heading "Advanced Technical Settings". Click on "Edit Domain Aliases Records" next to "CNAME".
  5. Enter "www" in one of the empty boxes on the left, and enter "ghs.google.com." in the box to its right. Hit Continue, and Continue again to confirm the settings.
You'll be asked to set up aliases to point at ghs.google.com frequently when setting up Google Apps, so it's worth keeping this in mind. Setting up the email alias is slightly different:

  1. Go to register.com, and click on "Your account"
  2. Log in, using the same registration details you used when you bought the domain name.
  3. User "Services for account", you'll see your domain name (and some additional services underneath that you can ignore.) Click on the domain name itself.
  4. Scroll down to the bottom of the screen that comes up, and you'll see a set of options under the heading "Advanced Technical Settings". Click on "Edit Mail Exchanger Records" next to "MX".
  5. Leave the two boxes on the left blank. Set one of the rows Priority=HIGH and Mail server = "aspmx.L.google.com", and the other "Medium" and "alt1.aspmx.L.google.com". Hit "Continue" and "Continue" again to confirm the settings.
I'm using Office 2010 on Windows 7, and as you might expect the system suffers from being new and from developers not being quite sure how to deal with it. The critical thing to remember is when you install the system, do not use the "Click to Run" installation that is downloaded by default. On the download page, if you're downloading Office from Microsoft, you'll find an "Advanced Download options" button (or something similar), make sure to select the 32 bit or 64 bit version only.

Once you install Office, you can, kinda sorta, use Outlook with Google Apps, however:
  • Without paying for the premium Google Apps product, there's no easy way to automatically sync contacts between the two systems. You however can tell your users to manually export and import contact lists between the two systems.
  • If you want to create a common address book, you'll need to build a separate LDAP server and ensure any changes made on the Google Apps side are also made on the LDAP side.
  • You will need a third party plug-in on each user's PC to synchronize the calendars, otherwise accepting an invitation on one system will not reflect on the other. At this point I have not come across a calendar synchronization tool that I can recommend. The tool I'm using right now, for instance, insists on prompting for an "Outlook profile" despite being told "which one to use", and despite there only being one profile on the system!
  • The other tools are built for web browser use. You can't integrate Excel with Google Docs, for example, but you wouldn't want to, the two applications are built for different tasks.
There are alternatives to Office! Consider using OpenOffice.org (from, uh, www.openoffice.org) It's a complete and extremely powerful office suite from Oracle (formerly Sun) and it's free. For email and calendar support, you might want to consider Thunderbird, from the Mozilla foundation (famous for the Firefox web-browser.)  With the Lightning plug-in installed (which is also free), the system works as a high quality replacement for Microsoft Outlook. Be aware though, that at the time of writing Lightning does not work with Google Calendar...

So...  have you used Google Apps? Would you like an office collaboration system that "just works" and has no licensing complications?


    Thursday, July 29, 2010

    Good evening

    So, this is my journal, and you're all more than welcome to comment on anything I write here. I'm hoping to write some articles that others will find useful here, especially about IT. I'll make a point to update the journal once a week, so please check back, and hope you find what's written here useful.