Until the last decade, I believed that what one does with a computer is to load it with code early and often. Most programmers have learned their craft in such a world. We now live in a world swarming with viruses, trojans, root kits, and worms that makes such behavior foolish. No longer can we assume that loading code is an explicit action or that, barring bugs, what the code does is something we desire. Such assumptions may still apply today in a well managed IT organization that maintains corporate servers. And perhaps some skilled amateurs can maintain their own PCs carefully enough.. But such assumptions clearly do not apply to the average computer user let alone the average smartphone or PDA user. And code isn't necessarily loaded explicitly at all. Often it enters our machines by stealth -- even in careful IT shops.
In some embedded computers, all code comes pre-loaded in ROM and there is no provision for adding to or modifying that code, Otherwise, some code must be installed before a computer can do anything. With current practices and economics, we also must also maintain and enhance the code, e.g., distribute and install patches, new versions, or new applications. Every one of those practices has proven to be an avenue for malware entry. In the not-too-distant future, many computers may be so cheap and ubiquitous that maintenance, patching, and enhancing their code is neither practical nor worthwhile. That is not the case today but it may be the only effective long-term solution. In some cases it is already easier and cheaper to throw away and replace a severely infected Windows PC rather than to attempt to clean it!
We need to adopt a new perspective based upon the principle that loading code, while essential for the operation of a single computer, is fundamentally threatening to any multicellular computing system of which it is a part. Until very recently, virtually all our practices for installing and maintaining software were legacies of the single-cell computing world. The programming priesthood would determine what code should be installed on important machines and the rest of the users were largely on their own. Then corporate networks found themselves compromised by end-user machines and we began to recognize the risks of loading code onto any networked computer that will ever be allowed within the firewall. Lots of stopgap measures have been tried including individual firewalls for each machine, encrypted wireless nets, anti-virus software, rules, rules, and more rules that have proven to be easily circumvented by "social engineering". Yet little has been done in the way of fundamentally rethinking the implications of allowing code to move from machine to machine. Meanwhile, each month brings discoveries of new security holes not covered by existing band-aids.
Biological metaphors (pdf) such as immune systems already have played some role in detecting viral code. However, at least two differences between biology and computing suggest that biological immune system analogies must be used with care. First, cells in multicellular organisms come with a full complement of DNA. Any attempt to inject DNA or RNA into such cells is, by definition, an attack that is to be blocked. So cells have mechanisms to trigger their own suicide if such an attempt is detected. In contrast, most computing systems need to accept and execute code on a regular basis, e.g., JavaScript or Flash code in Web pages or macros in Microsoft Office documents. Second, cells need only recognize two kinds of code: DNA and RNA whereas executable computing "code" comes in many forms. It masquerades as data inside email attachments, HTML documents, images, etc. or piggybacks onto USB devices that plug into PCs: thumb drives, mobile phones, GPS systems and the like.
One approach is to reexamine the whole notion of application privileges at the OS level. The familiar Unix “permission” structure – applications and users have predefined permission to read, write or execute files – were designed to protect a single-cell world where the data on disk was the most important aspect of the machine. From a multicellular computing perspective, communication traffic between computers, not management of its file system, will often be the most important task the computer does. Hence communication between machines (e.g., TCP/IP) and with the user (keyboard, mouse and screen) should be controlled with a revamped permission structure appropriate to modern attacks on networked computers.
A second approach might be to improve our ability to discriminate code (in all its varieties) from data. Executable code, whether machine language, java bytecodes, or JavaScript, among many others, will almost certainly have statistical or structural regularities that differ from non-executable data. The structure of the CPUs, compilers, and code interpreters that must run the code necessarily impose regularities upon the code itself. Whether such regularities can be reliably detected in chunks of code of the size found in malware exploits is an open question, as is the question of whether the malware can be obfuscated sufficiently to prevent detection yet still execute properly. Ongoing research into such topics may well bear fruit.
What, then, can we say about computing paradigms that fundamentally require the transmission of code? Robots, whether on the surface of Mars, in orbit around Jupiter, or in a battlefield situation may require code updates on the fly. Computer scientists have spent a decade or more exploring mobile agents that are based upon moving code from one machine to another. And a small industry is growing around the notion of cycle-scavenging, light-weight P2P grids for distributed computing that require the transmission of code. There is probably a different answer for each case. Robots that accept code are simply not going to be connected to the open Internet (except perhaps as toy examples for the enjoyment of netizens). Still, insufficient security protection in communication with battlefield robots could allow them to be hijacked by the enemy even though a great deal of effort undoubtedly has been expended to prevent such occurrences. Mobile agent systems can perhaps be sufficiently guarded if they are used with specialized hardware and operating systems designed specifically for mobile agents rather than general-purpose (and very vulnerable) Windows clients. P2P cycle-scavenging systems might also be protected sufficiently, especially since they tend to be aimed at computational problems with little I/O or communication with the outside world. As long as they are designed to get out of the way as soon as the user needs the CPU, they may not represent much risk. Nonetheless, the whole point of such systems is to scavenge compute cycles from millions of Windows machines, which is an inherently dangerous environment. Inasmuch as CPU cycles are getting increasingly cheap, and light-weight grids are useful only for a narrow range of “embarrassingly parallel” problems, one wonders why taking the risk is worthwhile.
Contact: sburbeck at mindspring.com
Last revised 6/12/2012