Creating Your Own PHPUnit @requires Annotations

PHPUnit offers a feature that lets you skip a test when certain requirements aren’t met. This can be done in two ways:

  1. You can manually check if the requirements are met, and then skip the test with $this->markTestSkipped() if they are not.
  2. In some cases, you can use the @requires annotation, and the test will be skipped automatically when the requirements aren’t met.

Using the @requires annotation is nicer, but PHPUnit only has so many options built in. Sometimes you have custom requirements that can’t really be checked reliably with any of the built-in options. An example is when you need some tests you’ve written for a WordPress plugin to run only when WordPress’s multisite feature is enabled on the test site. In my tests, I find myself needing this a lot. So I’ve been writing this over and over:

if ( ! is_multisite() ) {
     $this->markTestSkipped( 'Multisite must be enabled.' );
}

But just yesterday I realized that this was silly, and that I could easily add my own custom @requires annotation. So I did. Here is the code:

	protected function checkRequirements() {

		parent::checkRequirements();

		$annotations = $this->getAnnotations();

		foreach ( array( 'class', 'method' ) as $depth ) {

			if ( empty( $annotations[ $depth ]['requires'] ) ) {
				continue;
			}

			$requires = array_flip( $annotations[ $depth ]['requires'] );

			if ( isset( $requires['WordPress multisite'] ) && ! is_multisite() ) {
				$this->markTestSkipped( 'Multisite must be enabled.' );
			} elseif ( isset( $requires['WordPress !multisite'] ) && is_multisite() ) {
				$this->markTestSkipped( 'Multisite must not be enabled.' );
			}
		}
	}

You just need to add that method to your base test case class, and you will then be able to use @requires WordPress multisite instead of messing with markTestSkipped() all the time. For tests that should only run when multisite isn’t enabled, you can use @requires WordPress !multisite.

You could easily add more options for any other requirements your tests commonly have.

Fixing a Chronically Corrupt Git Index File

A while back I was working on a project where the website was under version control using git. Whenever I would try to run git status before checking out new changes, I would always get an error:

$ git status
error: bad index file sha1 signature 
fatal: index file corrupt 
fatal: 'git status --porcelain' failed

A quick internet search brought up what I thought was the solution:

$ rm -f .git/index
$ git reset

I say I thought this was the solution. But it wasn’t. I’d try to run git status after this, and the index file would still be corrupt. I struggled with this for a long time, until I finally realized what the problem was. It turned out that there were two folders in the git repo that contained .git directories—in other words, what we had was nested git repositories. This wasn’t intentional, and I had no idea that these .git directories were there. Once they were deleted, everything worked again.

TL;DR

Check to make sure that none of the non-submodule directories under your project have .git directories in them.

The Living Computer

I’m a programmer, but I’m also a nature lover, and I enjoy learning more about all of the sciences, especially biology. Recently, I’ve come to realize how much programming and biology share in common.

The basic building block of life is the cell. Actually, cells don’t have to just be building blocks. Single-celled creatures are just one single cell. And yet they have to confront all of the same basic challenges to life that you and I do.

Are Cells Computers?

Are cells living computers? No. They are so much more than that. But, just like you and I have a brain that has amazing computational power, cells have some aspects that are computer-like as well. Cells don’t have brains of course, or anything analogous to a nervous system. But they do have something else, an aspect that we don’t even understand yet in regard to the brain. They have software. Actually, we can go further than that. Cells have a complete OS.

There are several programming languages involved; one of the most well known is DNA. “But wait, isn’t DNA for storing information?” Thanks for asking! Actually, yes, you are correct, DNA is used by the cell to store huge volumes of information, which includes the blueprint not only for the cell’s structure, but also for its development. In your cells’ nuclei is all of the information needed to construct and maintain your body. How much is this? Over 3.2 billion base-pairs of DNA.

The Biotic Byte

Let’s convert that number to something more familiar. Instead of base-pairs, we could use bytes. Let’s take a minute to talk about bytes, just to show that this is a valid comparison. Bytes are actually groups of smaller units, called bits. Bits are binary; they can only be one of two things, a zero or a one. A byte is a string of exactly 8 bits. There are 2^8 or 256 different possible combinations of 8 bits, and so there are 256 unique bytes.

A strand of DNA is made up of base-pairs. These are in groups of three, called codons. We can think of these codons like bytes, and like bytes they are also made up of smaller units, the base-pairs. Unlike bits, which come in only two types, DNA is made bases that come in 4 different letters, A, C, T, and G. That means that twice as much information can be stored in a single letter as can be represented by a bit. So 4 letters of DNA can store the same amount of information as one byte.

Now that we know how to convert codons to bytes, we can do the math. We have 3.2 billion base-pairs or letters, so to get the number of bytes we just divide by 4: 3.2 billion / 4 ≈ 0.8 billion. So the size of the human genome is approximately 800 million bytes, or 763 megabytes.

Now think of this: Each cell in your body has two copies of the genome (except for red blood cells, which have none). And it’s estimated that there are 37.2 trillion cells in the average adult human body. Even if we assume that 17 trillion of these are red blood cells, that means that your body contains 23 trillion gigabytes of DNA. That could also be written as 22 million petabytes, or 21 zettabytes. To put this in perspective, the world’s total effective two-way telecommunications capacity was “only” 65,000 petabytes per-day in 2007. At that rate, to transmit all of the information encoded on all of the DNA in your body, it would take almost a whole year.

A year. And yet all of that information fits inside of you. Despite the fact that the strands of DNA in a single cell would stretch out to about 2 m (6 ft) long if laid end to end, in the nucleus they packed into a whopping diameter of just 6-10 millionths of a meter. That means all of the DNA in your body could fit into a 22 cm (8.5 in) cube. Let’s compare that size to how much room it would take to store the same amount of information on computers. Let’s imagine we put it all onto 1 terabyte hard drives that measure 3 in by 4 in by 0.5 in. They would make a cube about 424 ft (130 m) on a side. A building of that size would have a volume of 76 million cu ft, which would make it the eighth largest building in the world.

Not Just For Information Storage

DNA is obviously an extremely efficient medium of information storage. We’ve looked at it from the angle of just how much your body contains. But we can also look at it from the other angle. A single copy of the entire human genome takes up only 0.8 gigabytes. Compare that with the raw size of OS X Yosemite, which is 5.18 gigabytes. Windows 8 requires about 6–8 gigabytes. In other words, modern computer operating systems take almost 10 times as much code as it takes to create and run your body.

DNA is like a computer program but far, far more advanced than any software ever created.—Bill Gates, founder of Microsoft, in The Road Ahead

The really amazing thing about DNA—and this is what I started out to say a while back—is that it isn’t just a blueprint. Most of it doesn’t encode genes. Not even close. The protein-coding portion takes up less than 2% of your DNA, or about 15 megabytes. So what does the rest of the DNA do? Lot’s of things, actually. It does so much, in fact, that we aren’t even beginning to understand it all. But we do know enough to know that DNA is far more than a blueprint. Is it a computer program? Sort of. It really goes beyond that, but that’s the closest thing to it we’ve ever created.

Beyond Programming

As a programmer, it is amazing how much DNA is like a programming language. However, it is even more amazing how much DNA goes beyond modern programming.

How can DNA program for so much in such little space? We can’t yet fully answer that question, but we’re starting to find clues. One is that DNA isn’t just one programming language. It is several, all at once. The same DNA strand can code for several different codes, in both directions. I can’t imagine trying to write code that has to do one thing when read forwards and another when read backwards. Most of our languages couldn’t possibly do that, because of their syntax. They are inherently one-way.

Take PHP for example. Its syntax requires the code to be interpreted from left to right. It’s not just that you couldn’t interpret it backwards as PHP, but it would be really difficult even to create a language with inverted PHP syntax. The same goes for JavaScript.

Of course, some languages are simpler (like BASIC), and could potentially work forwards and backwards. These languages are also far less human-readable. They are already hard for us to grok as it is, so how in the world would we ever be able to write meaningful two-way code like that? It might seem like it would be easy to do, if we just wrote the one-way code and used computer algorithms to compress it into two-way code. But that’s far easier said than done.

The Modular Genome

Among programming best practices is that of writing modular code. Instead of creating one huge, garbled, interconnected whole, a project can be split into discrete parts that are interoperable.

While I was contemplating writing this post, I happened to come across an article that revealed that some genomes are like this. Actually, all genomes are modular, in the sense that they are made up of discrete genes. But what has been discovered in this case is something different. The DNA isn’t just modular, it is actually split into discrete packages.

The genome of the unicellular ciliate Stylonychia lemnae is really astounding. These creatures actually maintain two copies of their genome in separate nuclei. In one nucleus, called the micronucleus, all of the DNA is stored in a single chromosome. In the other nucleus the DNA is split into thousands of different chromosomes. More than 16 thousand, to be exact. This type of nucleus is much larger than the other, and is called the macronucleus.

The moment I read this, I thought of packagist.org. Thousands of different discrete modules maintained in a single repository. Actually though, it is much more like the plugin repository on WordPress.org, which isn’t just a listing directory, but actually holds all of the code for the 37,000+ plugins in a single SVN repository.

The fascinating thing is that the macronulceus is about 10 times larger than the micronucleus. In effect, this means that the copy of the genome which is used in genetic transmission is kept under 10x compression. 10x! It is amazing that the genome can be compressed this much, and yet still be usable for genetic recombination.

Compile-time Optimization

Languages like PHP get compiled into machine code. Some compilers have features that modify the compiled code in various ways to try to improve its performance. This is called compile-time optimization. It’s usually not trivial to do this, because the compiler is risking the possibility of introducing a bug instead of an optimization. It can also mean compilation itself is much less performant, because the compiler has to run sophisticated algorithms over the code.

In the genome, we might think of the transcription of DNA to RNA as compilation. It’s been known for some time that the nucleus sometimes makes modifications to the RNA after transcription. That’s kind of like compile-time optimization. But in fact, it is much more than that. Sometimes the changes are very simple, and affect just a single base. It’s been recently discovered that this type of RNA editing may be very common. But it has also been known for some time that much more complex forms of RNA editing occur as well. This is called alternative splicing, and it involves taking a gene and splitting it into its modular components. These are then rearranged from their usual configuration, with some being doubled or removed. Then they might be combined with pieces of a completely different gene.

This goes beyond our conventional compile-time optimizations. It’d be like compiling two different components of a program, breaking them down into smaller pieces, and rearranging them to create something entirely new.

Living Programmer

As a programmer, all of this is fascinating. I can sit here and write computer programs because of the trillions of programs being run inside of my body’s cells. This naturally leads us to a question: where did those programs come from? Who wrote them?

You might answer, “I don’t know.” But a staunch evolutionist will tell you that is the wrong answer. (Unless you catch him off guard.) They will tell you no-one wrote the program. As a programmer, that’s unbelievable. As a programmer, I know that programs don’t just happen, they take intelligence. And just being “smart” isn’t enough: you have to have skill too, you have to know the language. Even with high intelligence and superb skill, how often do we get it right the first time? How often do we have to do lot’s of testing to make sure the thing really works?

Yet evolutionists would have us believe that the unimaginable complexity of the genome happened by accident, that a programming language just created itself, and that, over time, a program was shaped through typos in the code.

Of course, as a programmer, I know that is ludicrous. One typo or mistake can easily kill a program. Even if a typo isn’t syntactically invalid, it can still cause the program to stop working properly. And even if that doesn’t happen, it’s still highly probable that a small bug has been introduced by it—and those small bugs are the real killers. You can argue that natural selection will, in effect, “weed out” those really bad bugs. And that’s true (though the reproduction rate isn’t high enough to sustain that level of mutation for millions of years). But you can’t say that about the small bugs. They’re little changes that don’t really seem to have much effect—most of the time. Instead, they’ll build up in the population until it is driven to the point of extinction.

Just imagine a program you’ve written being eroded this way over time. Before long, it would cease to do anything useful at all.

As a programmer, it is obvious: someone programmed me. And not just anyone either. Someone who has unbelievable intelligence, skill, and artistry. Someone who can build something infinitely more complex than Microsoft Windows, using less code, and even have that thing reproduce itself. Do you know anyone like that? It clearly wasn’t one of us. It clearly wasn’t any other form of biological life either (from here or elsewhere), because all life is based on programs. All life requires a Programmer.

As one living programmer, let me ask you: have you met the Programmer of all life? Have you met the living Programmer?

Travis CI, Composer, and PHP 5.2

Once I’ve written some PHP unit tests for my plugins, I like to make sure I put them to good use. I develop the plugins on GitHub, so with the right tools, it’s easy to set up Travis CI to run my tests. This will let me run the tests against all of the PHP versions I need too without the hassle of trying to do this locally.

The only problem is that WordPress still supports PHP 5.2, and while I want to run my tests against that version, I’m using composer to install some of my dev dependencies. And as you probably know, composer requires PHP 5.3. So I searched around the internet to see if anyone had a solution to this dilemma. I did find one project on GitHub, but it requires you to have a separate config file for PHP 5.2, and doesn’t appear to be maintained at this time.

What I was really hoping for was a way to run composer using PHP 5.3 even when the tests are running on 5.2, since all of the PHP versions are installed on the Travis test box. I couldn’t find any helpful information about switching PHP versions on Travis, but with a little research into phpenv (which Travis uses to manage the PHP environment), I was able to figure something out.

It’s actually as easy as this:

phpenv global 5.3
composer install
phpenv global "$TRAVIS_PHP_VERSION"

Just drop that into the before_install section of your .travis.yml, and you’re ready to go!

Improving Plugin Security One Day at a Time

As a security-conscious developer, I like to inspect a WordPress plugin’s source code before I install it on one of my sites. I didn’t do this, formerly. But after the first time I did (and found a vulnerability), I believe more strongly in its importance.

I also keep tabs on WordPress related security reports on sites like Packet Storm, Secunia, Exploit Database, and Bugtraq. I use IFTTT for this:

IFTTT Recipe: Email me WordPress security reports from Packet Storm connects feed to email

IFTTT Recipe: Email me WordPress security reports on Secunia connects feed to email

IFTTT Recipe: Email me WordPress security reports on Exploit DB connects feed to email

IFTTT Recipe: Email me WordPress security reports from Bugtraq connects feed to email

Over the last year I’ve come to realize something: almost every plugin has vulnerabilities. Okay, all code has vulnerabilities. But almost every plugin has glaringly obvious vulnerabilities, or at least that can be found without a great deal of effort.

That seems a little bit scary. Of course, many of these aren’t particularly serious. But they demonstrate that the people creating the plugins often don’t understand basic WordPress security.

I know I’m not the only one who realizes this. Probably, most experienced WordPress developers come to this realization sooner or later. But just realizing it and thinking, “I wish it weren’t so,” isn’t going to make things any better. So I’ve decided to do something about it. I’m going to improve plugin security one day at a time. I’m going to try to do something, every day, that will make WordPress plugins more secure.

How will I do this? In many ways. I said that I’m not the only one who understands the situation, and I’m not the only one who’s doing things about it either. The folks on the plugin review team for WordPress.org try to catch the vulnerabilities when a plugin is first submitted to the repo. They also handle security reports and make sure they make it to the plugin authors. There are also the great folks on the WordPress docs team contributing to the plugin developer handbook. Hopefully the security-related things in there will help to educate the next generation of plugin developers about these issues better right from the start.

One of the greatest things I can do is to try to help educate plugin devs about security. I’ll also try to make sure that reports of vulnerabilities make their way to the plugin developer. And I’ll continue to review plugins that I use, and report vulnerabilities that I find. I might then move on to investigating other popular plugins as well. I have even pondered creating a PHP source code security scanner, but that would be quite a project. (There are many of these out there, but none of them are intelligent enough for me.)

Regardless of how, I want to try to do a little something every day to improve WordPress plugin security. If just a few folks did this, how different might things look in a few years? We’ll just have to wait and see.