This post was written by Manuel Pais, co-author of Team Guide to Software Releasability.
Context
We recently wrote an article on DZone about bringing security into the delivery lifecycle where we covered both the people and technical perspectives.
To demo a deployment pipeline setup including automated security checks we decided to use a VM created with the popular Vagrant tool. Docker was another possibility (which would have saved us the 2 minutes Vagrant takes to boot the base VM) but we felt Vagrant is still more widely spread and easier to install for non-tech savvy clients, especially under Windows via an MSI (although “Dockerization” is reaching the Windows space too).
We picked an OWASP Rails demo app called RailsGoat as an interesting example of what kind of security checks we could integrate in the pipeline. We thought the build stage for this app would be fairly fast and painless, to have a realistic yet simple pipeline.
What we expected to be a quick setup and (shell) provisioning process turned out to be a time-consuming, fragile process.
In this blog post we explore the issues we encountered and how we streamlined the provisioning process to reduce its duration and increase repeatability.
Caveat: all measurements mentioned in this post are merely indicative and are by no means statistically meaningful (the law of coworking says your internet connection stability is inversely proportional to your benchmark reliability)
Starting State
The diagram below illustrates the sequence and example duration of each command part of the shell provisioning before we looked at how to reduce the provisioning time.
The edges show the time spent by each command. Those where we identified the largest time consumption are highlighted in red.
Note: several of the commands required root access, “sudo” is omitted here to simplify the graph.
Lesson #1: Balance dependencies on latest versions vs package availability
RailsGoat app maintainers do a good job of using the latest stable Ruby version. When we forked it early December last year, the Ruby version was 2.2.3. Unfortunately there was no Debian package for Ruby 2.2.3, so RVM (Ruby Version Manager) downloaded the C sources and compiled the Ruby runtime from scratch.
Installing Ruby 2.2.3 and the libraries it requires took more than a fourth (out of 35 minutes in total) of the time for provisioning the vagrant machine:
+++ curl -L https://get.rvm.io | bash -s stable --autolibs=2 --ruby=2.2.3
real 8m15.049s
user 5m6.648s
sys 0m46.736s
+++ sudo apt-get install -y ruby-dev
real 0m32.177s
user 0m2.876s
sys 0m4.672s
+++ sudo apt-get install -y build-essential bison openssl libreadline6 libreadline6-dev curl git-core zlib1g zlib1g-dev libssl-dev libyaml-dev libxml2-dev autoconf libc6-dev ncurses-dev automake libtool
real 0m40.855s
user 0m4.016s
sys 0m10.028s
Solution
We decided to downgrade Ruby to the closest version for which a Debian package was available, which turned out to be 2.2.1. As long as the demo app still installed and launched, we’d be ok while (hopefully) significantly reducing the provisioning time.
In fact, we could shave off about 6 and a half minutes (nearly 20%) from the provisioning time.
Lesson #2: Don’t think of commands as atomic
Seasoned Ruby developers know that gems documentation gets installed by default and that takes long. Really long.
The gauntlt gem (one of the security testing tools we used for the article) installation (and all its gem dependencies) alone took nearly half of the total provisioning time!
+++ gem install gauntlt
real 15m37.152s
user 10m13.332s
sys 0m31.460s
Solution
We knew we’d save some time by including the “–no-ri –no-rdoc” options in the gem install command. But we were stunned when we found out how much… Turns out the documentation install was taking up roughly two thirds (more than 10 minutes) of the gem installation time!
The main lesson here was that while we focused first on “fixing” the Ruby version issue (lesson #1), it turned out the biggest gain was as simple as adding a couple of options.
You want to reduce the total time as much as possible within reasonable effort but it’s definitely worthwhile to pre-analyze deeper where the time is being consumed. Don’t look at each command as atomic, instead check for multiple discrete actions that each command might be executing.
Lesson #3: Avoid downloading unnecessary metadata, even with dynamic dependencies
After we addressed those major time drains, we looked at other places where we could reduce time.
sqlmap is another command-line security tool we used in the demo. Initially we followed their instructions to clone their github repo as this would ensure we were installing the latest version of sqlmap during provisioning (we wanted to find any problems with new versions as soon as possible, to update the demo):
+++ git clone https://github.com/sqlmapproject/sqlmap.git sqlmap-dev
real 0m35.306s
user 0m4.696s
sys 0m8.512s
The execution above is reasonably fast but there’s still a lot of repo metadata being unnecessarily downloaded.
Solution
Although the sqlmap (a Python tool) repo download worked fine out of the box, we could improve the time considerably by downloading a single tarball file instead:
+++ wget -q https://github.com/sqlmapproject/sqlmap/tarball/master ; tar -xf master ; mv sqlmapproject-* sqlmap
real 0m7.778s
user 0m0.324s
sys 0m0.500s
The single file download took less than half the time and still gave us the latest version at any moment, not a hardcoded sqlmap version.
Caveat: this step is highly dependent on GitHub’s availability and latency, which might introduce errors or delays. Here we were forced to trade repeatability of the process for access to the latest tool version and modifications.
End State
Let’s revisit the diagram illustrating the sequence and example duration of each command part of the shell provisioning after we implemented the solutions previously highlighted.
As you can see the time bottlenecks now (besides the gauntlet gem install which we’ve reduced significantly) are those which we did not/could not address, namely GoCD server’s installation and the install of required libs for Ruby.
Conclusion
Having a clear idea of our needs, in particular which dependencies can be static (Ruby in our case) and which need to be dynamic (the security tools), allowed us to reduce the provisioning time roughly by half with three (rather) simple changes.
We learned that major gains are not always those we expect and that pre-analyzing time consumption (not thinking of commands as atomic units of execution and understanding which kind of activities are consuming most time) helps us improve faster.
Have a look here if you’re curious to know how the provisioning shell script looks like.