Chapter 5 Open Source rather than proprietary

Open source languages such as Python and R are increasing in popularity across government. One advantage of using these tools is that we can reduce the number of steps where the data needs to be moved from one program (or format) into another. This is in line with the principle of reproducibility given in guidance on producing quality analysis for government (the AQUA book), as the entire process can be represented as a single step in code, greatly reducing the likelihood of manual transcription errors.

Chapter 3 discussed why spreadsheets are dangerous. Moving away from proprietary software, towards Open Source, may also have the additional benefit of being more compatable with tried and tested software development tools and techniques (as well as the obvious no longer needing to pay for it).

In GDS’s project with DCMS described in Chapter 4, we decided to use the R language, however we could equally have chosen to use Python or another language entirely: the techniques we outline in this book are language agnostic.