{"id":22,"date":"2016-07-29T05:20:17","date_gmt":"2016-07-29T05:20:17","guid":{"rendered":"http:\/\/blog.orkan-tief.de\/?p=22"},"modified":"2022-09-27T10:08:27","modified_gmt":"2022-09-27T10:08:27","slug":"unit-testing-in-scientific-programming","status":"publish","type":"post","link":"http:\/\/www.martin-rdz.de\/index.php\/2016\/07\/29\/unit-testing-in-scientific-programming\/","title":{"rendered":"Unit Tests in Scientific Programming"},"content":{"rendered":"<p>Testing your code seems like a sane idea, at least from a programmers perspective [eg <a href=\"http:\/\/www.bbc.co.uk\/academy\/technology\/article\/art20150223103348419\">1<\/a>]. As a scientist when it comes to crunching numbers, in the majority of cases, you hack together a small script and hope that it runs as intended. If strange behavior\u00a0(aka bug) occurs while processing a gigabyte large dataset, you give it a quick fix, pray for the best and start computation all over. After nights of debugging it gets clear that this is not the best approach (or earlier\u00a0 &#8211; i have learned it the hard way).<\/p>\n<p>Assuring the correct execution is rather easy for small parts of code. It is obvious what the following (exaggerated simple)\u00a0function does:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"wpcustom\">def larger_zero(number): \r\n    return number &gt; 0<\/pre>\n<p>Problems develop with increasing length of the function or multiple functions\/objects interacting with each other. In the real world you are more likely to have a ~1000-lines-of-code data analysis script. It is reading from different files, doing computations and plotting and finally writes the results out. How to assure correct execution in this case?<\/p>\n<p>When your scientific software is modularized (which it ideally is, but this topic is too broad to discuss it here), testing seems pretty straightforward:<\/p>\n<ol>\n<li>starting at the bottom: test independent functions on their own (the core of unit testing)<\/li>\n<li>make sure, that the communication between functions works correct<\/li>\n<li>examine if the whole script does what you intended<\/li>\n<\/ol>\n<p>Some people argue, that you should write the tests <strong>before<\/strong> you start to write your software. In science where you might not know at the beginning what your program should look like at the end, this is not the perfect solution. My current way to go, looks approximately like this (I&#8217;m not claiming that there is no room improvements):<\/p>\n<ul>\n<li>play with the problem, get an idea about the structure of your program<\/li>\n<li>draft (and maybe write) the tests<\/li>\n<li>write the actual software, unit test regularly<\/li>\n<li>if you encounter a bug, write a test reproducing this bug, then debug<\/li>\n<li>test on a larger scale (for example with a test dataset &#8211; if there is one)<\/li>\n<li>give it a try with the real data (remember: bug -&gt; test to reproduce -&gt; fix the bug)<\/li>\n<\/ul>\n<p>But enough with the theory, lets look at a simple example. As I am still a novice in the field of (unit) testing, don&#8217;t expect too much \ud83d\ude09 Python is the language of my choice for scientific tasks [<a href=\"http:\/\/hplgit.github.io\/primer.html\/doc\/pub\/half\/book.pdf\">2<\/a>]. Aside from unittest (included in the python standard library), pytest [<a href=\"http:\/\/docs.pytest.org\/en\/latest\/\">3<\/a>] seems a good library to get started with unit testing.\u00a0Installation is straightforward using pip. To illustrate the case I will use a (slightly simplified) piece of code written during my master&#8217;s thesis. The\u00a0<code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">quality_mask()<\/code>\u00a0function is used to categorize measurements based on quality criteria.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"wpcustom\">from __future__ import print_function\r\nimport numpy as np\r\n\r\ndef quality_mask(data):\r\n    \"\"\" \"\"\"\r\n  \r\n    if data['Zcr'] &gt; -25. and (data['LDRcr'] &lt; - 14 or np.isnan(data['LDRcr'])):\r\n        print('# particle influenced')\r\n        flag = 1\r\n\r\n        if data['momentswp'][0][0] &lt; -29.:\r\n            #print('# low reflectivity')\r\n            flag = 2\r\n\r\n        if data['momentswp'][0][1] &lt; data['vcr']:\r\n            # left of cloud radar and above convective boundary layer\r\n            flag = 6\r\n\r\n            if (len(data['momentswp']) &gt; 1\r\n                and data['momentswp'][1][1] &gt; data['vcr']\r\n                and data['momentswp'][1][0] &gt; -29.):\r\n                flag = 4\r\n\r\n        if len(data['momentswp']) &gt; 6 \\\r\n                and data['momentswp'][4][0] &gt; -30.:\r\n            # print('# too many peaks (melting layer)')\r\n            flag = 5\r\n\r\n        #if snr_main_peak &lt; 20.:\r\n        if data['SNRwp'] &lt; 15.:\r\n            # print('# low snr')\r\n            flag = 3\r\n\r\n    elif data['LDRcr'] &gt; - 14. and data['Zcr'] &gt; 0.:\r\n        # melting layer\r\n        flag = 5\r\n        \r\n    else:\r\n        # not particle influence\r\n        flag = 0\r\n        \r\n    return flag<\/pre>\n<p>As there are several input parameters and a lot if-conditions you want to get sure, that classification works as expected. This calls for a unit test. It can be located in a separate file, to keep things clean. Each test has its own data dictionary and checks for the correct output.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-title=\"test_calculate.py\" data-enlighter-theme=\"wpcustom\">from __future__ import print_function\r\nimport numpy as np\r\n\r\ndef quality_mask(data):\r\n    \"\"\" \"\"\"\r\n  \r\n    if data['Zcr'] &gt; -25. and (data['LDRcr'] &lt; - 14 or np.isnan(data['LDRcr'])):\r\n        print('# particle influenced')\r\n        flag = 1\r\n\r\n        if data['momentswp'][0][0] &lt; -29.:\r\n            #print('# low reflectivity')\r\n            flag = 2\r\n\r\n        if data['momentswp'][0][1] &lt; data['vcr']:\r\n            # left of cloud radar and above convective boundary layer\r\n            flag = 6\r\n\r\n            if (len(data['momentswp']) &gt; 1\r\n                and data['momentswp'][1][1] &gt; data['vcr']\r\n                and data['momentswp'][1][0] &gt; -29.):\r\n                flag = 4\r\n\r\n        if len(data['momentswp']) &gt; 6 \\\r\n                and data['momentswp'][4][0] &gt; -30.:\r\n            # print('# too many peaks (melting layer)')\r\n            flag = 5\r\n\r\n        #if snr_main_peak &lt; 20.:\r\n        if data['SNRwp'] &lt; 15.:\r\n            # print('# low snr')\r\n            flag = 3\r\n\r\n    elif data['LDRcr'] &gt; - 14. and data['Zcr'] &gt; 0.:\r\n        # melting layer\r\n        flag = 5\r\n        \r\n    else:\r\n        # not particle influence\r\n        flag = 0\r\n        \r\n    return flag<\/pre>\n<p>Now we are ready to run the test with\u00a0 <code class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">py.test -v test_calculate.py<\/code>. The output looks like this:<\/p>\n<p><a href=\"http:\/\/www.martin-rdz.de\/wp-content\/uploads\/2016\/07\/unittest_output.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-28 size-full\" src=\"http:\/\/www.martin-rdz.de\/wp-content\/uploads\/2016\/07\/unittest_output.png\" alt=\"unittest_output\" width=\"799\" height=\"368\" \/><\/a>The first three test passed nicely. The forth one fails, either caused by a bug in the function or &#8211; in this case &#8211; an error in the data used for that test.<\/p>\n<p>Furthermore the test procedure is very fast: 0.07s. You can use it frequently during and after coding, to ensure code quality without loosing too much time. Many professional teams run their tests every time they commit to their repository or even automated during the night.<\/p>\n<p><strong>Further reading<\/strong><\/p>\n<ul>\n<li>[1] http:\/\/www.bbc.co.uk\/academy\/technology\/article\/art20150223103348419<\/li>\n<li>[2] Langtangen, A Primer on Scientific Programming with Python <a href=\"http:\/\/hplgit.github.io\/primer.html\/doc\/pub\/half\/book.pdf\">link<\/a><\/li>\n<li>[3] <a href=\"http:\/\/docs.pytest.org\/en\/latest\/\">http:\/\/docs.pytest.org\/en\/latest\/<\/a><\/li>\n<li><a href=\"https:\/\/sea.ucar.edu\/sites\/default\/files\/TDD_For_Scientists.pdf\">TDD for Scientists<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Testing your code seems like a sane idea, at least from a programmers perspective [eg 1]. As a scientist when it comes to crunching numbers, in the majority of cases, you hack together a small script and hope that it runs as intended. If strange behavior\u00a0(aka bug) occurs while processing a gigabyte large dataset, you &hellip;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[7],"tags":[3,4],"jetpack_sharing_enabled":true,"jetpack_featured_media_url":"","_links":{"self":[{"href":"http:\/\/www.martin-rdz.de\/index.php\/wp-json\/wp\/v2\/posts\/22"}],"collection":[{"href":"http:\/\/www.martin-rdz.de\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.martin-rdz.de\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.martin-rdz.de\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/www.martin-rdz.de\/index.php\/wp-json\/wp\/v2\/comments?post=22"}],"version-history":[{"count":24,"href":"http:\/\/www.martin-rdz.de\/index.php\/wp-json\/wp\/v2\/posts\/22\/revisions"}],"predecessor-version":[{"id":298,"href":"http:\/\/www.martin-rdz.de\/index.php\/wp-json\/wp\/v2\/posts\/22\/revisions\/298"}],"wp:attachment":[{"href":"http:\/\/www.martin-rdz.de\/index.php\/wp-json\/wp\/v2\/media?parent=22"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.martin-rdz.de\/index.php\/wp-json\/wp\/v2\/categories?post=22"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.martin-rdz.de\/index.php\/wp-json\/wp\/v2\/tags?post=22"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}