micro-refactoring: long bool statements

“Hell is other people’s code” – while trying to reprocess some data on our server, a colleagues script did not produce the desired output. The problem was, that the location attribute was missing in the input netCDF files [1]. Which leads us to the code, that checked the attributes:

#next the formatting I'm supposed to check it against:

(key_for_range_dimension,key_for_time_dimension,key_for_azimuth_variable,key_for_elevation_variable,key_for_range_variable,key_for_unix_time_variable,key_for_doppler_velocity_variable,key_for_doppler_velocity_error_variable,key_for_nyquist_velocity_variable)=variable_name_key_setting_function(location_we_want,system_we_want,location_info,formatting_translator_dictionary,verbose)
       
#this is a logical statement that includes everything, using brackets to break up inline statements and make it easier to see
formatting_is_correct=(
               'location' in that_files_attributes
               ) and (
                              'system' in that_files_attributes
                              ) and (
                                            key_for_range_dimension in that_files_dimensions
                                            ) and (
                                                           key_for_range_variable in that_files_variables
                                                           ) and (
                                                                         key_for_time_dimension in that_files_dimensions
                                                                          ) and (
                                                                                        key_for_unix_time_variable in that_files_variables
                                                                                        ) and (
                                                                                                        key_for_azimuth_variable in that_files_variables
                                                                                                        ) and (
                                                                                                                       key_for_elevation_variable in that_files_variables
                                                                                                                       ) and (
                                                                                                                                     key_for_doppler_velocity_variable in that_files_variables
                                                                                                                                     )

print("Check format keys:")
print(('location' in that_files_attributes),
    ('system' in that_files_attributes),(key_for_range_dimension in that_files_dimensions),
    (key_for_range_variable in that_files_variables),(key_for_time_dimension in that_files_dimensions),
    (key_for_unix_time_variable in that_files_variables),
    (key_for_azimuth_variable in that_files_variables),
    (key_for_elevation_variable in that_files_variables),
    (key_for_doppler_velocity_variable in that_files_variables))

[You might want to expand the code snippet to fully appreciate the >380 column lines]

At least for me, that looked like a classical broken window. The final print statement is already a owned to the debugging efforts. Two points, that I consider problematic:

    1. insanely long lines (not the only occurrence in that script but a prominent one)
    2. the long and nested boolean statement

Both make it hard to understand, debug and modify the code. Let’s have see, what we quickly can do about it.

var_name_key_setting = variable_name_key_setting_function(
               location_we_want, system_we_want, location_info,
               formatting_translator_dictionary, verbose)
# unpack the function return
(key_for_range_dimension, key_for_time_dimension, 
 key_for_azimuth_variable, key_for_elevation_variable, 
 key_for_range_variable, key_for_unix_time_variable, 
 key_for_doppler_velocity_variable, 
 key_for_doppler_velocity_error_variable,
 key_for_nyquist_velocity_variable) = var_name_key_setting
       
formatting_is_correct_list = [
    'location' in that_files_attributes, 
    'location in attributes',
    'system' in that_files_attributes, 
    'system in attributes',
    key_for_range_dimension in that_files_dimensions, 
    'key range dim in files_dimensions',
    key_for_range_variable in that_files_variables,
    'key range var in files_variables',
    key_for_time_dimension in that_files_dimensions,
    'key time dim in files_dimensions',
    key_for_unix_time_variable in that_files_variables,
    'key unix_time_var in files_variables',
    key_for_azimuth_variable in that_files_variables,
    'key azimuth var in files_variables',
    key_for_elevation_variable in that_files_variables,
    'key elevation var in files_variables',
    key_for_doppler_velocity_variable in that_files_variables,
    'key doppler_vel_var in files_variables',
       ]
formatting_is_correct = all(formatting_is_correct_list[::2])

print("Check format keys:", formatting_is_correct)
if not formatting_is_correct:
       [print('  ', e[0], e[1]) for e in \
            zip(formatting_is_correct_list[0:-1:2], 
                formatting_is_correct_list[1::2])]

Still not perfect, but a bit more explicit and easier to read. In fact the [::2] could also be omitted, as a string also evaluates to True in all().

Maybe it’s time to setup a compulsory linter or at least make PEP 8 and the Google Python Style Guide required reading.

[1] actually required a second pair of eyes to figure out…