Skip to content

Latest commit

 

History

History
327 lines (237 loc) · 11.4 KB

error-handling-antipatterns.md

File metadata and controls

327 lines (237 loc) · 11.4 KB

Table of Contents

Error handling anti-patterns

Hiding exceptions

There are multiple variations of this anti-pattern:

# Silence all exceptions
def toast(bread):
    try:
        toaster = Toaster()
        toaster.insert(bread)
        toaster.toast()
    except:
        pass


# Silence some exceptions
def toast(bread):
    try:
        toaster = Toaster()
        toaster.insert(bread)
        toaster.toast()
    except ValueError:
        pass

It depends on the context but in most cases this is a bad pattern:

  • Debugging those silent errors will be really difficult, because they won't show up in logs and exception reporting tool such as Sentry. Say you have an undefined variable in Toaster.insert(): it will raise NameError, which will be caught, and ignored, and you will never know about this developer error.
  • The user experience will randomly degrade without anybody knowing about it, including the user.
  • Identifying those errors will be impossible. Say do_stuff does an HTTP request to another service, and that service starts misbehaving. There won't be any exception, any metric that will let you identify it.

An article even named this the most diabolical Python antipattern.

The following full example:

from collections import namedtuple

Bread = namedtuple('Bread', 'color')

class ToastException(Exception):
    pass

def toast(bread):
    try:
        put_in_toaster(bread)
    except:
        raise ToastException('Could not toast bread')


def put_in_toaster(bread):
    brad.color = 'light_brown'  # Note the typo


toast(Bread('yellow'))

Will raise this cryptic and impossible to debug error:

Traceback (most recent call last):
  File "python-examples/reraise_exceptions.py", line 19, in <module>
    toast(Bread('yellow'))
  File "python-examples/reraise_exceptions.py", line 12, in toast
    raise ToastException('Could not toast bread')
__main__.ToastException: Could not toast bread

Sometime it's tempting to think that graceful degradation is about silencing exception. It's not.

  • Graceful degradation needs to happen at the highest level of the code, so that the user can get a very explicit error message (e.g. "we're having issues with X, please retry in a moment"). That requires knowing that there was an error, which you can't tell if you're silencing the exception.
  • You need to know when graceful degradation happens. You also need to be alerted if it happens too often. This requires adding monitoring (using something like statsd) and logging (Python's logger.exception automatically adds the exception stacktrace to the log message for instance). Silencing an exception won't make the error go away: all things being equal, it's better for something to break hard, than for an error to be silenced.
  • It is tempting to confound silencing the exception and fixing the exception. Say you're getting sporadic timeouts from a service. You might thing: let's ignore those timeouts and just do something else, like return an empty response. But this is very different from (1) actually finding the root cause for those timeouts (e.g. maybe a specific edge cases impacting certain objects) (2) doing proper graceful degradation (e.g. asking users to retry later because the request failed).

In other words, ask yourself: would it be a problem if every single action was failing? If you're silencing the error, how would you know it's happening for every single action?

Here's a number a better ways to do this:

Log and create metrics

import statsd


def toast(bread):
    # Note: no exception handling here.
    toaster = Toaster()
    toaster.insert(bread)
    toaster.toast()


def main():
    try:
        toast('brioche')
    except:
        logger.exception('Could not toast bread')
        statsd.count('toast.error', 1)

Very important: adding logging and metrics won't not sufficient if it's difficult to consume them. They won't help the developer who's debugging. There needs to be automatic alerting associated to those stats. The logs needs to be surfaced somewhere, ideally next to the higher exception (e.g. let's our main function above is used in a web interface - the web interface could say "additionally, the following logs were generated and display the log). For instance, Sentry can be configured to surface logger.exception errors the same way exception are surfaced. Otherwise the developer will still have to read the code to understand what's going on. Also - this won't work with sporadic errors. Those needs to be dealt with properly, and until then, it's better to let them go to your usual alerting tool.

Note that knowing where to catch is very important too. If you're catching inside the toast function, you might be hiding things a caller would need to know. Since this function is not returning anything, how would you make the difference between a success and a failure? You can't. That's why you want to let it raise, and catch only in the caller, where you have the context to know how you'll handle the exception.

Re-raise immediately

import statsd


def toast(bread):
    try:
        toaster = Toaster()
        toaster.insert(bread)
        toaster.toast()
    except:
        raise ToastingException('Could not toast bread %r' % bread)


def main():
    # Note: no exception handling here
    toast('brioche')

Be very specific about the exception that are caught

import statsd


def toast(bread):
    try:
        toaster = Toaster()
    except ValueError:
        # Note that even though we catch, we're still logging + creating
        # a metric
        logger.exception('Could not get toaster')
        statsd.count('toaster.instantiate.error', 1)
        return

    toaster.insert(bread)
    toaster.toast()


def main():
    toast('brioche')

Raising unrelated/unspecific exception

Most languages have predefined exceptions, including Python. It is important to make sure that the right exception is raised from a semantic standpoint.

Bad:

def validate(toast):
    if isinstance(toast, Brioche):
        # RuntimeError is too broad
        raise RuntimeError('Invalid toast')


def validate(toast):
    if isinstance(toast, Brioche):
        # SystemError should only be used for internal interpreter errors
        raise SystemError('Invalid toast')

Good:

def validate(toast):
    if isinstance(toast, Brioche):
        raise TypeError('Invalid toast')

TypeError is here perfectly meaningful, and clearly convey the context around the error.

Unconstrained defensive programming

While defensive programming can be a very good technique to make the code more resilient, it can seriously backfire when misused. This is a very similar anti-pattern to carelessly silencing exceptions (see about this anti-pattern in this document).

One example is to handle an edge case as a generic case at a very low level. Consider the following example:

def get_user_name(user_id):
    url = 'http://127.0.0.1/users/%s' % user_id
    response = requests.get(url)
    if response.status == 404:
        return 'unknown'
    return response.data

While this may look like a very good example of defensive programming (we're returning unknown when we can't find the user), this can have terrible repercussions, very similar to the one we have when doing an unrestricted bare try... except:

  • A new developer might not know about this magical convention, and assume that get_user_name is guaranteed to return a true user name.
  • The external service that we're getting user name from might start failing, and returning 404. We would silently return 'unknown' as a user name for all users, which could have terrible repercussions.

A much cleaner way is to raise an exception on 404, and let the caller decide how it wants to handle users that are not found.

Unnecessarily catching and re-raising exceptions

Bad:

def toast(bread):
    try:
        put_in_toaster(bread)
    except InvalidBreadException:
        raise ToastException('Could not toast')

Side note: an unconditional exception catching is considered even worse (see hiding exceptions).

This is a better pattern because we explicitly state what happened in the exception message:

def toast(bread):
    try:
        put_in_toaster(bread)
    except InvalidBreadType as e:
        raise ToastException('Cannot toast this bread type')

If we need to do some cleanup or extra logging, it's better to just raise the original exception again. The developer will know exactly what happened.

def toast(bread):
    try:
        put_in_toaster(bread)
    except:
        print 'Got exception while trying to toast'
        raise  # note the absence of specific exception

Here's what would happen:

Got exception while trying to toast
Traceback (most recent call last):
  File "reraise_exceptions_good.py", line 20, in <module>
    toast(Bread('yellow'))
  File "reraise_exceptions_good.py", line 10, in toast
    put_in_toaster(bread)
  File "reraise_exceptions_good.py", line 17, in put_in_toaster
    brad.color = 'light_brown'  # Note the typo
NameError: global name 'brad' is not defined

Another way to show how absurd the anti-pattern becomes at scale is through an example:

def call_1():
    try:
        call_2()
    except Call2Exception:
        raise Call1Exception()


def call_2():
    try:
        call_3()
    except Call3Exception:
        raise Call2Exception()


def call_3():
    ...

Another problem with this pattern is that you can consider it quite useless to do all of this catch/re-raise, since you will still need to catch at the end. In other words:

Error handling, and recovery are best done at the outer layers of your code base. This is known as the end-to-end principle. The end-to-end principle argues that it is easier to handle failure at the far ends of a connection than anywhere in the middle. If you have any handling inside, you still have to do the final top level check. If every layer atop must handle errors, so why bother handling them on the inside?

Write code that is easy to delete, not easy to extend

A better way:

# This is the highest level function where we have enough
# context to know how to handle the exceptions
def call_1():
    try:
        call_2()
    except Call2Exception:
        # handle it...
        pass
    except Call3Exception:
        # handle it...
        pass


def call_2():
    # Do not handle anything here.
    call_3()


def call_3():
    ...

More resources: