The definitive guide to solve the infamous Python exception “ModuleNotFoundError”

Among common Python exceptions, the most infamous and time consuming one to solve is no doubt the “ModuleNotFoundError” but actually is pretty simple to fix once you understand a couple of concepts.
Fundamentally it can be raised for three reasons:

1. A typo or a wrong path specified in the import statement

This is the most easy to spot, and if you are using an IDE like PyCharm you will notice it immediately before running your code.

In order to reproduce the exception, let’s consider a project structure like:

/proj
    /foo
        __init__.py
        bar.py
    main.py

A main.py containing:

from fo.bar import BarClass
c = BarClass()

and bar.py containing:

class BarClass:
    pass

By using /proj as a current working directory and by running:

python main.py

We will obtain the following exception:

Traceback (most recent call last):
  File "/Users/dave/PycharmProjects/proj/main.py", line 1, in <module>
    from fo.bar import BarClass
ModuleNotFoundError: No module named 'fo'

To solve the problem, we have simply to change the import in order to match the right path (“foo.bar” instead of “fo.bar”):

from foo.bar import BarClass
c = BarClass()

So far, so easy… but let’s go on with scenario N.2

2. Execution context which requires an entry addition in sys.path that has not been satisfied

This one occurs when we are executing a python script with an import statement in a directory from which the interpreter cannot resolve the path to the required module defined in the import statement due to missing or bad configuration of the sys.path.
And, here you have first to understand how Python lookup for modules works, so I report the official documentation:

When a module named spam is imported, the interpreter first searches for a built-in module with that name. If not found, it then searches for a file named spam.py in a list of directories given by the variable sys.path. sys.path is initialized from these locations:

  1. The directory containing the input script (or the current directory when no file is specified).
  2. PYTHONPATH (a list of directory names, with the same syntax as the shell variable PATH).
  3. The installation-dependent default.

Let’s keep the structure of the scenario N.1, but with main.py containing:

class BaseClass:
    pass

and bar.py containing:

from main import BaseClass
c = BaseClass()

but now let’s change the working directory to “foo”, and launch the command:

python bar.py 

We will obtain the following exception:

Traceback (most recent call last):
  File "bar.py", line 1, in <module>
    from main import BaseClass
ModuleNotFoundError: No module named 'main'

Because since we are in the “foo” directory and we didn’t update the sys.path, Python is looking for a main.py file in that directory and obviously is not the case!
We can fix this issue in two ways: by using the PYTHONPATH environment variable or by extending the sys.path list.
To use the PYTHONPATH in a single shot, we can launch the script with the following command:

PYTHONPATH=../ python bar.py

In this way, we are practically saying “hey python, please consider also the parent directory for the module lookup”.
The same can be specified programmatically in this way:

import sys
sys.path.append('../')

Of course the code above must be written before the other import statement. Anyway my advice is to avoid such approach and to relay only on the PYTHONPATH environment variable.
Use sys.path instead to debug your current path resolution in this way:

import sys

for p in sys.path:
    print(p)

3. Circular dependency

This one is the most hateful that you can face. It happens when a module A requires something from a module B and in turn, the module B requires something from module A, thus generating a “deadly” circular reference.
In most cases it happens after an automatic refactoring with PyCharm (typically if you use the logging framework in the classical way)*, if it happens for other reasons it’s a signal that your software design is not sound and that you must review it carefully.

* for a classical usage of the logging framework I mean:

import logging

log = logging.getLogger(__name__)

class MyClass:
    def my_method(self):
        log.info('My method invoked')

then after moving MyClass to another module (via automatic refactoring), PyCharm tends to include an import of log (which 1. is not required since each module has its logger, 2. may cause the circular dependency).
To manually reproduce the exception, let’s consider a super simple structure like the following:

    /proj
        a.py
        b.py

With a.py containing:

from b import ClassB

class ClassA:
    def __init__(self):
        self.b = ClassB()

and b.py containing:

from a import ClassA

class ClassB:
    pass

a = ClassA()

By running python a.py in the project root, we will get the following exception:

Traceback (most recent call last):
  File "/Users/dave/PycharmProjects/proj/a.py", line 1, in <module>
    from b import ClassB
  File "/Users/dave/PycharmProjects/proj/b.py", line 1, in <module>
    from a import ClassA
  File "/Users/dave/PycharmProjects/proj/a.py", line 1, in <module>
    from b import ClassB
ImportError: cannot import name 'ClassB'

If we pay attention we can quite easily spot that this time we are facing a circular reference issue, since the stack trace is longer that the previous ones, and it prints a “ping-pong” between a.py and b.py.

Writing better software with Python 3.6 type hints

One of the recent features of Python 3 that I like the most is definitely the support for type annotations.
Type annotations are a precious tool (especially if used in combination with an advanced IDE like PyCharm) that allow us to: write clear and implicitly documented code, prevent us from invoking methods with wrong data types (ok, actually we can do whatever at runtime since Python is a dynamic language and type hints as the name suggests is just that: an hint) and get useful code suggestions and autocompletion.
Starting with Python 3.6 is now possible to specify not only arguments type in method signatures, but also types for inline variables. Let’s see it in action with a sample code:

from datetime import datetime, timedelta
from enum import Enum
from typing import List


class Sex(Enum):
    M = 'M'
    F = 'F'


class Person:
    def __init__(self, 
                 first_name: str, 
                 last_name: str, 
                 birth_date: datetime, 
                 sex: Sex):
        self._first_name: str = first_name
        self._last_name: str = last_name
        self._birth_date: datetime = birth_date
        self._sex: Sex = sex
        self._hobbies: List[str] = []

    def get_age(self) -> int:
        diff: timedelta = datetime.now() - self._birth_date
        return int(diff.days / 365)

    @property
    def hobbies(self) -> List[str]:
        return self._hobbies

    @hobbies.setter
    def hobbies(self, hobby_list: List[str]):
        self._hobbies = hobby_list

So, basically we have created a Person class and a Sex enum and by using type hints we have declared that:
“first_name” and “last_name” must be a str type, “birth_date” a datetime type and “sex” a custom enum type Sex.
We have also specified the return type of get_age() method as int and inside its implementation we have referenced the date difference as a timedelta object.
Finally we have imported List from “typing” package in order to specify “hobbies” as a list of string objects (if we don’t care about list content we can just use list type by avoiding the import).

By using PyCharm, we can see that if we try to pass an invalid type as argument it complains as expected:

Unfortunately PyCharm does not complains if we try to specify “hobbies” via simple assignment:

But in my opinion, using the type hints as shown in the example code has the huge value of keeping code documented, especially if you work in a team, or if you want to write an open source project.

One limitation in type hints that I found is that you can’t create “circular references”, that means you can’t have a method in a class that specify itself as argument:

Update:

As suggested in the comments, this can be “bypassed” by using strings in place of types as reported here

Testing Flask subdomain routing locally

I’m working on a project in which each customer gets a subdomain for his personal area and thanks to Flask it’s just a matter of using “subdomain” params in routing config. The tricky part is how to setup Flask and the /etc/hosts file in order to make it working on a local development machine.
So the first step is to map 127.0.0.1 to a custom domain name, and the same for a custom subdomain name:

127.0.0.1 localwebsite
127.0.0.1 peter.localwebsite

Of course you can choose any name you like, the only important thing is that the main “fake domain” must match in the “fake subdomain” mapping! (in the case above “localwebsite”)

Then in Flask you have to specify the server name and its port (required!):

app.config['SERVER_NAME'] = 'localwebsite:8080'

In the above scenario I’m mapping the previously “fake domain” by specifying the port on which the Flask server is running (which in my development settings is 8080 and should be 5000 if not specifically defined).
Finally we can register the routes:

@app.route('/', subdomain='<customer>')
def customer_subdomain(customer):
    return 'Customer is: {}'.format(customer)

and if we call peter.localwebsite:8080 it will return “Customer is: peter” as a response! (If we point to localwebsite:8080 the default home page view will be used as expected)

Dynamic (and crazy) Python class runtime definition using built-in type() function

I just realized that thanks to the dynamic nature of Python we can create absurd class names at runtime… even a “?” class!
As everybody knows the following code raises a SyntaxError:

class ?What:
    pass

but… what if we create it dynamically using the built-in function type()?
The main use of type() is to get the type of an object like:

class Foo:
    pass

f = Foo()

type(f) # -> <class '__main__.Foo'>

But, the function can be also used to create a class at runtime by passing: a string representing the class name, a tuple containing superclass(es) from which to inherit and a dictionary containing class attributes.
The previously defined class can be dynamically defined in this way:

type('Foo', (object, ), {})

…the crazy thing is that, since we provide the name as a string, we can dynamically create class names which should be otherwise impossible to define in a classic static way. Example:

question_mark_type = type('?', (object, ), {})
question_mark_instance = question_mark_type()
type(question_mark_instance) # -> <class '__main__.?'>

We have defined a “?” class! :D
Of course you should avoid such an abomination, but this is a cool python feature, since it allows magic things happen. In fact I realized this while testing a dynamic database introspection using SQLAlchemy.
I created tables named with invalid chars like “!table”, “$table”, “#table” and so on (which are allowed in some databases) but I was expecting that the ORM automapping would had failed, since that names can’t be valid class names… but clearly SQLAlchemy makes use of type() in order to create dynamic model classes and so is possible to map bad table names as working Python classes… really cool!

Regular Expressions in Python: how to match english and non english letters

Ok, this is a quick (and I hope super-helpful) tip on how to match foreign languages letters like (ö, è…) in a python regex.
As everybody knows, matching letter signs is just a matter of using [a-z] or \w (the latter will also match underscores!) but unfortunately letters with “decorations” are not matched by these selectors. If you want to match them, you have to use unicode selectors (something like [\u00D8-\u00F6]), but python can automatically match all the unicode variants by simply passing the flag re.UNICODE to compile(). So this:

re.compile('[^\W_]', re.IGNORECASE | re.UNICODE)

will match any english and non english letter.
But let me explain… \w matches letters and underscores, \W (note it’s uppercased) as opposite match all but letters and undescores, so [^\W_] will match letters only (thanks to the negation ^).
Bear in mind: the flag re.UNICODE as reported in python docs :

“Makes several escapes like \w, \b, \s and \d dependent on the Unicode character database”

A stupid demonstration:

# -*- coding: utf-8 -*-
import re

ENGLISH_CHARS = re.compile('[^\W_]', re.IGNORECASE)
ALL_CHARS = re.compile('[^\W_]', re.IGNORECASE | re.UNICODE)

assert len(ENGLISH_CHARS.findall('_àÖÎ_')) == 0
assert len(ALL_CHARS.findall('_àÖÎ_')) == 3

ps: not all languages have implemented the unicode flag, for example JavaScript had not …I love Python :)

UPDATE:
Webucator has published a video based on this post, and as explained in the video, this is no longer required in Python 3, since the default encoding is UTF-8 instead of ASCII! Checkout the video