Python: reading numbers from JSON without loss of precision using Decimal class for data processing

In the project I’m working on, I’m using an external API which returns a JSON response containing conversion rates for currencies. Since I’m dealing with currencies and prices, the precision of numbers plays an important rule in order to calculate values in the application. The good thing about JSON, despite its name is the acronym of “JavaScript Object Notation“, is that it’s a cross-language format, so it’s not limited to the capabilities of a specific language like JavaScript, so numbers in in JSON may have an higher precision than a js float!
This is a quote from wikipedia about JSON numbers (emphasis is mine):

Number — a signed decimal number that may contain a fractional part and may use exponential E notation. JSON does not allow non-numbers like NaN, nor does it make any distinction between integer and floating-point. (Even though JavaScript uses a double-precision floating-point format for all its numeric values, other languages implementing JSON may encode numbers differently)

By default Python’s json module will loads decimal numbers as float, so if we have a JSON like:

{ "number": 1.00000000000000000001 }

the default conversion into python will be {u'number': 1.0} if we just write the following code:

import json

json.loads(json_string)

But, fortunately is dead simple to load numbers in JSON using the decimal module, there is no need to write custom decoders as I saw on the web, it’s just a matter of specify the Decimal class for floats parsing in the loads() function in this way:

import json
from decimal import Decimal

json.loads(json_string, parse_float=Decimal)

In this way the loaded python object will be:
{u'number': Decimal('1.00000000000000000001')}
And we will be able to perform precise arithmetic computations!
It’s also possible to use Decimal even for integer numbers, by specifying parse_int:

json.loads(json_string, 
           parse_int=Decimal, 
           parse_float=Decimal)

Additional reading from official Python docs:

Abstract classes in Python using abc module

Python is a powerful language but it lacks some OOP features which are the foundation of other programming languages like Java.
One of these are abstract classes (which are classes you can’t instantiate but only extends in order to inherit base common methods and to be forced to implement abstract methods representing a common interface).
A common practice that can be found in Python projects is to “mimic” abstract classes/methods by creating a base class and defining a series of methods that raise a NotImplementedError. The official python documentation in fact says:

In user defined base classes, abstract methods should raise this exception when they require derived classes to override the method.

But… with my huge pleasure I discovered that, starting from Python 2.6, a module called abc allows to create “real” abstract classes, methods and properties!!

So, let’s see how to implement a better and effective abstract class in Python by forgetting the old NotImplementedError:

from abc import ABCMeta, abstractmethod

class AbstractAnimal(object):
    __metaclass__ = ABCMeta
    
    @abstractmethod
    def run(self):
        pass

now… if you (stupid idiot) try to instantiate an AbstractAnimal, the Python interpreter will complains saying:

TypeError: Can’t instantiate abstract class AbstractAnimal with abstract methods run

since, now you get it… let’s extend the abstract class with a concrete one:

class Dog(AbstractAnimal):
    pass

but you don’t trust me or simply forget to implement the abstract method (which MUST be implemented since marked as abstract)… and once again the interpreter will complains with:

TypeError: Can’t instantiate abstract class Dog with abstract methods run

(that, to be honest, is a dumb message since the class Dog is not actually abstract, but simply not implementing the required methods… but is however an “understandable” exception)

uh… If you use a cool IDE like PyCharm, it will marks the class as invalid by showing a message in the tooltip: “Dog must implement all abstract methods”!

finally once the method is implemented:

class Dog(AbstractAnimal):
    def run(self):
        print 'running like a dog...'

it’s also possible to define abstract properties using the related decorator @abstracproperty.

So… to recap: say goodbye to NotImplementedError, use abc module to assign ABCMeta to __metaclass__ property of the abstract class you want to implement, and use @abstractmethod and @abstracproperty to provide abstract methods and properties.

Read more about abc module here.

Creating a custom AMI with Postgis and its dependencies in order to deploy Django + GeoDjango on Amazon Elastic Beanstalk

While the installation of PostgreSQL + Postgis on my development machine (my beloved MacBook Pro) has been very easy, thanks to MacPorts, installing the necessary software on Amazon Elastic Beanstalk (in order to move my project Cygora.com from local to the cloud) has been an hard challenge!
Theoretically you can customize an environment by using configuration files in which you can specify packages and other resources to install, but the problem is that in the Amazon 64bit Linux distribution for Python (which is an extremely customized version of Red Hat) you don’t have apt (for which postgis packages are available), instead you have to rely on yum. Is possible to install extra repositories for yum (see here: http://postgis.net/install) in order to easily install postgis… but honestly I have no idea which repository should be the right one for Amazon Linux, so… it’s been painful, but I opted for an “old school” style installation, by downloading and compiling the missing packages by myself. So, after launching my EC2 instance I did connect to it via SSH and:

1. Switch to root user:

sudo su -

2. Update all the installed packages (which Amazon doesn’t update very often!):

yum update -y

3. Install development tools and necessary libraries (some of them, like “graphviz” are not required for GeoDjango and you can aovid their installation if you want… I’m reporting all my libraries as a future reference for myself :P)

yum install -y python-devel libpcap libpcap-devel libnet libnet-devel pcre pcre-devel gcc gcc-c++ libtool make libyaml libyaml-devel binutils libxml2 libxml2-devel zlib zlib-devel file-devel postgresql postgresql-devel postgresql-contrib geoip geoip-devel graphviz graphviz-devel gettext libtiff-devel libjpeg-devel libzip-devel freetype-devel lcms2-devel libwebp-devel tcl-devel tk-devel

4. Download and compile proj:

wget http://download.osgeo.org/proj/proj-4.8.0.zip
unzip proj-4.8.0.zip && cd proj-4.8.0
./configure && make && sudo make install
cd ..

5. Download and compile geos:

wget http://download.osgeo.org/geos/geos-3.4.2.tar.bz2
tar -xvf geos-3.4.2.tar.bz2 && cd geos-3.4.2
./configure && make && sudo make install
cd ..

6. Download and compile gdal (this library is the most SLOW to compile and depending on the type of instance that you have launched it may takes up to a couple of hours… be patient!):

wget http://download.osgeo.org/gdal/1.10.1/gdal1101.zip
unzip gdal1101.zip && cd gdal-1.10.1
./configure --with-python=yes && make && sudo make install
cd ..

7. Download and install postgis:

wget http://download.osgeo.org/postgis/source/postgis-2.1.1.tar.gz
tar -xvf postgis-2.1.1.tar.gz && cd postgis-2.1.1
./configure && make && sudo make install

8. Update installed libraries (this step is necessary to avoid issues related to invalid library paths):

sudo echo /usr/local/lib >> /etc/ld.so.conf
sudo ldconfig

It’s also a nice idea to export the environment variable LD_LIBRARY_PATH (as /usr/local/lib/:$LD_LIBRARY_PATH).
Once you have installed all the necessary software on your machine you can create a custom AMI by going to: EC2 > instances > select your instance > create AMI. To use that AMI as the default one for your application you have to specify its id in your Elastic Beanstalk environment configuration.

ps. If you are interested in Elastic Beanstalk + Django topic, you should subscribe to my RSS feed and follow my on Twitter (@daveoncode)… I will write a series of posts on running my project in the cloud using Amazon Web Services :)

Django 1.6 has been finally released… these are the changes I introduced into my codebase

Django 1.6 has been finally released, but this update is quite obtrusive, in fact…

Issue n.1: tests not found!

the dev team has changed significantly how unit tests are discovered and my old approach stopped working (no tests to run found).
In my previous system I created a “tests” package for each app in my project, in these packages I created a test class for each tested class, named like “TestClassName” and I exposed these classes at “tests” package level by customizing its __init__.py file.
But starting with Django 1.6 the default test runner doesn’t care about “tests” modules/packages, it looks instead for file matching the pattern “test*.py” and since this pattern is CASE SENSITIVE (aaaarghh!!!) my test CLASSES (so they start with an uppercase letter) are ignored! Fortunately is very simple to override the default runner in order to match another pattern :)
This is my custom runner:

from django.test.runner import DiscoverRunner

class TestRunner(DiscoverRunner):

    def __init__(self, pattern=None, top_level=None, verbosity=1, interactive=True, failfast=False, **kwargs):
        super(TestRunner, self).__init__('Test*.py', top_level, verbosity, interactive, failfast, **kwargs)

and in the settings module:

TEST_RUNNER = 'myapp.TestRunner.TestRunner'

Issue n.2: PyCharm Django Tests configuration is now broken!

If you are using PyCharm as your python IDE, the “Django tests” run configuration is now broken (honestly I don’t understand why the guys of JetBrains have set up a such perverse way to run Django tests by writing their own python modules LOL)… so I defined a simple python run configuration in which I call “python manage.py test“… it’s not so pleasant, like the Django tests run configuration because is basically the same output of the shell script without the “green bar” and the red/yellow/green buttons that show the results of the run tests… but at the moment it does the job, and the important thing is that, differently from launching tests from the shell I can use breakpoints in the IDE to stop code execution and debug the situation! (like before)
Uh… the good part is that now my tests are executed ~40% faster!!! (10 seconds vs 14 seconds in the old run configuration)

Issue n.3: Where is gone GeoIP?!

Another problem I faced is that the GeoIP wrapper around MaxMind api has been moved from django.contrib.gis.utils to django.contrib.gis.geoip (and this change seems not documented in the release notes!)