Wednesday, December 13, 2006

Module Methods and Failing Tests

I have been looking at how CPython handles keyword arguments in methods today. I've had my fair share of experience with Python over the years (though, not so much in the last few months) but I was totally unaware that methods may or may not support keyword arguments! Maybe that's because I often used the PyQt GUI toolkit bindings which didn't support keywords arguments anyway, I'm not sure.

At the end of my last entry my _csv module was in a position where I was ready to implement the register_dialect() method. To do this I needed to figure out how Jython handles arguments as I thought I would need to support keyword arguments for register_dialect() - it's a python method after all and all python methods support keyword arguments don't they? In fact, as it turns out this isn't always the case! Although not explicity mentioned in the documentation, some CPython methods don't support keyword arguments and if you try to use them you will get a TypeError. Indeed, csv.register_dialect() is one such method:


Python 2.3.6 (#1, Nov 17 2006, 22:32:43)
[GCC 4.1.2 20060928 (prerelease) (Ubuntu 4.1.1-13ubuntu5)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import csv
>>> csv.register_dialect(dialect=None, name="excel")
Traceback (most recent call last):
File "", line 1, in ?
TypeError: register_dialect() takes no keyword arguments

Presumably register_dialect() behaves like this because it is not really a Python method. The csv.py module just exposes register_dialect() from the C Module but a normal python developer would not know this and would quite rightly expect the method to support keyword arguments. This inconsistency is less than ideal and it's tempting to fix it for Jython but I think that would be a mistake. Jython is supposed to mimic CPython's behaviour whether rightly or wrongly. From a Jython perspective it's right if it's the way CPython behaves.

So, in the case of register_dialect() I can explicitly specify the arguments as follows:

public static void register_dialect(PyObject name, PyObject dialect) {
}

If I try to run test_csv.py now I get the following error:

Traceback (innermost last):
File "dist/Lib/test/test_csv.py", line 9, in ?
ImportError: no module named gc

The test_csv.py module uses the gc module which isn't supported by Jython yet. For now, I have just completely side-tracked this problem by making a copy of test_csv.py and removing all the tests that involve gc! Problem solved (temporarily at least)!

Now, when I run my own copy of test_csv.py without the gc calls I get yet another annoying error:

Traceback (innermost last):
File "test_csv.py", line 363, in ?
File "test_csv.py", line 364, in TestEscapedExcel
File "/jy/dist/Lib/csv.py", line 39, in __init__
None: Dialect did not validate: quoting parameter not set

This, and no doubt many other future cryptic errors are due to the fact that all the identifiers are the wrong type - they are all PyObjects which confuses Jython a great deal. Now is the right time to revisit each identifier and change it to the correct type.

It's worth noting at this stage, my goal for today is to get _csv into a state where it is good enough to fail all tests. Wow, what a statement - lets say that again: I want _csv to be good enough to fail all tests! What a strange goal to aim for. Well, actually once I have _csv in a state where test_csv.py can properly execute I am in a far better position than I was before. I can analyse the output of test_csv and tackle one test at a time, gaining satisfaction and confidence as I go. This is one of the primarily advantages of Test Driven Development and it's surprising how effective it is.

First, I will tackle the methods. Rather than figure out the parameters for each method I have simply specified "PyObject[] args" as the parameter list which just means the method supports 0 or more arguments. For example, I have implemented unregister_dialect() as follows:

static public PyObject unregister_dialect(PyObject[] args) {
return null;
}

For all the QUOTE_xxx identifiers I looked in the _csv.c module and saw they were enums. In Java I just make these separate integers to get them to work initially. I left Error as a PyObject as I will need to spend some time looking at exceptions at a later date. Similarly, I have left Dialect well alone and will look into it when the time is right. Finally, I changed __doc__ and __version__ to empty Strings to complete the process.

With all the identifiers now the correct type, Jython is happy to run test_csv.py. Of course, not many tests pass and there is a lot of output - here's a sample of it:

======================================================================
FAIL: test_reader_arg_valid1 (__main__.Test_Csv)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/jy/dist/Lib/unittest.py", line 229, in __call__
File "test_csv.py", line 19, in test_reader_arg_valid1
File "/jy/dist/Lib/unittest.py", line 295, in failUnlessRaises
AssertionError: TypeError
----------------------------------------------------------------------
Ran 65 tests in 0.666s

FAILED (failures=4, errors=55)
Traceback (innermost last):
File "test_csv.py", line 716, in ?
File "test_csv.py", line 0, in test_main
File "/jy/dist/Lib/test/test_support.py", line 262, in run_unittest
File "/jy/dist/Lib/test/test_support.py", line 246, in run_suite
TestFailed: errors occurred; run in verbose mode for details

I may have 59 failures but this is a much better position than before. I now have something to focus on - I can tackle each test as it comes and gain confidence as the number of failures decrease and the number of passes increase until the porting process is complete. Yippee!

Now I am ready to implement the module proper, the first task is to find out what "c.s.v" stands for! ;) :)

No comments: