------------
iopro.pyodbc
------------

This project is an enhancement of the Python database module for ODBC
that implements the Python DB API 2.0 specification.  You can see the
original project here:

:homepage: http://code.google.com/p/pyodbc
:source:   http://github.com/mkleehammer/pyodbc
:source:   http://code.google.com/p/pyodbc/source/list

The enhancements are documented in this file.  For general info about
the pyodbc package, please refer to the original project
documentation.

This module enhancement requires:

* Python 2.4 or greater
* ODBC 3.0 or greater
* NumPy 1.5 or greater (1.7 is required for datetime64 support)

The enhancements in this module consist mainly in the addition of some
new methods for fetching the data after a query and put it in a
variety of NumPy containers.

Using NumPy as data containers instead of the classical list of tuples
has a couple of advantages:

1) The NumPy container is much more compact, and hence, it
requires much less memory, than the original approach.

2) As a NumPy container can hold arbitrarily large arrays, it requires
much less object creation than the original approach (one Python
object per datum retrieved).

This means that this enhancements will allow to fetch data out of
relational databases in a much faster way, while consuming
significantly less resources.


API additions
=============

Variables
~~~~~~~~~

* `pyodbc.npversion`  The version for the NumPy additions

Methods
~~~~~~~

**Cursor.fetchdictarray** (size=cursor.arraysize)

This is similar to the original `Cursor.fetchmany(size)`, but the data
is returned in a dictionary where the keys are the names of the
columns and the values are NumPy containers.

For example, it a SELECT is returning 3 columns with names 'a', 'b'
and 'c' and types `varchar(10)`, `integer` and `timestamp`, the
returned object will be something similar to::

  {'a': array([...], dtype='S11'),
   'b': array([...], dtype=int32),
   'c': array([...], dtype=datetime64[us])}

Note that the `varchar(10)` type is translated automatically to a
string type of 11 elements ('S11').  This is because the ODBC driver
needs one additional space to put the trailing '\0' in strings, and
NumPy needs to provide the room for this.

Also, it is important to stress that all the `timestamp` types are
translated into a NumPy `datetime64` type with a resolution of
microseconds by default.

**Cursor.fetchsarray** (size=cursor.arraysize)

This is similar to the original `Cursor.fetchmany(size)`, but the data
is returned in a NumPy structured array, where the name and type of
the fields matches to those resulting from the SELECT.

Here it is an example of the output for the SELECT above::

  array([(...),
         (...)], 
        dtype=[('a', '|S11'), ('b', '<i4'), ('c', ('<M8[us]', {}))])

Note that, due to efficiency considerations, this method is calling the
`fetchdictarray()` behind the scenes, and then doing a conversion to
get an structured array.  So, in general, this is a bit slower than
its `fetchdictarray()` counterpart.


Data types supported
====================

The new methods listed above have support for a subset of the standard
ODBC.  In particular:

* String support (SQL_VARCHAR) is supported.

* Numerical types, be them integers or floats (single and double
  precision) are fully supported.  Here it is the complete list:
  SQL_INTEGER, SQL_TINYINT, SQL_SMALLINT, SQL_FLOAT and SQL_DOUBLE.

* Dates, times, and timestamps are mapped to the `datetime64` and
  `timedelta` NumPy types.  The list of supported data types are:
  SQL_DATE, SQL_TIME and SQL_TIMESTAMP,

* Binary data is not supported yet.

* Unicode strings are not supported yet.


NULL values
===========

As there is not (yet) a definitive support for missing values (NA) in
NumPy, this module represents NA data as particular values depending
on the data type.  Here it is the current table of the particular
values::

  int8: -128 (-2**7)
  uint8: 255 (2**8-1)
  int16: -32768 (-2**15)
  uint16: 65535 (2**16-1)
  int32: -2147483648 (-2**31)
  uint32: 4294967295 (2**32-1)
  int64: -9223372036854775808 (-2**63)
  uint64: 18446744073709551615 (2**64-1)
  float32: NaN
  float64: NaN
  datetime64: NaT
  timedelta64: NaT (or -2**63)
  string: 'NA'


Improvements for 1.1 release
============================

* The rowcount is not trusted anymore for the `fetchdict()` and
  `fetchsarray()` methods.  Now the NumPy containers are built
  incrementally, using realloc for a better use of resources.

* The Python interpreter does not exit anymore when fetching an exotic
  datatype not supported by NumPy.

* The docsctrings for `fetchdict()` and `fetchsarray()` have been improved.

