+### The DAL: A quick tour
web2py defines the following classes that make up the DAL:
The **DAL** object represents a database connection. For example:
``sqlite``:inxx
``
db = DAL('sqlite://storage.db')
``:code
``define_table``:inxx
**Table** represents a database table. You do not directly instantiate Table; instead, ``DAL.define_table`` instantiates it.
``
db.define_table('mytable', Field('myfield'))
``:code
The most important methods of a Table are:
``insert``:inxx
``truncate``:inxx
``drop``:inxx
``import_from_csv_file``:inxx
``count``:inxx
for row in rows:
myquery = (db.mytable.myfield != None) | (db.mytable.myfield > 'A')
``:code
``Set``:inxx
**Set** is an object that represents a set of records. Its most important methods are ``count``, ``select``, ``update``, and ``delete``. For example:
``
myset = db(myquery)
rows = myset.select()
myset.update(myfield='somevalue')
myset.delete()
``:code
``Expression``:inxx
**Expression** is something like an ``orderby`` or ``groupby`` expression. The Field class is derived from the Expression. Here is an example.
``
myorder = db.mytable.myfield.upper() | db.mytable.id
db().select(db.table.ALL, orderby=myorder)
``:code
+### Using the DAL "stand-alone"
+The web2py DAL can be used in a non-web2py environment via
+``
+from gluon import DAL, Field
+# also consider: from gluon.validators import *
+``:code
+
[[dal_constructor]]
### DAL constructor
Basic use:
``
>>> db = DAL('sqlite://storage.db')
``:code
The database is now connected and the connection is stored in the global variable ``db``.
At any time you can retrieve the connection string.
``_uri``:inxx
``
>>> print db._uri
sqlite://storage.db
``:code
and the database name
``_dbname``:inxx
``
>>> print db._dbname
Now, if you insert a record again, the counter starts again at 1 (this is back-e
Notice you can pass parameters to ``truncate``, for example you can tell SQLITE to restart the id counter.
``
db.person.truncate('RESTART IDENTITY CASCADE')
``:code
The argument is in raw SQL and therefore engine specific.
``bulk_insert``:inxx
web2py also provides a bulk_insert method
``
>>> db.person.bulk_insert([{'name':'Alex'}, {'name':'John'}, {'name':'Tim'}])
[3,4,5]
``:code
It takes a list of dictionaries of fields to be inserted and performs multiple inserts at once. It returns the IDs of the inserted records. On the supported relational databases there is no advantage in using this function as opposed to looping and performing individual inserts but on Google App Engine NoSQL, there is a major speed advantage.
### ``commit`` and ``rollback``
No create, drop, insert, truncate, delete, or update operation is actually committed until web2py issues the commit command. In models, views and controllers, web2py does this for you, but in modules you are required to do the commit.
``commit``:inxx
``
>>> db.commit()
``:code
To check it let's insert a new record:
``
>>> db.person.insert(name="Bob")
2
``:code
and roll back, i.e., ignore all operations since the last commit:
``rollback``:inxx
``
>>> db.rollback()
``:code
If you now insert again, the counter will again be set to 2, since the previous insert was rolled back.
``
>>> db.person.insert(name="Bob")
2
``:code
Code in models, views and controllers is enclosed in web2py code that looks like this:
``
try:
execute models, controller function and view
except:
rollback all connections
log the traceback
send a ticket to the visitor
else:
commit all connections
save cookies, sessions and return the page
``:code
+So in models, views and controllers there is no need to ever call ``commit`` or ``rollback`` explicitly in web2py unless you need more granular control.
+However, in modules you will need to use ``commit()``.
### Raw SQL
web2py defines the following classes that make up the DAL:
The **DAL** object represents a database connection. For example:
``sqlite``:inxx
``
db = DAL('sqlite://storage.db')
``:code
``define_table``:inxx
**Table** represents a database table. You do not directly instantiate Table; instead, ``DAL.define_table`` instantiates it.
``
db.define_table('mytable', Field('myfield'))
``:code
The most important methods of a Table are:
``insert``:inxx
``truncate``:inxx
``drop``:inxx
``import_from_csv_file``:inxx
``count``:inxx
for row in rows:
myquery = (db.mytable.myfield != None) | (db.mytable.myfield > 'A')
``:code
``Set``:inxx
**Set** is an object that represents a set of records. Its most important methods are ``count``, ``select``, ``update``, and ``delete``. For example:
``
myset = db(myquery)
rows = myset.select()
myset.update(myfield='somevalue')
myset.delete()
``:code
``Expression``:inxx
**Expression** is something like an ``orderby`` or ``groupby`` expression. The Field class is derived from the Expression. Here is an example.
``
myorder = db.mytable.myfield.upper() | db.mytable.id
db().select(db.table.ALL, orderby=myorder)
``:code
[[dal_constructor]]
### DAL constructor
Basic use:
``
>>> db = DAL('sqlite://storage.db')
``:code
The database is now connected and the connection is stored in the global variable ``db``.
At any time you can retrieve the connection string.
``_uri``:inxx
``
>>> print db._uri
sqlite://storage.db
``:code
and the database name
``_dbname``:inxx
``
>>> print db._dbname
Now, if you insert a record again, the counter starts again at 1 (this is back-e
Notice you can pass parameters to ``truncate``, for example you can tell SQLITE to restart the id counter.
``
db.person.truncate('RESTART IDENTITY CASCADE')
``:code
The argument is in raw SQL and therefore engine specific.
``bulk_insert``:inxx
web2py also provides a bulk_insert method
``
>>> db.person.bulk_insert([{'name':'Alex'}, {'name':'John'}, {'name':'Tim'}])
[3,4,5]
``:code
It takes a list of dictionaries of fields to be inserted and performs multiple inserts at once. It returns the IDs of the inserted records. On the supported relational databases there is no advantage in using this function as opposed to looping and performing individual inserts but on Google App Engine NoSQL, there is a major speed advantage.
### ``commit`` and ``rollback``
No create, drop, insert, truncate, delete, or update operation is actually committed until you issue the commit command
``commit``:inxx
``
>>> db.commit()
``:code
To check it let's insert a new record:
``
>>> db.person.insert(name="Bob")
2
``:code
and roll back, i.e., ignore all operations since the last commit:
``rollback``:inxx
``
>>> db.rollback()
``:code
If you now insert again, the counter will again be set to 2, since the previous insert was rolled back.
``
>>> db.person.insert(name="Bob")
2
``:code
Code in models, views and controllers is enclosed in web2py code that looks like this:
``
try:
execute models, controller function and view
except:
rollback all connections
log the traceback
send a ticket to the visitor
else:
commit all connections
save cookies, sessions and return the page
``:code
-There is no need to ever call ``commit`` or ``rollback`` explicitly in web2py unless one needs more granular control.
-
### Raw SQL

after_connection=None,
tables=None,
ignore_field_case=True,
entity_quoting=False,
table_hash=None)
``:code
-m## Coding style
after_connection=None,
tables=None,
ignore_field_case=True,
entity_quoting=False,
table_hash=None)
``:code

+m## Coding style
after_connection=None,
tables=None,
ignore_field_case=True,
entity_quoting=False,
table_hash=None)
``:code
[[connection_strings]]
#### Connection strings (the uri parameter)
``connection strings``:inxx
A connection with the database is established by creating an instance of the DAL object:
``
>>> db = DAL('sqlite://storage.db', pool_size=0)
``:code
``db`` is not a keyword; it is a local variable that stores the connection object ``DAL``. You are free to give it a different name. The constructor of ``DAL`` requires a single argument, the connection string. The connection string is the only web2py code that depends on a specific back-end database. Here are examples of connection strings for specific types of supported back-end databases (in all cases, we assume the database is running from localhost on its default port and is named "test"):
``ndb``:index
-------------
Migration is detailed below in Tables [[table migrations #table_migrations]]. Th
``fake_migrate_all = False`` If set to True fake migrates ALL tables
#### Experiment with the web2py shell
You can experiment with the DAL API using the web2py shell (-S [[command line option ../04#CommandLineOptions]]).
Start by creating a connection. For the sake of example, you can use SQLite. Nothing in this discussion changes when you change the back-end engine.
[[table_constructor]]
### Table constructor
``define_table``:inxx ``Field``:inxx
#### define_table signature
The signature for define_table:
Tables are defined in the DAL via ``define_table``:
``
>>> db.define_table('person', Field('name')
+ id=id,
rname=None,
redefine=True
common_filter,
fake_migrate,
fields,
format,
migrate,
on_define,
plural,
polymodel,
primarykey,
redefine,
sequence_name,
singular,
table_class,
trigger_name)
``:code
It defines, stores and returns a ``Table`` object called "person" containing a field (column) "name". This object can also be accessed via ``db.person``, so you do not need to catch the return value.
#### ``id``: Notes about the primary key
Do not declare a field called "id", because one is created by web2py anyway. Every table has a field called "id" by default. It is an auto-increment integer field (starting at 1) used for cross-reference and for making every record unique, so "id" is a primary key. (Note: the id counter starting at 1 is back-end specific. For example, this does not apply to the Google App Engine NoSQL.)
``named id field``:inxx
+Optionally you can define a Field of ``type='id'`` and web2py will use this field as auto-increment id field. This is not recommended except when accessing legacy database tables which have a primary key under a different name.
+With some limitation, you can also use different primary keys using the ``primarykey`` parameter. [[primarykey #primarykey]] is explained shortly below.
#### ``plural`` and ``singular``
Smartgrid objects may need to know the singular and plural name of the table. The defaults are smart but these parameters allow you to be specific. See smartgrid for more information.
#### ``redefine``
Tables can be defined only once but you can force web2py to redefine an existing table:
``
db.define_table('person', Field('name'))
db.define_table('person', Field('name'), redefine=True)
``:code
The redefinition may trigger a migration if field content is different.
[[record_representation]]
#### format: Record representation
It is optional but recommended to specify a format representation for records with the ``format`` parameter.
``
>>> db.define_table('person', Field('name'), format='%(name)s')
``:code
or
``
>>> db.define_table('person', Field('name'), format='%(name)s %(id)s')
``:code
or even more complex ones using a function:
``
>>> db.define_table('person', Field('name'),
format=lambda r: r.name or 'anonymous')
``:code
The format attribute will be used for two purposes:
- To represent referenced records in select/option drop-downs.
- To set the ``db.othertable.person.represent`` attribute for all fields referencing this table. This means that SQLTABLE will not show references by id but will use the format preferred representation instead.
#### rname: Record representation
``rname`` sets a database backend name for the table. This makes the web2py table name an alias, and ``rname`` is the real name used when constructing the query for the backend.
To illustrate just one use, ``rname`` can be used to provide MSSQL fully qualified table names accessing tables belonging to other databases on the server: ``rname = 'db1.dbo.table1'``:code
+[[primarykey]]
#### primarykey: Support for legacy tables
``primarykey`` helps support legacy tables with existing primary keys, even multi-part.
See [[Legacy Databases #LegacyDatabases]] below.
after_connection=None,
tables=None,
ignore_field_case=True,
entity_quoting=False,
table_hash=None)
``:code
[[connection_strings]]
#### Connection strings (the uri parameter)
``connection strings``:inxx
A connection with the database is established by creating an instance of the DAL object:
``
>>> db = DAL('sqlite://storage.db', pool_size=0)
``:code
``db`` is not a keyword; it is a local variable that stores the connection object ``DAL``. You are free to give it a different name. The constructor of ``DAL`` requires a single argument, the connection string. The connection string is the only web2py code that depends on a specific back-end database. Here are examples of connection strings for specific types of supported back-end databases (in all cases, we assume the database is running from localhost on its default port and is named "test"):
``ndb``:index
-------------
Migration is detailed below in Tables [[table migrations #table_migrations]]. Th
``fake_migrate_all = False`` If set to True fake migrates ALL tables
#### Experiment with the web2py shell
You can experiment with the DAL API using the web2py shell (-S [[command line option ../04#CommandLineOptions]]).
Start by creating a connection. For the sake of example, you can use SQLite. Nothing in this discussion changes when you change the back-end engine.
[[table_constructor]]
### Table constructor
``define_table``:inxx ``Field``:inxx
#### define_table signature
The signature for define_table:
Tables are defined in the DAL via ``define_table``:
``
>>> db.define_table('person', Field('name')
rname=None,
redefine=True
common_filter,
fake_migrate,
fields,
format,
migrate,
on_define,
plural,
polymodel,
primarykey,
redefine,
sequence_name,
singular,
table_class,
trigger_name)
``:code
It defines, stores and returns a ``Table`` object called "person" containing a field (column) "name". This object can also be accessed via ``db.person``, so you do not need to catch the return value.
#### ``id``: The primary key ``id`` field
Do not declare a field called "id", because one is created by web2py anyway. Every table has a field called "id" by default. It is an auto-increment integer field (starting at 1) used for cross-reference and for making every record unique, so "id" is a primary key. (Note: the id counter starting at 1 is back-end specific. For example, this does not apply to the Google App Engine NoSQL.)
``named id field``:inxx
-Optionally you can define a Field of ``type='id'`` and web2py will use this field as auto-increment id field. This is not recommended except when accessing legacy database tables. With some limitation, you can also use different primary keys and this is discussed in the section on "Legacy databases and keyed tables".
#### ``plural`` and ``singular``
Smartgrid objects may need to know the singular and plural name of the table. The defaults are smart but these parameters allow you to be specific. See smartgrid for more information.
-
-
#### ``redefine``
Tables can be defined only once but you can force web2py to redefine an existing table:
``
db.define_table('person', Field('name'))
db.define_table('person', Field('name'), redefine=True)
``:code
The redefinition may trigger a migration if field content is different.
[[record_representation]]
#### format: Record representation
It is optional but recommended to specify a format representation for records with the ``format`` parameter.
``
>>> db.define_table('person', Field('name'), format='%(name)s')
``:code
or
``
>>> db.define_table('person', Field('name'), format='%(name)s %(id)s')
``:code
or even more complex ones using a function:
``
>>> db.define_table('person', Field('name'),
format=lambda r: r.name or 'anonymous')
``:code
The format attribute will be used for two purposes:
- To represent referenced records in select/option drop-downs.
- To set the ``db.othertable.person.represent`` attribute for all fields referencing this table. This means that SQLTABLE will not show references by id but will use the format preferred representation instead.
#### rname: Record representation
``rname`` sets a database backend name for the table. This makes the web2py table name an alias, and ``rname`` is the real name used when constructing the query for the backend.
To illustrate just one use, ``rname`` can be used to provide MSSQL fully qualified table names accessing tables belonging to other databases on the server: ``rname = 'db1.dbo.table1'``:code
#### primarykey: Support for legacy tables
``primarykey`` helps support legacy tables with existing primary keys, even multi-part.
See [[Legacy Databases #LegacyDatabases]] below.

+``ndb``:index
+
-------------
**SQLite** | ``sqlite://storage.db``
**MySQL** | ``mysql://username:password@localhost/test``
**PostgreSQL** | ``postgres://username:password@localhost/test``
**MSSQL** | ``mssql://username:password@localhost/test``
**FireBird** | ``firebird://username:password@localhost/test``
**Oracle** | ``oracle://username/password@test``
**DB2** | ``db2://username:password@test``
**Ingres** | ``ingres://username:password@localhost/test``
**Sybase** | ``sybase://username:password@localhost/test``
**Informix** | ``informix://username:password@test``
**Teradata** | ``teradata://DSN=dsn;UID=user;PWD=pass;DATABASE=test``
**Cubrid** | ``cubrid://username:password@localhost/test``
**SAPDB** | ``sapdb://username:password@localhost/test``
**IMAP** | ``imap://user:password@server:port``
**MongoDB** | ``mongodb://username:password@localhost/test``
**Google/SQL** | ``google:sql://project:instance/database``
**Google/NoSQL** | ``google:datastore``
+**Google/NoSQL/NDB** | ``google:datastore+ndb``
-------------
Notice that in SQLite the database consists of a single file. If it does not exist, it is created. This file is locked every time it is accessed. In the case of MySQL, PostgreSQL, MSSQL, FireBird, Oracle, DB2, Ingres and Informix the database "test" must be created outside web2py. Once the connection is established, web2py will create, alter, and drop tables appropriately.
+In the Google/NoSQL case the ``+ndb`` option turns on NDB. NDB uses a Memcache buffer to read data that is accessed often. This is completely automatic and done at the datastore level, not at the web2py level.
+
It is also possible to set the connection string to ``None``. In this case DAL will not connect to any back-end database, but the API can still be accessed for testing. Examples of this will be discussed in Chapter 7.
-------------
**SQLite** | ``sqlite://storage.db``
**MySQL** | ``mysql://username:password@localhost/test``
**PostgreSQL** | ``postgres://username:password@localhost/test``
**MSSQL** | ``mssql://username:password@localhost/test``
**FireBird** | ``firebird://username:password@localhost/test``
**Oracle** | ``oracle://username/password@test``
**DB2** | ``db2://username:password@test``
**Ingres** | ``ingres://username:password@localhost/test``
**Sybase** | ``sybase://username:password@localhost/test``
**Informix** | ``informix://username:password@test``
**Teradata** | ``teradata://DSN=dsn;UID=user;PWD=pass;DATABASE=test``
**Cubrid** | ``cubrid://username:password@localhost/test``
**SAPDB** | ``sapdb://username:password@localhost/test``
**IMAP** | ``imap://user:password@server:port``
**MongoDB** | ``mongodb://username:password@localhost/test``
**Google/SQL** | ``google:sql://project:instance/database``
**Google/NoSQL** | ``google:datastore``
-------------
Notice that in SQLite the database consists of a single file. If it does not exist, it is created. This file is locked every time it is accessed. In the case of MySQL, PostgreSQL, MSSQL, FireBird, Oracle, DB2, Ingres and Informix the database "test" must be created outside web2py. Once the connection is established, web2py will create, alter, and drop tables appropriately.
It is also possible to set the connection string to ``None``. In this case DAL will not connect to any back-end database, but the API can still be accessed for testing. Examples of this will be discussed in Chapter 7.

Example:
``
db = DAL(lazy_tables=True)
db.define_table('person',Field('name'),Field('age','integer'),
on_define=lambda table: [
table.name.set_attributes(requires=IS_NOT_EMPTY(),default=''),
table.age.set_attributes(requires=IS_INT_IN_RANGE(0,120),default=30),
``:code
+Note this example shows how to use ``on_define`` but it is not actually necessary. The simple ``requires`` values could be added to the Field definitions and the table would still be lazy. However, ``requires`` which take a Set object as the first argument, such as IS_IN_DB, will make a query like ``db.sometable.somefield == some_value``:code which would cause ``sometable`` to be defined early. This is the situation saved by ``on_define``.
+
[[lazy_tables]]
#### Lazy Tables, a major performance boost
``lazy tables``:inxx
web2py models are executed before controllers, so all tables are defined at every request. Not all tables are needed to handle each request, so it is possible that some of the time spent defining tables is wasted. Conditional models ([[conditional models, chapter 4 ../04/#conditional_models]]) can help, but web2py offers a big performance boost via lazy_tables. This feature means that table creation is deferred until the table is actually referenced. Enabling lazy tables is made when initialising a database via the DAL constructor. It requires setting the ``DAL(...,lazy_tables=True)`` parameter. This is one of the most significant response-time performance boosts in web2py.
Example:
``
db = DAL(lazy_tables=True)
db.define_table('person',Field('name'),Field('age','integer'),
on_define=lambda table: [
table.name.set_attributes(requires=IS_NOT_EMPTY(),default=''),
table.age.set_attributes(requires=IS_INT_IN_RANGE(0,120),default=30),
``:code
[[lazy_tables]]
#### Lazy Tables, a major performance boost
``lazy tables``:inxx
web2py models are executed before controllers, so all tables are defined at every request. Not all tables are needed to handle each request, so it is possible that some of the time spent defining tables is wasted. Conditional models ([[conditional models, chapter 4 ../04/#conditional_models]]) can help, but web2py offers a big performance boost via lazy_tables. This feature means that table creation is deferred until the table is actually referenced. Enabling lazy tables is made when initialising a database via the DAL constructor. It requires setting the ``DAL(...,lazy_tables=True)`` parameter. This is one of the most significant response-time performance boosts in web2py.

+[[dal_constructor]]
+### DAL constructor
+Basic use:
+``
+>>> db = DAL('sqlite://storage.db')
+``:code
+
+The database is now connected and the connection is stored in the global variable ``db``.
+
+At any time you can retrieve the connection string.
+``_uri``:inxx
+``
+>>> print db._uri
+sqlite://storage.db
+``:code
+
+and the database name
+``_dbname``:inxx
+``
+>>> print db._dbname
+sqlite
+``:code
+
+The connection string is called a ``_uri`` because it is an instance of a Uniform Resource Identifier.
+
+The DAL allows multiple connections with the same database or with different databases, even databases of different types. For now, we will assume the presence of a single database since this is the most common situation.
+
+#### DAL signature
+``
+DAL(
+ uri='sqlite://dummy.db',
+ pool_size=0,
+ folder=None,
+ db_codec='UTF-8',
+ check_reserved=None,
+ migrate=True,
+ fake_migrate=False,
+ migrate_enabled=True,
+ fake_migrate_all=False,
+ decode_credentials=False,
+ driver_args=None,
+ adapter_args=None,
+ attempts=5,
+ auto_import=False,
+ bigint_id=False,
+ debug=False,
+ lazy_tables=False,
+ db_uid=None,
+ do_connect=True,
+ after_connection=None,
+ tables=None,
+ ignore_field_case=True,
+ entity_quoting=False,
+ table_hash=None)
+``:code
+
+[[connection_strings]]
+#### Connection strings (the uri parameter)
``connection strings``:inxx
A connection with the database is established by creating an instance of the DAL object:
``
>>> db = DAL('sqlite://storage.db', pool_size=0)
``:code
``db`` is not a keyword; it is a local variable that stores the connection object ``DAL``. You are free to give it a different name. The constructor of ``DAL`` requires a single argument, the connection string. The connection string is the only web2py code that depends on a specific back-end database. Here are examples of connection strings for specific types of supported back-end databases (in all cases, we assume the database is running from localhost on its default port and is named "test"):
-------------
**SQLite** | ``sqlite://storage.db``
**MySQL** | ``mysql://username:password@localhost/test``
**PostgreSQL** | ``postgres://username:password@localhost/test``
**MSSQL** | ``mssql://username:password@localhost/test``
**FireBird** | ``firebird://username:password@localhost/test``
**Oracle** | ``oracle://username/password@test``
**DB2** | ``db2://username:password@test``
**Ingres** | ``ingres://username:password@localhost/test``
**Sybase** | ``sybase://username:password@localhost/test``
**Informix** | ``informix://username:password@test``
**Teradata** | ``teradata://DSN=dsn;UID=user;PWD=pass;DATABASE=test``
A connection with the database is established by creating an instance of the DAL
-------------
Notice that in SQLite the database consists of a single file. If it does not exist, it is created. This file is locked every time it is accessed. In the case of MySQL, PostgreSQL, MSSQL, FireBird, Oracle, DB2, Ingres and Informix the database "test" must be created outside web2py. Once the connection is established, web2py will create, alter, and drop tables appropriately.
It is also possible to set the connection string to ``None``. In this case DAL will not connect to any back-end database, but the API can still be accessed for testing. Examples of this will be discussed in Chapter 7.
Some times you may need to generate SQL as if you had a connection but without actually connecting to the database. This can be done with
``
db = DAL('...', do_connect=False)
``:code
In this case you will be able to call ``_select``, ``_insert``, ``_update``, and ``_delete`` to generate SQL but not call ``select``, ``insert``, ``update``, and ``delete``. In most of the cases you can use ``do_connect=False`` even without having the required database drivers.
Notice that by default web2py uses utf8 character encoding for databases. If you work with existing databases that behave differently, you have to change it with the optional parameter ``db_codec`` like
``
db = DAL('...', db_codec='latin1')
``:code
+Otherwise you'll get UnicodeDecodeError tickets.
+
#### Connection pooling
``connection pooling``:inxx
A common argument of the DAL constructor is the ``pool_size``; it defaults to zero.
As it is rather slow to establish a new database connection for each request, web2py implements a mechanism for connection pooling. Once a connection is established and the page has been served and the transaction completed, the connection is not closed but goes into a pool. When the next http request arrives, web2py tries to recycle a connection from the pool and use that for the new transaction. If there are no available connections in the pool, a new connection is established.
When web2py starts, the pool is always empty. The pool grows up to the minimum between the value of ``pool_size`` and the max number of concurrent requests. This means that if ``pool_size=10`` but our server never receives more than 5 concurrent requests, then the actual pool size will only grow to 5. If ``pool_size=0`` then connection pooling is not used.
Connections in the pools are shared sequentially among threads, in the sense that they may be used by two different but not simultaneous threads. There is only one pool for each web2py process.
The ``pool_size`` parameter is ignored by SQLite and Google App Engine.
Connection pooling is ignored for SQLite, since it would not yield any benefit.
+#### Connection failures (attempts parameter)
+
+If web2py fails to connect to the database it waits 1 seconds and by default tries again up to 5 times before declaring a failure. In case of connection pooling it is possible that a pooled connection that stays open but unused for some time is closed by the database end. Thanks to the retry feature web2py tries to re-establish these dropped connections.
+The number of attempts is set via the attempts parameter.
+#### Lazy Tables
+setting ``lazy_tables = True`` provides a major performance boost. See below: [[lazy tables #lazy_tables]]
#### Replicated databases
The first argument of ``DAL(...)`` can be a list of URIs. In this case web2py tries to connect to each of them. The main purpose for this is to deal with multiple database servers and distribute the workload among them). Here is a typical use case:
``
db = DAL(['mysql://...1','mysql://...2','mysql://...3'])
``:code
In this case the DAL tries to connect to the first and, on failure, it
will try the second and the third. This can also be used to distribute load
in a database master-slave configuration. We will talk more about this
in Chapter 13 in the context of scalability.
#### Reserved keywords
``reserved Keywords``:inxx
``check_reserved`` tells the constructor to check table names and column names against reserved SQL keywords in target back-end databases. ``check_reserved`` defaults to None.
This is a list of strings that contain the database back-end adapter names.
The adapter name is the same as used in the DAL connection string. So if you want to check against PostgreSQL and MSSQL then your connection string would look as follows:
``
db = DAL('sqlite://storage.db',
check_reserved=['postgres', 'mssql'])
``:code
The DAL will scan the keywords in the same order as of the list.
There are two extra options "all" and "common". If you specify all, it will check against all known SQL keywords. If you specify common, it will only check against common SQL keywords such as ``SELECT``, ``INSERT``, ``UPDATE``, etc.
For supported back-ends you may also specify if you would like to check against the non-reserved SQL keywords as well. In this case you would append ``_nonreserved`` to the name. For example:
``
check_reserved=['postgres', 'postgres_nonreserved']
``:code
The following database backends support reserved words checking.
-----
**PostgreSQL** | ``postgres(_nonreserved)``
**MySQL** | ``mysql``
**FireBird** | ``firebird(_nonreserved)``
**MSSQL** | ``mssql``
**Oracle** | ``oracle``
-----
#### Database quoting and case settings (entity_quoting, ignore_field)
You can also use explicit quoting of SQL entities at DAL level. It works transparently so you can use the same names in python and in the DB schema.
+``ignore_field_case = True``
+``entity_quoting = True``
+Here is an example:
``
db = DAL('postgres://...', ...,ignore_field_case=False, entity_quoting=True)
db.define_table('table1', Field('column'), Field('COLUMN'))
print db(db.table1.COLUMN != db.table1.column).select()
``:code
#### Other DAL constructor parameters
+##### Database folder location
+``folder`` – where .table files will be created. Automatically set within web2py. Use an explicit path when using DAL outside web2py
+##### Default migration settings
+Migration is detailed below in Tables [[table migrations #table_migrations]]. The DAL constructor migration settings are booleans affecting defaults and global behaviour.
+``migrate = True`` sets default migrate behavior for all tables
+
+``fake_migrate = False`` sets default fake_migrate behavior for all tables
+
+``migrate_enabled = True`` If set to False disables ALL migrations
+``fake_migrate_all = False`` If set to True fake migrates ALL tables
+
+
+#### Experiment with the web2py shell
+
+You can experiment with the DAL API using the web2py shell (-S [[command line option ../04#CommandLineOptions]]).
+
+Start by creating a connection. For the sake of example, you can use SQLite. Nothing in this discussion changes when you change the back-end engine.
+
+
+[[table_constructor]]
+### Table constructor
``define_table``:inxx ``Field``:inxx
+#### define_table signature
+The signature for define_table:
+
+Tables are defined in the DAL via ``define_table``:
+``
+>>> db.define_table('person', Field('name')
+ rname=None,
+ redefine=True
+ common_filter,
+ fake_migrate,
+ fields,
+ format,
+ migrate,
+ on_define,
+ plural,
+ polymodel,
+ primarykey,
+ redefine,
+ sequence_name,
+ singular,
+ table_class,
+ trigger_name)
``:code
It defines, stores and returns a ``Table`` object called "person" containing a field (column) "name". This object can also be accessed via ``db.person``, so you do not need to catch the return value.
+#### ``id``: The primary key ``id`` field
+Do not declare a field called "id", because one is created by web2py anyway. Every table has a field called "id" by default. It is an auto-increment integer field (starting at 1) used for cross-reference and for making every record unique, so "id" is a primary key. (Note: the id counter starting at 1 is back-end specific. For example, this does not apply to the Google App Engine NoSQL.)
``named id field``:inxx
Optionally you can define a Field of ``type='id'`` and web2py will use this field as auto-increment id field. This is not recommended except when accessing legacy database tables. With some limitation, you can also use different primary keys and this is discussed in the section on "Legacy databases and keyed tables".
+#### ``plural`` and ``singular``
+Smartgrid objects may need to know the singular and plural name of the table. The defaults are smart but these parameters allow you to be specific. See smartgrid for more information.
+
+
+#### ``redefine``
Tables can be defined only once but you can force web2py to redefine an existing table:
``
db.define_table('person', Field('name'))
db.define_table('person', Field('name'), redefine=True)
``:code
The redefinition may trigger a migration if field content is different.
[[record_representation]]
#### format: Record representation
It is optional but recommended to specify a format representation for records with the ``format`` parameter.
``
>>> db.define_table('person', Field('name'), format='%(name)s')
``:code
or
``
>>> db.define_table('person', Field('name'), format='%(name)s %(id)s')
``:code
or even more complex ones using a function:
``
>>> db.define_table('person', Field('name'),
format=lambda r: r.name or 'anonymous')
``:code
The format attribute will be used for two purposes:
- To represent referenced records in select/option drop-downs.
- To set the ``db.othertable.person.represent`` attribute for all fields referencing this table. This means that SQLTABLE will not show references by id but will use the format preferred representation instead.
+#### rname: Record representation
+``rname`` sets a database backend name for the table. This makes the web2py table name an alias, and ``rname`` is the real name used when constructing the query for the backend.
+To illustrate just one use, ``rname`` can be used to provide MSSQL fully qualified table names accessing tables belonging to other databases on the server: ``rname = 'db1.dbo.table1'``:code
+
+#### primarykey: Support for legacy tables
+``primarykey`` helps support legacy tables with existing primary keys, even multi-part.
+See [[Legacy Databases #LegacyDatabases]] below.
+
+#### migrate, fake_migrate
+``migrate`` sets migration options for the table. See [[Table Migrations #table_migrations]] below
+
+#### table_class
+If you define your own Table class as a sub-class of gluon.dal.Table, you can provide it here; this allows you to extend and override methods. Example: ``table_class=MyTable``:code
+
+#### polymodel
+For Google App Engine
+
+#### on_define
+``on_define`` is a callback triggered when a lazy_table is instantiated, although it is called anyway if the table is not lazy. This allows dynamic changes to the table without losing the advantages of delayed instantiation.
+
+Example:
+``
+ db = DAL(lazy_tables=True)
+ db.define_table('person',Field('name'),Field('age','integer'),
+ on_define=lambda table: [
+ table.name.set_attributes(requires=IS_NOT_EMPTY(),default=''),
+ table.age.set_attributes(requires=IS_INT_IN_RANGE(0,120),default=30),
+``:code
+[[lazy_tables]]
+#### Lazy Tables, a major performance boost
+``lazy tables``:inxx
+web2py models are executed before controllers, so all tables are defined at every request. Not all tables are needed to handle each request, so it is possible that some of the time spent defining tables is wasted. Conditional models ([[conditional models, chapter 4 ../04/#conditional_models]]) can help, but web2py offers a big performance boost via lazy_tables. This feature means that table creation is deferred until the table is actually referenced. Enabling lazy tables is made when initialising a database via the DAL constructor. It requires setting the ``DAL(...,lazy_tables=True)`` parameter. This is one of the most significant response-time performance boosts in web2py.
+
+#### Adding attributes to fields and tables
+If you need to add custom attributes to fields, you can simply do this:
+``db.table.field.extra = {}``:code
+
+"extra" is not a keyword ; it's a custom attributes now attached to the field object. You can do it with tables too but they must be preceded by an
+underscore to avoid naming conflicts with fields:
+
+``db.table._extra = {} ``:code
+
+
[[field_constructor]]
### Field constructor
``Field constructor``:inxx
These are the default values of a Field constructor:
``
Field(name, 'string', length=None, default=None,
required=False, requires='<default>',
ondelete='CASCADE', notnull=False, unique=False,
uploadfield=True, widget=None, label=None, comment=None,
writable=True, readable=True, update=None, authorize=None,
autodelete=False, represent=None, compute=None,
uploadfolder=None,
+ uploadseparate=None,uploadfs=None,
+ rname=None)
``:code
Not all of them are relevant for every field. "length" is relevant only for fields of type "string". "uploadfield" and "authorize" are relevant only for fields of type "upload". "ondelete" is relevant only for fields of type "reference" and "upload".
- ``length`` sets the maximum length of a "string", "password" or "upload" field. If ``length`` is not specified a default value is used but the default value is not guaranteed to be backward compatible. ''To avoid unwanted migrations on upgrades, we recommend that you always specify the length for string, password and upload fields.''
- ``default`` sets the default value for the field. The default value is used when performing an insert if a value is not explicitly specified. It is also used to pre-populate forms built from the table using SQLFORM. Note, rather than being a fixed value, the default can instead be a function (including a lambda function) that returns a value of the appropriate type for the field. In that case, the function is called once for each record inserted, even when multiple records are inserted in a single transaction.
- ``required`` tells the DAL that no insert should be allowed on this table if a value for this field is not explicitly specified.
- ``requires`` is a validator or a list of validators. This is not used by the DAL, but it is used by SQLFORM. The default validators for the given types are shown in the following table:
- ``uploadfolder`` while the default is ``None``, most DB adapters will default to uploading files into os.path.join(request.folder, 'uploads'). MongoAdapter does not seem to be doing so at present.
+- ``rname`` provides the field was a "real name", a name for the field known to the database adapter; when the field is used, it is the rname value which is sent to the database. The web2py name for the field is then effectively an alias.
+
[[field_types]]
#### Field types
``field types``:inxx
----------
**field type** | **default field validators**
``string`` | ``IS_LENGTH(length)`` default length is 512
``text`` | ``IS_LENGTH(65536)``
``blob`` | ``None``
``boolean`` | ``None``
``integer`` | ``IS_INT_IN_RANGE(-1e100, 1e100)``
``double`` | ``IS_FLOAT_IN_RANGE(-1e100, 1e100)``
``decimal(n,m)`` | ``IS_DECIMAL_IN_RANGE(-1e100, 1e100)``
``date`` | ``IS_DATE()``
``time`` | ``IS_TIME()``
``datetime`` | ``IS_DATETIME()``
``password`` | ``None``
``upload`` | ``None``
``reference <table>`` | ``IS_IN_DB(db,table.field,format)``
``list:string`` | ``None``
including its parent table, tablename, and parent connection:
``
>>> db.person.name._table == db.person
True
>>> db.person.name._tablename == 'person'
True
>>> db.person.name._db == db
True
``:code
A field also has methods. Some of them are used to build queries and we will see them later.
A special method of the field object is ``validate`` and it calls the validators for the field.
``
print db.person.name.validate('John')
``
which returns a tuple ``(value, error)``. ``error`` is ``None`` if the input passes validation.
+[[table_migrations]]
### Migrations
+
``migrations``:inxx
``define_table`` checks whether or not the corresponding table exists. If it does not, it generates the SQL to create it and executes the SQL. If the table does exist but differs from the one being defined, it generates the SQL to alter the table and executes it. If a field has changed type but not name, it will try to convert the data (If you do not want this, you need to redefine the table twice, the first time, letting web2py drop the field by removing it, and the second time adding the newly defined field so that web2py can create it.). If the table exists and matches the current definition, it will leave it alone. In all cases it will create the ``db.person`` object that represents the table.
We refer to this behavior as a "migration". web2py logs all migrations and migration attempts in the file "databases/sql.log".
The first argument of ``define_table`` is always the table name. The other unnamed arguments are the fields (Field). The function also takes an optional keyword argument called "migrate":
``
>>> db.define_table('person', Field('name'), migrate='person.table')
``:code
The value of migrate is the filename (in the "databases" folder for the application) where web2py stores internal migration information for this table.
These files are very important and should never be removed while the corresponding tables exist. In cases where a table has been dropped and the corresponding file still exist, it can be removed manually. By default, migrate is set to True. This causes web2py to generate the filename from a hash of the connection string. If migrate is set to False, the migration is not performed, and web2py assumes that the table exists in the datastore and it contains (at least) the fields listed in ``define_table``.
The best practice is to give an explicit name to the migrate table.
There may not be two tables in the same application with the same migrate filename.
The DAL class also takes a "migrate" argument, which determines the default value of migrate for calls to ``define_table``. For example,
``
>>> db = DAL('sqlite://storage.db', migrate=False)
Finally, you can drop tables and all data will be lost:
``drop``:inxx
``
>>> db.person.drop()
``:code
Note for sqlite: web2py will not re-create the dropped table until you navigate the file system to the databases directory of your app, and delete the file associated with the dropped table.
### Indexes
Currently the DAL API does not provide a command to create indexes on tables, but this can be done using the ``executesql`` command. This is because the existence of indexes can make migrations complex, and it is better to deal with them explicitly. Indexes may be needed for those fields that are used in recurrent queries.
Here is an example of how to [[create an index using SQL in SQLite http://www.sqlite.org/lang_createindex.html]]:
``
>>> db = DAL('sqlite://storage.db')
>>> db.define_table('person', Field('name'))
>>> db.executesql('CREATE INDEX IF NOT EXISTS myidx ON person (name);')
``:code
Other database dialects have very similar syntaxes but may not support the optional "IF NOT EXISTS" directive.
+[[LegacyDatabases]]
### Legacy databases and keyed tables
web2py can connect to legacy databases under some conditions.
The easiest way is when these conditions are met:
- Each table must have a unique auto-increment integer field called "id"
- Records must be referenced exclusively using the "id" field.
When accessing an existing table, i.e., a table not created by web2py in the current application, always set ``migrate=False``.
If the legacy table has an auto-increment integer field but it is not called "id", web2py can still access it but the table definition must contain explicitly as ``Field('....','id')`` where ... is the name of the auto-increment integer field.
``keyed table``:inxx
Finally if the legacy table uses a primary key that is not an auto-increment id field it is possible to use a "keyed table", for example:
``
db.define_table('account',
Field('accnum','integer'),
Field('acctype'),
Field('accdesc'),
When records are deleted, they are not really deleted. A deleted record is copie
``db._common_fields`` is a list of fields that should belong to all the tables. This list can also contain tables and it is understood as all fields from the table. For example occasionally you find yourself in need to add a signature to all your tables but the ```auth`` tables. In this case, after you ``db.define_tables()`` but before defining any other table, insert
``
db._common_fields.append(auth.signature)
``
One field is special: "request_tenant".
This field does not exist but you can create it and add it to any of your tables (or them all):
``
db._common_fields.append(Field('request_tenant',
default=request.env.http_host,writable=False))
``
For every table with a field called ``db._request_tenant``, all records for all queries are always automatically filtered by:
``
db.table.request_tenant == db.table.request_tenant.default
``:code
and for every record inserted, this field is set to the default value.
In the example above we have chosen
``
default = request.env.http_host
``
i.e. we have chose to ask our app to filter all tables in all queries with
``
db.table.request_tenant == request.env.http_host
``
This simple trick allow us to turn any application into a multi-tenant application. i.e. even if we run one instance of the app and we use one single database, if the app is accessed under two or more domains (in the example the domain name is retrieved from ``request.env.http_host``) the visitors will see different data depending on the domain. Think of running multiple web stores under different domains with one app and one database.
You can turn off multi tenancy filters using: ``ignore_common_filters``:inxx
``
rows = db(query, ignore_common_filters=True).select()
``:code
#### Common filters
A common filter is a generalization of the above multi-tenancy idea.
It provides an easy way to prevent repeating of the same query.
db = DAL('sqlite://storage.sqlite')
and you wish to move to another database using a different connection string:
``
db = DAL('postgres://username:password@localhost/mydb')
``
Before you switch, you want to move the data and rebuild all the metadata for the new database. We assume the new database to exist but we also assume it is empty.
Web2py provides a script that does this work for you:
``
cd web2py
python scripts/cpdb.py \
-f applications/app/databases \
-y 'sqlite://storage.sqlite' \
-Y 'postgres://username:password@localhost/mydb'
``
After running the script you can simply switch the connection string in the model and everything should work out of the box. The new data should be there.
This script provides various command line options that allows you to move data from one application to another, move all tables or only some tables, clear the data in the tables. For more info try:
-### Connection strings
``connection strings``:inxx
A connection with the database is established by creating an instance of the DAL object:
``
>>> db = DAL('sqlite://storage.db', pool_size=0)
``:code
``db`` is not a keyword; it is a local variable that stores the connection object ``DAL``. You are free to give it a different name. The constructor of ``DAL`` requires a single argument, the connection string. The connection string is the only web2py code that depends on a specific back-end database. Here are examples of connection strings for specific types of supported back-end databases (in all cases, we assume the database is running from localhost on its default port and is named "test"):
-------------
**SQLite** | ``sqlite://storage.db``
**MySQL** | ``mysql://username:password@localhost/test``
**PostgreSQL** | ``postgres://username:password@localhost/test``
**MSSQL** | ``mssql://username:password@localhost/test``
**FireBird** | ``firebird://username:password@localhost/test``
**Oracle** | ``oracle://username/password@test``
**DB2** | ``db2://username:password@test``
**Ingres** | ``ingres://username:password@localhost/test``
**Sybase** | ``sybase://username:password@localhost/test``
**Informix** | ``informix://username:password@test``
**Teradata** | ``teradata://DSN=dsn;UID=user;PWD=pass;DATABASE=test``
A connection with the database is established by creating an instance of the DAL
-------------
Notice that in SQLite the database consists of a single file. If it does not exist, it is created. This file is locked every time it is accessed. In the case of MySQL, PostgreSQL, MSSQL, FireBird, Oracle, DB2, Ingres and Informix the database "test" must be created outside web2py. Once the connection is established, web2py will create, alter, and drop tables appropriately.
It is also possible to set the connection string to ``None``. In this case DAL will not connect to any back-end database, but the API can still be accessed for testing. Examples of this will be discussed in Chapter 7.
Some times you may need to generate SQL as if you had a connection but without actually connecting to the database. This can be done with
``
db = DAL('...', do_connect=False)
``:code
In this case you will be able to call ``_select``, ``_insert``, ``_update``, and ``_delete`` to generate SQL but not call ``select``, ``insert``, ``update``, and ``delete``. In most of the cases you can use ``do_connect=False`` even without having the required database drivers.
Notice that by default web2py uses utf8 character encoding for databases. If you work with existing databases that behave differently, you have to change it with the optional parameter ``db_codec`` like
``
db = DAL('...', db_codec='latin1')
``:code
-otherwise you'll get UnicodeDecodeError tickets.
#### Connection pooling
``connection pooling``:inxx
The second argument of the DAL constructor is the ``pool_size``; it defaults to zero.
As it is rather slow to establish a new database connection for each request, web2py implements a mechanism for connection pooling. Once a connection is established and the page has been served and the transaction completed, the connection is not closed but goes into a pool. When the next http request arrives, web2py tries to recycle a connection from the pool and use that for the new transaction. If there are no available connections in the pool, a new connection is established.
When web2py starts, the pool is always empty. The pool grows up to the minimum between the value of ``pool_size`` and the max number of concurrent requests. This means that if ``pool_size=10`` but our server never receives more than 5 concurrent requests, then the actual pool size will only grow to 5. If ``pool_size=0`` then connection pooling is not used.
Connections in the pools are shared sequentially among threads, in the sense that they may be used by two different but not simultaneous threads. There is only one pool for each web2py process.
The ``pool_size`` parameter is ignored by SQLite and Google App Engine.
Connection pooling is ignored for SQLite, since it would not yield any benefit.
-#### Connection failures
-If web2py fails to connect to the database it waits 1 seconds and tries again up to 5 times before declaring a failure. In case of connection pooling it is possible that a pooled connection that stays open but unused for some time is closed by the database end. Thanks to the retry feature web2py tries to re-establish these dropped connections.
#### Replicated databases
The first argument of ``DAL(...)`` can be a list of URIs. In this case web2py tries to connect to each of them. The main purpose for this is to deal with multiple database servers and distribute the workload among them). Here is a typical use case:
``
db = DAL(['mysql://...1','mysql://...2','mysql://...3'])
``:code
In this case the DAL tries to connect to the first and, on failure, it
will try the second and the third. This can also be used to distribute load
in a database master-slave configuration. We will talk more about this
in Chapter 13 in the context of scalability.
### Reserved keywords
``reserved Keywords``:inxx
``check_reserved`` is yet another argument that can be passed to the DAL constructor. It tells it to check table names and column names against reserved SQL keywords in target back-end databases. ``check_reserved`` defaults to None.
This is a list of strings that contain the database back-end adapter names.
The adapter name is the same as used in the DAL connection string. So if you want to check against PostgreSQL and MSSQL then your connection string would look as follows:
``
db = DAL('sqlite://storage.db',
check_reserved=['postgres', 'mssql'])
``:code
The DAL will scan the keywords in the same order as of the list.
There are two extra options "all" and "common". If you specify all, it will check against all known SQL keywords. If you specify common, it will only check against common SQL keywords such as ``SELECT``, ``INSERT``, ``UPDATE``, etc.
For supported back-ends you may also specify if you would like to check against the non-reserved SQL keywords as well. In this case you would append ``_nonreserved`` to the name. For example:
``
check_reserved=['postgres', 'postgres_nonreserved']
``:code
The following database backends support reserved words checking.
-----
**PostgreSQL** | ``postgres(_nonreserved)``
**MySQL** | ``mysql``
**FireBird** | ``firebird(_nonreserved)``
**MSSQL** | ``mssql``
**Oracle** | ``oracle``
-----
-[[DAL_table_field]]
-### ``DAL``, ``Table``, ``Field``
You can experiment with the DAL API using the web2py shell.
Start by creating a connection. For the sake of example, you can use SQLite. Nothing in this discussion changes when you change the back-end engine.
-[[dal_constructor]]
-#### DAL constructor
``
->>> db = DAL('sqlite://storage.db')
``:code
The database is now connected and the connection is stored in the global variable ``db``.
-At any time you can retrieve the connection string.
-``_uri``:inxx
-``
->>> print db._uri
sqlite://storage.db
``:code
-and the database name
-``_dbname``:inxx
-``
->>> print db._dbname
-sqlite
``:code
-The connection string is called a ``_uri`` because it is an instance of a Uniform Resource Identifier.
-The DAL allows multiple connections with the same database or with different databases, even databases of different types. For now, we will assume the presence of a single database since this is the most common situation.
-[[table_constructor]]
-#### Table constructor
``define_table``:inxx ``Field``:inxx
-``type``:inxx ``length``:inxx ``default``:inxx ``requires``:inxx ``required``:inxx ``unique``:inxx
-``notnull``:inxx ``ondelete``:inxx ``uploadfield``:inxx ``uploadseparate``:inxx ``migrate``:inxx ``sql.log``:inxx
-The most important method of a DAL is ``define_table``:
-``
->>> db.define_table('person', Field('name'))
``:code
It defines, stores and returns a ``Table`` object called "person" containing a field (column) "name". This object can also be accessed via ``db.person``, so you do not need to catch the return value.
-Do not declare a field called "id", because one is created by web2py anyway. Every table has a field called "id" by default. It is an auto-increment integer field (starting at 1) used for cross-reference and for making every record unique, so "id" is a primary key. (Note: the id's starting at 1 is back-end specific. For example, this does not apply to the Google App Engine NoSQL.)
``named id field``:inxx
Optionally you can define a field of ``type='id'`` and web2py will use this field as auto-increment id field. This is not recommended except when accessing legacy database tables. With some limitation, you can also use different primary keys and this is discussed in the section on "Legacy databases and keyed tables".
Tables can be defined only once but you can force web2py to redefine an existing table:
``
db.define_table('person', Field('name'))
db.define_table('person', Field('name'), redefine=True)
``:code
The redefinition may trigger a migration if field content is different.
-[[lazy_tables]]
-#### Lazy Tables, a major performance boost
-``lazy tables``:inxx
-web2py models are executed before controllers, so all tables are defined at every request. Not all tables are needed to handle each request, so it is possible that some of the time spent defining tables is wasted. Conditional models ([[conditional models, chapter 4 ../04/#conditional_models]]) can help, but web2py offers a big performance boost via lazy_tables. This feature means that table creation is deferred until the table is actually referenced. Enabling lazy tables requires setting the ``DAL(...,lazy_tables=True)`` parameter. This is one of the most significant response-time performance boosts in web2py.
-
-#### Adding attributes to fields and tables
-If you need to add custom attributes to fields, you can simply do this:
-``db.table.field.extra = {}``:code
-
-"extra" is not a keyword ; it's a custom attributes now attached to the field object. You can do it with tables too but they must be preceded by an
-underscore to avoid naming conflicts with fields:
-
-``db.table._extra = {} ``:code
-
[[record_representation]]
#### Record representation
It is optional but recommended to specify a format representation for records:
``
>>> db.define_table('person', Field('name'), format='%(name)s')
``:code
or
``
>>> db.define_table('person', Field('name'), format='%(name)s %(id)s')
``:code
or even more complex ones using a function:
``
>>> db.define_table('person', Field('name'),
format=lambda r: r.name or 'anonymous')
``:code
The format attribute will be used for two purposes:
- To represent referenced records in select/option drop-downs.
- To set the ``db.othertable.person.represent`` attribute for all fields referencing this table. This means that SQLTABLE will not show references by id but will use the format preferred representation instead.
[[field_constructor]]
#### Field constructor
``Field constructor``:inxx
These are the default values of a Field constructor:
``
Field(name, 'string', length=None, default=None,
required=False, requires='<default>',
ondelete='CASCADE', notnull=False, unique=False,
uploadfield=True, widget=None, label=None, comment=None,
writable=True, readable=True, update=None, authorize=None,
autodelete=False, represent=None, compute=None,
uploadfolder=None,
- uploadseparate=None,uploadfs=None)
``:code
Not all of them are relevant for every field. "length" is relevant only for fields of type "string". "uploadfield" and "authorize" are relevant only for fields of type "upload". "ondelete" is relevant only for fields of type "reference" and "upload".
- ``length`` sets the maximum length of a "string", "password" or "upload" field. If ``length`` is not specified a default value is used but the default value is not guaranteed to be backward compatible. ''To avoid unwanted migrations on upgrades, we recommend that you always specify the length for string, password and upload fields.''
- ``default`` sets the default value for the field. The default value is used when performing an insert if a value is not explicitly specified. It is also used to pre-populate forms built from the table using SQLFORM. Note, rather than being a fixed value, the default can instead be a function (including a lambda function) that returns a value of the appropriate type for the field. In that case, the function is called once for each record inserted, even when multiple records are inserted in a single transaction.
- ``required`` tells the DAL that no insert should be allowed on this table if a value for this field is not explicitly specified.
- ``requires`` is a validator or a list of validators. This is not used by the DAL, but it is used by SQLFORM. The default validators for the given types are shown in the following table:
- ``uploadfolder`` while the default is ``None``, most DB adapters will default to uploading files into os.path.join(request.folder, 'uploads'). MongoAdapter does not seem to be doing so at present.
[[field_types]]
#### Field types
``field types``:inxx
----------
**field type** | **default field validators**
``string`` | ``IS_LENGTH(length)`` default length is 512
``text`` | ``IS_LENGTH(65536)``
``blob`` | ``None``
``boolean`` | ``None``
``integer`` | ``IS_INT_IN_RANGE(-1e100, 1e100)``
``double`` | ``IS_FLOAT_IN_RANGE(-1e100, 1e100)``
``decimal(n,m)`` | ``IS_DECIMAL_IN_RANGE(-1e100, 1e100)``
``date`` | ``IS_DATE()``
``time`` | ``IS_TIME()``
``datetime`` | ``IS_DATETIME()``
``password`` | ``None``
``upload`` | ``None``
``reference <table>`` | ``IS_IN_DB(db,table.field,format)``
``list:string`` | ``None``
including its parent table, tablename, and parent connection:
``
>>> db.person.name._table == db.person
True
>>> db.person.name._tablename == 'person'
True
>>> db.person.name._db == db
True
``:code
A field also has methods. Some of them are used to build queries and we will see them later.
A special method of the field object is ``validate`` and it calls the validators for the field.
``
print db.person.name.validate('John')
``
which returns a tuple ``(value, error)``. ``error`` is ``None`` if the input passes validation.
### Migrations
``migrations``:inxx
``define_table`` checks whether or not the corresponding table exists. If it does not, it generates the SQL to create it and executes the SQL. If the table does exist but differs from the one being defined, it generates the SQL to alter the table and executes it. If a field has changed type but not name, it will try to convert the data (If you do not want this, you need to redefine the table twice, the first time, letting web2py drop the field by removing it, and the second time adding the newly defined field so that web2py can create it.). If the table exists and matches the current definition, it will leave it alone. In all cases it will create the ``db.person`` object that represents the table.
We refer to this behavior as a "migration". web2py logs all migrations and migration attempts in the file "databases/sql.log".
The first argument of ``define_table`` is always the table name. The other unnamed arguments are the fields (Field). The function also takes an optional keyword argument called "migrate":
``
>>> db.define_table('person', Field('name'), migrate='person.table')
``:code
The value of migrate is the filename (in the "databases" folder for the application) where web2py stores internal migration information for this table.
These files are very important and should never be removed while the corresponding tables exist. In cases where a table has been dropped and the corresponding file still exist, it can be removed manually. By default, migrate is set to True. This causes web2py to generate the filename from a hash of the connection string. If migrate is set to False, the migration is not performed, and web2py assumes that the table exists in the datastore and it contains (at least) the fields listed in ``define_table``.
The best practice is to give an explicit name to the migrate table.
There may not be two tables in the same application with the same migrate filename.
The DAL class also takes a "migrate" argument, which determines the default value of migrate for calls to ``define_table``. For example,
``
>>> db = DAL('sqlite://storage.db', migrate=False)
Finally, you can drop tables and all data will be lost:
``drop``:inxx
``
>>> db.person.drop()
``:code
Note for sqlite: web2py will not re-create the dropped table until you navigate the file system to the databases directory of your app, and delete the file associated with the dropped table.
### Indexes
Currently the DAL API does not provide a command to create indexes on tables, but this can be done using the ``executesql`` command. This is because the existence of indexes can make migrations complex, and it is better to deal with them explicitly. Indexes may be needed for those fields that are used in recurrent queries.
Here is an example of how to [[create an index using SQL in SQLite http://www.sqlite.org/lang_createindex.html]]:
``
>>> db = DAL('sqlite://storage.db')
>>> db.define_table('person', Field('name'))
>>> db.executesql('CREATE INDEX IF NOT EXISTS myidx ON person (name);')
``:code
Other database dialects have very similar syntaxes but may not support the optional "IF NOT EXISTS" directive.
### Legacy databases and keyed tables
web2py can connect to legacy databases under some conditions.
The easiest way is when these conditions are met:
- Each table must have a unique auto-increment integer field called "id"
- Records must be referenced exclusively using the "id" field.
When accessing an existing table, i.e., a table not created by web2py in the current application, always set ``migrate=False``.
If the legacy table has an auto-increment integer field but it is not called "id", web2py can still access it but the table definition must contain explicitly as ``Field('....','id')`` where ... is the name of the auto-increment integer field.
``keyed table``:inxx
Finally if the legacy table uses a primary key that is not an auto-increment id field it is possible to use a "keyed table", for example:
``
db.define_table('account',
Field('accnum','integer'),
Field('acctype'),
Field('accdesc'),
When records are deleted, they are not really deleted. A deleted record is copie
``db._common_fields`` is a list of fields that should belong to all the tables. This list can also contain tables and it is understood as all fields from the table. For example occasionally you find yourself in need to add a signature to all your tables but the ```auth`` tables. In this case, after you ``db.define_tables()`` but before defining any other table, insert
``
db._common_fields.append(auth.signature)
``
One field is special: "request_tenant".
This field does not exist but you can create it and add it to any of your tables (or them all):
``
db._common_fields.append(Field('request_tenant',
default=request.env.http_host,writable=False))
``
For every table with a field called ``db._request_tenant``, all records for all queries are always automatically filtered by:
``
db.table.request_tenant == db.table.request_tenant.default
``:code
and for every record insert, this field is set to the default value.
In the example above we have chosen
``
default = request.env.http_host
``
i.e. we have chose to ask our app to filter all tables in all queries with
``
db.table.request_tenant == request.env.http_host
``
This simple trick allow us to turn any application into a multi-tenant application. i.e. even if we run one instance of the app and we use one single database, if the app is accessed under two or more domains (in the example the domain name is retrieved from ``request.env.http_host``) the visitors will see different data depending on the domain. Think of running multiple web stores under different domains with one app and one database.
You can turn off multi tenancy filters using: ``ignore_common_filters``:inxx
``
rows = db(query, ignore_common_filters=True).select()
``:code
#### Common filters
A common filter is a generalization of the above multi-tenancy idea.
It provides an easy way to prevent repeating of the same query.
db = DAL('sqlite://storage.sqlite')
and you wish to move to another database using a different connection string:
``
db = DAL('postgres://username:password@localhost/mydb')
``
Before you switch, you want to move the data and rebuild all the metadata for the new database. We assume the new database to exist but we also assume it is empty.
Web2py provides a script that does this work for you:
``
cd web2py
python scripts/cpdb.py \
-f applications/app/databases \
-y 'sqlite://storage.sqlite' \
-Y 'postgres://username:password@localhost/mydb'
``
After running the script you can simply switch the connection string in the model and everything should work out of the box. The new data should be there.
This script provides various command line options that allows you to move data from one application to another, move all tables or only some tables, clear the data in the tables. for more info try:

Not all of them are relevant for every field. "length" is relevant only for fields of type "string". "uploadfield" and "authorize" are relevant only for fields of type "upload". "ondelete" is relevant only for fields of type "reference" and "upload".
- ``length`` sets the maximum length of a "string", "password" or "upload" field. If ``length`` is not specified a default value is used but the default value is not guaranteed to be backward compatible. ''To avoid unwanted migrations on upgrades, we recommend that you always specify the length for string, password and upload fields.''
- ``default`` sets the default value for the field. The default value is used when performing an insert if a value is not explicitly specified. It is also used to pre-populate forms built from the table using SQLFORM. Note, rather than being a fixed value, the default can instead be a function (including a lambda function) that returns a value of the appropriate type for the field. In that case, the function is called once for each record inserted, even when multiple records are inserted in a single transaction.
- ``required`` tells the DAL that no insert should be allowed on this table if a value for this field is not explicitly specified.
- ``requires`` is a validator or a list of validators. This is not used by the DAL, but it is used by SQLFORM. The default validators for the given types are shown in the following table:
- ``uploadfolder`` while the default is ``None``, most DB adapters will default to uploading files into os.path.join(request.folder, 'uploads'). MongoAdapter does not seem to be doing so at present.
[[field_types]]
#### Field types
``field types``:inxx
Not all of them are relevant for every field. "length" is relevant only for fields of type "string". "uploadfield" and "authorize" are relevant only for fields of type "upload". "ondelete" is relevant only for fields of type "reference" and "upload".
- ``length`` sets the maximum length of a "string", "password" or "upload" field. If ``length`` is not specified a default value is used but the default value is not guaranteed to be backward compatible. ''To avoid unwanted migrations on upgrades, we recommend that you always specify the length for string, password and upload fields.''
- ``default`` sets the default value for the field. The default value is used when performing an insert if a value is not explicitly specified. It is also used to pre-populate forms built from the table using SQLFORM. Note, rather than being a fixed value, the default can instead be a function (including a lambda function) that returns a value of the appropriate type for the field. In that case, the function is called once for each record inserted, even when multiple records are inserted in a single transaction.
- ``required`` tells the DAL that no insert should be allowed on this table if a value for this field is not explicitly specified.
- ``requires`` is a validator or a list of validators. This is not used by the DAL, but it is used by SQLFORM. The default validators for the given types are shown in the following table:
[[field_types]]
#### Field types
``field types``:inxx

[[field_constructor]]
#### Field constructor
``Field constructor``:inxx
These are the default values of a Field constructor:
``
Field(name, 'string', length=None, default=None,
required=False, requires='<default>',
ondelete='CASCADE', notnull=False, unique=False,
uploadfield=True, widget=None, label=None, comment=None,
writable=True, readable=True, update=None, authorize=None,
autodelete=False, represent=None, compute=None,
uploadfolder=None,
uploadseparate=None,uploadfs=None)
``:code
[[field_constructor]]
#### Field constructor
``Field constructor``:inxx
These are the default values of a Field constructor:
``
Field(name, 'string', length=None, default=None,
required=False, requires='<default>',
ondelete='CASCADE', notnull=False, unique=False,
uploadfield=True, widget=None, label=None, comment=None,
writable=True, readable=True, update=None, authorize=None,
autodelete=False, represent=None, compute=None,
uploadfolder=os.path.join(request.folder,'uploads'),
uploadseparate=None,uploadfs=None)
``:code

#### Google SQL
Google SQL has the same problems as MySQL and more. In particular table metadata itself must be stored in the database in a table that is not migrated by web2py. This is because Google App Engine has a read-only file system. Web2py migrations in Google:SQL combined with the MySQL issue described above can result in metadata corruption. Again, this can be prevented (by migrating the table at once and then setting migrate=False so that the metadata table is not accessed any more) or it can fixed a posteriori (by accessing the database using the Google dashboard and deleting any corrupted entry from the table called ``web2py_filesystem``.
#### MSSQL (Microsoft SQL Server)
``limitby``:inxx
MSSQL does not support the SQL OFFSET keyword. Therefore the database cannot do pagination. When doing a ``limitby=(a,b)`` web2py will fetch the first ``b`` rows and discard the first ``a``. This may result in a considerable overhead when compared with other database engines.
#### Oracle
Oracle also does not support pagination. It does not support neither the OFFSET nor the LIMIT keywords. Web2py achieves pagination by translating a ``db(...).select(limitby=(a,b))`` into a complex three-way nested select (as suggested by official Oracle documentation).
This works for simple select but may break for complex selects involving aliased fields and or joins.
#### MSSQL
MSSQL has problems with circular references in tables that have ONDELETE CASCADE. This is an MSSQL bug and you work around it by setting the ondelete attribute for all reference fields to "NO ACTION".
You can also do it once and for all before you define tables:
``
db = DAL('mssql://....')
for key in ['reference','reference FK']:
db._adapter.types[key]=db._adapter.types[key].replace(
'%(on_delete_action)s','NO ACTION')
``:code
MSSQL also has problems with arguments passed to the DISTINCT keyword and therefore
while this works,
``
db(query).select(distinct=True)
``
this does not
``
db(query).select(distinct=db.mytable.myfield)
``
#### Google NoSQL (Datastore)
Google NoSQL (Datastore) does not allow joins, left joins, aggregates, expression, OR involving more than one table, the ‘like’ operator searches in "text" fields.
#### Googel SQL
Google SQL has the same problems as MySQL and more. In particular table metadata itself must be stored in the database in a table that is not migrated by web2py. This is because Google App Engine has a read-only file system. Web2py migrations in Google:SQL combined with the MySQL issue described above can result in metadata corruption. Again, this can be prevented (by migrating the table at once and then setting migrate=False so that the metadata table is not accessed any more) or it can fixed a posteriori (by accessing the database using the Google dashboard and deleting any corrupted entry from the table called ``web2py_filesystem``.
#### MSSQL (Microsoft SQL Server)
``limitby``:inxx
MSSQL does not support the SQL OFFSET keyword. Therefore the database cannot do pagination. When doing a ``limitby=(a,b)`` web2py will fetch the first ``b`` rows and discard the first ``a``. This may result in a considerable overhead when compared with other database engines.
#### Oracle
Oracle also does not support pagination. It does not support neither the OFFSET nor the LIMIT keywords. Web2py achieves pagination by translating a ``db(...).select(limitby=(a,b))`` into a complex three-way nested select (as suggested by official Oracle documentation).
This works for simple select but may break for complex selects involving aliased fields and or joins.
#### MSSQL
MSSQL has problems with circular references in tables that have ONDELETE CASCADE. This is an MSSQL bug and you work around it by setting the ondelete attribute for all reference fields to "NO ACTION".
You can also do it once and for all before you define tables:
``
db = DAL('mssql://....')
for key in ['reference','reference FK']:
db._adapter.types[key]=db._adapter.types[key].replace(
'%(on_delete_action)s','NO ACTION')
``:code
MSSQL also has problems with arguments passed to the DISTINCT keyword and therefore
while this works,
``
db(query).select(distinct=True)
``
this does not
``
db(query).select(distinct=db.mytable.myfield)
``
#### Googel NoSQL (Datastore)
Google NoSQL (Datastore) does not allow joins, left joins, aggregates, expression, OR involving more than one table, the ‘like’ operator searches in "text" fields.

#### ``like``, ``regexp``, ``startswith``, ``endswith``, ``contains``, ``upper``, ``lower``
``like``:inxx ``startswith``:inxx ``endswith``:inxx ``regexp``:inxx
``contains``:inxx ``upper``:inxx ``lower``:inxx
Fields have a like operator that you can use to match strings:
``
>>> for row in db(db.log.event.like('port%')).select():
print row.event
port scan
``:code
Here "port%" indicates a string starting with "port". The percent sign character, "%", is a wild-card character that means "any sequence of characters".
The like operator is case-insensitive but it can be made case-sensitive with
``
db.mytable.myfield.like('value',case_sensitive=True)
``:code
web2py also provides some shortcuts:
``
db.mytable.myfield.startswith('value')
+db.mytable.myfield.endswith('value')
db.mytable.myfield.contains('value')
``:code
which are equivalent respectively to
``
db.mytable.myfield.like('value%')
+db.mytable.myfield.like('%value')
db.mytable.myfield.like('%value%')
``:code
#### ``like``, ``regexp``, ``startswith``, ``contains``, ``upper``, ``lower``
``like``:inxx ``startswith``:inxx ``regexp``:inxx
``contains``:inxx ``upper``:inxx ``lower``:inxx
Fields have a like operator that you can use to match strings:
``
>>> for row in db(db.log.event.like('port%')).select():
print row.event
port scan
``:code
Here "port%" indicates a string starting with "port". The percent sign character, "%", is a wild-card character that means "any sequence of characters".
The like operator is case-insensitive but it can be made case-sensitive with
``
db.mytable.myfield.like('value',case_sensitive=True)
``:code
web2py also provides some shortcuts:
``
db.mytable.myfield.startswith('value')
db.mytable.myfield.contains('value')
``:code
which are equivalent respectively to
``
db.mytable.myfield.like('value%')
db.mytable.myfield.like('%value%')
``:code

``SQLCustomType`` is a field type factory. Its ``type`` argument must be one of the standard web2py types. It tells web2py how to treat the field values at the web2py level. ``native`` is the type of the field as far as the database is concerned. Allowed names depend on the database engine. ``encoder`` is an optional transformation function applied when the data is stored and ``decoder`` is the optional reversed transformation function.
``SQLCustomType`` is a field type factory. Its ``type`` argument must be one of the standard web2py types. It tells web2py how to treat the field values at the web2py level. ``native`` is the name of the field as far as the database is concerned. Allowed names depend on the database engine. ``encoder`` is an optional transformation function applied when the data is stored and ``decoder`` is the optional reversed transformation function.

+- ``linkto`` lambda function or an action to be used to link reference fields (default to None).
+If you assign it a string with the name of an action, it will generate a link to that function passing it, as args, the name of the table and the id of each record (in this order). Example:
+``
+linkto = 'pointed_function' # generates something like <a href="pointed_function/table_name/id_value">
+``:code
+If you want a different link to be generated, you can specify a lambda, wich will receive as parameters, the value of the id, the type of the object (e.g. table), and the name of the object. For example, if you want to receive the args in reverse order:
+``
+linkto = lambda id, type, name: URL(f='pointed_function', args=[id, name])
+``:code
- ``upload`` the URL or the download action to allow downloading of uploaded files (default to None)
- ``headers`` a dictionary mapping field names to their labels to be used as headers (default to ``{}``). It can also be an instruction. Currently we support ``headers='fieldname:capitalize'``.
- ``truncate`` the number of characters for truncating long values in the table (default is 16)
- ``columns`` the list of fieldnames to be shown as columns (in tablename.fieldname format).
Those not listed are not displayed (defaults to all).
- ``**attributes`` generic helper attributes to be passed to the most external TABLE object.
-- ``linkto`` the URL or an action to be used to link reference fields (default to None)
- ``upload`` the URL or the download action to allow downloading of uploaded files (default to None)
- ``headers`` a dictionary mapping field names to their labels to be used as headers (default to ``{}``). It can also be an instruction. Currently we support ``headers='fieldname:capitalize'``.
- ``truncate`` the number of characters for truncating long values in the table (default is 16)
- ``columns`` the list of fieldnames to be shown as columns (in tablename.fieldname format).
Those not listed are not displayed (defaults to all).
- ``**attributes`` generic helper attributes to be passed to the most external TABLE object.

[[lazy_tables]]
#### Lazy Tables, a major performance boost
``lazy tables``:inxx
web2py models are executed before controllers, so all tables are defined at every request. Not all tables are needed to handle each request, so it is possible that some of the time spent defining tables is wasted. Conditional models ([[conditional models, chapter 4 ../04/#conditional_models]]) can help, but web2py offers a big performance boost via lazy_tables. This feature means that table creation is deferred until the table is actually referenced. Enabling lazy tables requires setting the ``DAL(...,lazy_tables=True)`` parameter. This is one of the most significant response-time performance boosts in web2py.
[[lazy_tables]]
#### Lazy Tables, a major performance boost
``lazy tables``:inxx
web2py models are executed before controllers, so all tables are defined at every request. Not all tables are needed to handle each request, so it is possible that some of the time spent defining tables is wasted. Conditional models [[conditional models, chapter 4 ../04/#conditional_models]] can help, but web2py offers a big performance boost via lazy_tables. This feature means that table creation is deferred until the table is actually referenced. Enabling lazy tables requires setting the ``DAL(...,lazy_tables=True)`` parameter. This is one of the most significant response-time performance boosts in web2py.

+[[lazy_tables]]
+#### Lazy Tables, a major performance boost
+``lazy tables``:inxx
+web2py models are executed before controllers, so all tables are defined at every request. Not all tables are needed to handle each request, so it is possible that some of the time spent defining tables is wasted. Conditional models [[conditional models, chapter 4 ../04/#conditional_models]] can help, but web2py offers a big performance boost via lazy_tables. This feature means that table creation is deferred until the table is actually referenced. Enabling lazy tables requires setting the ``DAL(...,lazy_tables=True)`` parameter. This is one of the most significant response-time performance boosts in web2py.
#### Adding attributes to fields and tables
If you need to add custom attributes to fields, you can simply do this:
``db.table.field.extra = {}``:code
"extra" is not a keyword ; it's a custom attributes now attached to the field object. You can do it with tables too but they must be preceded by an
underscore to avoid naming conflicts with fields:
``db.table._extra = {} ``:code
[[record_representation]]
#### Record representation
It is optional but recommended to specify a format representation for records:
``
>>> db.define_table('person', Field('name'), format='%(name)s')
``:code
or
``
Here is an example:
``SQLFORM.grid``:inxx ``SQLFORM.smartgrid``:inxx
------
``SQLTABLE`` is useful but there are times when one needs more. ``SQLFORM.grid`` is an extension of SQLTABLE that creates a table with search features and pagination, as well as ability to open detailed records, create, edit and delete records. ``SQLFORM.smartgrid`` is a further generalization that allows all of the above but also creates buttons to access referencing records.
------
Here is an example of usage of ``SQLFORM.grid``:
``
def index():
return dict(grid=SQLFORM.grid(query))
``:code
and the corresponding view:
``
{{extend 'layout.html'}}
{{=grid}}
``
For working with multiple rows, ``SQLFORM.grid`` and ``SQLFORM.smartgrid`` are preferred to ``SQLTABLE`` because they are more powerful. Please see chapter 7.
#### ``orderby``, ``groupby``, ``limitby``, ``distinct``, ``having``,``orderby_on_limitby``,``left``,``cache``
The ``select`` command takes a number of optional arguments.
##### orderby
You can fetch the records sorted by name:
``orderby``:inxx ``groupby``:inxx ``having``:inxx
``
>>> for row in db().select(
db.person.ALL, orderby=db.person.name):
print row.name
Alex
Bob
Carl
``:code
You can fetch the records sorted by name in reverse order (notice the tilde):
``
Curt
A lighter alternative to Many 2 Many relations is tagging. Tagging is discussed in the context of the ``IS_IN_DB`` validator. Tagging works even on database backends that do not support JOINs like the Google App Engine NoSQL.
### ``list:<type>`` and ``contains``
``list:string``:inxx
``list:integer``:inxx
``list:reference``:inxx
``contains``:inxx
``multiple``:inxx
``tags``:inxx
web2py provides the following special field types:
``
list:string
list:integer
list:reference <table>
``:code
They can contain lists of strings, of integers and of references respectively.
On Google App Engine NoSQL ``list:string`` is mapped into ``StringListProperty``, the other two are mapped into ``ListProperty(int)``. On relational databases they are mapped into text fields which contain the list of items separated by ``|``. For example ``[1,2,3]`` is mapped into ``|1|2|3|``.
-Because usually in web2py models are executed before controllers, it is possible that some table are defined even if not needed. It is therefore necessary to speed up the code by making table definitions lazy. This is done by setting the ``DAL(...,lazy_tables=True)`` parameter. Tables will be actually created only when accessed.
#### Adding attributes to fields and tables
If you need to add custom attributes to fields, you can simply do this:
``db.table.field.extra = {}``:code
"extra" is not a keyword ; it's a custom attributes now attached to the field object. You can do it with tables too but they must be preceded by an
underscore to avoid naming conflicts with fields:
``db.table._extra = {} ``:code
[[record_representation]]
#### Record representation
It is optional but recommended to specify a format representation for records:
``
>>> db.define_table('person', Field('name'), format='%(name)s')
``:code
or
``
Here is an example:
``SQLFORM.grid``:inxx ``SQLFORM.smartgrid``:inxx
------
``SQLTABLE`` is useful but there are times when one needs more. ``SQLFORM.grid`` is an extension of SQLTABLE that creates a table with search features and pagination, as well as ability to open detailed records, create, edit and delete records. ``SQLFORM.smartgrid`` is a further generalization that allows all of the above but also creates buttons to access referencing records.
------
Here is an example of usage of ``SQLFORM.grid``:
``
def index():
return dict(grid=SQLFORM.grid(query))
``:code
and the corresponding view:
``
{{extend 'layout.html'}}
{{=grid}}
``
``SQLFORM.grid`` and ``SQLFORM.smartgrid`` should be preferred to ``SQLTABLE`` because they are more powerful although higher level and therefore more constraining. They will be explained in more detail in chapter 7.
#### ``orderby``, ``groupby``, ``limitby``, ``distinct``, ``having``,``orderby_on_limitby``,``left``,``cache``
The ``select`` command takes a number of optional arguments.
##### orderby
You can fetch the records sorted by name:
``orderby``:inxx ``groupby``:inxx ``having``:inxx
``
>>> for row in db().select(
db.person.ALL, orderby=db.person.name):
print row.name
Alex
Bob
Carl
``:code
You can fetch the records sorted by name in reverse order (notice the tilde):
``
Curt
A lighter alternative to Many 2 Many relations is tagging. Tagging is discussed in the context of the ``IS_IN_DB`` validator. Tagging works even on database backends that do not support JOINs like the Google App Engine NoSQL.
### ``list:<type>`` and ``contains``
``list:string``:inxx
``list:integer``:inxx
``list:reference``:inxx
``contains``:inxx
``multiple``:inxx
``tags``:inxx
web2py provides the following special field types:
``
list:string
list:integer
list:reference <table>
``:code
They can contain lists of strings, of integers and of references respectively.
On Google App Engine NoSQL ``list:string`` is mapped into ``StringListProperty``, the other two are mapped into ``ListProperty(int)``. On relational databases they all mapped into text fields which contain the list of items separated by ``|``. For example ``[1,2,3]`` is mapped into ``|1|2|3|``.

+
You can delete records by id:
``
del db.mytable[id]
``:code
and this is equivalent to
``
db(db.mytable.id==id).delete()
``:code
and deletes the record with the given ``id``, if it exists.
+Note: This delete shortcut syntax does not currently work if [[versioning #versioning]] is activated
+
You can insert records:
``
db.mytable[0] = dict(myfield='somevalue')
``:code
It is equivalent to
``
db.mytable.insert(myfield='somevalue')
``:code
and it creates a new record with field values specified by the dictionary on the right hand side.
You can update records:
``
db.mytable[id] = dict(myfield='somevalue')
``:code
Here ``f`` is a dict of fields passed to insert or update, ``id`` is the id of t
``
>>> db.person.insert(name='John')
({'name': 'John'},)
({'name': 'John'}, 1)
>>> db(db.person.id==1).update(name='Tim')
(<Set (person.id = 1)>, {'name': 'Tim'})
(<Set (person.id = 1)>, {'name': 'Tim'})
>>> db(db.person.id==1).delete()
(<Set (person.id = 1)>,)
(<Set (person.id = 1)>,)
``:code
The return values of these callback should be ``None`` or ``False``. If any of the ``_before_*`` callback returns a ``True`` value it will abort the actual insert/update/delete operation.
``update_naive``:inxx.
Some times a callback may need to perform an update in the same or a different table and one wants to avoid callbacks calling themselves recursively.
For this purpose there the Set objects have an ``update_naive`` method that works like ``update`` but ignores before and after callbacks.
+[[versioning]]
#### Record versioning
``_enable_record_versioning``:inxx
You can delete records by id:
``
del db.mytable[id]
``:code
and this is equivalent to
``
db(db.mytable.id==id).delete()
``:code
and deletes the record with the given ``id``, if it exists.
You can insert records:
``
db.mytable[0] = dict(myfield='somevalue')
``:code
It is equivalent to
``
db.mytable.insert(myfield='somevalue')
``:code
and it creates a new record with field values specified by the dictionary on the right hand side.
You can update records:
``
db.mytable[id] = dict(myfield='somevalue')
``:code
Here ``f`` is a dict of fields passed to insert or update, ``id`` is the id of t
``
>>> db.person.insert(name='John')
({'name': 'John'},)
({'name': 'John'}, 1)
>>> db(db.person.id==1).update(name='Tim')
(<Set (person.id = 1)>, {'name': 'Tim'})
(<Set (person.id = 1)>, {'name': 'Tim'})
>>> db(db.person.id==1).delete()
(<Set (person.id = 1)>,)
(<Set (person.id = 1)>,)
``:code
The return values of these callback should be ``None`` or ``False``. If any of the ``_before_*`` callback returns a ``True`` value it will abort the actual insert/update/delete operation.
``update_naive``:inxx.
Some times a callback may need to perform an update in the same or a different table and one wants to avoid callbacks calling themselves recursively.
For this purpose there the Set objects have an ``update_naive`` method that works like ``update`` but ignores before and after callbacks.
#### Record versioning
-
``_enable_record_versioning``:inxx

You can do an intersection of the records in two set of rows:
``
>>> rows3 = rows1 & rows2
>>> print rows3
name
Tim
``:code
You can do a union of the records removing duplicates:
``
>>> rows3 = rows1 | rows2
>>> print rows3
name
Max
Tim
John
``:code
#### ``find``, ``exclude``, ``sort``
``find``:inxx ``exclude``:inxx ``sort``:inxx
Some times you need to perform two selects and one contains a subset of a previous select. In this case it is pointless to access the database again. The ``find``, ``exclude`` and ``sort`` objects allow you to manipulate a Rows objects and generate another one without accessing the database. More specifically:
- ``find`` returns a new set of Rows filtered by a condition and leaves the original unchanged.
- ``exclude`` returns a new set of Rows filtered by a condition and removes them from the original Rows.
Curt Boat
``:code
Similarly, you can search for all things owned by Alex:
``
>>> for row in persons_and_things(db.person.name=='Alex').select():
print row.thing.name
Boat
Chair
``:code
and all owners of Boat:
``
>>> for row in persons_and_things(db.thing.name=='Boat').select():
print row.person.name
Alex
Curt
``:code
A lighter alternative to Many 2 Many relations is tagging. Tagging is discussed in the context of the ``IS_IN_DB`` validator. Tagging works even on database backends that do not support JOINs like the Google App Engine NoSQL.
### ``list:<type>`` and ``contains``
``list:string``:inxx
``list:integer``:inxx
``list:reference``:inxx
``contains``:inxx
``multiple``:inxx
``tags``:inxx
web2py provides the following special field types:
``
list:string
list:integer
list:reference <table>
``:code
They can contain lists of strings, of integers and of references respectively.
On Google App Engine NoSQL ``list:string`` is mapped into ``StringListProperty``, the other two are mapped into ``ListProperty(int)``. On relational databases they all mapped into text fields which contain the list of items separated by ``|``. For example ``[1,2,3]`` is mapped into ``|1|2|3|``.
For lists of string the items are escaped so that any ``|`` in the item is replaced by a ``||``. Anyway this is an internal representation and it is transparent to the user.
This is best explained via some examples.
>>> db.person._before_delete.append(lambda s: pprint(s))
>>> db.person._after_delete.append(lambda s: pprint(s))
``:code
Here ``f`` is a dict of fields passed to insert or update, ``id`` is the id of the newly inserted record, ``s`` is the Set object used for update or delete.
``
>>> db.person.insert(name='John')
({'name': 'John'},)
({'name': 'John'}, 1)
>>> db(db.person.id==1).update(name='Tim')
(<Set (person.id = 1)>, {'name': 'Tim'})
(<Set (person.id = 1)>, {'name': 'Tim'})
>>> db(db.person.id==1).delete()
(<Set (person.id = 1)>,)
(<Set (person.id = 1)>,)
``:code
The return values of these callback should be ``None`` or ``False``. If any of the ``_before_*`` callback returns a ``True`` value it will abort the actual insert/update/delete operation.
``update_naive``:inxx
You can do a union of the records in two set of rows:
``
>>> rows3 = rows1 & rows2
>>> print rows3
name
-Max
-Tim
-John
Tim
``:code
You can do a union of the records removing duplicates:
``
>>> rows3 = rows1 | rows2
>>> print rows3
name
Max
Tim
John
``:code
#### ``find``, ``exclude``, ``sort``
``find``:inxx ``exclude``:inxx ``sort``:inxx
Some times you need to perform two selects and one contains a subset of a previous select. In this case it is pointless to access the database again. The ``find``, ``exclude`` and ``sort`` objects allow you to manipulate a Rows objects and generate another one without accessing the database. More specifically:
- ``find`` returns a new set of Rows filtered by a condition and leaves the original unchanged.
- ``exclude`` returns a new set of Rows filtered by a condition and removes them from the original Rows.
Curt Boat
``:code
Similarly, you can search for all things owned by Alex:
``
>>> for row in persons_and_things(db.person.name=='Alex').select():
print row.thing.name
Boat
Chair
``:code
and all owners of Boat:
``
>>> for row in persons_and_things(db.thing.name=='Boat').select():
print row.person.name
Alex
Curt
``:code
A lighter alternative to Many 2 Many relations is tagging. Tagging is discussed in the context of the ``IS_IN_DB`` validator. Tagging works even on database backends that do not support JOINs like the Google App Engine NoSQL.
### ``list:<type>``, and ``contains``
``list:string``:inxx
``list:integer``:inxx
``list:reference``:inxx
``contains``:inxx
``multiple``:inxx
``tags``:inxx
web2py provides the following special field types:
``
list:string
list:integer
list:reference <table>
``:code
They can contain lists of strings, of integers and of references respectively.
On Google App Engine NoSQL ``list:string`` is mapped into ``StringListProperty``, the other two are mapped into ``ListProperty(int)``. On relational databases they all mapped into text fields which contain the list of items separated by ``|``. For example ``[1,2,3]`` is mapped into ``|1|2|3|``.
For lists of string the items are escaped so that any ``|`` in the item is replaced by a ``||``. Anyway this is an internal representation and it is transparent to the user.
This is best explained via some examples.
>>> db.person._before_delete.append(lambda s: pprint(s))
>>> db.person._after_delete.append(lambda s: pprint(s))
``:code
Here ``f`` is a dict of fields passed to insert or update, ``id`` is the id of the newly inserted record, ``s`` is the Set object used for update or delete.
``
>>> db.person.insert(name='John')
({'name': 'John'},)
({'name': 'John'}, 1)
>>> db(db.person.id==1).update(name='Tim')
(<Set (person.id = 1)>, {'name': 'Tim'})
(<Set (person.id = 1)>, {'name': 'Tim'})
>>> db(db.person.id==1).delete()
(<Set (person.id = 1)>,)
(<Set (person.id = 1)>,)
``:code
The return values of these callback should be ``None`` or ``False``. If any of the ``_before_*`` callback returns a ``True`` value it will abort the actual insert/update/delete operation.
``update_naive``:inxx.

-----
**PostgreSQL** | ``postgres(_nonreserved)``
**MySQL** | ``mysql``
**FireBird** | ``firebird(_nonreserved)``
**MSSQL** | ``mssql``
**Oracle** | ``oracle``
-----
[[DAL_table_field]]
### ``DAL``, ``Table``, ``Field``
You can experiment with the DAL API using the web2py shell.
Start by creating a connection. For the sake of example, you can use SQLite. Nothing in this discussion changes when you change the back-end engine.
+[[dal_constructor]]
+#### DAL constructor
``
>>> db = DAL('sqlite://storage.db')
``:code
The database is now connected and the connection is stored in the global variable ``db``.
At any time you can retrieve the connection string.
``_uri``:inxx
``
>>> print db._uri
sqlite://storage.db
``:code
and the database name
``_dbname``:inxx
``
>>> print db._dbname
sqlite
``:code
The connection string is called a ``_uri`` because it is an instance of a Uniform Resource Identifier.
The DAL allows multiple connections with the same database or with different databases, even databases of different types. For now, we will assume the presence of a single database since this is the most common situation.
+[[table_constructor]]
+#### Table constructor
+
``define_table``:inxx ``Field``:inxx
``type``:inxx ``length``:inxx ``default``:inxx ``requires``:inxx ``required``:inxx ``unique``:inxx
``notnull``:inxx ``ondelete``:inxx ``uploadfield``:inxx ``uploadseparate``:inxx ``migrate``:inxx ``sql.log``:inxx
The most important method of a DAL is ``define_table``:
``
>>> db.define_table('person', Field('name'))
``:code
It defines, stores and returns a ``Table`` object called "person" containing a field (column) "name". This object can also be accessed via ``db.person``, so you do not need to catch the return value.
Do not declare a field called "id", because one is created by web2py anyway. Every table has a field called "id" by default. It is an auto-increment integer field (starting at 1) used for cross-reference and for making every record unique, so "id" is a primary key. (Note: the id's starting at 1 is back-end specific. For example, this does not apply to the Google App Engine NoSQL.)
``named id field``:inxx
Optionally you can define a field of ``type='id'`` and web2py will use this field as auto-increment id field. This is not recommended except when accessing legacy database tables. With some limitation, you can also use different primary keys and this is discussed in the section on "Legacy databases and keyed tables".
Tables can be defined only once but you can force web2py to redefine an existing table:
``
db.define_table('person', Field('name'))
db.define_table('person', Field('name'), redefine=True)
``:code
The redefinition may trigger a migration if field content is different.
----------
Because usually in web2py models are executed before controllers, it is possible that some table are defined even if not needed. It is therefore necessary to speed up the code by making table definitions lazy. This is done by setting the ``DAL(...,lazy_tables=True)`` parameter. Tables will be actually created only when accessed.
----------
#### Adding attributes to fields and tables
If you need to add custom attributes to fields, you can simply do this:
``db.table.field.extra = {}``:code
"extra" is not a keyword ; it's a custom attributes now attached to the field object. You can do it with tables too but they must be preceded by an
underscore to avoid naming conflicts with fields:
``db.table._extra = {} ``:code
+[[record_representation]]
+#### Record representation
It is optional but recommended to specify a format representation for records:
``
>>> db.define_table('person', Field('name'), format='%(name)s')
``:code
or
``
>>> db.define_table('person', Field('name'), format='%(name)s %(id)s')
``:code
or even more complex ones using a function:
``
>>> db.define_table('person', Field('name'),
format=lambda r: r.name or 'anonymous')
``:code
The format attribute will be used for two purposes:
- To represent referenced records in select/option drop-downs.
- To set the ``db.othertable.person.represent`` attribute for all fields referencing this table. This means that SQLTABLE will not show references by id but will use the format preferred representation instead.
+[[field_constructor]]
+#### Field constructor
``Field constructor``:inxx
These are the default values of a Field constructor:
``
Field(name, 'string', length=None, default=None,
required=False, requires='<default>',
ondelete='CASCADE', notnull=False, unique=False,
uploadfield=True, widget=None, label=None, comment=None,
writable=True, readable=True, update=None, authorize=None,
autodelete=False, represent=None, compute=None,
uploadfolder=os.path.join(request.folder,'uploads'),
uploadseparate=None,uploadfs=None)
``:code
Not all of them are relevant for every field. "length" is relevant only for fields of type "string". "uploadfield" and "authorize" are relevant only for fields of type "upload". "ondelete" is relevant only for fields of type "reference" and "upload".
- ``length`` sets the maximum length of a "string", "password" or "upload" field. If ``length`` is not specified a default value is used but the default value is not guaranteed to be backward compatible. ''To avoid unwanted migrations on upgrades, we recommend that you always specify the length for string, password and upload fields.''
- ``default`` sets the default value for the field. The default value is used when performing an insert if a value is not explicitly specified. It is also used to pre-populate forms built from the table using SQLFORM. Note, rather than being a fixed value, the default can instead be a function (including a lambda function) that returns a value of the appropriate type for the field. In that case, the function is called once for each record inserted, even when multiple records are inserted in a single transaction.
- ``required`` tells the DAL that no insert should be allowed on this table if a value for this field is not explicitly specified.
- ``requires`` is a validator or a list of validators. This is not used by the DAL, but it is used by SQLFORM. The default validators for the given types are shown in the following table:
+[[field_types]]
+#### Field types
+``field types``:inxx
+
----------
**field type** | **default field validators**
``string`` | ``IS_LENGTH(length)`` default length is 512
``text`` | ``IS_LENGTH(65536)``
``blob`` | ``None``
``boolean`` | ``None``
``integer`` | ``IS_INT_IN_RANGE(-1e100, 1e100)``
``double`` | ``IS_FLOAT_IN_RANGE(-1e100, 1e100)``
``decimal(n,m)`` | ``IS_DECIMAL_IN_RANGE(-1e100, 1e100)``
``date`` | ``IS_DATE()``
``time`` | ``IS_TIME()``
``datetime`` | ``IS_DATETIME()``
``password`` | ``None``
``upload`` | ``None``
``reference <table>`` | ``IS_IN_DB(db,table.field,format)``
``list:string`` | ``None``
``list:integer`` | ``None``
``list:reference <table>`` | ``IS_IN_DB(db,table.field,format,multiple=True)``
``json`` | ``IS_JSON()``
``bigint`` | ``None``
will upload files to the "web2py/applications/myapp/static/temp" folder.
- ``widget`` must be one of the available widget objects, including custom widgets, for example: ``SQLFORM.widgets.string.widget``. A list of available widgets will be discussed later. Each field type has a default widget.
- ``label`` is a string (or a helper or something that can be serialized to a string) that contains the label to be used for this field in auto-generated forms.
- ``comment`` is a string (or a helper or something that can be serialized to a string) that contains a comment associated with this field, and will be displayed to the right of the input field in the autogenerated forms.
- ``writable`` declares whether a field is writable in forms.
- ``readable`` declares whether a field is readable in forms. If a field is neither readable nor writable, it will not be displayed in create and update forms.
- ``update`` contains the default value for this field when the record is updated.
- ``compute`` is an optional function. If a record is inserted or updated, the compute function will be executed and the field will be populated with the function result. The record is passed to the compute function as a ``dict``, and the dict will not include the current value of that, or any other compute field.
- ``authorize`` can be used to require access control on the corresponding field, for "upload" fields only. It will be discussed more in detail in the context of Authentication and Authorization.
- ``autodelete`` determines if the corresponding uploaded file should be deleted when the record referencing the file is deleted. For "upload" fields only.
- ``represent`` can be None or can point to a function that takes a field value and returns an alternate representation for the field value. Examples:
``
db.mytable.name.represent = lambda name,row: name.capitalize()
db.mytable.other_id.represent = lambda id,row: row.myfield
db.mytable.some_uploadfield.represent = lambda value,row: \
A('get it', _href=URL('download', args=value))
``:code
``blob``:inxx
"blob" fields are also special. By default, binary data is encoded in base64 before being stored into the actual database field, and it is decoded when extracted. This has the negative effect of using 25% more storage space than necessary in blob fields, but has two advantages. On average it reduces the amount of data communicated between web2py and the database server, and it makes the communication independent of back-end-specific escaping conventions.
+
+#### Run-time field and table modification
+
Most attributes of fields and tables can be modified after they are defined:
-----
**PostgreSQL** | ``postgres(_nonreserved)``
**MySQL** | ``mysql``
**FireBird** | ``firebird(_nonreserved)``
**MSSQL** | ``mssql``
**Oracle** | ``oracle``
-----
### ``DAL``, ``Table``, ``Field``
You can experiment with the DAL API using the web2py shell.
Start by creating a connection. For the sake of example, you can use SQLite. Nothing in this discussion changes when you change the back-end engine.
``
>>> db = DAL('sqlite://storage.db')
``:code
The database is now connected and the connection is stored in the global variable ``db``.
At any time you can retrieve the connection string.
``_uri``:inxx
``
>>> print db._uri
sqlite://storage.db
``:code
and the database name
``_dbname``:inxx
``
>>> print db._dbname
sqlite
``:code
The connection string is called a ``_uri`` because it is an instance of a Uniform Resource Identifier.
The DAL allows multiple connections with the same database or with different databases, even databases of different types. For now, we will assume the presence of a single database since this is the most common situation.
``define_table``:inxx ``Field``:inxx
``type``:inxx ``length``:inxx ``default``:inxx ``requires``:inxx ``required``:inxx ``unique``:inxx
``notnull``:inxx ``ondelete``:inxx ``uploadfield``:inxx ``uploadseparate``:inxx ``migrate``:inxx ``sql.log``:inxx
The most important method of a DAL is ``define_table``:
``
>>> db.define_table('person', Field('name'))
``:code
It defines, stores and returns a ``Table`` object called "person" containing a field (column) "name". This object can also be accessed via ``db.person``, so you do not need to catch the return value.
Do not declare a field called "id", because one is created by web2py anyway. Every table has a field called "id" by default. It is an auto-increment integer field (starting at 1) used for cross-reference and for making every record unique, so "id" is a primary key. (Note: the id's starting at 1 is back-end specific. For example, this does not apply to the Google App Engine NoSQL.)
``named id field``:inxx
Optionally you can define a field of ``type='id'`` and web2py will use this field as auto-increment id field. This is not recommended except when accessing legacy database tables. With some limitation, you can also use different primary keys and this is discussed in the section on "Legacy databases and keyed tables".
Tables can be defined only once but you can force web2py to redefine an existing table:
``
db.define_table('person', Field('name'))
db.define_table('person', Field('name'), redefine=True)
``:code
The redefinition may trigger a migration if field content is different.
----------
Because usually in web2py models are executed before controllers, it is possible that some table are defined even if not needed. It is therefore necessary to speed up the code by making table definitions lazy. This is done by setting the ``DAL(...,lazy_tables=True)`` parameter. Tables will be actually created only when accessed.
----------
#### Adding attributes to fields and tables
If you need to add custom attributes to fields, you can simply do this:
``db.table.field.extra = {}``:code
"extra" is not a keyword ; it's a custom attributes now attached to the field object. You can do it with tables too but they must be preceded by an
underscore to avoid naming conflicts with fields:
``db.table._extra = {} ``:code
-### Record representation
It is optional but recommended to specify a format representation for records:
``
>>> db.define_table('person', Field('name'), format='%(name)s')
``:code
or
``
>>> db.define_table('person', Field('name'), format='%(name)s %(id)s')
``:code
or even more complex ones using a function:
``
>>> db.define_table('person', Field('name'),
format=lambda r: r.name or 'anonymous')
``:code
The format attribute will be used for two purposes:
- To represent referenced records in select/option drop-downs.
- To set the ``db.othertable.person.represent`` attribute for all fields referencing this table. This means that SQLTABLE will not show references by id but will use the format preferred representation instead.
-
``Field constructor``:inxx
These are the default values of a Field constructor:
``
Field(name, 'string', length=None, default=None,
required=False, requires='<default>',
ondelete='CASCADE', notnull=False, unique=False,
uploadfield=True, widget=None, label=None, comment=None,
writable=True, readable=True, update=None, authorize=None,
autodelete=False, represent=None, compute=None,
uploadfolder=os.path.join(request.folder,'uploads'),
uploadseparate=None,uploadfs=None)
``:code
Not all of them are relevant for every field. "length" is relevant only for fields of type "string". "uploadfield" and "authorize" are relevant only for fields of type "upload". "ondelete" is relevant only for fields of type "reference" and "upload".
- ``length`` sets the maximum length of a "string", "password" or "upload" field. If ``length`` is not specified a default value is used but the default value is not guaranteed to be backward compatible. ''To avoid unwanted migrations on upgrades, we recommend that you always specify the length for string, password and upload fields.''
- ``default`` sets the default value for the field. The default value is used when performing an insert if a value is not explicitly specified. It is also used to pre-populate forms built from the table using SQLFORM. Note, rather than being a fixed value, the default can instead be a function (including a lambda function) that returns a value of the appropriate type for the field. In that case, the function is called once for each record inserted, even when multiple records are inserted in a single transaction.
- ``required`` tells the DAL that no insert should be allowed on this table if a value for this field is not explicitly specified.
- ``requires`` is a validator or a list of validators. This is not used by the DAL, but it is used by SQLFORM. The default validators for the given types are shown in the following table:
----------
**field type** | **default field validators**
``string`` | ``IS_LENGTH(length)`` default length is 512
``text`` | ``IS_LENGTH(65536)``
``blob`` | ``None``
``boolean`` | ``None``
``integer`` | ``IS_INT_IN_RANGE(-1e100, 1e100)``
``double`` | ``IS_FLOAT_IN_RANGE(-1e100, 1e100)``
``decimal(n,m)`` | ``IS_DECIMAL_IN_RANGE(-1e100, 1e100)``
``date`` | ``IS_DATE()``
``time`` | ``IS_TIME()``
``datetime`` | ``IS_DATETIME()``
``password`` | ``None``
``upload`` | ``None``
``reference <table>`` | ``IS_IN_DB(db,table.field,format)``
``list:string`` | ``None``
``list:integer`` | ``None``
``list:reference <table>`` | ``IS_IN_DB(db,table.field,format,multiple=True)``
``json`` | ``IS_JSON()``
``bigint`` | ``None``
will upload files to the "web2py/applications/myapp/static/temp" folder.
- ``widget`` must be one of the available widget objects, including custom widgets, for example: ``SQLFORM.widgets.string.widget``. A list of available widgets will be discussed later. Each field type has a default widget.
- ``label`` is a string (or a helper or something that can be serialized to a string) that contains the label to be used for this field in auto-generated forms.
- ``comment`` is a string (or a helper or something that can be serialized to a string) that contains a comment associated with this field, and will be displayed to the right of the input field in the autogenerated forms.
- ``writable`` declares whether a field is writable in forms.
- ``readable`` declares whether a field is readable in forms. If a field is neither readable nor writable, it will not be displayed in create and update forms.
- ``update`` contains the default value for this field when the record is updated.
- ``compute`` is an optional function. If a record is inserted or updated, the compute function will be executed and the field will be populated with the function result. The record is passed to the compute function as a ``dict``, and the dict will not include the current value of that, or any other compute field.
- ``authorize`` can be used to require access control on the corresponding field, for "upload" fields only. It will be discussed more in detail in the context of Authentication and Authorization.
- ``autodelete`` determines if the corresponding uploaded file should be deleted when the record referencing the file is deleted. For "upload" fields only.
- ``represent`` can be None or can point to a function that takes a field value and returns an alternate representation for the field value. Examples:
``
db.mytable.name.represent = lambda name,row: name.capitalize()
db.mytable.other_id.represent = lambda id,row: row.myfield
db.mytable.some_uploadfield.represent = lambda value,row: \
A('get it', _href=URL('download', args=value))
``:code
``blob``:inxx
"blob" fields are also special. By default, binary data is encoded in base64 before being stored into the actual database field, and it is decoded when extracted. This has the negative effect of using 25% more storage space than necessary in blob fields, but has two advantages. On average it reduces the amount of data communicated between web2py and the database server, and it makes the communication independent of back-end-specific escaping conventions.
Most attributes of fields and tables can be modified after they are defined:

``
>>> db.item.total_price = Field.Virtual(
+ 'total_price',
lambda row: row.item.unit_price*row.item.quantity)
``:code
``
>>> db.item.total_price = Field.Virtual(
lambda row: row.item.unit_price*row.item.quantity)
``:code

+The Gotchas section at the end of this chapter has some more information about specific databases.
+
The Windows binary distribution works out of the box with SQLite and MySQL. The Mac binary distribution works out of the box with SQLite.
To use any other database back-end, run from the source distribution and install the appropriate driver for the required back end.
``database drivers``:inxx
Once the proper driver is installed, start web2py from source, and it will find the driver. Here is a list of drivers:
``DAL``:inxx ``SQLite``:inxx ``MySQL``:inxx ``PostgresSQL``:inxx ``Oracle``:inxx ``MSSQL``:inxx ``FireBird``:inxx ``DB2``:inxx ``Informix``:inxx ``Sybase``:inxx ``Teradata``:inxx ``MongoDB``:inxx ``CouchDB``:inxx ``SAPDB``:inxx ``Cubrid``:inxx
----------
database | drivers (source)
SQLite | sqlite3 or pysqlite2 or zxJDBC ``zxjdbc``:cite (on Jython)
PostgreSQL | psycopg2 ``psycopg2``:cite or pg8000 ``pg8000``:cite or zxJDBC ``zxjdbc``:cite (on Jython)
MySQL | pymysql ``pymysql``:cite or MySQLdb ``mysqldb``:cite
Oracle | cx_Oracle ``cxoracle``:cite
MSSQL | pyodbc ``pyodbc``:cite
FireBird | kinterbasdb ``kinterbasdb``:cite or fdb or pyodbc
DB2 | pyodbc ``pyodbc``:cite
Informix | informixdb ``informixdb``:cite
Ingres | ingresdbi ``ingresdbi``:cite
Cubrid | cubriddb ``cubridb``:cite ``cubridb``:cite
ADAPTERS = {
}
``:code
the uri string is then parsed in more detail by the adapter itself.
For any adapter you can replace the driver with a different one:
``
import MySQLdb as mysqldb
from gluon.dal import MySQLAdapter
MySQLAdapter.driver = mysqldb
``
i.e. ``mysqldb`` has to be ''that module'' with a .connect() method.
You can specify optional driver arguments and adapter arguments:
``
db =DAL(..., driver_args={}, adapter_args={})
``
+### Gotchas
+
+#### SQLite
+SQLite does not support dropping and altering columns. That means that web2py migrations will work up to a point. If you delete a field from a table, the column will remain in the database but will be invisible to web2py. If you decide to reinstate the column, web2py will try re-create it and fail. In this case you must set ``fake_migrate=True`` so that metadata is rebuilt without attempting to add the column again. Also, for the same reason, **SQLite** is not aware of any change of column type. If you insert a number in a string field, it will be stored as string. If you later change the model and replace the type "string" with type "integer", SQLite will continue to keep the number as a string and this may cause problem when you try to extract the data.
+
+SQLite doesn't have a boolean type. web2py internally maps booleans to a 1 character string, with 'T' and 'F' representing True and False. The DAL handles this completely; the abstraction of a true boolean value works well.
+But if you are updating the SQLite table with SQL directly, be aware of the web2py implementation, and avoid using 0 and 1 values.
+
+#### MySQL
MySQL does not support multiple ALTER TABLE within a single transaction. This means that any migration process is broken into multiple commits. If something happens that causes a failure it is possible to break a migration (the web2py metadata are no longer in sync with the actual table structure in the database). This is unfortunate but it can be prevented (migrate one table at the time) or it can be fixed a posteriori (revert the web2py model to what corresponds to the table structure in database, set ``fake_migrate=True`` and after the metadata has been rebuilt, set ``fake_migrate=False`` and migrate the table again).
#### Googel SQL
Google SQL has the same problems as MySQL and more. In particular table metadata itself must be stored in the database in a table that is not migrated by web2py. This is because Google App Engine has a read-only file system. Web2py migrations in Google:SQL combined with the MySQL issue described above can result in metadata corruption. Again, this can be prevented (by migrating the table at once and then setting migrate=False so that the metadata table is not accessed any more) or it can fixed a posteriori (by accessing the database using the Google dashboard and deleting any corrupted entry from the table called ``web2py_filesystem``.
+#### MSSQL (Microsoft SQL Server)
``limitby``:inxx
MSSQL does not support the SQL OFFSET keyword. Therefore the database cannot do pagination. When doing a ``limitby=(a,b)`` web2py will fetch the first ``b`` rows and discard the first ``a``. This may result in a considerable overhead when compared with other database engines.
#### Oracle
+Oracle also does not support pagination. It does not support neither the OFFSET nor the LIMIT keywords. Web2py achieves pagination by translating a ``db(...).select(limitby=(a,b))`` into a complex three-way nested select (as suggested by official Oracle documentation).
+This works for simple select but may break for complex selects involving aliased fields and or joins.
+
+#### MSSQL
+
+MSSQL has problems with circular references in tables that have ONDELETE CASCADE. This is an MSSQL bug and you work around it by setting the ondelete attribute for all reference fields to "NO ACTION".
+You can also do it once and for all before you define tables:
``
db = DAL('mssql://....')
for key in ['reference','reference FK']:
db._adapter.types[key]=db._adapter.types[key].replace(
'%(on_delete_action)s','NO ACTION')
``:code
MSSQL also has problems with arguments passed to the DISTINCT keyword and therefore
while this works,
``
db(query).select(distinct=True)
``
this does not
``
db(query).select(distinct=db.mytable.myfield)
``
+#### Googel NoSQL (Datastore)
+Google NoSQL (Datastore) does not allow joins, left joins, aggregates, expression, OR involving more than one table, the ‘like’ operator searches in "text" fields.
+
+Transactions are limited and not provided automatically by web2py (you need to use the Google API ``run_in_transaction`` which you can look up in the Google App Engine documentation online).
+
+Google also limits the number of records you can retrieve in each one query (1000 at the time of writing). On the Google datastore record IDs are integer but they are not sequential.
+While on SQL the "list:string" type is mapped into a "text" type, on the Google Datastore it is mapped into a ``ListStringProperty``. Similarly "list:integer" and "list:reference" are mapped into "ListProperty". This makes searches for content inside these fields types are more efficient on Google NoSQL than on SQL databases.
The Windows binary distribution works out of the box with SQLite and MySQL. The Mac binary distribution works out of the box with SQLite.
To use any other database back-end, run from the source distribution and install the appropriate driver for the required back end.
``database drivers``:inxx
Once the proper driver is installed, start web2py from source, and it will find the driver. Here is a list of drivers:
``DAL``:inxx ``SQLite``:inxx ``MySQL``:inxx ``PostgresSQL``:inxx ``Oracle``:inxx ``MSSQL``:inxx ``FireBird``:inxx ``DB2``:inxx ``Informix``:inxx ``Sybase``:inxx ``Teradata``:inxx ``MongoDB``:inxx ``CouchDB``:inxx ``SAPDB``:inxx ``Cubrid``:inxx
----------
database | drivers (source)
SQLite | sqlite3 or pysqlite2 or zxJDBC ``zxjdbc``:cite (on Jython)
PostgreSQL | psycopg2 ``psycopg2``:cite or pg8000 ``pg8000``:cite or zxJDBC ``zxjdbc``:cite (on Jython)
MySQL | pymysql ``pymysql``:cite or MySQLdb ``mysqldb``:cite
Oracle | cx_Oracle ``cxoracle``:cite
MSSQL | pyodbc ``pyodbc``:cite
FireBird | kinterbasdb ``kinterbasdb``:cite or fdb or pyodbc
DB2 | pyodbc ``pyodbc``:cite
Informix | informixdb ``informixdb``:cite
Ingres | ingresdbi ``ingresdbi``:cite
Cubrid | cubriddb ``cubridb``:cite ``cubridb``:cite
ADAPTERS = {
}
``:code
the uri string is then parsed in more detail by the adapter itself.
For any adapter you can replace the driver with a different one:
``
import MySQLdb as mysqldb
from gluon.dal import MySQLAdapter
MySQLAdapter.driver = mysqldb
``
i.e. ``mysqldb`` has to be ''that module'' with a .connect() method.
You can specify optional driver arguments and adapter arguments:
``
db =DAL(..., driver_args={}, adapter_args={})
``
-#### Gotchas
**SQLite** does not support dropping and altering columns. That means that web2py migrations will work up to a point. If you delete a field from a table, the column will remain in the database but will be invisible to web2py. If you decide to reinstate the column, web2py will try re-create it and fail. In this case you must set ``fake_migrate=True`` so that metadata is rebuilt without attempting to add the column again. Also, for the same reason, **SQLite** is not aware of any change of column type. If you insert a number in a string field, it will be stored as string. If you later change the model and replace the type "string" with type "integer", SQLite will continue to keep the number as a string and this may cause problem when you try to extract the data.
**MySQL** does not support multiple ALTER TABLE within a single transaction. This means that any migration process is broken into multiple commits. If something happens that causes a failure it is possible to break a migration (the web2py metadata are no longer in sync with the actual table structure in the database). This is unfortunate but it can be prevented (migrate one table at the time) or it can be fixed a posteriori (revert the web2py model to what corresponds to the table structure in database, set ``fake_migrate=True`` and after the metadata has been rebuilt, set ``fake_migrate=False`` and migrate the table again).
**Google SQL** has the same problems as MySQL and more. In particular table metadata itself must be stored in the database in a table that is not migrated by web2py. This is because Google App Engine has a read-only file system. Web2py migrations in Google:SQL combined with the MySQL issue described above can result in metadata corruption. Again, this can be prevented (by migrating the table at once and then setting migrate=False so that the metadata table is not accessed any more) or it can fixed a posteriori (by accessing the database using the Google dashboard and deleting any corrupted entry from the table called ``web2py_filesystem``.
``limitby``:inxx
**MSSQL** does not support the SQL OFFSET keyword. Therefore the database cannot do pagination. When doing a ``limitby=(a,b)`` web2py will fetch the first ``b`` rows and discard the first ``a``. This may result in a considerable overhead when compared with other database engines.
**Oracle** also does not support pagination. It does not support neither the OFFSET nor the LIMIT keywords. Web2py achieves pagination by translating a ``db(...).select(limitby=(a,b))`` into a complex three-way nested select (as suggested by official Oracle documentation). This works for simple select but may break for complex selects involving aliased fields and or joins.
-**MSSQL** has problems with circular references in tables that have ONDELETE CASCADE. This is an MSSQL bug and you work around it by setting the ondelete attribute for all reference fields to "NO ACTION". You can also do it once and for all before you define tables:
``
db = DAL('mssql://....')
for key in ['reference','reference FK']:
db._adapter.types[key]=db._adapter.types[key].replace(
'%(on_delete_action)s','NO ACTION')
``:code
**MSSQL** also has problems with arguments passed to the DISTINCT keyword and therefore
while this works,
``
db(query).select(distinct=True)
``
this does not
``
db(query).select(distinct=db.mytable.myfield)
``
-**Google NoSQL (Datastore)** does not allow joins, left joins, aggregates, expression, OR involving more than one table, the ‘like’ operator searches in "text" fields. Transactions are limited and not provided automatically by web2py (you need to use the Google API ``run_in_transaction`` which you can look up in the Google App Engine documentation online). Google also limits the number of records you can retrieve in each one query (1000 at the time of writing). On the Google datastore record IDs are integer but they are not sequential. While on SQL the "list:string" type is mapped into a "text" type, on the Google Datastore it is mapped into a ``ListStringProperty``. Similarly "list:integer" and "list:reference" are mapped into "ListProperty". This makes searches for content inside these fields types are more efficient on Google NoSQL than on SQL databases.

+#### Rendering rows using represent
+You may wish to rewrite rows returned by select to take advantage of formatting information contained in the represents setting of the fields.
+
+``rows = db(query).select()
+repr_row = rows.render(0)``:code
+
+If you don't specify an index, you get a generator to iterate over all the rows:
+
+``for row in rows.render():
+ print row.myfield``:code
+
+Can also be applied to slices:
+
+``for row in rows[0:10].render():
+ print row.myfield``:code
+
+If you only want to transform selected fields via their "represent" attribute, you can list them in the "fields" argument:
+
+``repr_row = row.render(0, fields=[db.mytable.myfield])``:code
+
+Note, it returns a transformed copy of the original Row, so there's no update_record (which you wouldn't want anyway) or delete_record.
+
+
#### Shortcuts
``DAL shortcuts``:inxx
The DAL supports various code-simplifying shortcuts.
In particular:
``
myrecord = db.mytable[id]
``:code
returns the record with the given ``id`` if it exists. If the ``id`` does not exist, it returns ``None``. The above statement is equivalent to
``
myrecord = db(db.mytable.id==id).select().first()
``:code
You can delete records by id:
``
del db.mytable[id]
``:code
Here is an example:
------
``SQLTABLE`` is useful but there are times when one needs more. ``SQLFORM.grid`` is an extension of SQLTABLE that creates a table with search features and pagination, as well as ability to open detailed records, create, edit and delete records. ``SQLFORM.smartgrid`` is a further generalization that allows all of the above but also creates buttons to access referencing records.
------
Here is an example of usage of ``SQLFORM.grid``:
``
def index():
return dict(grid=SQLFORM.grid(query))
``:code
and the corresponding view:
``
{{extend 'layout.html'}}
{{=grid}}
``
``SQLFORM.grid`` and ``SQLFORM.smartgrid`` should be preferred to ``SQLTABLE`` because they are more powerful although higher level and therefore more constraining. They will be explained in more detail in chapter 7.
#### ``orderby``, ``groupby``, ``limitby``, ``distinct``, ``having``,``orderby_on_limitby``,``left``,``cache``
The ``select`` command takes a number of optional arguments.
+##### orderby
You can fetch the records sorted by name:
``orderby``:inxx ``groupby``:inxx ``having``:inxx
``
>>> for row in db().select(
db.person.ALL, orderby=db.person.name):
print row.name
Alex
Bob
Carl
``:code
You can fetch the records sorted by name in reverse order (notice the tilde):
``
>>> for row in db().select(
db.person.ALL, orderby=~db.person.name):
print row.name
Carl
Bob
Alex
Bob
``:code
-----
The use of ``orderby='<random>'`` is not supported on Google NoSQL. However, in this situation and likewise in many others where built-ins are insufficient, imports can be used:
``
import random
rows=db(...).select().sort(lambda row: random.random())
``:code
-----
You can sort the records according to multiple fields by concatenating them with a "|":
``
>>> for row in db().select(
db.person.ALL, orderby=db.person.name|db.person.id):
print row.name
Carl
Bob
Alex
``:code
+##### groupby, having
Using ``groupby`` together with ``orderby``, you can group records with the same value for the specified field (this is back-end specific, and is not on the Google NoSQL):
``
>>> for row in db().select(
db.person.ALL,
orderby=db.person.name, groupby=db.person.name):
print row.name
Alex
Bob
Carl
``:code
You can use ``having`` in conjunction with ``groupby`` to group conditionally (only those ``having`` the condition are grouped.
``
>>> print db(query1).select(db.person.ALL, groupby=db.person.name, having=query2)
``
Notice that query1 filters records to be displayed, query2 filters records to be grouped.
+##### distinct
``distinct``:inxx
With the argument ``distinct=True``, you can specify that you only want to select distinct records. This has the same effect as grouping using all specified fields except that it does not require sorting. When using distinct it is important not to select ALL fields, and in particular not to select the "id" field, else all records will always be distinct.
Here is an example:
``
>>> for row in db().select(db.person.name, distinct=True):
print row.name
Alex
Bob
Carl
``:code
Notice that ``distinct`` can also be an expression for example:
``
>>> for row in db().select(db.person.name,distinct=db.person.name):
print row.name
Alex
Bob
Carl
``:code
##### limitby
With limitby=(min, max), you can select a subset of the records from offset=min to but not including offset=max (in this case, the first two starting at zero):
``limitby``:inxx
``
>>> for row in db().select(db.person.ALL, limitby=(0, 2)):
print row.name
Alex
Bob
``:code
+##### orderby_on_limitby
+``orderby_on_limitby``:inxx
+Note that the DAL defaults to implicitly adding an orderby when using a limitby.
+This ensures the same query returns the same results each time, important for pagination.
+But it can cause performance problems.
+use ``orderby_on_limitby = False`` to change this (this defaults to True).
+
+##### left
+Discussed below in the section on joins
+
+##### cache, cacheable
+An example use which gives much faster selects is:
+``rows = db(query).select(cache=(cache.ram,3600),cacheable=True)``:code
+See discussion on 'caching selects', below, to understand what the tradeoffs are.
+
#### Shortcuts
``DAL shortcuts``:inxx
The DAL supports various code-simplifying shortcuts.
In particular:
``
myrecord = db.mytable[id]
``:code
returns the record with the given ``id`` if it exists. If the ``id`` does not exist, it returns ``None``. The above statement is equivalent to
``
myrecord = db(db.mytable.id==id).select().first()
``:code
You can delete records by id:
``
del db.mytable[id]
``:code
Here is an example:
------
``SQLTABLE`` is useful but there are times when one needs more. ``SQLFORM.grid`` is an extension of SQLTABLE that creates a table with search features and pagination, as well as ability to open detailed records, create, edit and delete records. ``SQLFORM.smartgrid`` is a further generalization that allows all of the above but also creates buttons to access referencing records.
------
Here is an example of usage of ``SQLFORM.grid``:
``
def index():
return dict(grid=SQLFORM.grid(query))
``:code
and the corresponding view:
``
{{extend 'layout.html'}}
{{=grid}}
``
``SQLFORM.grid`` and ``SQLFORM.smartgrid`` should be preferred to ``SQLTABLE`` because they are more powerful although higher level and therefore more constraining. They will be explained in more detail in chapter 7.
#### ``orderby``, ``groupby``, ``limitby``, ``distinct``, ``having``
The ``select`` command takes five optional arguments: orderby, groupby, limitby, left and cache. Here we discuss the first three.
You can fetch the records sorted by name:
``orderby``:inxx ``groupby``:inxx ``having``:inxx
``
>>> for row in db().select(
db.person.ALL, orderby=db.person.name):
print row.name
Alex
Bob
Carl
``:code
You can fetch the records sorted by name in reverse order (notice the tilde):
``
>>> for row in db().select(
db.person.ALL, orderby=~db.person.name):
print row.name
Carl
Bob
Alex
Bob
``:code
-----
The use of ``orderby='<random>'`` is not supported on Google NoSQL. However, in this situation and likewise in many others where built-ins are insufficient, imports can be used:
``
import random
rows=db(...).select().sort(lambda row: random.random())
``:code
-----
You can sort the records according to multiple fields by concatenating them with a "|":
``
>>> for row in db().select(
db.person.ALL, orderby=db.person.name|db.person.id):
print row.name
Carl
Bob
Alex
``:code
Using ``groupby`` together with ``orderby``, you can group records with the same value for the specified field (this is back-end specific, and is not on the Google NoSQL):
``
>>> for row in db().select(
db.person.ALL,
orderby=db.person.name, groupby=db.person.name):
print row.name
Alex
Bob
Carl
``:code
You can use ``having`` in conjunction with ``groupby`` to group conditionally (only those ``having`` the condition are grouped.
``
>>> print db(query1).select(db.person.ALL, groupby=db.person.name, having=query2)
``
Notice that query1 filters records to be displayed, query2 filters records to be grouped.
``distinct``:inxx
With the argument ``distinct=True``, you can specify that you only want to select distinct records. This has the same effect as grouping using all specified fields except that it does not require sorting. When using distinct it is important not to select ALL fields, and in particular not to select the "id" field, else all records will always be distinct.
Here is an example:
``
>>> for row in db().select(db.person.name, distinct=True):
print row.name
Alex
Bob
Carl
``:code
Notice that ``distinct`` can also be an expression for example:
``
>>> for row in db().select(db.person.name,distinct=db.person.name):
print row.name
Alex
Bob
Carl
``:code
With limitby=(min, max), you can select a subset of the records from offset=min to but not including offset=max (in this case, the first two starting at zero):
``limitby``:inxx
``
>>> for row in db().select(db.person.ALL, limitby=(0, 2)):
print row.name
Alex
Bob
``:code

----------
Because usually in web2py models are executed before controllers, it is possible that some table are defined even if not needed. It is therefore necessary to speed up the code by making table definitions lazy. This is done by setting the ``DAL(...,lazy_tables=True)`` parameter. Tables will be actually created only when accessed.
----------
### Record representation
It is optional but recommended to specify a format representation for records:
``
>>> db.define_table('person', Field('name'), format='%(name)s')
``:code
or
``
>>> db.define_table('person', Field('name'), format='%(name)s %(id)s')
``:code
or even more complex ones using a function:
``
>>> db.define_table('person', Field('name'),
format=lambda r: r.name or 'anonymous')
``:code
db = DAL(...,migrate_enabled=False)
This is the recommended behavior when two apps share the same database. Only one of the two apps should perform migrations, the other should disabled them.
### Fixing broken migrations
``fake_migrate``:inxx
There are two common problems with migrations and there are ways to recover from them.
One problem is specific with SQLite. SQLite does not enforce column types and cannot drop columns. This means that if you have a column of type string and you remove it, it is not really removed. If you add the column again with a different type (for example datetime) you end up with a datetime column that contains strings (junk for practical purposes). web2py does not complain about this because it does not know what is in the database, until it tries to retrieve records and fails.
If web2py returns an error in the gluon.sql.parse function when selecting records, this is the problem: corrupted data in a column because of the above issue.
The solution consists in updating all records of the table and updating the values in the column in question with None.
The other problem is more generic but typical with MySQL. MySQL does not allow more than one ALTER TABLE in a transaction. This means that web2py must break complex transactions into smaller ones (one ALTER TABLE at the time) and commit one piece at the time. It is therefore possible that part of a complex transaction gets committed and one part fails, leaving web2py in a corrupted state. Why would part of a transaction fail? Because, for example, it involves altering a table and converting a string column into a datetime column, web2py tries to convert the data, but the data cannot be converted. What happens to web2py? It gets confused about what exactly is the table structure actually stored in the database.
The solution consists of disabling migrations for all tables and enabling fake migrations:
``
db.define_table(....,migrate=False,fake_migrate=True)
``:code
This will rebuild web2py metadata about the table according to the table definition. Try multiple table definitions to see which one works (the one before the failed migration and the one after the failed migration). Once successful remove the ``fake_migrate=True`` parameter.
----------
Because usually in web2py models are executed before controllers, it is possible that some table are defined even if not needed. It is therefore necessary to speed up the code by making table definitions lazy. This is done by setting the ``DAL(...,lazy_tables=True)`` attribute. Tables will be actually created only when accessed.
----------
### Record representation
It is optional but recommended to specify a format representation for records:
``
>>> db.define_table('person', Field('name'), format='%(name)s')
``:code
or
``
>>> db.define_table('person', Field('name'), format='%(name)s %(id)s')
``:code
or even more complex ones using a function:
``
>>> db.define_table('person', Field('name'),
format=lambda r: r.name or 'anonymous')
``:code
db = DAL(...,migrate_enabled=False)
This is the recommended behavior when two apps share the same database. Only one of the two apps should perform migrations, the other should disabled them.
### Fixing broken migrations
``fake_migrate``:inxx
There are two common problems with migrations and there are ways to recover from them.
One problem is specific with SQLite. SQLite does not enforce column types and cannot drop columns. This means that if you have a column of type string and you remove it, it is not really removed. If you add the column again with a different type (for example datetime) you end up with a datetime column that contains strings (junk for practical purposes). web2py does not complain about this because it does not know what is in the database, until it tries to retrieve records and fails.
If web2py returns an error in the gluon.sql.parse function when selecting records, this is the problem: corrupted data in a column because of the above issue.
The solution consists in updating all records of the table and updating the values in the column in question with None.
The other problem is more generic but typical with MySQL. MySQL does not allow more than one ALTER TABLE in a transaction. This means that web2py must break complex transactions into smaller ones (one ALTER TABLE at the time) and commit one piece at the time. It is therefore possible that part of a complex transaction gets committed and one part fails, leaving web2py in a corrupted state. Why would part of a transaction fail? Because, for example, it involves altering a table and converting a string column into a datetime column, web2py tries to convert the data, but the data cannot be converted. What happens to web2py? It gets confused about what exactly is the table structure actually stored in the database.
The solution consists of disabling migrations for all tables and enabling fake migrations:
``
db.define_table(....,migrate=False,fake_migrate=True)
``:code
This will rebuild web2py metadata about the table according to the table definition. Try multiple table definitions to see which one works (the one before the failed migration and the one after the failed migration). Once successful remove the ``fake_migrate=True`` attribute.

+#### Adding attributes to fields and tables
+If you need to add custom attributes to fields, you can simply do this:
+``db.table.field.extra = {}``:code
+
+"extra" is not a keyword ; it's a custom attributes now attached to the field object. You can do it with tables too but they must be preceded by an
+underscore to avoid naming conflicts with fields:
+
+``db.table._extra = {} ``:code
### Record representation
It is optional but recommended to specify a format representation for records:
``
>>> db.define_table('person', Field('name'), format='%(name)s')
``:code
or
``
>>> db.define_table('person', Field('name'), format='%(name)s %(id)s')
``:code
or even more complex ones using a function:
``
>>> db.define_table('person', Field('name'),
format=lambda r: r.name or 'anonymous')
``:code
The format attribute will be used for two purposes:
False
including its parent table, tablename, and parent connection:
``
>>> db.person.name._table == db.person
True
>>> db.person.name._tablename == 'person'
True
>>> db.person.name._db == db
True
``:code
A field also has methods. Some of them are used to build queries and we will see them later.
A special method of the field object is ``validate`` and it calls the validators for the field.
``
print db.person.name.validate('John')
``
which returns a tuple ``(value, error)``. ``error`` is ``None`` if the input passes validation.
+
+
### Migrations
``migrations``:inxx
### Record representation
It is optional but recommended to specify a format representation for records:
``
>>> db.define_table('person', Field('name'), format='%(name)s')
``:code
or
``
>>> db.define_table('person', Field('name'), format='%(name)s %(id)s')
``:code
or even more complex ones using a function:
``
>>> db.define_table('person', Field('name'),
format=lambda r: r.name or 'anonymous')
``:code
The format attribute will be used for two purposes:
False
including its parent table, tablename, and parent connection:
``
>>> db.person.name._table == db.person
True
>>> db.person.name._tablename == 'person'
True
>>> db.person.name._db == db
True
``:code
A field also has methods. Some of them are used to build queries and we will see them later.
A special method of the field object is ``validate`` and it calls the validators for the field.
``
print db.person.name.validate('John')
``
which returns a tuple ``(value, error)``. ``error`` is ``None`` if the input passes validation.
### Migrations
``migrations``:inxx

The first argument of ``define_table`` is always the table name. The other unnamed arguments are the fields (Field). The function also takes an optional keyword argument called "migrate":
``
>>> db.define_table('person', Field('name'), migrate='person.table')
``:code
The value of migrate is the filename (in the "databases" folder for the application) where web2py stores internal migration information for this table.
These files are very important and should never be removed while the corresponding tables exist. In cases where a table has been dropped and the corresponding file still exist, it can be removed manually. By default, migrate is set to True. This causes web2py to generate the filename from a hash of the connection string. If migrate is set to False, the migration is not performed, and web2py assumes that the table exists in the datastore and it contains (at least) the fields listed in ``define_table``.
The best practice is to give an explicit name to the migrate table.
There may not be two tables in the same application with the same migrate filename.
The DAL class also takes a "migrate" argument, which determines the default value of migrate for calls to ``define_table``. For example,
``
>>> db = DAL('sqlite://storage.db', migrate=False)
``:code
will set the default value of migrate to False whenever ``db.define_table`` is called without a migrate argument.
------
Notice that web2py only migrates new columns, removed columns, and changes in column type (except in sqlite). web2py does not migrate changes in attributes such as changes in the values of ``default``, ``unique``, ``notnull``, and ``ondelete``.
------
Migrations can be disabled for all tables at once:
db = DAL(...,migrate_enabled=False)
``
This is the recommended behavior when two apps share the same database. Only one of the two apps should perform migrations, the other should disabled them.
### Fixing broken migrations
``fake_migrate``:inxx
There are two common problems with migrations and there are ways to recover from them.
One problem is specific with SQLite. SQLite does not enforce column types and cannot drop columns. This means that if you have a column of type string and you remove it, it is not really removed. If you add the column again with a different type (for example datetime) you end up with a datetime column that contains strings (junk for practical purposes). web2py does not complain about this because it does not know what is in the database, until it tries to retrieve records and fails.
If web2py returns an error in the gluon.sql.parse function when selecting records, this is the problem: corrupted data in a column because of the above issue.
The solution consists in updating all records of the table and updating the values in the column in question with None.
The other problem is more generic but typical with MySQL. MySQL does not allow more than one ALTER TABLE in a transaction. This means that web2py must break complex transactions into smaller ones (one ALTER TABLE at the time) and commit one piece at the time. It is therefore possible that part of a complex transaction gets committed and one part fails, leaving web2py in a corrupted state. Why would part of a transaction fail? Because, for example, it involves altering a table and converting a string column into a datetime column, web2py tries to convert the data, but the data cannot be converted. What happens to web2py? It gets confused about what exactly is the table structure actually stored in the database.
The solution consists of disabling migrations for all tables and enabling fake migrations:
``
db.define_table(....,migrate=True,fake_migrate=True)
``:code
This will rebuild web2py metadata about the table according to the table definition. Try multiple table definitions to see which one works (the one before the failed migration and the one after the failed migration). Once successful remove the ``fake_migrate=True`` attribute.
Before attempting to fix migration problems it is prudent to make a copy of "applications/yourapp/databases/*.table" files.
Migration problems can also be fixed for all tables at once:
``
db = DAL(...,fake_migrate_all=True)
``:code
This also fails if the model describes tables that do not exist in the database,
but it can help narrowing down the problem.
+### Migration control summary
+
+The logic of the various migration arguments are summarized in this pseudo-code:
+``
+if DAL.migrate_enabled and table.migrate:
+ if DAL.fake_migrate_all or table.fake_migrate:
+ perform fake migration
+ else:
+ perform migration
+``:code
+
### ``insert``
The first argument of ``define_table`` is always the table name. The other unnamed arguments are the fields (Field). The function also takes an optional last argument called "migrate" which must be referred to explicitly by name as in:
``
>>> db.define_table('person', Field('name'), migrate='person.table')
``:code
The value of migrate is the filename (in the "databases" folder for the application) where web2py stores internal migration information for this table.
These files are very important and should never be removed while the corresponding tables exist. In cases where a table has been dropped and the corresponding file still exist, it can be removed manually. By default, migrate is set to True. This causes web2py to generate the filename from a hash of the connection string. If migrate is set to False, the migration is not performed, and web2py assumes that the table exists in the datastore and it contains (at least) the fields listed in ``define_table``.
The best practice is to give an explicit name to the migrate table.
There may not be two tables in the same application with the same migrate filename.
The DAL class also takes a "migrate" argument, which determines the default value of migrate for calls to ``define_table``. For example,
``
>>> db = DAL('sqlite://storage.db', migrate=False)
``:code
will set the default value of migrate to False whenever ``db.define_table`` is called without a migrate argument.
------
Notice that web2py only migrates new columns, removed columns, and changes in column type (except in sqlite). web2py does not migrate changes in attributes such as changes in the values of ``default``, ``unique``, ``notnull``, and ``ondelete``.
------
Migrations can be disabled for all tables at once:
db = DAL(...,migrate_enabled=False)
``
This is the recommended behavior when two apps share the same database. Only one of the two apps should perform migrations, the other should disabled them.
### Fixing broken migrations
``fake_migrate``:inxx
There are two common problems with migrations and there are ways to recover from them.
One problem is specific with SQLite. SQLite does not enforce column types and cannot drop columns. This means that if you have a column of type string and you remove it, it is not really removed. If you add the column again with a different type (for example datetime) you end up with a datetime column that contains strings (junk for practical purposes). web2py does not complain about this because it does not know what is in the database, until it tries to retrieve records and fails.
If web2py returns an error in the gluon.sql.parse function when selecting records, this is the problem: corrupted data in a column because of the above issue.
The solution consists in updating all records of the table and updating the values in the column in question with None.
The other problem is more generic but typical with MySQL. MySQL does not allow more than one ALTER TABLE in a transaction. This means that web2py must break complex transactions into smaller ones (one ALTER TABLE at the time) and commit one piece at the time. It is therefore possible that part of a complex transaction gets committed and one part fails, leaving web2py in a corrupted state. Why would part of a transaction fail? Because, for example, it involves altering a table and converting a string column into a datetime column, web2py tries to convert the data, but the data cannot be converted. What happens to web2py? It gets confused about what exactly is the table structure actually stored in the database.
The solution consists of disabling migrations for all tables and enabling fake migrations:
``
db.define_table(....,migrate=False,fake_migrate=True)
``:code
This will rebuild web2py metadata about the table according to the table definition. Try multiple table definitions to see which one works (the one before the failed migration and the one after the failed migration). Once successful remove the ``fake_migrate=True`` attribute.
Before attempting to fix migration problems it is prudent to make a copy of "applications/yourapp/databases/*.table" files.
Migration problems can also be fixed for all tables at once:
``
db = DAL(...,fake_migrate_all=True)
``:code
This also fails if the model describes tables that do not exist in the database,
but it can help narrowing down the problem.
### ``insert``

+Computed fields are evaluated in the order in which they are defined in the table definition. A computed field can refer to previously defined computed fields (new after v 2.5.1)
+
### Virtual fields
### Virtual fields

------
Mind that virtual fields do not have the same attributes as the other fields (default, readable, requires, etc). In older versions of web2py they do not appear in the list of ``db.table.fields`` and they require a special approach to display in SQLFORM.grid and SQLFORM.smartgrid. See the discussion on grids and virtual fields in the Forms chapter.
------
------
Mind that virtual fields do not have the same attributes as the other fields (default, readable, requires, etc) and they do not appear in the list of ``db.table.fields`` and are not visualized by default in tables (TABLE) and grids (SQLFORM.grid, SQLFORM.smartgrid). For the grids, see the discussion on virtual fields in the Forms chapter.
------

------
Mind that virtual fields do not have the same attributes as the other fields (default, readable, requires, etc) and they do not appear in the list of ``db.table.fields`` and are not visualized by default in tables (TABLE) and grids (SQLFORM.grid, SQLFORM.smartgrid). For the grids, see the discussion on virtual fields in the Forms chapter.
------
------
Mind that virtual fields do not have the same attributes as the other fields (default, readable, requires, etc) and they do not appear in the list of ``db.table.fields`` and are not visualized by default in tables (TABLE) and grids (SQLFORM.grid, SQLFORM.smartgrid).
------

The selection criteria in the example above is a single field.
It can also be a query, such as
``
db.person.update_or_insert((db.person.name=='John') & (db.person.birthplace=='Chicago'),
name='John',birthplace='Chicago',pet='Rover')
``:code
The selection criteria in the exmaple above is a single field.
It can also be a query, such as
``
db.person.update_or_insert((db.person.name=='John') & (db.person.birthplace=='Chicago')),
name='John',birthplace='Chicago',pet='Rover')
``:code

- ``ondelete`` translates into the "ON DELETE" SQL statement. By default it is set to "CASCADE". This tells the database that when it deletes a record, it should also delete all records that refer to it. To disable this feature, set ``ondelete`` to "NO ACTION" or "SET NULL".
- ``notnull=True`` translates into the "NOT NULL" SQL statement. It prevents the database from inserting null values for the field.
- ``unique=True`` translates into the "UNIQUE" SQL statement and it makes sure that values of this field are unique within the table. It is enforced at the database level.
- ``uploadfield`` applies only to fields of type "upload". A field of type "upload" stores the name of a file saved somewhere else, by default on the filesystem under the application "uploads/" folder. If ``uploadfield`` is set to True, then the file is stored in a blob field within the same table and the value of ``uploadfield`` is the name of the blob field. This will be discussed in more detail later in the context of SQLFORM.
- ``ondelete`` translates into the "ON DELETE" SQL statement. By default it is set to "CASCADE". This tells the database that when it deletes a record, it should also delete all records that refer to it. To disable this feature, set ``ondelete`` to "NO ACTION" or "SET NULL".
- ``notnull=True`` translates into the "NOT NULL" SQL statement. It prevents the database from inserting null values for the field.
- ``unique=True`` translates into the "UNIQUE" SQL statement and it makes sure that values of this field are unique within the table. It is enforced at the database level.
- ``uploadfield`` applies only to fields of type "upload". A field of type "upload" stores the name of a file saved somewhere else, by default on the filesystem under the application "uploads/" folder. If ``uploadfield`` is set, then the file is stored in a blob field within the same table and the value of ``uploadfield`` is the name of the blob field. This will be discussed in more detail later in the context of SQLFORM.

+The selection criteria in the exmaple above is a single field.
+It can also be a query, such as
+``
+db.person.update_or_insert((db.person.name=='John') & (db.person.birthplace=='Chicago')),
+ name='John',birthplace='Chicago',pet='Rover')
+``:code
+
#### ``validate_and_insert``, ``validate_and_update``
#### ``validate_and_insert``, ``validate_and_update``

-------------
**SQLite** | ``sqlite://storage.db``
**MySQL** | ``mysql://username:password@localhost/test``
**PostgreSQL** | ``postgres://username:password@localhost/test``
**MSSQL** | ``mssql://username:password@localhost/test``
**FireBird** | ``firebird://username:password@localhost/test``
**Oracle** | ``oracle://username/password@test``
**DB2** | ``db2://username:password@test``
**Ingres** | ``ingres://username:password@localhost/test``
**Sybase** | ``sybase://username:password@localhost/test``
**Informix** | ``informix://username:password@test``
**Teradata** | ``teradata://DSN=dsn;UID=user;PWD=pass;DATABASE=test``
**Cubrid** | ``cubrid://username:password@localhost/test``
**SAPDB** | ``sapdb://username:password@localhost/test``
**IMAP** | ``imap://user:password@server:port``
**MongoDB** | ``mongodb://username:password@localhost/test``
**Google/SQL** | ``google:sql://project:instance/database``
**Google/NoSQL** | ``google:datastore``
-------------
-------------
**SQLite** | ``sqlite://storage.db``
**MySQL** | ``mysql://username:password@localhost/test``
**PostgreSQL** | ``postgres://username:password@localhost/test``
**MSSQL** | ``mssql://username:password@localhost/test``
**FireBird** | ``firebird://username:password@localhost/test``
**Oracle** | ``oracle://username/password@test``
**DB2** | ``db2://username:password@test``
**Ingres** | ``ingres://username:password@localhost/test``
**Sybase** | ``sybase://username:password@localhost/test``
**Informix** | ``informix://username:password@test``
**Teradata** | ``teradata://DSN=dsn;UID=user;PWD=pass;DATABASE=test``
**Cubrid** | ``cubrid://username:password@localhost/test``
**SAPDB** | ``sapdb://username:password@localhost/test``
**IMAP** | ``imap://user:password@server:port``
**MongoDB** | ``mongodb://username:password@localhost/test``
**Google/SQL** | ``google:sql``
**Google/NoSQL** | ``google:datastore``
-------------

+Note for sqlite: web2py will not re-create the dropped table until you navigate the file system to the databases directory of your app, and delete the file associated with the dropped table.
+
### Indexes
### Indexes

-------------
**SQLite** | ``sqlite://storage.db``
**MySQL** | ``mysql://username:password@localhost/test``
**PostgreSQL** | ``postgres://username:password@localhost/test``
**MSSQL** | ``mssql://username:password@localhost/test``
**FireBird** | ``firebird://username:password@localhost/test``
**Oracle** | ``oracle://username/password@test``
**DB2** | ``db2://username:password@test``
**Ingres** | ``ingres://username:password@localhost/test``
**Sybase** | ``sybase://username:password@localhost/test``
**Informix** | ``informix://username:password@test``
**Teradata** | ``teradata://DSN=dsn;UID=user;PWD=pass;DATABASE=test``
**Cubrid** | ``cubrid://username:password@localhost/test``
**SAPDB** | ``sapdb://username:password@localhost/test``
**IMAP** | ``imap://user:password@server:port``
**MongoDB** | ``mongodb://username:password@localhost/test``
**Google/SQL** | ``google:sql``
**Google/NoSQL** | ``google:datastore``
-------------
Notice that in SQLite the database consists of a single file. If it does not exist, it is created. This file is locked every time it is accessed. In the case of MySQL, PostgreSQL, MSSQL, FireBird, Oracle, DB2, Ingres and Informix the database "test" must be created outside web2py. Once the connection is established, web2py will create, alter, and drop tables appropriately.
It is also possible to set the connection string to ``None``. In this case DAL will not connect to any back-end database, but the API can still be accessed for testing. Examples of this will be discussed in Chapter 7.
Some times you may need to generate SQL as if you had a connection but without actually connecting to the database. This can be done with
``
db = DAL('...', do_connect=False)
``:code
In this case you will be able to call ``_select``, ``_insert``, ``_update``, and ``_delete`` to generate SQL but not call ``select``, ``insert``, ``update``, and ``delete``. In most of the cases you can use ``do_connect=False`` even without having the required database drivers.
Notice that by default web2py uses utf8 character encoding for databases. If you work with existing databases that behave differently, you have to change it with the optional parameter ``db_codec`` like
``
db = DAL('...', db_codec='latin1')
``:code
otherwise you'll get UnicodeDecodeError tickets.
#### Connection pooling
``connection pooling``:inxx
The second argument of the DAL constructor is the ``pool_size``; it defaults to zero.
As it is rather slow to establish a new database connection for each request, web2py implements a mechanism for connection pooling. Once a connection is established and the page has been served and the transaction completed, the connection is not closed but goes into a pool. When the next http request arrives, web2py tries to recycle a connection from the pool and use that for the new transaction. If there are no available connections in the pool, a new connection is established.
When web2py starts, the pool is always empty. The pool grows up to the minimum between the value of ``pool_size`` and the max number of concurrent requests. This means that if ``pool_size=10`` but our server never receives more than 5 concurrent requests, then the actual pool size will only grow to 5. If ``pool_size=0`` then connection pooling is not used.
Connections in the pools are shared sequentially among threads, in the sense that they may be used by two different but not simultaneous threads. There is only one pool for each web2py process.
The ``pool_size`` parameter is ignored by SQLite and Google App Engine.
Connection pooling is ignored for SQLite, since it would not yield any benefit.
#### Connection failures
If web2py fails to connect to the database it waits 1 seconds and tries again up to 5 times before declaring a failure. In case of connection pooling it is possible that a pooled connection that stays open but unused for some time is closed by the database end. Thanks to the retry feature web2py tries to re-establish these dropped connections.
#### Replicated databases
The first argument of ``DAL(...)`` can be a list of URIs. In this case web2py tries to connect to each of them. The main purpose for this is to deal with multiple database servers and distribute the workload among them). Here is a typical use case:
``
db = DAL(['mysql://...1','mysql://...2','mysql://...3'])
``:code
In this case the DAL tries to connect to the first and, on failure, it
will try the second and the third. This can also be used to distribute load
in a database master-slave configuration. We will talk more about this
in Chapter 13 in the context of scalability.
### Reserved keywords
``reserved Keywords``:inxx
``check_reserved`` is yet another argument that can be passed to the DAL constructor. It tells it to check table names and column names against reserved SQL keywords in target back-end databases. ``check_reserved`` defaults to None.
This is a list of strings that contain the database back-end adapter names.
The adapter name is the same as used in the DAL connection string. So if you want to check against PostgreSQL and MSSQL then your connection string would look as follows:
``
db = DAL('sqlite://storage.db',
check_reserved=['postgres', 'mssql'])
``:code
The DAL will scan the keywords in the same order as of the list.
There are two extra options "all" and "common". If you specify all, it will check against all known SQL keywords. If you specify common, it will only check against common SQL keywords such as ``SELECT``, ``INSERT``, ``UPDATE``, etc.
For supported back-ends you may also specify if you would like to check against the non-reserved SQL keywords as well. In this case you would append ``_nonreserved`` to the name. For example:
``
check_reserved=['postgres', 'postgres_nonreserved']
``:code
The following database backends support reserved words checking.
The most important method of a DAL is ``define_table``:
>>> db.define_table('person', Field('name'))
``:code
It defines, stores and returns a ``Table`` object called "person" containing a field (column) "name". This object can also be accessed via ``db.person``, so you do not need to catch the return value.
Do not declare a field called "id", because one is created by web2py anyway. Every table has a field called "id" by default. It is an auto-increment integer field (starting at 1) used for cross-reference and for making every record unique, so "id" is a primary key. (Note: the id's starting at 1 is back-end specific. For example, this does not apply to the Google App Engine NoSQL.)
``named id field``:inxx
Optionally you can define a field of ``type='id'`` and web2py will use this field as auto-increment id field. This is not recommended except when accessing legacy database tables. With some limitation, you can also use different primary keys and this is discussed in the section on "Legacy databases and keyed tables".
Tables can be defined only once but you can force web2py to redefine an existing table:
``
db.define_table('person', Field('name'))
db.define_table('person', Field('name'), redefine=True)
``:code
The redefinition may trigger a migration if field content is different.
----------
Because usually in web2py models are executed before controllers, it is possible that some table are defined even if not needed. It is therefore necessary to speed up the code by making table definitions lazy. This is done by setting the ``DAL(...,lazy_tables=True)`` attribute. Tables will be actually created only when accessed.
----------
### Record representation
It is optional but recommended to specify a format representation for records:
``
>>> db.define_table('person', Field('name'), format='%(name)s')
``:code
or
``
>>> db.define_table('person', Field('name'), format='%(name)s %(id)s')
``:code
or even more complex ones using a function:
``
>>> db.define_table('person', Field('name'),
format=lambda r: r.name or 'anonymous')
``:code
All queries are automatically timed by web2py. The variable ``db._timings`` is a
{{=response.toolbar()}}
``
#### ``executesql``
The DAL allows you to explicitly issue SQL statements.
``executesql``:inxx
``
>>> print db.executesql('SELECT * FROM person;')
[(1, u'Massimo'), (2, u'Massimo')]
``:code
In this case, the return values are not parsed or transformed by the DAL, and the format depends on the specific database driver. This usage with selects is normally not needed, but it is more common with indexes.
``executesql`` takes four optional arguments: ``placeholders``, ``as_dict``, ``fields`` and ``colnames``.
``placeholders`` is an optional
sequence of values to be substituted in
or, if supported by the DB driver, a dictionary with keys
matching named placeholders in your SQL.
If ``as_dict`` is set to True, the results cursor returned by the DB driver will be converted to a sequence of dictionaries keyed with the db field names. Results returned with ``as_dict = True`` are the same as those returned when applying **.as_list()** to a normal select.
``
[{field1: value1, field2: value2}, {field1: value1b, field2: value2b}]
``:code
The ``fields`` argument is a list of DAL Field objects that match the
fields returned from the DB. The Field objects should be part of one or
more Table objects defined on the DAL object. The ``fields`` list can
include one or more DAL Table objects in addition to or instead of
including Field objects, or it can be just a single table (not in a
list). In that case, the Field objects will be extracted from the
table(s).
Instead of specifying the ``fields`` argument, the ``colnames`` argument
can be specified as a list of field names in tablename.fieldname format.
Again, these should represent tables and fields defined on the DAL
object.
It is also possible to specify both ``fields`` and the associated
``colnames``. In that case, ``fields`` can also include DAL Expression
objects in addition to Field objects. For Field objects in "fields",
Here is an example:
``SQLFORM.grid``:inxx ``SQLFORM.smartgrid``:inxx
------
``SQLTABLE`` is useful but there are times when one needs more. ``SQLFORM.grid`` is an extension of SQLTABLE that creates a table with search features and pagination, as well as ability to open detailed records, create, edit and delete records. ``SQLFORM.smartgrid`` is a further generalization that allows all of the above but also creates buttons to access referencing records.
------
Here is an example of usage of ``SQLFORM.grid``:
``
def index():
return dict(grid=SQLFORM.grid(query))
``:code
and the corresponding view:
``
{{extend 'layout.html'}}
{{=grid}}
``
``SQLFORM.grid`` and ``SQLFORM.smartgrid`` should be preferred to ``SQLTABLE`` because they are more powerful although higher level and therefore more constraining. They will be explained in more detail in chapter 7.
#### ``orderby``, ``groupby``, ``limitby``, ``distinct``, ``having``
The ``select`` command takes five optional arguments: orderby, groupby, limitby, left and cache. Here we discuss the first three.
You can fetch the records sorted by name:
``orderby``:inxx ``groupby``:inxx ``having``:inxx
``
>>> for row in db().select(
db.person.ALL, orderby=db.person.name):
print row.name
Alex
Bob
Carl
``:code
You can fetch the records sorted by name in reverse order (notice the tilde):
``
>>> for row in db().select(
name
Max
Tim
John
Tim
``:code
You can do a union of the records removing duplicates:
``
>>> rows3 = rows1 | rows2
>>> print rows3
name
Max
Tim
John
``:code
#### ``find``, ``exclude``, ``sort``
``find``:inxx ``exclude``:inxx ``sort``:inxx
Some times you need to perform two selects and one contains a subset of a previous select. In this case it is pointless to access the database again. The ``find``, ``exclude`` and ``sort`` objects allow you to manipulate a Rows objects and generate another one without accessing the database. More specifically:
- ``find`` returns a new set of Rows filtered by a condition and leaves the original unchanged.
- ``exclude`` returns a new set of Rows filtered by a condition and removes them from the original Rows.
- ``sort`` returns a new set of Rows sorted by a condition and leaves the original unchanged.
All these methods take a single argument, a function that acts on each individual row.
Here is an example of usage:
``
>>> db.define_table('person',Field('name'))
>>> db.person.insert(name='John')
>>> db.person.insert(name='Max')
>>> db.person.insert(name='Alex')
>>> rows = db(db.person).select()
>>> for row in rows.find(lambda row: row.name[0]=='M'):
print row.name
Max
>>> print len(rows)
3
>>> for row in rows.exclude(lambda row: row.name[0]=='M'):
print row.name
Max
>>> for row in rows.sort(lambda row: row.name):
print row.name
Alex
John
``:code
They can be combined:
``
>>> rows = db(db.person).select()
>>> rows = rows.find(
lambda row: 'x' in row.name).sort(
lambda row: row.name)
>>> for row in rows:
print row.name
Alex
Max
``:code
Sort takes an optional argument ``reverse=True`` with the obvious meaning.
The ``find`` method has an optional limitby argument with the same syntax and functionality as the Set select ``method``.
### Other methods
#### ``update_or_insert``
``update_or_insert``:inxx
Some times you need to perform an insert only if there is no record with the same values as those being inserted.
This can be done with
``
db.define_table('person',Field('name'),Field('birthplace'))
db.person.update_or_insert(name='John',birthplace='Chicago')
``:code
The record will be inserted only if there is no other user called John born in Chicago.
You can specify which values to use as a key to determine if the record exists. For example:
``
db.person.update_or_insert(db.person.name=='John',
name='John',birthplace='Chicago')
``:code
and if there is John his birthplace will be updated else a new record will be created.
#### ``validate_and_insert``, ``validate_and_update``
``validate_and_insert``:inxx ``validate_and_update``:inxx
The function
``
ret = db.mytable.validate_and_insert(field='value')
``:code
works very much like
``
id = db.mytable.insert(field='value')
``:code
except that it calls the validators for the fields before performing the insert and bails out if the validation does not pass. If validation does not pass the errors can be found in ``ret.error``. If it passes, the id of the new record is in ``ret.id``. Mind that normally validation is done by the form processing logic so this function is rarely needed.
Similarly
``
ret = db(query).validate_and_update(field='value')
``:code
works very much the same as
``
num = db(query).update(field='value')
``:code
except that it calls the validators for the fields before performing the update. Notice that it only works if query involves a single table. The number of updated records can be found in ``res.updated`` and errors will be ``ret.errors``.
#### ``smart_query`` (experimental)
There are times when you need to parse a query using natural language such as
``
name contain m and age greater than 18
``
The DAL provides a method to parse this type of queries:
``
search = 'name contain m and age greater than 18'
rows = db.smart_query([db.person],search).select()
``
The first argument must be a list of tables or fields that should be allowed in the search. It raises a ``RuntimeError`` if the search string is invalid. This functionality can be used to build RESTful interfaces (see chapter 10) and it is used internally by the ``SQLFORM.grid`` and ``SQLFORM.smartgrid``.
In the smartquery search string, a field can be identified by fieldname only and or by tablename.fieldname. Strings may be delimited by double quotes if they contain spaces.
left = db.thing.on(...)
``:code
does the left join query. Here the argument of ``db.thing.on`` is the condition required for the join (the same used above for the inner join). In the case of a left join, it is necessary to be explicit about which fields to select.
Multiple left joins can be combined by passing a list or tuple of ``db.mytable.on(...)`` to the ``left`` attribute.
#### Grouping and counting
When doing joins, sometimes you want to group rows according to certain criteria and count them. For example, count the number of things owned by every person. web2py allows this as well. First, you need a count operator. Second, you want to join the person table with the thing table by owner. Third, you want to select all rows (person + thing), group them by person, and count them while grouping:
``grouping``:inxx
``
>>> count = db.person.id.count()
>>> for row in db(db.person.id==db.thing.owner_id).select(
db.person.name, count, groupby=db.person.name):
print row.person.name, row[count]
Alex 2
Bob 1
``:code
Notice the ``count`` operator (which is built-in) is used as a field. The only issue here is in how to retrieve the information. Each row clearly contains a person and the count, but the count is not a field of a person nor is it a table. So where does it go? It goes into the storage object representing the record with a key equal to the query expression itself. The count method of the Field object has an optional ``distinct`` argument. When set to ``True`` it specifies that only distinct values of the field in question are to be counted.
### Many to many
``many-to-many``:inxx
In the previous examples, we allowed a thing to have one owner but one person could have many things. What if Boat was owned by Alex and Curt? This requires a many-to-many relation, and it is realized via an intermediate table that links a person to a thing via an ownership relation.
Here is how to do it:
``
>>> db.define_table('person',
Field('name'))
>>> db.define_table('thing',
Field('name'))
>>> db.define_table('ownership',
Field('person', 'reference person'),
Field('thing', 'reference thing'))
``:code
the existing ownership relationship can now be rewritten as:
``
>>> db.ownership.insert(person=1, thing=1) # Alex owns Boat
>>> db.ownership.insert(person=1, thing=2) # Alex owns Chair
or any value from the list
db.mytable.myfield.contains(['value1','value2'], all=false)
``
There is a also a ``regexp`` method that works like the ``like`` method but allows regular expression syntax for the look-up expression. It is only supported by PostgreSQL and SQLite.
The ``upper`` and ``lower`` methods allow you to convert the value of the field to upper or lower case, and you can also combine them with the like operator:
``upper``:inxx ``lower``:inxx
``
>>> for row in db(db.log.event.upper().like('PORT%')).select():
print row.event
port scan
``:code
#### ``year``, ``month``, ``day``, ``hour``, ``minutes``, ``seconds``
``hour``:inxx ``minutes``:inxx ``seconds``:inxx ``day``:inxx ``month``:inxx ``year``:inxx
The date and datetime fields have day, month and year methods. The datetime and time fields have hour, minutes and seconds methods. Here is an example:
``
>>> for row in db(db.log.event_time.year()==2013).select():
print row.event
port scan
xss injection
unauthorized login
``:code
#### ``belongs``
The SQL IN operator is realized via the belongs method which returns true when the field value belongs to the specified set (list or tuples):
``belongs``:inxx
``
>>> for row in db(db.log.severity.belongs((1, 2))).select():
print row.event
port scan
xss injection
``:code
The DAL also allows a nested select as the argument of the belongs operator. The only caveat is that the nested select has to be a ``_select``, not a ``select``, and only one field has to be selected explicitly, the one that defines the set.
``nested select``:inxx
``
>>> bad_days = db(db.log.severity==3)._select(db.log.event_time)
>>> for row in db(db.log.event_time.belongs(bad_days)).select():
print row.event
port scan
xss injection
unauthorized login
``:code
In those cases where a nested select is required and the look-up field is a reference we can also use a query as argument. For example:
``
+db.define_table('person', Field('name'))
+db.define_table('thing', Field('name'), Field('owner_id', 'reference thing'))
db(db.thing.owner_id.belongs(db.person.name=='Jonathan')).select()
``:code
In this case it is obvious that the next select only needs the field referenced by the ``db.thing.owner_id`` field so we do not need the more verbose ``_select`` notation.
``nested_select``:inxx
A nested select can also be used as insert/update value but in this case the syntax is different:
``
lazy = db(db.person.name=='Jonathan').nested_select(db.person.id)
db(db.thing.id==1).update(owner_id = lazy)
``:code
In this case ``lazy`` is a nested expression that computes the ``id`` of person "Jonathan". The two lines result in one single SQL query.
#### ``sum``, ``avg``, ``min``, ``max`` and ``len``
``sum``:inxx ``avg``:inxx ``min``:inxx ``max``:inxx
Previously, you have used the count operator to count records. Similarly, you can use the sum operator to add (sum) the values of a specific field from a group of records. As in the case of count, the result of a sum is retrieved via the store object:
if not db(db.person).count():
``:code
Each record is identified by an ID and referenced by that ID. If you
have two copies of the database used by distinct web2py installations,
the ID is unique only within each database and not across the databases.
This is a problem when merging records from different databases.
In order to make a record uniquely identifiable across databases, they
must:
- have a unique id (UUID),
- have an event_time (to figure out which one is more recent if multiple copies),
- reference the UUID instead of the id.
This can be achieved without modifying web2py. Here is what to do:
Change the above model into:
``
db.define_table('person',
Field('uuid', length=64, default=lambda:str(uuid.uuid4())),
Field('modified_on', 'datetime', default=request.now),
Field('name'),
format='%(name)s')
db.define_table('thing',
Field('uuid', length=64, default=lambda:str(uuid.uuid4())),
Field('modified_on', 'datetime', default=request.now),
Field('owner_id', length=64),
Field('name'),
format='%(name)s')
db.thing.owner_id.requires = IS_IN_DB(db,'person.uuid','%(name)s')
if not db(db.person.id).count():
id = uuid.uuid4()
db.person.insert(name="Massimo", uuid=id)
db.thing.insert(owner_id=id, name="Chair")
``:code
-------
Notice that in the above table definitions, the default value for the two ``uuid`` fields is set to a lambda function, which returns a UUID (converted to a string). The lambda function is called once for each record inserted, ensuring that each record gets a unique UUID, even if multiple records are inserted in a single transaction.
-------
Create a controller action to export the database:
``
def export():
The ``export_to_csv_file`` function accepts a keyword argument named ``represent
``colnames``:inxx
The function also accepts a keyword argument named ``colnames`` that should contain a list of column names one wish to export. It defaults to all columns.
Both ``export_to_csv_file`` and ``import_from_csv_file`` accept keyword arguments that tell the csv parser the format to save/load the files:
- ``delimiter``: delimiter to separate values (default ',')
- ``quotechar``: character to use to quote string values (default to double quotes)
- ``quoting``: quote system (default ``csv.QUOTE_MINIMAL``)
Here is some example usage:
``
>>> import csv
>>> rows = db(query).select()
>>> rows.export_to_csv_file(open('/tmp/test.txt', 'w'),
delimiter='|',
quotechar='"',
quoting=csv.QUOTE_NONNUMERIC)
``:code
Which would render something similar to
``
"hello"|35|"this is the text description"|"2013-03-03"
``:code
For more information consult the official Python documentation ``quoteall``:cite
### Caching selects
The select method also takes a cache argument, which defaults to None. For caching purposes, it should be set to a tuple where the first element is the cache model (cache.ram, cache.disk, etc.), and the second element is the expiration time in seconds.
In the following example, you see a controller that caches a select on the previously defined db.log table. The actual select fetches data from the back-end database no more frequently than once every 60 seconds and stores the result in cache.ram. If the next call to this controller occurs in less than 60 seconds since the last database IO, it simply fetches the previous data from cache.ram.
``cache select``:inxx
``
def cache_db_select():
logs = db().select(db.log.ALL, cache=(cache.ram, 60))
return dict(logs=logs)
``:code
``cacheable``:inxx
The ``select`` method has an optional ``cacheable`` argument, normally set to ``False``. When ``cacheable=True`` the resulting ``Rows`` is serializable but The ``Row``s lack ``update_record`` and ``delete_record`` methods.
This is best explained via some examples.
``:code
Here ``f`` is a dict of fields passed to insert or update, ``id`` is the id of the newly inserted record, ``s`` is the Set object used for update or delete.
``
>>> db.person.insert(name='John')
({'name': 'John'},)
({'name': 'John'}, 1)
>>> db(db.person.id==1).update(name='Tim')
(<Set (person.id = 1)>, {'name': 'Tim'})
(<Set (person.id = 1)>, {'name': 'Tim'})
>>> db(db.person.id==1).delete()
(<Set (person.id = 1)>,)
(<Set (person.id = 1)>,)
``:code
The return values of these callback should be ``None`` or ``False``. If any of the ``_before_*`` callback returns a ``True`` value it will abort the actual insert/update/delete operation.
``update_naive``:inxx.
Some times a callback may need to perform an update in the same or a different table and one wants to avoid callbacks calling themselves recursively.
For this purpose there the Set objects have an ``update_naive`` method that works like ``update`` but ignores before and after callbacks.
#### Record versioning
``_enable_record_versioning``:inxx
It is possible to ask web2py to save every copy of a record when the record is individually modified. There are different ways to do it and it can be done for all tables at once using the syntax:
``
auth.enable_record_versioning(db)
``:code
this requires Auth and it is discussed in the chapter about authentication.
It can also be done for each individual table as discussed below.
Consider the following table:
``
db.define_table('stored_item',
True.
We can tell web2py to create a new table (in the same or a different database) and store all previous versions of each record in the table, when modified.
This is done in the following way:
``
db.stored_item._enable_record_versioning()
``:code
or in a more verbose syntax:
``
db.stored_item._enable_record_versioning(
archive_db = db,
archive_name = 'stored_item_archive',
current_record = 'current_record',
is_active = 'is_active')
``
The ``archive_db=db`` tells web2py to store the archive table in the same database as the ``stored_item`` table. The ``archive_name`` sets the name for the archive table. The archive table has the same fields as the original table ``stored_item`` except that unique fields are no longer unique (because it needs to store multiple versions) and has an extra field which name is specified by ``current_record`` and which is a reference to the current record in the ``stored_item`` table.
When records are deleted, they are not really deleted. A deleted record is copied in the ``stored_item_archive`` table (like when it is modified) and the ``is_active`` field is set to False. By enabling record versioning web2py sets a ``custom_filter`` on this table that hides all records in table ``stored_item`` where the ``is_active`` field is set to False. The ``is_active`` parameter in the ``_enable_record_versioning`` method allows to specify the name of the field used by the ``custom_filter`` to determine if the field was deleted or not.
``custom_filter``s are ignored by the appadmin interface.
#### Common fields and multi-tenancy
``common fields``:inxx
``multi tenancy``:inxx
``db._common_fields`` is a list of fields that should belong to all the tables. This list can also contain tables and it is understood as all fields from the table. For example occasionally you find yourself in need to add a signature to all your tables but the ```auth`` tables. In this case, after you ``db.define_tables()`` but before defining any other table, insert
``
db._common_fields.append(auth.signature)
``
One field is special: "request_tenant".
This field does not exist but you can create it and add it to any of your tables (or them all):
``
db._common_fields.append(Field('request_tenant',
default=request.env.http_host,writable=False))
``
Any select, delete or update in this table, will include only public blog posts.
``
db.blog_post._common_filter = lambda query: db.blog_post.is_public == True
``
It serves both as a way to avoid repeating the "db.blog_post.is_public==True" phrase in each blog post search, and also as a security enhancement, that prevents you from forgetting to disallow viewing of none public posts.
In case you actually do want items left out by the common filter (for example, allowing the admin to see none public posts), you can either remove the filter:
``
db.blog_post._common_filter = None
``
or ignore it:
``
db(query, ignore_common_filters=True).select(...)
``
#### Custom ``Field`` types (experimental)
``SQLCustomType``:inxx
Aside for using ``filter_in`` and ``filter_out``, it is possible to define new/custom field types.
For example we consider here a field that contains binary data in compressed form:
``
from gluon.dal import SQLCustomType
import zlib
compressed = SQLCustomType(
type ='text',
native='text',
encoder =(lambda x: zlib.compress(x or '')),
decoder = (lambda x: zlib.decompress(x))
)
db.define_table('example', Field('data',type=compressed))
``:code
``SQLCustomType`` is a field type factory. Its ``type`` argument must be one of the standard web2py types. It tells web2py how to treat the field values at the web2py level. ``native`` is the name of the field as far as the database is concerned. Allowed names depend on the database engine. ``encoder`` is an optional transformation function applied when the data is stored and ``decoder`` is the optional reversed transformation function.
This feature is marked as experimental. In practice it has been in web2py for a long time and it works but it can make the code not portable, for example when the native type is database specific. It does not work on Google App Engine NoSQL.
#### Using DAL without define tables
SQLiteAdapter extends BaseAdapter
JDBCSQLiteAdapter extends SQLiteAdapter
MySQLAdapter extends BaseAdapter
PostgreSQLAdapter extends BaseAdapter
JDBCPostgreSQLAdapter extends PostgreSQLAdapter
OracleAdapter extends BaseAdapter
MSSQLAdapter extends BaseAdapter
MSSQL2Adapter extends MSSQLAdapter
FireBirdAdapter extends BaseAdapter
FireBirdEmbeddedAdapter extends FireBirdAdapter
InformixAdapter extends BaseAdapter
DB2Adapter extends BaseAdapter
IngresAdapter extends BaseAdapter
IngresUnicodeAdapter extends IngresAdapter
GoogleSQLAdapter extends MySQLAdapter
NoSQLAdapter extends BaseAdapter
GoogleDatastoreAdapter extends NoSQLAdapter
CubridAdapter extends MySQLAdapter (experimental)
TeradataAdapter extends DB2Adapter (experimental)
SAPDBAdapter extends BaseAdapter (experimental)
CouchDBAdapter extends NoSQLAdapter (experimental)
+IMAPAdapter extends NoSQLAdapter (experimental)
MongoDBAdapter extends NoSQLAdapter (experimental)
``
which override the behavior of the ``BaseAdapter``.
Each adapter has more or less this structure:
``
class MySQLAdapter(BaseAdapter):
# specify a diver to use
driver = globals().get('pymysql',None)
# map web2py types into database types
types = {
'boolean': 'CHAR(1)',
'string': 'VARCHAR(%(length)s)',
'text': 'LONGTEXT',
...
}
ADAPTERS = {
'oracle': OracleAdapter,
'mssql': MSSQLAdapter,
'mssql2': MSSQL2Adapter,
'db2': DB2Adapter,
'teradata': TeradataAdapter,
'informix': InformixAdapter,
'firebird': FireBirdAdapter,
'firebird_embedded': FireBirdAdapter,
'ingres': IngresAdapter,
'ingresu': IngresUnicodeAdapter,
'sapdb': SAPDBAdapter,
'cubrid': CubridAdapter,
'jdbc:sqlite': JDBCSQLiteAdapter,
'jdbc:sqlite:memory': JDBCSQLiteAdapter,
'jdbc:postgres': JDBCPostgreSQLAdapter,
'gae': GoogleDatastoreAdapter, # discouraged, for backward compatibility
'google:datastore': GoogleDatastoreAdapter,
'google:sql': GoogleSQLAdapter,
'couchdb': CouchDBAdapter,
'mongodb': MongoDBAdapter,
+ 'imap': IMAPAdapter
}
``:code
the uri string is then parsed in more detail by the adapter itself.
For any adapter you can replace the driver with a different one:
``
import MySQLdb as mysqldb
from gluon.dal import MySQLAdapter
MySQLAdapter.driver = mysqldb
``
i.e. ``mysqldb`` has to be ''that module'' with a .connect() method.
You can specify optional driver arguments and adapter arguments:
``
db =DAL(..., driver_args={}, adapter_args={})
``
#### Gotchas
**SQLite** does not support dropping and altering columns. That means that web2py migrations will work up to a point. If you delete a field from a table, the column will remain in the database but will be invisible to web2py. If you decide to reinstate the column, web2py will try re-create it and fail. In this case you must set ``fake_migrate=True`` so that metadata is rebuilt without attempting to add the column again. Also, for the same reason, **SQLite** is not aware of any change of column type. If you insert a number in a string field, it will be stored as string. If you later change the model and replace the type "string" with type "integer", SQLite will continue to keep the number as a string and this may cause problem when you try to extract the data.
**MySQL** does not support multiple ALTER TABLE within a single transaction. This means that any migration process is broken into multiple commits. If something happens that causes a failure it is possible to break a migration (the web2py metadata are no longer in sync with the actual table structure in the database). This is unfortunate but it can be prevented (migrate one table at the time) or it can be fixed a posteriori (revert the web2py model to what corresponds to the table structure in database, set ``fake_migrate=True`` and after the metadata has been rebuilt, set ``fake_migrate=False`` and migrate the table again).
**Google SQL** has the same problems as MySQL and more. In particular table metadata itself must be stored in the database in a table that is not migrated by web2py. This is because Google App Engine has a read-only file system. Web2py migrations in Google:SQL combined with the MySQL issue described above can result in metadata corruption. Again, this can be prevented (by migrating the table at once and then setting migrate=False so that the metadata table is not accessed any more) or it can fixed a posteriori (by accessing the database using the Google dashboard and deleting any corrupted entry from the table called ``web2py_filesystem``.
``limitby``:inxx
**MSSQL** does not support the SQL OFFSET keyword. Therefore the database cannot do pagination. When doing a ``limitby=(a,b)`` web2py will fetch the first ``b`` rows and discard the first ``a``. This may result in a considerable overhead when compared with other database engines.
**Oracle** also does not support pagination. It does not support neither the OFFSET nor the LIMIT keywords. Web2py achieves pagination by translating a ``db(...).select(limitby=(a,b))`` into a complex three-way nested select (as suggested by official Oracle documentation). This works for simple select but may break for complex selects involving aliased fields and or joins.
**MSSQL** has problems with circular references in tables that have ONDELETE CASCADE. This is an MSSQL bug and you work around it by setting the ondelete attribute for all reference fields to "NO ACTION". You can also do it once and for all before you define tables:
-------------
**SQLite** | ``sqlite://storage.db``
**MySQL** | ``mysql://username:password@localhost/test``
**PostgreSQL** | ``postgres://username:password@localhost/test``
**MSSQL** | ``mssql://username:password@localhost/test``
**FireBird** | ``firebird://username:password@localhost/test``
**Oracle** | ``oracle://username/password@test``
**DB2** | ``db2://username:password@test``
**Ingres** | ``ingres://username:password@localhost/test``
**Sybase** | ``sybase://username:password@localhost/test``
**Informix** | ``informix://username:password@test``
**Teradata** | ``teradata://DSN=dsn;UID=user;PWD=pass;DATABASE=name``
**Cubrid** | ``cubrid://username:password@localhost/test``
**SAPDB** | ``sapdb://username:password@localhost/test``
**IMAP** | ``imap://user:password@server:port``
**MongoDB** | ``mongodb://username:password@localhost/test``
**Google/SQL** | ``google:sql``
**Google/NoSQL** | ``google:datastore``
-------------
Notice that in SQLite the database consists of a single file. If it does not exist, it is created. This file is locked every time it is accessed. In the case of MySQL, PostgreSQL, MSSQL, FireBird, Oracle, DB2, Ingres and Informix the database "test" must be created outside web2py. Once the connection is established, web2py will create, alter, and drop tables appropriately.
It is also possible to set the connection string to ``None``. In this case DAL will not connect to any back-end database, but the API can still be accessed for testing. Examples of this will be discussed in Chapter 7.
Some times you may need to generate SQL as if you had a connection but without actually connecting to the database. This can be done with
``
db = DAL('...', do_connect=False)
``:code
In this case you will be able to call ``_select``, ``_insert``, ``_update``, and ``_delete`` to generate SQL but not call ``select``, ``insert``, ``update``, and ``delete``. In most of the cases you can use ``do_connect=False`` even without having the required database drivers.
Notice that by default web2py uses utf8 character encoding for databases. If you work with existing databases that behave differently, you have to change it with the optional parameter ``db_codec`` like
``
db = DAL('...', db_codec='latin1')
``:code
otherwise you'll get UnicodeDecodeErrors tickets.
#### Connection pooling
``connection pooling``:inxx
The second argument of the DAL constructor is the ``pool_size``; it defaults to zero.
As it is rather slow to establish a new database connection for each request, web2py implements a mechanism for connection pooling. Once a connection is established and the page has been served and the transaction completed, the connection is not closed but goes into a pool. When the next http request arrives, web2py tries to recycle a connection from the pool and use that for the new transaction. If there are no available connections in the pool, a new connection is established.
When web2py starts, the pool is always empty. The pool grows up to the minimum between the value of ``pool_size`` and the max number of concurrent requests. This means that if ``pool_size=10`` but our server never receives more than 5 concurrent requests, then the actual pool size will only grow to 5. If ``pool_size=0`` then connection pooling is not used.
Connections in the pools are shared sequentially among threads, in the sense that they may be used by two different but not simultaneous threads. There is only one pool for each web2py process.
The ``pool_size`` parameter is ignored by SQLite and Google App Engine.
Connection pooling is ignored for SQLite, since it would not yield any benefit.
#### Connection failures
If web2py fails to connect to the database it waits 1 seconds and tries again up to 5 times before declaring a failure. In case of connection pooling it is possible that a pooled connection that stays open but unused for some time is closed by the database end. Thanks to the retry feature web2py tries to re-establish these dropped connections.
#### Replicated databases
The first argument of ``DAL(...)`` can be a list of URIs. In this case web2py tries to connect to each of them. The main purpose for this is to deal with multiple database servers and distribute the workload among them). Here is a typical use case:
``
db = DAL(['mysql://...1','mysql://...2','mysql://...3'])
``:code
In this case the DAL tries to connect to the first and, on failure, it
will try the second and the third. This can also be used to distribute load
in a database master-slave configuration. We will talk more about this
in Chapter 13 in the context of scalability.
### Reserved keywords
``reserved Keywords``:inxx
-``check_reserved`` is yet another argument that can be passed to the DAL constructor. It tells it to check table names and column names against reserved SQL keywords in target back-end databases.
-
This argument is ``check_reserved`` and it defaults to None.
This is a list of strings that contain the database back-end adapter names.
The adapter name is the same as used in the DAL connection string. So if you want to check against PostgreSQL and MSSQL then your connection string would look as follows:
``
db = DAL('sqlite://storage.db',
check_reserved=['postgres', 'mssql'])
``:code
The DAL will scan the keywords in the same order as of the list.
There are two extra options "all" and "common". If you specify all, it will check against all known SQL keywords. If you specify common, it will only check against common SQL keywords such as ``SELECT``, ``INSERT``, ``UPDATE``, etc.
For supported back-ends you may also specify if you would like to check against the non-reserved SQL keywords as well. In this case you would append ``_nonreserved`` to the name. For example:
``
check_reserved=['postgres', 'postgres_nonreserved']
``:code
The following database backends support reserved words checking.
The most important method of a DAL is ``define_table``:
>>> db.define_table('person', Field('name'))
``:code
It defines, stores and returns a ``Table`` object called "person" containing a field (column) "name". This object can also be accessed via ``db.person``, so you do not need to catch the return value.
Do not declare a field called "id", because one is created by web2py anyway. Every table has a field called "id" by default. It is an auto-increment integer field (starting at 1) used for cross-reference and for making every record unique, so "id" is a primary key. (Note: the id's starting at 1 is back-end specific. For example, this does not apply to the Google App Engine NoSQL.)
``named id field``:inxx
Optionally you can define a field of ``type='id'`` and web2py will use this field as auto-increment id field. This is not recommended except when accessing legacy database tables. With some limitation, you can also use different primary keys and this is discussed in the section on "Legacy databases and keyed tables".
Tables can be defined only once but you can force web2py to redefine an existing table:
``
db.define_table('person', Field('name'))
db.define_table('person', Field('name'), redefine=True)
``:code
The redefinition may trigger a migration if field content is different.
----------
Because usually in web2py models are executed before controllers, it is possible that some table are defined even if not needed. It is therefore necessary to speed up the code by making table definitions lazy. This is done by setting the ``DAL(...,lazy_tables=True)`` attributes. Tables will be actually created only when accessed.
----------
### Record representation
It is optional but recommended to specify a format representation for records:
``
>>> db.define_table('person', Field('name'), format='%(name)s')
``:code
or
``
>>> db.define_table('person', Field('name'), format='%(name)s %(id)s')
``:code
or even more complex ones using a function:
``
>>> db.define_table('person', Field('name'),
format=lambda r: r.name or 'anonymous')
``:code
All queries are automatically timed by web2py. The variable ``db._timings`` is a
{{=response.toolbar()}}
``
#### ``executesql``
The DAL allows you to explicitly issue SQL statements.
``executesql``:inxx
``
>>> print db.executesql('SELECT * FROM person;')
[(1, u'Massimo'), (2, u'Massimo')]
``:code
In this case, the return values are not parsed or transformed by the DAL, and the format depends on the specific database driver. This usage with selects is normally not needed, but it is more common with indexes.
``executesql`` takes four optional arguments: ``placeholders``, ``as_dict``, ``fields`` and ``colnames``.
``placeholders`` is an optional
sequence of values to be substituted in
or, if supported by the DB driver, a dictionary with keys
matching named placeholders in your SQL.
-If ``as_dict`` is set to True,
-and the results cursor returned by the DB driver will be
-converted to a sequence of dictionaries keyed with the db
-field names. Results returned with ``as_dict = True ``are
the same as those returned when applying **.as_list()** to a normal select.
``
[{field1: value1, field2: value2}, {field1: value1b, field2: value2b}]
``:code
The ``fields`` argument is a list of DAL Field objects that match the
fields returned from the DB. The Field objects should be part of one or
more Table objects defined on the DAL object. The ``fields`` list can
include one or more DAL Table objects in addition to or instead of
including Field objects, or it can be just a single table (not in a
list). In that case, the Field objects will be extracted from the
table(s).
Instead of specifying the ``fields`` argument, the ``colnames`` argument
can be specified as a list of field names in tablename.fieldname format.
Again, these should represent tables and fields defined on the DAL
object.
It is also possible to specify both ``fields`` and the associated
``colnames``. In that case, ``fields`` can also include DAL Expression
objects in addition to Field objects. For Field objects in "fields",
Here is an example:
``SQLFORM.grid``:inxx ``SQLFORM.smartgrid``:inxx
------
``SQLTABLE`` is useful but there are times when one needs more. ``SQLFORM.grid`` is an extension of SQLTABLE that creates a table with search features and pagination, as well as ability to open detailed records, create, edit and delete records. ``SQLFORM.smartgrid`` is a further generalization that allows all of the above but also creates buttons to access referencing records.
------
Here is an example of usage of ``SQLFORM.grid``:
``
def index():
return dict(grid=SQLFORM.grid(query))
``:code
and the corresponding view:
``
{{extend 'layout.html'}}
{{=grid}}
``
``SQLFORM.grid`` and ``SQLFORM.smartgrid`` should be preferred to ``SQLTABLE`` because they are more powerful although higher level and therefore more constraining. They will be explained in more detail in chapter 8.
#### ``orderby``, ``groupby``, ``limitby``, ``distinct``, ``having``
The ``select`` command takes five optional arguments: orderby, groupby, limitby, left and cache. Here we discuss the first three.
You can fetch the records sorted by name:
``orderby``:inxx ``groupby``:inxx ``having``:inxx
``
>>> for row in db().select(
db.person.ALL, orderby=db.person.name):
print row.name
Alex
Bob
Carl
``:code
You can fetch the records sorted by name in reverse order (notice the tilde):
``
>>> for row in db().select(
name
Max
Tim
John
Tim
``:code
You can do a union of the records removing duplicates:
``
>>> rows3 = rows1 | rows2
>>> print rows3
name
Max
Tim
John
``:code
#### ``find``, ``exclude``, ``sort``
``find``:inxx ``exclude``:inxx ``sort``:inxx
Some times you to perform two selects and one contains a subset of a previous select. In this case it is pointless to access the database again. The ``find``, ``exclude`` and ``sort`` objects allow you to manipulate a Rows objects and generate another one without accessing the database. More specifically:
- ``find`` returns a new set of Rows filtered by a condition and leaves the original unchanged.
- ``exclude`` returns a new set of Rows filtered by a condition and removes them from the original Rows.
- ``sort`` returns a new set of Rows sorted by a condition and leaves the original unchanged.
All these methods take a single argument, a function that acts on each individual row.
Here is an example of usage:
``
>>> db.define_table('person',Field('name'))
>>> db.person.insert(name='John')
>>> db.person.insert(name='Max')
>>> db.person.insert(name='Alex')
>>> rows = db(db.person).select()
>>> for row in rows.find(lambda row: row.name[0]=='M'):
print row.name
Max
>>> print len(rows)
3
>>> for row in rows.exclude(lambda row: row.name[0]=='M'):
print row.name
Max
>>> for row in rows.sort(lambda row: row.name):
print row.name
Alex
John
``:code
They can be combined:
``
>>> rows = db(db.person).select()
>>> rows = rows.find(
lambda row: 'x' in row.name).sort(
lambda row: row.name)
>>> for row in rows:
print row.name
Alex
Max
``:code
Sort takes an optional argument ``reverse=True`` with the obvious meaning.
The ``find`` method as an optional limitby argument with the same syntax and functionality as the Set select ``method``.
### Other methods
#### ``update_or_insert``
``update_or_insert``:inxx
Some times you need to perform an insert only if there is no record with the same values as those being inserted.
This can be done with
``
db.define_table('person',Field('name'),Field('birthplace'))
db.person.update_or_insert(name='John',birthplace='Chicago')
``:code
The record will be inserted only of there is no other user called John born in Chicago.
You can specify which values to use as a key to determine if the record exists. For example:
``
db.person.update_or_insert(db.person.name=='John',
name='John',birthplace='Chicago')
``:code
and if there is John his birthplace will be updated else a new record will be created.
#### ``validate_and_insert``, ``validate_and_update``
``validate_and_insert``:inxx ``validate_and_update``:inxx
The function
``
ret = db.mytable.validate_and_insert(field='value')
``:code
works very much like
``
id = db.mytable.insert(field='value')
``:code
except that it calls the validators for the fields before performing the insert and bails out if the validation does not pass. If validation does not pass the errors can be found in ``ret.error``. If it passes, the id of the new record is in ``ret.id``. Mind that normally validation is done by the form processing logic so this function is rarely needed.
Similarly
``
ret = db(query).validate_and_update(field='value')
``:code
works very much the same as
``
num = db(query).update(field='value')
``:code
except that it calls the validators for the fields before performing the update. Notice that it only works if query involves a single table. The number of updated records can be found in ``res.updated`` and errors will be ``ret.errors``.
#### ``smart_query`` (experimental)
There are times when you need to parse a query using natural language such as
``
name contain m and age greater than 18
``
The DAL provides a method to parse this type of queries:
``
search = 'name contain m and age greater than 18'
rows = db.smart_query([db.person],search).select()
``
The first argument must be a list of tables or fields that should be allowed in the search. It raises a ``RuntimeError`` if the search string is invalid. This functionality can be used to build RESTful interfaces (see chapter 10) and it is used internally by the ``SQLFORM.grid`` and ``SQLFORM.smartgrid``.
In the smartquery search string, a field can be identified by fieldname only and or by tablename.fieldname. Strings may be delimited by double quotes if they contain spaces.
left = db.thing.on(...)
``:code
does the left join query. Here the argument of ``db.thing.on`` is the condition required for the join (the same used above for the inner join). In the case of a left join, it is necessary to be explicit about which fields to select.
Multiple left joins can be combined by passing a list or tuple of ``db.mytable.on(...)`` to the ``left`` attribute.
#### Grouping and counting
When doing joins, sometimes you want to group rows according to certain criteria and count them. For example, count the number of things owned by every person. web2py allows this as well. First, you need a count operator. Second, you want to join the person table with the thing table by owner. Third, you want to select all rows (person + thing), group them by person, and count them while grouping:
``grouping``:inxx
``
>>> count = db.person.id.count()
>>> for row in db(db.person.id==db.thing.owner_id).select(
db.person.name, count, groupby=db.person.name):
print row.person.name, row[count]
Alex 2
Bob 1
``:code
Notice the count operator (which is built-in) is used as a field. The only issue here is in how to retrieve the information. Each row clearly contains a person and the count, but the count is not a field of a person nor is it a table. So where does it go? It goes into the storage object representing the record with a key equal to the query expression itself. The count method of the Field object has an optional ``distinct`` argument. When set to ``True`` it specifies that only distinct values of the field in question are to be counted.
### Many to many
``many-to-many``:inxx
In the previous examples, we allowed a thing to have one owner but one person could have many things. What if Boat was owned by Alex and Curt? This requires a many-to-many relation, and it is realized via an intermediate table that links a person to a thing via an ownership relation.
Here is how to do it:
``
>>> db.define_table('person',
Field('name'))
>>> db.define_table('thing',
Field('name'))
>>> db.define_table('ownership',
Field('person', 'reference person'),
Field('thing', 'reference thing'))
``:code
the existing ownership relationship can now be rewritten as:
``
>>> db.ownership.insert(person=1, thing=1) # Alex owns Boat
>>> db.ownership.insert(person=1, thing=2) # Alex owns Chair
or any value from the list
db.mytable.myfield.contains(['value1','value2'], all=false)
``
There is a also a ``regexp`` method that works like the ``like`` method but allows regular expression syntax for the look-up expression. It is only supported by PostgreSQL and SQLite.
The ``upper`` and ``lower`` methods allow you to convert the value of the field to upper or lower case, and you can also combine them with the like operator:
``upper``:inxx ``lower``:inxx
``
>>> for row in db(db.log.event.upper().like('PORT%')).select():
print row.event
port scan
``:code
#### ``year``, ``month``, ``day``, ``hour``, ``minutes``, ``seconds``
``hour``:inxx ``minutes``:inxx ``seconds``:inxx ``day``:inxx ``month``:inxx ``year``:inxx
The date and datetime fields have day, month and year methods. The datetime and time fields have hour, minutes and seconds methods. Here is an example:
``
>>> for row in db(db.log.event_time.year()==2009).select():
print row.event
port scan
xss injection
unauthorized login
``:code
#### ``belongs``
The SQL IN operator is realized via the belongs method which returns true when the field value belongs to the specified set (list of tuples):
``belongs``:inxx
``
>>> for row in db(db.log.severity.belongs((1, 2))).select():
print row.event
port scan
xss injection
``:code
The DAL also allows a nested select as the argument of the belongs operator. The only caveat is that the nested select has to be a ``_select``, not a ``select``, and only one field has to be selected explicitly, the one that defines the set.
``nested select``:inxx
``
>>> bad_days = db(db.log.severity==3)._select(db.log.event_time)
>>> for row in db(db.log.event_time.belongs(bad_days)).select():
print row.event
port scan
xss injection
unauthorized login
``:code
In those cases where a nested select is required and the look-up field is a reference we can also use a query as argument. For example:
``
-db.define_table('person',Field('name'))
-db.define_table('thing,Field('name'), Field('owner_id','reference thing'))
db(db.thing.owner_id.belongs(db.person.name=='Jonathan')).select()
``:code
In this case it is obvious that the next select only needs the field referenced by the ``db.thing.owner_id`` field so we do not need the more verbose ``_select`` notation.
``nested_select``:inxx
A nested select can also be used as insert/update value but in this case the syntax is different:
``
lazy = db(db.person.name=='Jonathan').nested_select(db.person.id)
db(db.thing.id==1).update(owner_id = lazy)
``:code
In this case ``lazy`` is a nested expression that computes the ``id`` of person "Jonathan". The two lines result in one single SQL query.
#### ``sum``, ``avg``, ``min``, ``max`` and ``len``
``sum``:inxx ``avg``:inxx ``min``:inxx ``max``:inxx
Previously, you have used the count operator to count records. Similarly, you can use the sum operator to add (sum) the values of a specific field from a group of records. As in the case of count, the result of a sum is retrieved via the store object:
if not db(db.person).count():
``:code
Each record is identified by an ID and referenced by that ID. If you
have two copies of the database used by distinct web2py installations,
the ID is unique only within each database and not across the databases.
This is a problem when merging records from different databases.
In order to make a record uniquely identifiable across databases, they
must:
- have a unique id (UUID),
- have an event_time (to figure out which one is more recent if multiple copies),
- reference the UUID instead of the id.
This can be achieved without modifying web2py. Here is what to do:
Change the above model into:
``
db.define_table('person',
Field('uuid', length=64, default=lambda:str(uuid.uuid4())),
Field('modified_on', 'datetime', default=now),
Field('name'),
format='%(name)s')
db.define_table('thing',
Field('uuid', length=64, default=lambda:str(uuid.uuid4())),
Field('modified_on', 'datetime', default=now),
Field('owner_id', length=64),
Field('name'),
format='%(name)s')
db.thing.owner_id.requires = IS_IN_DB(db,'person.uuid','%(name)s')
if not db(db.person.id).count():
id = uuid.uuid4()
db.person.insert(name="Massimo", uuid=id)
db.thing.insert(owner_id=id, name="Chair")
``:code
-------
Notice that in the above table definitions, the default value for the two ``uuid`` fields is set to a lambda function, which returns a UUID (converted to a string). The lambda function is called once for each record inserted, ensuring that each record gets a unique UUID, even if multiple records are inserted in a single transaction.
-------
Create a controller action to export the database:
``
def export():
The ``export_to_csv_file`` function accepts a keyword argument named ``represent
``colnames``:inxx
The function also accepts a keyword argument named ``colnames`` that should contain a list of column names one wish to export. It defaults to all columns.
Both ``export_to_csv_file`` and ``import_from_csv_file`` accept keyword arguments that tell the csv parser the format to save/load the files:
- ``delimiter``: delimiter to separate values (default ',')
- ``quotechar``: character to use to quote string values (default to double quotes)
- ``quoting``: quote system (default ``csv.QUOTE_MINIMAL``)
Here is some example usage:
``
>>> import csv
>>> rows = db(query).select()
>>> rows.export_to_csv_file(open('/tmp/test.txt', 'w'),
delimiter='|',
quotechar='"',
quoting=csv.QUOTE_NONNUMERIC)
``:code
Which would render something similar to
``
"hello"|35|"this is the text description"|"2009-03-03"
``:code
For more information consult the official Python documentation ``quoteall``:cite
### Caching selects
The select method also takes a cache argument, which defaults to None. For caching purposes, it should be set to a tuple where the first element is the cache model (cache.ram, cache.disk, etc.), and the second element is the expiration time in seconds.
In the following example, you see a controller that caches a select on the previously defined db.log table. The actual select fetches data from the back-end database no more frequently than once every 60 seconds and stores the result in cache.ram. If the next call to this controller occurs in less than 60 seconds since the last database IO, it simply fetches the previous data from cache.ram.
``cache select``:inxx
``
def cache_db_select():
logs = db().select(db.log.ALL, cache=(cache.ram, 60))
return dict(logs=logs)
``:code
``cacheable``:inxx
The ``select`` method has an optional ``cacheable`` argument, normally set to ``False``. When ``cacheable=True`` the resulting ``Rows`` is serializable but The ``Row``s lack ``update_record`` and ``delete_record`` methods.
This is best explained via some examples.
``:code
Here ``f`` is a dict of fields passed to insert or update, ``id`` is the id of the newly inserted record, ``s`` is the Set object used for update or delete.
``
>>> db.person.insert(name='John')
({'name': 'John'},)
({'name': 'John'}, 1)
>>> db(db.person.id==1).update(name='Tim')
(<Set (person.id = 1)>, {'name': 'Tim'})
(<Set (person.id = 1)>, {'name': 'Tim'})
>>> db(db.person.id==1).delete()
(<Set (person.id = 1)>,)
(<Set (person.id = 1)>,)
``:code
The return values of these callback should be ``None`` or ``False``. If any of the ``_before_*`` callback returns a ``True`` value it will abort the actual insert/update/delete operation.
``update_naive``:inxx.
Some times a callback may need to perform an update in the same of a different table and one wants to avoid callbacks calling themselves recursively.
For this purpose there the Set objects have an ``update_naive`` method that works like ``update`` but ignores before and after callbacks.
#### Record versioning
``_enable_record_versioning``:inxx
It is possible to ask web2py to save every copy of a record when the record is individually modified. There are different ways to do it and it can be done for all tables at once using the syntax:
``
auth.enable_record_versioning(db)
``:code
this requires Auth and it is discussed in the chapter about authentication.
It can also be done for each individual table as discussed below.
Consider the following table:
``
db.define_table('stored_item',
True.
We can tell web2py to create a new table (in the same or a different database) and store all previous versions of each record in the table, when modified.
This is done in the following way:
``
db.stored_item._enable_record_versioning()
``:code
or in a more verbose syntax:
``
db.stored_item._enable_record_versioning(
archive_db = db,
archive_name = 'stored_item_archive',
current_record = 'current_record',
is_active = 'is_active')
``
The ``archive_db=db`` tells web2py to store the archive table in the same database as the ``stored_item`` table. The ``archive_name`` sets the name for the archive table. The archive table has the same fields as the original table ``stored_item`` except that unique fields are no longer unique (because it needs to store multiple versions) and has an extra field which name is specified by ``current_record`` and which is a reference to the current record in the ``stored_item`` table.
When records are deleted, they are not really deleted. A deleted record is copied in the ``stored_item_archive`` table (like when it is modified) and the ``is_active`` field is set to False. By enabling record versioning web2py sets a ``custom_filter`` on this table that hides all fields in table ``stored_item`` where the ``is_active`` field is set to False. The ``is_active`` parameter in the ``_enable_record_versioning`` method allows to specify the name of the field used by the ``custom_filter`` to determine if the field was deleted or not.
``custom_filter``s are ignored by the appadmin interface.
#### Common fields and multi-tenancy
``common fields``:inxx
``multi tenancy``:inxx
``db._common_fields`` is a list of fields that should belong to all the tables. This list can also contain tables and it is understood as all fields from the table. For example occasionally you find yourself in need to add a signature to all your tables but the ```auth`` tables. In this case, after you ``db.define_tables()`` but before defining any other table, insert
``
db._common_fields.append(auth.signature)
``
One field is special: "request_tenant".
This field does not exist but you can create it and add it to any of your tables (or them all):
``
db._common_fields.append(Field('request_tenant',
default=request.env.http_host,writable=False))
``
Any select, delete or update in this table, will include only public blog posts.
``
db.blog_post._common_filter = lambda query: db.blog_post.is_public == True
``
It serves both as a way to avoid repeating the "db.blog_post.is_public==True" phrase in each blog post search, and also as a security enhancement, that prevents you from forgetting to disallow viewing of none public posts.
In case you actually do want items left out by the common filter (for example, allowing the admin to see none public posts), you can either remove the filter:
``
db.blog_post._common_filter = None
``
or ignore it:
``
db(query, ignore_common_filters=True).select(...)
``
#### Custom ``Field`` types (experimental)
``SQLCustomType``:inxx
Aside for using ``filter_in`` and ``filter_out``, it is possible to define new/custom field types.
For example we consider here the example if a field that contains binary data in compressed form:
``
from gluon.dal import SQLCustomType
import zlib
compressed = SQLCustomType(
type ='text',
native='text',
encoder =(lambda x: zlib.compress(x or '')),
decoder = (lambda x: zlib.decompress(x))
)
db.define_table('example', Field('data',type=compressed))
``:code
``SQLCustomType`` is a field type factory. Its ``type`` argument must be one of the standard web2py types. It tells web2py how to treat the field values at the web2py level. ``native`` is the name of the field as far as the database is concerned. Allowed names depend on the database engine. ``encoder`` is an optional transformation function applied when the data is stored and ``decoder`` is the optional reversed transformation function.
This feature is marked as experimental. In practice it has been in web2py for a long time and it works but it can make the code not portable, for example when the native type is database specific. It does not work on Google App Engine NoSQL.
#### Using DAL without define tables
SQLiteAdapter extends BaseAdapter
JDBCSQLiteAdapter extends SQLiteAdapter
MySQLAdapter extends BaseAdapter
PostgreSQLAdapter extends BaseAdapter
JDBCPostgreSQLAdapter extends PostgreSQLAdapter
OracleAdapter extends BaseAdapter
MSSQLAdapter extends BaseAdapter
MSSQL2Adapter extends MSSQLAdapter
FireBirdAdapter extends BaseAdapter
FireBirdEmbeddedAdapter extends FireBirdAdapter
InformixAdapter extends BaseAdapter
DB2Adapter extends BaseAdapter
IngresAdapter extends BaseAdapter
IngresUnicodeAdapter extends IngresAdapter
GoogleSQLAdapter extends MySQLAdapter
NoSQLAdapter extends BaseAdapter
GoogleDatastoreAdapter extends NoSQLAdapter
CubridAdapter extends MySQLAdapter (experimental)
TeradataAdapter extends DB2Adapter (experimental)
SAPDBAdapter extends BaseAdapter (experimental)
CouchDBAdapter extends NoSQLAdapter (experimental)
MongoDBAdapter extends NoSQLAdapter (experimental)
``
which override the behavior of the ``BaseAdapter``.
Each adapter has more or less this structure:
``
class MySQLAdapter(BaseAdapter):
# specify a diver to use
driver = globals().get('pymysql',None)
# map web2py types into database types
types = {
'boolean': 'CHAR(1)',
'string': 'VARCHAR(%(length)s)',
'text': 'LONGTEXT',
...
}
ADAPTERS = {
'oracle': OracleAdapter,
'mssql': MSSQLAdapter,
'mssql2': MSSQL2Adapter,
'db2': DB2Adapter,
'teradata': TeradataAdapter,
'informix': InformixAdapter,
'firebird': FireBirdAdapter,
'firebird_embedded': FireBirdAdapter,
'ingres': IngresAdapter,
'ingresu': IngresUnicodeAdapter,
'sapdb': SAPDBAdapter,
'cubrid': CubridAdapter,
'jdbc:sqlite': JDBCSQLiteAdapter,
'jdbc:sqlite:memory': JDBCSQLiteAdapter,
'jdbc:postgres': JDBCPostgreSQLAdapter,
'gae': GoogleDatastoreAdapter, # discouraged, for backward compatibility
'google:datastore': GoogleDatastoreAdapter,
'google:sql': GoogleSQLAdapter,
'couchdb': CouchDBAdapter,
'mongodb': MongoDBAdapter,
}
``:code
the uri string is then parsed in more detail by the adapter itself.
For any adapter you can replace the driver with a different one:
``
import MySQLdb as mysqldb
from gluon.dal import MySQLAdapter
MySQLAdapter.driver = mysqldb
``
i.e. ``mysqldb`` has to be ''that module'' with a .connect() method.
You can specify optional driver arguments and adapter arguments:
``
db =DAL(..., driver_args={}, adapter_args={})
``
#### Gotchas
**SQLite** does not support dropping and altering columns. That means that web2py migrations will work up to a point. If you delete a field from a table, the column will remain in the database but be invisible to web2py. If you decide to reinstate the column, web2py will try re-create it and fail. In this case you must set ``fake_migrate=True`` so that metadata is rebuilt without attempting to add the column again. Also, for the same reason, **SQLite** is not aware of any change of column type. If you insert a number in a string field, it will be stored as string. If you later change the model and replace the type "string" with type "integer", SQLite will continue to keep the number as a string and this may cause problem when you try to extract the data.
**MySQL** does not support multiple ALTER TABLE within a single transaction. This means that any migration process is broken into multiple commits. If something happens that causes a failure it is possible to break a migration (the web2py metadata are no longer in sync with the actual table structure in the database). This is unfortunate but it can be prevented (migrate one table at the time) or it can be fixed a posteriori (revert the web2py model to what corresponds to the table structure in database, set ``fake_migrate=True`` and after the metadata has been rebuilt, set ``fake_migrate=False`` and migrate the table again).
**Google SQL** has the same problems as MySQL and more. In particular table metadata itself must be stored in the database in a table that is not migrated by web2py. This is because Google App Engine has a read-only file system. Web2py migrations in Google:SQL combined with the MySQL issue described above can result in metadata corruption. Again, this can be prevented (my migrating the table at once and then setting migrate=False so that the metadata table is not accessed any more) or it can fixed a posteriori (my accessing the database using the Google dashboard and deleting any corrupted entry from the table called ``web2py_filesystem``.
``limitby``:inxx
**MSSQL** does not support the SQL OFFSET keyword. Therefore the database cannot do pagination. When doing a ``limitby=(a,b)`` web2py will fetch the first ``b`` rows and discard the first ``a``. This may result in a considerable overhead when compared with other database engines.
**Oracle** also does not support pagination. It does not support neither the OFFSET nor the LIMIT keywords. Web2py achieves pagination by translating a ``db(...).select(limitby=(a,b))`` into a complex three-way nested select (as suggested by official Oracle documentation). This works for simple select but may break for complex selects involving aliased fields and or joins.
**MSSQL** has problems with circular references in tables that have ONDELETE CASCADE. This is an MSSSQL bug and you work around it by setting the ondelete attribute for all reference fields to "NO ACTION". You can also do it once and for all before you define tables:

``
+import MySQLdb as mysqldb
from gluon.dal import MySQLAdapter
MySQLAdapter.driver = mysqldb
``
+i.e. ``mysqldb`` has to be ''that module'' with a .connect() method.
+You can specify optional driver arguments and adapter arguments:
``
from gluon.dal import MySQLAdapter
MySQLAdapter.driver = mysqldb
``
-
-and you can specify optional driver arguments and adapter arguments:

Consider the previous table person and a new table "thing" referencing a "person":
``
>>> db.define_table('thing',
Field('name'),
Field('owner_id','reference person'))
``:code
and a simple select from this table:
``
>>> things = db(db.thing).select()
``:code
which is equivalent to
``
>>> things = db(db.thing._id>0).select()
``:code
where ``._id`` is a reference to the primary key of the table. Normally ``db.thing._id`` is the same as ``db.thing.id`` and we will assume that in most of this book. ``_id``:inxx
For each Row of things it is possible to fetch not just fields from the selected table (thing) but also from linked tables (recursively):
``
>>> for thing in things: print thing.name, thing.owner_id.name
``:code
Here ``thing.owner_id.name`` requires one database select for each thing in things and it is therefore inefficient. We suggest using joins whenever possible instead of recursive selects, nevertheless this is convenient and practical when accessing individual records.
You can also do it backwards, by selecting the things referenced by a person:
``
person = db.person(id)
for thing in person.thing.select(orderby=db.thing.name):
print person.name, 'owns', thing.name
``:code
In this last expressions ``person.thing`` is a shortcut for
``
db(db.thing.owner_id==person.id)
``:code
i.e. the Set of ``thing``s referenced by the current ``person``. This syntax breaks down if the referencing table has multiple references to the referenced table. In this case one needs to be more explicit and use a full Query.
#### Serializing ``Rows`` in views
Given the following action containing a query
``SQLTABLE``:inxx
``
def index()
return dict(rows = db(query).select())
``:code
The result of a select can be displayed in a view with the following syntax:
``
{{extend 'layout.html'}}
<h1>Records</h1>
{{=rows}}
Virtual fields can be ''lazy''; all they need to do is return a function and acc
``:code
or shorter using a lambda function:
``
>>> class MyVirtualFields(object):
def lazy_total_price(self):
return lambda self=self: self.item.unit_price \
* self.item.quantity
``:code
### One to many relation
``one to many``:inxx
To illustrate how to implement one to many relations with the web2py DAL, define another table "thing" that refers to the table "person" which we redefine here:
``
>>> db.define_table('person',
Field('name'),
format='%(name)s')
>>> db.define_table('thing',
Field('name'),
Field('owner_id', 'reference person'),
format='%(name)s')
``:code
Table "thing" has two fields, the name of the thing and the owner of the thing. The "owner_id" field id a reference field. A reference type can be specified in two equivalent ways:
``
+Field('owner_id', 'reference person')
+Field('owner_id', db.person)
``:code
The latter is always converted to the former. They are equivalent except in the case of lazy tables, self references or other types of cyclic references where the former notation is the only allowed notation.
When a field type is another table, it is intended that the field reference the other table by its id. In fact, you can print the actual type value and get:
``
>>> print db.thing.owner_id.type
reference person
``:code
Now, insert three things, two owned by Alex and one by Bob:
``
>>> db.thing.insert(name='Boat', owner_id=1)
1
>>> db.thing.insert(name='Chair', owner_id=1)
2
>>> db.thing.insert(name='Shoes', owner_id=2)
3
``:code
You can select as you did for any other table:
``
>>> for row in db(db.thing.owner_id==1).select():
print row.name
Boat
Chair
``:code
Because a thing has a reference to a person, a person can have many things, so a record of table person now acquires a new attribute thing, which is a Set, that defines the things of that person. This allows looping over all persons and fetching their things easily:
``referencing``:inxx
``
>>> for person in db().select(db.person.ALL):
print person.name
for thing in person.thing.select():
print ' ', thing.name
Alex
Boat
Chair
Bob
Shoes
Carl
``:code
#### Inner joins
Another way to achieve a similar result is by using a join, specifically an INNER JOIN. web2py performs joins automatically and transparently when the query links two or more tables as in the following example:
``Rows``:inxx ``inner join``:inxx ``join``:inxx
``
>>> rows = db(db.person.id==db.thing.owner_id).select()
>>> for row in rows:
print row.person.name, 'has', row.thing.name
Alex has Boat
Alex has Chair
Bob has Shoes
``:code
Observe that web2py did a join, so the rows now contain two records, one from each table, linked together. Because the two records may have fields with conflicting names, you need to specify the table when extracting a field value from a row. This means that while before you could do:
``
row.name
``:code
and it was obvious whether this was the name of a person or a thing, in the result of a join you have to be more explicit and say:
``
row.person.name
``:code
or:
``
row.thing.name
``:code
There is an alternative syntax for INNER JOINS:
``
>>> rows = db(db.person).select(join=db.thing.on(db.person.id==db.thing.owner_id))
>>> for row in rows:
print row.person.name, 'has', row.thing.name
Alex has Boat
Alex has Chair
Bob has Shoes
``:code
While the output is the same, the generated SQL in the two cases can be different. The latter syntax removes possible ambiguities when the same table is joined twice and aliased:
``
>>> db.define_table('thing',
Field('name'),
+ Field('owner_id1','reference person'),
+ Field('owner_id2','reference person'))
>>> rows = db(db.person).select(
+ join=[db.person.with_alias('owner_id1').on(db.person.id==db.thing.owner_id1).
+ db.person.with_alias('owner_id2').on(db.person.id==db.thing.owner_id2)])
``
The value of ``join`` can be list of ``db.table.on(...)`` to join.
#### Left outer join
Notice that Carl did not appear in the list above because he has no things. If you intend to select on persons (whether they have things or not) and their things (if they have any), then you need to perform a LEFT OUTER JOIN. This is done using the argument "left" of the select command. Here is an example:
``Rows``:inxx ``left outer join``:inxx ``outer join``:inxx
``
>>> rows=db().select(
db.person.ALL, db.thing.ALL,
left=db.thing.on(db.person.id==db.thing.owner_id))
>>> for row in rows:
print row.person.name, 'has', row.thing.name
Alex has Boat
Alex has Chair
Bob has Shoes
Carl has None
``:code
where:
``
left = db.thing.on(...)
``:code
does the left join query. Here the argument of ``db.thing.on`` is the condition required for the join (the same used above for the inner join). In the case of a left join, it is necessary to be explicit about which fields to select.
Multiple left joins can be combined by passing a list or tuple of ``db.mytable.on(...)`` to the ``left`` attribute.
#### Grouping and counting
When doing joins, sometimes you want to group rows according to certain criteria and count them. For example, count the number of things owned by every person. web2py allows this as well. First, you need a count operator. Second, you want to join the person table with the thing table by owner. Third, you want to select all rows (person + thing), group them by person, and count them while grouping:
``grouping``:inxx
``
>>> count = db.person.id.count()
>>> for row in db(db.person.id==db.thing.owner_id).select(
db.person.name, count, groupby=db.person.name):
print row.person.name, row[count]
Alex 2
Bob 1
``:code
Notice the count operator (which is built-in) is used as a field. The only issue here is in how to retrieve the information. Each row clearly contains a person and the count, but the count is not a field of a person nor is it a table. So where does it go? It goes into the storage object representing the record with a key equal to the query expression itself. The count method of the Field object has an optional ``distinct`` argument. When set to ``True`` it specifies that only distinct values of the field in question are to be counted.
### Many to many
``many-to-many``:inxx
In the previous examples, we allowed a thing to have one owner but one person could have many things. What if Boat was owned by Alex and Curt? This requires a many-to-many relation, and it is realized via an intermediate table that links a person to a thing via an ownership relation.
Here is how to do it:
``
>>> db.define_table('person',
Field('name'))
>>> db.define_table('thing',
Field('name'))
>>> db.define_table('ownership',
Field('person', 'reference person'),
The SQL IN operator is realized via the belongs method which returns true when t
port scan
xss injection
``:code
The DAL also allows a nested select as the argument of the belongs operator. The only caveat is that the nested select has to be a ``_select``, not a ``select``, and only one field has to be selected explicitly, the one that defines the set.
``nested select``:inxx
``
>>> bad_days = db(db.log.severity==3)._select(db.log.event_time)
>>> for row in db(db.log.event_time.belongs(bad_days)).select():
print row.event
port scan
xss injection
unauthorized login
``:code
In those cases where a nested select is required and the look-up field is a reference we can also use a query as argument. For example:
``
db.define_table('person',Field('name'))
+db.define_table('thing,Field('name'), Field('owner_id','reference thing'))
+db(db.thing.owner_id.belongs(db.person.name=='Jonathan')).select()
``:code
In this case it is obvious that the next select only needs the field referenced by the ``db.thing.owner_id`` field so we do not need the more verbose ``_select`` notation.
``nested_select``:inxx
A nested select can also be used as insert/update value but in this case the syntax is different:
``
lazy = db(db.person.name=='Jonathan').nested_select(db.person.id)
db(db.thing.id==1).update(owner_id = lazy)
``:code
In this case ``lazy`` is a nested expression that computes the ``id`` of person "Jonathan". The two lines result in one single SQL query.
#### ``sum``, ``avg``, ``min``, ``max`` and ``len``
``sum``:inxx ``avg``:inxx ``min``:inxx ``max``:inxx
Previously, you have used the count operator to count records. Similarly, you can use the sum operator to add (sum) the values of a specific field from a group of records. As in the case of count, the result of a sum is retrieved via the store object:
``
>>> sum = db.log.severity.sum()
>>> print db().select(sum).first()[sum]
6
``:code
You can also use ``avg``, ``min``, and ``max`` to the average, minimum, and maximum value respectively for the selected records. For example:
``
>>> max = db.log.severity.max()
>>> print db().select(max).first()[max]
3
And finally, here is ``_update`` ``_update``:inxx
>>> print db(db.person.name=='Alex')._update()
UPDATE person SET WHERE person.name='Alex';
``:code
-----
Moreover you can always use ``db._lastsql`` to return the most recent
SQL code, whether it was executed manually using executesql or was SQL
generated by the DAL.
-----
### Exporting and importing data
``export``:inxx ``import``:inxx
#### CSV (one Table at a time)
When a Rows object is converted to a string it is automatically
serialized in CSV:
``csv``:inxx
``
>>> rows = db(db.person.id==db.thing.owner_id).select()
>>> print rows
person.id,person.name,thing.id,thing.name,thing.owner_id
1,Alex,1,Boat,1
1,Alex,2,Chair,1
2,Bob,3,Shoes,2
``:code
You can serialize a single table in CSV and store it in a file "test.csv":
``
>>> open('test.csv', 'wb').write(str(db(db.person.id).select()))
``:code
This is equivalent to
``
>>> rows = db(db.person.id).select()
>>> rows.export_to_csv_file(open('test.csv', 'wb'))
``:code
You can read the CSV file back with:
``
>>> db.person.import_from_csv_file(open('test.csv', 'r'))
END
``:code
The file does not include uploaded files if these are not stored in the database. In any case it is easy enough to zip the "uploads" folder separately.
When importing, the new records will be appended to the database if it is not empty. In general the new imported records will not have the same record id as the original (saved) records but web2py will restore references so they are not broken, even if the id values may change.
If a table contains a field called
"uuid", this field will be used to identify duplicates. Also, if an
imported record has the same "uuid" as an existing record, the
previous record will be updated.
#### CSV and remote database synchronization
Consider the following model:
``
db = DAL('sqlite:memory:')
db.define_table('person',
Field('name'),
format='%(name)s')
db.define_table('thing',
Field('owner_id', 'reference person'),
Field('name'),
format='%(name)s')
if not db(db.person).count():
id = db.person.insert(name="Massimo")
db.thing.insert(owner_id=id, name="Chair")
``:code
Each record is identified by an ID and referenced by that ID. If you
have two copies of the database used by distinct web2py installations,
the ID is unique only within each database and not across the databases.
This is a problem when merging records from different databases.
In order to make a record uniquely identifiable across databases, they
must:
- have a unique id (UUID),
- have an event_time (to figure out which one is more recent if multiple copies),
- reference the UUID instead of the id.
This can be achieved without modifying web2py. Here is what to do:
Change the above model into:
``
db.define_table('person',
Field('uuid', length=64, default=lambda:str(uuid.uuid4())),
Field('modified_on', 'datetime', default=now),
Field('name'),
format='%(name)s')
db.define_table('thing',
Field('uuid', length=64, default=lambda:str(uuid.uuid4())),
Field('modified_on', 'datetime', default=now),
Field('owner_id', length=64),
Field('name'),
format='%(name)s')
db.thing.owner_id.requires = IS_IN_DB(db,'person.uuid','%(name)s')
if not db(db.person.id).count():
id = uuid.uuid4()
db.person.insert(name="Massimo", uuid=id)
db.thing.insert(owner_id=id, name="Chair")
``:code
-------
Notice that in the above table definitions, the default value for the two ``uuid`` fields is set to a lambda function, which returns a UUID (converted to a string). The lambda function is called once for each record inserted, ensuring that each record gets a unique UUID, even if multiple records are inserted in a single transaction.
-------
Create a controller action to export the database:
``
def export():
s = StringIO.StringIO()
db.export_to_csv_file(s)
response.headers['Content-Type'] = 'text/csv'
return s.getvalue()
``:code
Create a controller action to import a saved copy of the other database and sync records:
``
def import_and_sync():
Alternatively, you can use XML-RPC to export/import the file.
If the records reference uploaded files, you also need to export/import the content of the uploads folder. Notice that files therein are already labeled by UUIDs so you do not need to worry about naming conflicts and references.
#### HTML and XML (one Table at a time)
``Rows objects``:inxx
Rows objects also have an ``xml`` method (like helpers) that serializes it to XML/HTML:
``HTML``:inxx
``
>>> rows = db(db.person.id > 0).select()
>>> print rows.xml()
<table>
<thead>
<tr>
<th>person.id</th>
<th>person.name</th>
<th>thing.id</th>
<th>thing.name</th>
<th>thing.owner_id</th>
</tr>
</thead>
<tbody>
<tr class="even">
<td>1</td>
<td>Alex</td>
<td>1</td>
<td>Boat</td>
<td>1</td>
</tr>
...
</tbody>
</table>
``:code
Consider the previous table person and a new table "thing" referencing a "person":
``
>>> db.define_table('thing',
Field('name'),
Field('owner','reference person'))
``:code
and a simple select from this table:
``
>>> things = db(db.thing).select()
``:code
which is equivalent to
``
>>> things = db(db.thing._id>0).select()
``:code
where ``._id`` is a reference to the primary key of the table. Normally ``db.thing._id`` is the same as ``db.thing.id`` and we will assume that in most of this book. ``_id``:inxx
For each Row of things it is possible to fetch not just fields from the selected table (thing) but also from linked tables (recursively):
``
>>> for thing in things: print thing.name, thing.owner.name
``:code
Here ``thing.owner.name`` requires one database select for each thing in things and it is therefore inefficient. We suggest using joins whenever possible instead of recursive selects, nevertheless this is convenient and practical when accessing individual records.
You can also do it backwards, by selecting the things referenced by a person:
``
person = db.person(id)
for thing in person.thing.select(orderby=db.thing.name):
print person.name, 'owns', thing.name
``:code
In this last expressions ``person.thing`` is a shortcut for
``
db(db.thing.owner==person.id)
``:code
i.e. the Set of ``thing``s referenced by the current ``person``. This syntax breaks down if the referencing table has multiple references to the referenced table. In this case one needs to be more explicit and use a full Query.
#### Serializing ``Rows`` in views
Given the following action containing a query
``SQLTABLE``:inxx
``
def index()
return dict(rows = db(query).select())
``:code
The result of a select can be displayed in a view with the following syntax:
``
{{extend 'layout.html'}}
<h1>Records</h1>
{{=rows}}
Virtual fields can be ''lazy''; all they need to do is return a function and acc
``:code
or shorter using a lambda function:
``
>>> class MyVirtualFields(object):
def lazy_total_price(self):
return lambda self=self: self.item.unit_price \
* self.item.quantity
``:code
### One to many relation
``one to many``:inxx
To illustrate how to implement one to many relations with the web2py DAL, define another table "thing" that refers to the table "person" which we redefine here:
``
>>> db.define_table('person',
Field('name'),
format='%(name)s')
>>> db.define_table('thing',
Field('name'),
Field('owner', 'reference person'),
format='%(name)s')
``:code
Table "thing" has two fields, the name of the thing and the owner of the thing. The "owner" field id a reference field. A reference type can be specified in two equivalent ways:
``
-Field('owner', 'reference person')
-Field('owner', db.person)
``:code
The latter is always converted to the former. They are equivalent except in the case of lazy tables, self references or other types of cyclic references where the former notation is the only allowed notation.
When a field type is another table, it is intended that the field reference the other table by its id. In fact, you can print the actual type value and get:
``
>>> print db.thing.owner.type
reference person
``:code
Now, insert three things, two owned by Alex and one by Bob:
``
>>> db.thing.insert(name='Boat', owner=1)
1
>>> db.thing.insert(name='Chair', owner=1)
2
>>> db.thing.insert(name='Shoes', owner=2)
3
``:code
You can select as you did for any other table:
``
>>> for row in db(db.thing.owner==1).select():
print row.name
Boat
Chair
``:code
Because a thing has a reference to a person, a person can have many things, so a record of table person now acquires a new attribute thing, which is a Set, that defines the things of that person. This allows looping over all persons and fetching their things easily:
``referencing``:inxx
``
>>> for person in db().select(db.person.ALL):
print person.name
for thing in person.thing.select():
print ' ', thing.name
Alex
Boat
Chair
Bob
Shoes
Carl
``:code
#### Inner joins
Another way to achieve a similar result is by using a join, specifically an INNER JOIN. web2py performs joins automatically and transparently when the query links two or more tables as in the following example:
``Rows``:inxx ``inner join``:inxx ``join``:inxx
``
>>> rows = db(db.person.id==db.thing.owner).select()
>>> for row in rows:
print row.person.name, 'has', row.thing.name
Alex has Boat
Alex has Chair
Bob has Shoes
``:code
Observe that web2py did a join, so the rows now contain two records, one from each table, linked together. Because the two records may have fields with conflicting names, you need to specify the table when extracting a field value from a row. This means that while before you could do:
``
row.name
``:code
and it was obvious whether this was the name of a person or a thing, in the result of a join you have to be more explicit and say:
``
row.person.name
``:code
or:
``
row.thing.name
``:code
There is an alternative syntax for INNER JOINS:
``
>>> rows = db(db.person).select(join=db.thing.on(db.person.id==db.thing.owner))
>>> for row in rows:
print row.person.name, 'has', row.thing.name
Alex has Boat
Alex has Chair
Bob has Shoes
``:code
While the output is the same, the generated SQL in the two cases can be different. The latter syntax removes possible ambiguities when the same table is joined twice and aliased:
``
>>> db.define_table('thing',
Field('name'),
- Field('owner1','reference person'),
- Field('owner2','reference person'))
>>> rows = db(db.person).select(
- join=[db.person.with_alias('owner1').on(db.person.id==db.thing.owner1).
- db.person.with_alias('owner2').on(db.person.id==db.thing.owner2)])
``
The value of ``join`` can be list of ``db.table.on(...)`` to join.
#### Left outer join
Notice that Carl did not appear in the list above because he has no things. If you intend to select on persons (whether they have things or not) and their things (if they have any), then you need to perform a LEFT OUTER JOIN. This is done using the argument "left" of the select command. Here is an example:
``Rows``:inxx ``left outer join``:inxx ``outer join``:inxx
``
>>> rows=db().select(
db.person.ALL, db.thing.ALL,
left=db.thing.on(db.person.id==db.thing.owner))
>>> for row in rows:
print row.person.name, 'has', row.thing.name
Alex has Boat
Alex has Chair
Bob has Shoes
Carl has None
``:code
where:
``
left = db.thing.on(...)
``:code
does the left join query. Here the argument of ``db.thing.on`` is the condition required for the join (the same used above for the inner join). In the case of a left join, it is necessary to be explicit about which fields to select.
Multiple left joins can be combined by passing a list or tuple of ``db.mytable.on(...)`` to the ``left`` attribute.
#### Grouping and counting
When doing joins, sometimes you want to group rows according to certain criteria and count them. For example, count the number of things owned by every person. web2py allows this as well. First, you need a count operator. Second, you want to join the person table with the thing table by owner. Third, you want to select all rows (person + thing), group them by person, and count them while grouping:
``grouping``:inxx
``
>>> count = db.person.id.count()
>>> for row in db(db.person.id==db.thing.owner).select(
db.person.name, count, groupby=db.person.name):
print row.person.name, row[count]
Alex 2
Bob 1
``:code
Notice the count operator (which is built-in) is used as a field. The only issue here is in how to retrieve the information. Each row clearly contains a person and the count, but the count is not a field of a person nor is it a table. So where does it go? It goes into the storage object representing the record with a key equal to the query expression itself. The count method of the Field object has an optional ``distinct`` argument. When set to ``True`` it specifies that only distinct values of the field in question are to be counted.
### Many to many
``many-to-many``:inxx
In the previous examples, we allowed a thing to have one owner but one person could have many things. What if Boat was owned by Alex and Curt? This requires a many-to-many relation, and it is realized via an intermediate table that links a person to a thing via an ownership relation.
Here is how to do it:
``
>>> db.define_table('person',
Field('name'))
>>> db.define_table('thing',
Field('name'))
>>> db.define_table('ownership',
Field('person', 'reference person'),
The SQL IN operator is realized via the belongs method which returns true when t
port scan
xss injection
``:code
The DAL also allows a nested select as the argument of the belongs operator. The only caveat is that the nested select has to be a ``_select``, not a ``select``, and only one field has to be selected explicitly, the one that defines the set.
``nested select``:inxx
``
>>> bad_days = db(db.log.severity==3)._select(db.log.event_time)
>>> for row in db(db.log.event_time.belongs(bad_days)).select():
print row.event
port scan
xss injection
unauthorized login
``:code
In those cases where a nested select is required and the look-up field is a reference we can also use a query as argument. For example:
``
db.define_table('person',Field('name'))
-db.define_table('thing',Field('owner'),Field('owner','reference thing'))
-db(db.thing.owner.belongs(db.person.name=='Jonathan')).select()
``:code
In this case it is obvious that the next select only needs the field referenced by the ``db.thing.owner`` field so we do not need the more verbose ``_select`` notation.
``nested_select``:inxx
A nested select can also be used as insert/update value but in this case the syntax is different:
``
lazy = db(db.person.name=='Jonathan').nested_select(db.person.id)
db(db.thing.id==1).update(owner = lazy)
``:code
In this case ``lazy`` is a nested expression that computes the ``id`` of person "Jonathan". The two lines result in one single SQL query.
#### ``sum``, ``avg``, ``min``, ``max`` and ``len``
``sum``:inxx ``avg``:inxx ``min``:inxx ``max``:inxx
Previously, you have used the count operator to count records. Similarly, you can use the sum operator to add (sum) the values of a specific field from a group of records. As in the case of count, the result of a sum is retrieved via the store object:
``
>>> sum = db.log.severity.sum()
>>> print db().select(sum).first()[sum]
6
``:code
You can also use ``avg``, ``min``, and ``max`` to the average, minimum, and maximum value respectively for the selected records. For example:
``
>>> max = db.log.severity.max()
>>> print db().select(max).first()[max]
3
And finally, here is ``_update`` ``_update``:inxx
>>> print db(db.person.name=='Alex')._update()
UPDATE person SET WHERE person.name='Alex';
``:code
-----
Moreover you can always use ``db._lastsql`` to return the most recent
SQL code, whether it was executed manually using executesql or was SQL
generated by the DAL.
-----
### Exporting and importing data
``export``:inxx ``import``:inxx
#### CSV (one Table at a time)
When a Rows object is converted to a string it is automatically
serialized in CSV:
``csv``:inxx
``
>>> rows = db(db.person.id==db.thing.owner).select()
>>> print rows
person.id,person.name,thing.id,thing.name,thing.owner
1,Alex,1,Boat,1
1,Alex,2,Chair,1
2,Bob,3,Shoes,2
``:code
You can serialize a single table in CSV and store it in a file "test.csv":
``
>>> open('test.csv', 'wb').write(str(db(db.person.id).select()))
``:code
This is equivalent to
``
>>> rows = db(db.person.id).select()
>>> rows.export_to_csv_file(open('test.csv', 'wb'))
``:code
You can read the CSV file back with:
``
>>> db.person.import_from_csv_file(open('test.csv', 'r'))
END
``:code
The file does not include uploaded files if these are not stored in the database. In any case it is easy enough to zip the "uploads" folder separately.
When importing, the new records will be appended to the database if it is not empty. In general the new imported records will not have the same record id as the original (saved) records but web2py will restore references so they are not broken, even if the id values may change.
If a table contains a field called
"uuid", this field will be used to identify duplicates. Also, if an
imported record has the same "uuid" as an existing record, the
previous record will be updated.
#### CSV and remote database synchronization
Consider the following model:
``
db = DAL('sqlite:memory:')
db.define_table('person',
Field('name'),
format='%(name)s')
db.define_table('thing',
Field('owner', 'reference person'),
Field('name'),
format='%(name)s')
if not db(db.person).count():
id = db.person.insert(name="Massimo")
db.thing.insert(owner=id, name="Chair")
``:code
Each record is identified by an ID and referenced by that ID. If you
have two copies of the database used by distinct web2py installations,
the ID is unique only within each database and not across the databases.
This is a problem when merging records from different databases.
In order to make a record uniquely identifiable across databases, they
must:
- have a unique id (UUID),
- have an event_time (to figure out which one is more recent if multiple copies),
- reference the UUID instead of the id.
This can be achieved without modifying web2py. Here is what to do:
Change the above model into:
``
db.define_table('person',
Field('uuid', length=64, default=lambda:str(uuid.uuid4())),
Field('modified_on', 'datetime', default=now),
Field('name'),
format='%(name)s')
db.define_table('thing',
Field('uuid', length=64, default=lambda:str(uuid.uuid4())),
Field('modified_on', 'datetime', default=now),
Field('owner', length=64),
Field('name'),
format='%(name)s')
db.thing.owner.requires = IS_IN_DB(db,'person.uuid','%(name)s')
if not db(db.person.id).count():
id = uuid.uuid4()
db.person.insert(name="Massimo", uuid=id)
db.thing.insert(owner=id, name="Chair")
``:code
-------
Notice that in the above table definitions, the default value for the two ``uuid`` fields is set to a lambda function, which returns a UUID (converted to a string). The lambda function is called once for each record inserted, ensuring that each record gets a unique UUID, even if multiple records are inserted in a single transaction.
-------
Create a controller action to export the database:
``
def export():
s = StringIO.StringIO()
db.export_to_csv_file(s)
response.headers['Content-Type'] = 'text/csv'
return s.getvalue()
``:code
Create a controller action to import a saved copy of the other database and sync records:
``
def import_and_sync():
Alternatively, you can use XML-RPC to export/import the file.
If the records reference uploaded files, you also need to export/import the content of the uploads folder. Notice that files therein are already labeled by UUIDs so you do not need to worry about naming conflicts and references.
#### HTML and XML (one Table at a time)
``Rows objects``:inxx
Rows objects also have an ``xml`` method (like helpers) that serializes it to XML/HTML:
``HTML``:inxx
``
>>> rows = db(db.person.id > 0).select()
>>> print rows.xml()
<table>
<thead>
<tr>
<th>person.id</th>
<th>person.name</th>
<th>thing.id</th>
<th>thing.name</th>
<th>thing.owner</th>
</tr>
</thead>
<tbody>
<tr class="even">
<td>1</td>
<td>Alex</td>
<td>1</td>
<td>Boat</td>
<td>1</td>
</tr>
...
</tbody>
</table>
``:code

web2py comes with a Database Abstraction Layer (DAL), an API that maps Python objects into database objects such as queries, tables, and records. The DAL dynamically generates the SQL in real time using the specified dialect for the database back end, so that you do not have to write SQL code or learn different SQL dialects (the term SQL is used generically), and the application will be portable among different types of databases. A partial list of supported databases is show in the table below. Please check on the web2py web site and mailing list for more recent adapters. Google NoSQL is treated as a particular case in Chapter 13.
The Windows binary distribution works out of the box with SQLite and MySQL. The Mac binary distribution works out of the box with SQLite.
To use any other database back-end, run from the source distribution and install the appropriate driver for the required back end.
``database drivers``:inxx
Once the proper driver is installed, start web2py from source, and it will find the driver. Here is a list of drivers:
``DAL``:inxx ``SQLite``:inxx ``MySQL``:inxx ``PostgresSQL``:inxx ``Oracle``:inxx ``MSSQL``:inxx ``FireBird``:inxx ``DB2``:inxx ``Informix``:inxx ``Sybase``:inxx ``Teradata``:inxx ``MongoDB``:inxx ``CouchDB``:inxx ``SAPDB``:inxx ``Cubrid``:inxx
----------
database | drivers (source)
SQLite | sqlite3 or pysqlite2 or zxJDBC ``zxjdbc``:cite (on Jython)
PostgreSQL | psycopg2 ``psycopg2``:cite or pg8000 ``pg8000``:cite or zxJDBC ``zxjdbc``:cite (on Jython)
MySQL | pymysql ``pymysql``:cite or MySQLdb ``mysqldb``:cite
Oracle | cx_Oracle ``cxoracle``:cite
MSSQL | pyodbc ``pyodbc``:cite
FireBird | kinterbasdb ``kinterbasdb``:cite or fdb or pyodbc
DB2 | pyodbc ``pyodbc``:cite
Informix | informixdb ``informixdb``:cite
Ingres | ingresdbi ``ingresdbi``:cite
Cubrid | cubriddb ``cubridb``:cite ``cubridb``:cite
Sybase | Sybase ``Sybase``:cite
Teradata | pyodbc ``Teradata``:cite
SAPDB | sapdb ``SAPDB``:cite
MongoDB | pymongo ``pymongo``:cite
IMAP | imaplib ``IMAP``:cite
---------
``sqlite3``, ``pymysql``, ``pg8000``, and ``imaplib`` ship with web2py. Support of MongoDB is experimental. The IMAP option allows to use DAL to access IMAP.
web2py defines the following classes that make up the DAL:
The **DAL** object represents a database connection. For example:
``sqlite``:inxx
``
db = DAL('sqlite://storage.db')
``:code
``define_table``:inxx
**Table** represents a database table. You do not directly instantiate Table; instead, ``DAL.define_table`` instantiates it.
``
db.define_table('mytable', Field('myfield'))
``:code
The most important methods of a Table are:
``insert``:inxx
``truncate``:inxx
``drop``:inxx
``import_from_csv_file``:inxx
``count``:inxx
``.insert``, ``.truncate``, ``.drop``, and ``.import_from_csv_file``.
``Field``:inxx
Some times you may need to generate SQL as if you had a connection but without a
``
db = DAL('...', do_connect=False)
``:code
In this case you will be able to call ``_select``, ``_insert``, ``_update``, and ``_delete`` to generate SQL but not call ``select``, ``insert``, ``update``, and ``delete``. In most of the cases you can use ``do_connect=False`` even without having the required database drivers.
Notice that by default web2py uses utf8 character encoding for databases. If you work with existing databases that behave differently, you have to change it with the optional parameter ``db_codec`` like
``
db = DAL('...', db_codec='latin1')
``:code
otherwise you'll get UnicodeDecodeErrors tickets.
#### Connection pooling
``connection pooling``:inxx
The second argument of the DAL constructor is the ``pool_size``; it defaults to zero.
As it is rather slow to establish a new database connection for each request, web2py implements a mechanism for connection pooling. Once a connection is established and the page has been served and the transaction completed, the connection is not closed but goes into a pool. When the next http request arrives, web2py tries to recycle a connection from the pool and use that for the new transaction. If there are no available connections in the pool, a new connection is established.
When web2py starts, the pool is always empty. The pool grows up to the minimum between the value of ``pool_size`` and the max number of concurrent requests. This means that if ``pool_size=10`` but our server never receives more than 5 concurrent requests, then the actual pool size will only grow to 5. If ``pool_size=0`` then connection pooling is not used.
Connections in the pools are shared sequentially among threads, in the sense that they may be used by two different but not simultaneous threads. There is only one pool for each web2py process.
The ``pool_size`` parameter is ignored by SQLite and Google App Engine.
Connection pooling is ignored for SQLite, since it would not yield any benefit.
#### Connection failures
If web2py fails to connect to the database it waits 1 seconds and tries again up to 5 times before declaring a failure. In case of connection pooling it is possible that a pooled connection that stays open but unused for some time is closed by the database end. Thanks to the retry feature web2py tries to re-establish these dropped connections.
#### Replicated databases
The first argument of ``DAL(...)`` can be a list of URIs. In this case web2py tries to connect to each of them. The main purpose for this is to deal with multiple database servers and distribute the workload among them). Here is a typical use case:
``
db = DAL(['mysql://...1','mysql://...2','mysql://...3'])
``:code
In this case the DAL tries to connect to the first and, on failure, it
will try the second and the third. This can also be used to distribute load
in a database master-slave configuration. We will talk more about this
in Chapter 13 in the context of scalability.
### Reserved keywords
``reserved Keywords``:inxx
``check_reserved`` is yet another argument that can be passed to the DAL constructor. It tells it to check table names and column names against reserved SQL keywords in target back-end databases.
This argument is ``check_reserved`` and it defaults to None.
This is a list of strings that contain the database back-end adapter names.
The adapter name is the same as used in the DAL connection string. So if you want to check against PostgreSQL and MSSQL then your connection string would look as follows:
``
db = DAL('sqlite://storage.db',
check_reserved=['postgres', 'mssql'])
``:code
The DAL will scan the keywords in the same order as of the list.
There are two extra options "all" and "common". If you specify all, it will check against all known SQL keywords. If you specify common, it will only check against common SQL keywords such as ``SELECT``, ``INSERT``, ``UPDATE``, etc.
For supported back-ends you may also specify if you would like to check against the non-reserved SQL keywords as well. In this case you would append ``_nonreserved`` to the name. For example:
``
check_reserved=['postgres', 'postgres_nonreserved']
``:code
The following database backends support reserved words checking.
-----
**PostgreSQL** | ``postgres(_nonreserved)``
**MySQL** | ``mysql``
**FireBird** | ``firebird(_nonreserved)``
**MSSQL** | ``mssql``
**Oracle** | ``oracle``
-----
### ``DAL``, ``Table``, ``Field``
You can experiment with the DAL API using the web2py shell.
Start by creating a connection. For the sake of example, you can use SQLite. Nothing in this discussion changes when you change the back-end engine.
``
>>> db = DAL('sqlite://storage.db')
``:code
The database is now connected and the connection is stored in the global variable ``db``.
At any time you can retrieve the connection string.
``_uri``:inxx
``
>>> print db._uri
sqlite://storage.db
``:code
and the database name
``_dbname``:inxx
``
>>> print db._dbname
Not all of them are relevant for every field. "length" is relevant only for fiel
``decimal(n,m)`` | ``IS_DECIMAL_IN_RANGE(-1e100, 1e100)``
``date`` | ``IS_DATE()``
``time`` | ``IS_TIME()``
``datetime`` | ``IS_DATETIME()``
``password`` | ``None``
``upload`` | ``None``
``reference <table>`` | ``IS_IN_DB(db,table.field,format)``
``list:string`` | ``None``
``list:integer`` | ``None``
``list:reference <table>`` | ``IS_IN_DB(db,table.field,format,multiple=True)``
``json`` | ``IS_JSON()``
``bigint`` | ``None``
``big-id`` | ``None``
``big-reference`` | ``None``
---------
Decimal requires and returns values as ``Decimal`` objects, as defined in the Python ``decimal`` module. SQLite does not handle the ``decimal`` type so internally we treat it as a ``double``. The (n,m) are the number of digits in total and the number of digits after the decimal point respectively.
The ``big-id`` and, ``big-reference`` are only supported by some of the database engines and are experimental. They are not normally used as field types unless for legacy tables, however, the DAL constructor has a ``bigint_id`` argument that when set to ``True`` makes the ``id`` fields and ``reference`` fields ``big-id`` and ``big-reference`` respectively.
The ``list:<type>`` fields are special because they are designed to take advantage of certain denormalization features on NoSQL (in the case of Google App Engine NoSQL, the field types ``ListProperty`` and ``StringListProperty``) and back-port them all the other supported relational databases. On relational databases lists are stored as a ``text`` field. The items are separated by a ``|`` and each ``|`` in string item is escaped as a ``||``. They are discussed in their own section.
The ``json`` field type is pretty much explanatory. It can store any json serializable object. It is designed to work specifically for MongoDB and backported to the other database adapters for portability.
-------
Notice that ``requires=...`` is enforced at the level of forms, ``required=True`` is enforced at the level of the DAL (insert), while ``notnull``, ``unique`` and ``ondelete`` are enforced at the level of the database. While they sometimes may seem redundant, it is important to maintain the distinction when programming with the DAL.
-------
``ondelete``:inxx
- ``ondelete`` translates into the "ON DELETE" SQL statement. By default it is set to "CASCADE". This tells the database that when it deletes a record, it should also delete all records that refer to it. To disable this feature, set ``ondelete`` to "NO ACTION" or "SET NULL".
- ``notnull=True`` translates into the "NOT NULL" SQL statement. It prevents the database from inserting null values for the field.
- ``unique=True`` translates into the "UNIQUE" SQL statement and it makes sure that values of this field are unique within the table. It is enforced at the database level.
- ``uploadfield`` applies only to fields of type "upload". A field of type "upload" stores the name of a file saved somewhere else, by default on the filesystem under the application "uploads/" folder. If ``uploadfield`` is set, then the file is stored in a blob field within the same table and the value of ``uploadfield`` is the name of the blob field. This will be discussed in more detail later in the context of SQLFORM.
+- ``uploadfolder`` defaults to the application's "uploads/" folder. If set to a different path, files will uploaded to a different folder. For example,
+``
+Field(...,uploadfolder=os.path.join(request.folder,'static/temp'))
+``:code
+will upload files to the "web2py/applications/myapp/static/temp" folder.
+- ``uploadseparate`` if set to True will upload files under different subfolders of the ''uploadfolder'' folder. This is optimized to avoid too many files under the same folder/subfolder. ATTENTION: You cannot change the value of ``uploadseparate`` from True to False without breaking links to existing uploads. web2py either uses the separate subfolders or it does not. Changing the behavior after files have been uploaded will prevent web2py from being able to retrieve those files. If this happens it is possible to move files and fix the problem but this is not described here.
+- ``uploadfs`` allows you specify a different file system where to upload files, including an Amazon S3 storage or a remote SFTP storage. This option requires PyFileSystem installed. ``uploadfs`` must point to ``PyFileSystem``. ``PyFileSystem``:inxx ``uploadfs``:idxx
- ``widget`` must be one of the available widget objects, including custom widgets, for example: ``SQLFORM.widgets.string.widget``. A list of available widgets will be discussed later. Each field type has a default widget.
+- ``label`` is a string (or a helper or something that can be serialized to a string) that contains the label to be used for this field in auto-generated forms.
+- ``comment`` is a string (or a helper or something that can be serialized to a string) that contains a comment associated with this field, and will be displayed to the right of the input field in the autogenerated forms.
+- ``writable`` declares whether a field is writable in forms.
+- ``readable`` declares whether a field is readable in forms. If a field is neither readable nor writable, it will not be displayed in create and update forms.
- ``update`` contains the default value for this field when the record is updated.
- ``compute`` is an optional function. If a record is inserted or updated, the compute function will be executed and the field will be populated with the function result. The record is passed to the compute function as a ``dict``, and the dict will not include the current value of that, or any other compute field.
- ``authorize`` can be used to require access control on the corresponding field, for "upload" fields only. It will be discussed more in detail in the context of Authentication and Authorization.
- ``autodelete`` determines if the corresponding uploaded file should be deleted when the record referencing the file is deleted. For "upload" fields only.
- ``represent`` can be None or can point to a function that takes a field value and returns an alternate representation for the field value. Examples:
``
db.mytable.name.represent = lambda name,row: name.capitalize()
db.mytable.other_id.represent = lambda id,row: row.myfield
db.mytable.some_uploadfield.represent = lambda value,row: \
A('get it', _href=URL('download', args=value))
``:code
``blob``:inxx
"blob" fields are also special. By default, binary data is encoded in base64 before being stored into the actual database field, and it is decoded when extracted. This has the negative effect of using 25% more storage space than necessary in blob fields, but has two advantages. On average it reduces the amount of data communicated between web2py and the database server, and it makes the communication independent of back-end-specific escaping conventions.
Most attributes of fields and tables can be modified after they are defined:
``
db.define_table('person',Field('name',default=''),format='%(name)s')
db.person._format = '%(name)s/%(id)s'
We refer to this behavior as a "migration". web2py logs all migrations and migra
The first argument of ``define_table`` is always the table name. The other unnamed arguments are the fields (Field). The function also takes an optional last argument called "migrate" which must be referred to explicitly by name as in:
``
>>> db.define_table('person', Field('name'), migrate='person.table')
``:code
The value of migrate is the filename (in the "databases" folder for the application) where web2py stores internal migration information for this table.
These files are very important and should never be removed while the corresponding tables exist. In cases where a table has been dropped and the corresponding file still exist, it can be removed manually. By default, migrate is set to True. This causes web2py to generate the filename from a hash of the connection string. If migrate is set to False, the migration is not performed, and web2py assumes that the table exists in the datastore and it contains (at least) the fields listed in ``define_table``.
The best practice is to give an explicit name to the migrate table.
There may not be two tables in the same application with the same migrate filename.
The DAL class also takes a "migrate" argument, which determines the default value of migrate for calls to ``define_table``. For example,
``
>>> db = DAL('sqlite://storage.db', migrate=False)
``:code
will set the default value of migrate to False whenever ``db.define_table`` is called without a migrate argument.
------
Notice that web2py only migrates new columns, removed columns, and changes in column type (except in sqlite). web2py does not migrate changes in attributes such as changes in the values of ``default``, ``unique``, ``notnull``, and ``ondelete``.
------
Migrations can be disabled for all tables at once:
``
db = DAL(...,migrate_enabled=False)
``
This is the recommended behavior when two apps share the same database. Only one of the two apps should perform migrations, the other should disabled them.
### Fixing broken migrations
``fake_migrate``:inxx
There are two common problems with migrations and there are ways to recover from them.
One problem is specific with SQLite. SQLite does not enforce column types and cannot drop columns. This means that if you have a column of type string and you remove it, it is not really removed. If you add the column again with a different type (for example datetime) you end up with a datetime column that contains strings (junk for practical purposes). web2py does not complain about this because it does not know what is in the database, until it tries to retrieve records and fails.
If web2py returns an error in the gluon.sql.parse function when selecting records, this is the problem: corrupted data in a column because of the above issue.
The solution consists in updating all records of the table and updating the values in the column in question with None.
The other problem is more generic but typical with MySQL. MySQL does not allow more than one ALTER TABLE in a transaction. This means that web2py must break complex transactions into smaller ones (one ALTER TABLE at the time) and commit one piece at the time. It is therefore possible that part of a complex transaction gets committed and one part fails, leaving web2py in a corrupted state. Why would part of a transaction fail? Because, for example, it involves altering a table and converting a string column into a datetime column, web2py tries to convert the data, but the data cannot be converted. What happens to web2py? It gets confused about what exactly is the table structure actually stored in the database.
The solution consists of disabling migrations for all tables and enabling fake migrations:
``
db.define_table(....,migrate=False,fake_migrate=True)
``:code
This will rebuild web2py metadata about the table according to the table definition. Try multiple table definitions to see which one works (the one before the failed migration and the one after the failed migration). Once successful remove the ``fake_migrate=True`` attribute.
Before attempting to fix migration problems it is prudent to make a copy of "applications/yourapp/databases/*.table" files.
Migration problems can also be fixed for all tables at once:
``
db = DAL(...,fake_migrate_all=True)
``:code
+This also fails if the model describes tables that do not exist in the database,
+but it can help narrowing down the problem.
### ``insert``
Given a table, you can insert records
``insert``:inxx
``
>>> db.person.insert(name="Alex")
1
>>> db.person.insert(name="Bob")
2
``:code
Insert returns the unique "id" value of each record inserted.
You can truncate the table, i.e., delete all records and reset the counter of the id.
``truncate``:inxx
``
>>> db.person.truncate()
Finally, you can drop tables and all data will be lost:
Currently the DAL API does not provide a command to create indexes on tables, but this can be done using the ``executesql`` command. This is because the existence of indexes can make migrations complex, and it is better to deal with them explicitly. Indexes may be needed for those fields that are used in recurrent queries.
Here is an example of how to [[create an index using SQL in SQLite http://www.sqlite.org/lang_createindex.html]]:
``
>>> db = DAL('sqlite://storage.db')
>>> db.define_table('person', Field('name'))
>>> db.executesql('CREATE INDEX IF NOT EXISTS myidx ON person (name);')
``:code
Other database dialects have very similar syntaxes but may not support the optional "IF NOT EXISTS" directive.
### Legacy databases and keyed tables
web2py can connect to legacy databases under some conditions.
The easiest way is when these conditions are met:
- Each table must have a unique auto-increment integer field called "id"
- Records must be referenced exclusively using the "id" field.
When accessing an existing table, i.e., a table not created by web2py in the current application, always set ``migrate=False``.
If the legacy table has an auto-increment integer field but it is not called "id", web2py can still access it but the table definition must contain explicitly as ``Field('....','id')`` where ... is the name of the auto-increment integer field.
``keyed table``:inxx
Finally if the legacy table uses a primary key that is not an auto-increment id field it is possible to use a "keyed table", for example:
``
db.define_table('account',
Field('accnum','integer'),
Field('acctype'),
Field('accdesc'),
primarykey=['accnum','acctype'],
migrate=False)
``:code
- ``primarykey`` is a list of the field names that make up the primary key.
- All primarykey fields have a ``NOT NULL`` set even if not specified.
- Keyed tables can only reference other keyed tables.
- Referencing fields must use the ``reference tablename.fieldname`` format.
- The ``update_record`` function is not available for Rows of keyed tables.
-------
Currently keyed tables are only supported for DB2, MS-SQL, Ingres and Informix, but others engines will be added.
-------
At the time of writing, we cannot guarantee that the ``primarykey`` attribute works with every existing legacy table and every supported database backend.
For simplicity, we recommend, if possible, creating a database view that has an auto-increment id field.
### Distributed transaction
``distributed transactions``:inxx
------
At the time of writing this feature is only supported
by PostgreSQL, MySQL and Firebird, since they expose API for two-phase commits.
------
Assuming you have two (or more) connections to distinct PostgreSQL databases, for example:
``
db_a = DAL('postgres://...')
db_b = DAL('postgres://...')
``:code
Alex
``:code
You can have the fetched records appear in random order:
``
>>> for row in db().select(
db.person.ALL, orderby='<random>'):
print row.name
Carl
Alex
Bob
``:code
-----
The use of ``orderby='<random>'`` is not supported on Google NoSQL. However, in this situation and likewise in many others where built-ins are insufficient, imports can be used:
``
import random
rows=db(...).select().sort(lambda row: random.random())
``:code
-----
You can sort the records according to multiple fields by concatenating them with a "|":
``
>>> for row in db().select(
db.person.ALL, orderby=db.person.name|db.person.id):
print row.name
Carl
Bob
Alex
``:code
Using ``groupby`` together with ``orderby``, you can group records with the same value for the specified field (this is back-end specific, and is not on the Google NoSQL):
``
>>> for row in db().select(
db.person.ALL,
orderby=db.person.name, groupby=db.person.name):
print row.name
Alex
Bob
Carl
``:code
web2py also allows updating a single record that is already in memory using ``up
because for a single row, the method ``update`` updates the row object but not the database record, as in the case of ``update_record``.
It is also possible to change the attributes of a row (one at a time) and then call ``update_record()`` without arguments to save the changes:
``
>>> row = db(db.person.id > 2).select().first()
>>> row.name = 'Curt'
>>> row.update_record() # saves above change
``:code
The ``update_record`` method is available only if the table's ``id`` field is included in the select, and ``cacheable`` is not set to ``True``.
#### Inserting and updating from a dictionary
A common issue consists of needing to insert or update records in a table where the name of the table, the field to be updated, and the value for the field are all stored in variables. For example: ``tablename``, ``fieldname``, and ``value``.
The insert can be done using the following syntax:
``
db[tablename].insert(**{fieldname:value})
``:code
The update of record with given id can be done with: ``_id``:inxx
``
db(db[tablename]._id==id).update(**{fieldname:value})
``:code
Notice we used ``table._id`` instead of ``table.id``. In this way the query works even for tables with a field of type "id" which has a name other than "id".
#### ``first`` and ``last``
``first``:inxx ``last``:inxx
Given a Rows object containing records:
``
>>> rows = db(query).select()
>>> first_row = rows.first()
>>> last_row = rows.last()
``:code
name
Max
Tim
John
Tim
``:code
You can do a union of the records removing duplicates:
``
>>> rows3 = rows1 | rows2
>>> print rows3
name
Max
Tim
John
``:code
#### ``find``, ``exclude``, ``sort``
``find``:inxx ``exclude``:inxx ``sort``:inxx
Some times you to perform two selects and one contains a subset of a previous select. In this case it is pointless to access the database again. The ``find``, ``exclude`` and ``sort`` objects allow you to manipulate a Rows objects and generate another one without accessing the database. More specifically:
- ``find`` returns a new set of Rows filtered by a condition and leaves the original unchanged.
- ``exclude`` returns a new set of Rows filtered by a condition and removes them from the original Rows.
- ``sort`` returns a new set of Rows sorted by a condition and leaves the original unchanged.
All these methods take a single argument, a function that acts on each individual row.
Here is an example of usage:
``
>>> db.define_table('person',Field('name'))
>>> db.person.insert(name='John')
>>> db.person.insert(name='Max')
>>> db.person.insert(name='Alex')
>>> rows = db(db.person).select()
>>> for row in rows.find(lambda row: row.name[0]=='M'):
print row.name
Max
>>> print len(rows)
3
>>> for row in rows.exclude(lambda row: row.name[0]=='M'):
print row.name
Here is ``_delete`` ``_delete``:inxx
DELETE FROM person WHERE person.name='Alex';
``:code
And finally, here is ``_update`` ``_update``:inxx
``
>>> print db(db.person.name=='Alex')._update()
UPDATE person SET WHERE person.name='Alex';
``:code
-----
Moreover you can always use ``db._lastsql`` to return the most recent
SQL code, whether it was executed manually using executesql or was SQL
generated by the DAL.
-----
### Exporting and importing data
``export``:inxx ``import``:inxx
#### CSV (one Table at a time)
When a Rows object is converted to a string it is automatically
serialized in CSV:
``csv``:inxx
``
>>> rows = db(db.person.id==db.thing.owner).select()
>>> print rows
person.id,person.name,thing.id,thing.name,thing.owner
1,Alex,1,Boat,1
1,Alex,2,Chair,1
2,Bob,3,Shoes,2
``:code
You can serialize a single table in CSV and store it in a file "test.csv":
``
>>> open('test.csv', 'wb').write(str(db(db.person.id).select()))
``:code
This is equivalent to
``
>>> rows = db(db.person.id).select()
>>> rows.export_to_csv_file(open('test.csv', 'wb'))
``:code
You can read the CSV file back with:
``
>>> db.person.import_from_csv_file(open('test.csv', 'r'))
``:code
When importing, web2py looks for the field names in the CSV header. In this example, it finds two columns: "person.id" and "person.name". It ignores the "person." prefix, and it ignores the "id" fields. Then all records are appended and assigned new ids. Both of these operations can be performed via the appadmin web interface.
#### CSV (all tables at once)
In web2py, you can backup/restore an entire database with two commands:
To export:
``
>>> db.export_to_csv_file(open('somefile.csv', 'wb'))
``:code
To import:
``
db.define_table('thing',
format='%(name)s')
if not db(db.person).count():
id = db.person.insert(name="Massimo")
db.thing.insert(owner=id, name="Chair")
``:code
Each record is identified by an ID and referenced by that ID. If you
have two copies of the database used by distinct web2py installations,
the ID is unique only within each database and not across the databases.
This is a problem when merging records from different databases.
In order to make a record uniquely identifiable across databases, they
must:
- have a unique id (UUID),
- have an event_time (to figure out which one is more recent if multiple copies),
- reference the UUID instead of the id.
This can be achieved without modifying web2py. Here is what to do:
Change the above model into:
``
db.define_table('person',
Field('uuid', length=64, default=lambda:str(uuid.uuid4())),
Field('modified_on', 'datetime', default=now),
Field('name'),
format='%(name)s')
db.define_table('thing',
Field('uuid', length=64, default=lambda:str(uuid.uuid4())),
Field('modified_on', 'datetime', default=now),
Field('owner', length=64),
Field('name'),
format='%(name)s')
db.thing.owner.requires = IS_IN_DB(db,'person.uuid','%(name)s')
if not db(db.person.id).count():
id = uuid.uuid4()
db.person.insert(name="Massimo", uuid=id)
db.thing.insert(owner=id, name="Chair")
``:code
-------
Notice that in the above table definitions, the default value for the two ``uuid`` fields is set to a lambda function, which returns a UUID (converted to a string). The lambda function is called once for each record inserted, ensuring that each record gets a unique UUID, even if multiple records are inserted in a single transaction.
-------
Create a controller action to export the database:
``
def export():
s = StringIO.StringIO()
db.export_to_csv_file(s)
response.headers['Content-Type'] = 'text/csv'
return s.getvalue()
``:code
Create a controller action to import a saved copy of the other database and sync records:
``
def import_and_sync():
form = FORM(INPUT(_type='file', _name='data'), INPUT(_type='submit'))
if form.process().accepted:
db.import_from_csv_file(form.vars.data.file,unique=False)
# for every table
for table in db.tables:
# for every uuid, delete all but the latest
items = db(db[table]).select(db[table].id,
db[table].uuid,
orderby=db[table].modified_on,
groupby=db[table].uuid)
for item in items:
db((db[table].uuid==item.uuid)&\
(db[table].id!=item.id)).delete()
return dict(form=form)
``:code
Optionally you should create an index manually to make the search by uuid faster.
``XML-RPC``:inxx
Alternatively, you can use XML-RPC to export/import the file.
If the records reference uploaded files, you also need to export/import the content of the uploads folder. Notice that files therein are already labeled by UUIDs so you do not need to worry about naming conflicts and references.
#### HTML and XML (one Table at a time)
+``Rows objects``:inxx
+Rows objects also have an ``xml`` method (like helpers) that serializes it to XML/HTML:
``HTML``:inxx
``
>>> rows = db(db.person.id > 0).select()
>>> print rows.xml()
<table>
<thead>
<tr>
<th>person.id</th>
<th>person.name</th>
<th>thing.id</th>
<th>thing.name</th>
<th>thing.owner</th>
</tr>
</thead>
<tbody>
<tr class="even">
<td>1</td>
<td>Alex</td>
<td>1</td>
<td>Boat</td>
<td>1</td>
</tr>
...
</tbody>
</table>
``:code
+``Rows custom tags``:inxx
+If you need to serialize the Rows in any other XML format with custom tags, you can easily do that using the universal TAG helper and the * notation:
``XML``:inxx
``
>>> rows = db(db.person.id > 0).select()
>>> print TAG.result(*[TAG.row(*[TAG.field(r[f], _name=f) \
for f in db.person.fields]) for r in rows])
<result>
<row>
<field name="id">1</field>
<field name="name">Alex</field>
</row>
...
</result>
``:code
#### Data representation
``export_to_csv_file``:inxx
The ``export_to_csv_file`` function accepts a keyword argument named ``represent``. When ``True`` it will use the columns ``represent`` function while exporting the data instead of the raw data.
The select method also takes a cache argument, which defaults to None. For cachi
In the following example, you see a controller that caches a select on the previously defined db.log table. The actual select fetches data from the back-end database no more frequently than once every 60 seconds and stores the result in cache.ram. If the next call to this controller occurs in less than 60 seconds since the last database IO, it simply fetches the previous data from cache.ram.
``cache select``:inxx
``
def cache_db_select():
logs = db().select(db.log.ALL, cache=(cache.ram, 60))
return dict(logs=logs)
``:code
``cacheable``:inxx
The ``select`` method has an optional ``cacheable`` argument, normally set to ``False``. When ``cacheable=True`` the resulting ``Rows`` is serializable but The ``Row``s lack ``update_record`` and ``delete_record`` methods.
If you do not need these methods you can speed up selects a lot by setting the cacheable attribute:
``
rows = db(query).select(cacheable=True)
``:code
When the ``cache`` argument is set but ``cacheable=False`` (default) only the database results are cached, not the actual Rows object. When the ``cache`` argument is used in conjunction with ``cacheable=True`` the entire Rows object is cached and this results in much faster caching:
``
rows = db(query).select(cache=(cache.ram,3600),cacheable=True)
``:code
### Self-Reference and aliases
``self reference``:inxx
``alias``:inxx
It is possible to define tables with fields that refer to themselves, here is an example:
``reference table``:inxx
``
db.define_table('person',
Field('name'),
Field('father_id', 'reference person'),
Field('mother_id', 'reference person'))
``:code
Notice that the alternative notation of using a table object as field type will fail in this case, because it uses a variable ``db.person`` before it is defined:
Here ``f`` is a dict of fields passed to insert or update, ``id`` is the id of t
>>> db(db.person.id==1).update(name='Tim')
(<Set (person.id = 1)>, {'name': 'Tim'})
(<Set (person.id = 1)>, {'name': 'Tim'})
>>> db(db.person.id==1).delete()
(<Set (person.id = 1)>,)
(<Set (person.id = 1)>,)
``:code
The return values of these callback should be ``None`` or ``False``. If any of the ``_before_*`` callback returns a ``True`` value it will abort the actual insert/update/delete operation.
``update_naive``:inxx.
Some times a callback may need to perform an update in the same of a different table and one wants to avoid callbacks calling themselves recursively.
For this purpose there the Set objects have an ``update_naive`` method that works like ``update`` but ignores before and after callbacks.
#### Record versioning
``_enable_record_versioning``:inxx
It is possible to ask web2py to save every copy of a record when the record is individually modified. There are different ways to do it and it can be done for all tables at once using the syntax:
``
auth.enable_record_versioning(db)
``:code
this requires Auth and it is discussed in the chapter about authentication.
It can also be done for each individual table as discussed below.
Consider the following table:
``
db.define_table('stored_item',
Field('name'),
Field('quantity','integer'),
Field('is_active','boolean',
writable=False,readable=False,default=True))
``:code
Notice the hidden boolean field called ``is_active`` and defaulting to
True.
Any select, delete or update in this table, will include only public blog posts.
``
db.blog_post._common_filter = lambda query: db.blog_post.is_public == True
``
It serves both as a way to avoid repeating the "db.blog_post.is_public==True" phrase in each blog post search, and also as a security enhancement, that prevents you from forgetting to disallow viewing of none public posts.
In case you actually do want items left out by the common filter (for example, allowing the admin to see none public posts), you can either remove the filter:
``
db.blog_post._common_filter = None
``
or ignore it:
``
db(query, ignore_common_filters=True).select(...)
``
#### Custom ``Field`` types (experimental)
``SQLCustomType``:inxx
+Aside for using ``filter_in`` and ``filter_out``, it is possible to define new/custom field types.
+For example we consider here the example if a field that contains binary data in compressed form:
``
from gluon.dal import SQLCustomType
import zlib
compressed = SQLCustomType(
type ='text',
native='text',
encoder =(lambda x: zlib.compress(x or '')),
decoder = (lambda x: zlib.decompress(x))
)
db.define_table('example', Field('data',type=compressed))
``:code
``SQLCustomType`` is a field type factory. Its ``type`` argument must be one of the standard web2py types. It tells web2py how to treat the field values at the web2py level. ``native`` is the name of the field as far as the database is concerned. Allowed names depend on the database engine. ``encoder`` is an optional transformation function applied when the data is stored and ``decoder`` is the optional reversed transformation function.
This feature is marked as experimental. In practice it has been in web2py for a long time and it works but it can make the code not portable, for example when the native type is database specific. It does not work on Google App Engine NoSQL.
#### Using DAL without define tables
To access the data and its attributes we still have to define all the tables we
If we just need access to the data but not to the web2py table attributes, we get away without re-defining the tables but simply asking web2py to read the necessary info from the metadata in the .table files:
``
from gluon import DAL, Field
db = DAL('sqlite://storage.sqlite',folder='path/to/app/databases',
auto_import=True))
``:code
This allows us to access any ``db.table`` without need to re-define it.
#### PostGIS, SpatiaLite, and MS Geo (experimental)
``PostGIS``:inxx ``StatiaLite``:inxx ``Geo Extensions``:inxx
``geometry``:inxx ``geoPoint``:inxx ``geoLine``:inxx ``geoPolygon``:inxx
The DAL supports geographical APIs using PostGIS (for PostgreSQL), spatialite (for SQLite), and MSSQL and Spatial Extensions. This is a feature that was sponsored by the Sahana project and implemented by Denes Lengyel.
DAL provides geometry and geography fields types and the following functions:
``st_asgeojson``:inxx ``st_astext``:inxx ``st_contains``:inxx
``st_distance``:inxx ``st_equals``:inxx ``st_intersects``:inxx ``st_overlaps``:inxx
``st_simplify``:inxx ``st_touches``:inxx ``st_within``:inxx
``
st_asgeojson (PostGIS only)
st_astext
st_contains
st_distance
st_equals
st_intersects
st_overlaps
st_simplify (PostGIS only)
st_touches
st_within
st_x
st_y
``
Here are some examples:
calls
``
Table.insert(myfield='myvalue')
``
which delegates the adapter by returning:
``
db._adapter.insert(db.mytable,db.mytable._listify(dict(myfield='myvalue')))
``
Here ``db.mytable._listify`` converts the dict of arguments into a list of ``(field,value)`` and calls the ``insert`` method of the ``adapter``. ``db._adapter`` does more or less the following:
``
query = db._adapter._insert(db.mytable,list_of_fields)
db._adapter.execute(query)
``
where the first line builds the query and the second executes it.
``BaseAdapter`` defines the interface for all adapters.
"gluon/dal.py" at the moment of writing this book, contains the following adapters:
``
SQLiteAdapter extends BaseAdapter
JDBCSQLiteAdapter extends SQLiteAdapter
MySQLAdapter extends BaseAdapter
PostgreSQLAdapter extends BaseAdapter
JDBCPostgreSQLAdapter extends PostgreSQLAdapter
OracleAdapter extends BaseAdapter
MSSQLAdapter extends BaseAdapter
MSSQL2Adapter extends MSSQLAdapter
FireBirdAdapter extends BaseAdapter
FireBirdEmbeddedAdapter extends FireBirdAdapter
InformixAdapter extends BaseAdapter
DB2Adapter extends BaseAdapter
IngresAdapter extends BaseAdapter
IngresUnicodeAdapter extends IngresAdapter
GoogleSQLAdapter extends MySQLAdapter
NoSQLAdapter extends BaseAdapter
class MySQLAdapter(BaseAdapter):
credential_decoder=lambda x:x, driver_args={},
adapter_args={}):
# parse uri string and store parameters in driver_args
...
# define a connection function
def connect(driver_args=driver_args):
return self.driver.connect(**driver_args)
# place it in the pool
self.pool_connection(connect)
# set optional parameters (after connection)
self.execute('SET FOREIGN_KEY_CHECKS=1;')
self.execute("SET sql_mode='NO_BACKSLASH_ESCAPES';")
# override BaseAdapter methods as needed
def lastrowid(self,table):
self.execute('select last_insert_id();')
return int(self.cursor.fetchone()[0])
``:code
Looking at the various adapters as example should be easy to write new ones.
web2py comes with a Database Abstraction Layer (DAL), an API that maps Python objects into database objects such as queries, tables, and records. The DAL dynamically generates the SQL in real time using the specified dialect for the database back end, so that you do not have to write SQL code or learn different SQL dialects (the term SQL is used generically), and the application will be portable among different types of databases. At the time of this writing, the supported databases are SQLite (which comes with Python and thus web2py), PostgreSQL, MySQL, Oracle, MSSQL, FireBird, DB2, Informix, Ingres, MongoDB, and the Google App Engine (SQL and NoSQL). Experimentally we support more databases. Please check on the web2py web site and mailing list for more recent adapters. Google NoSQL is treated as a particular case in Chapter 13.
The Windows binary distribution works out of the box with SQLite and MySQL. The Mac binary distribution works out of the box with SQLite.
To use any other database back-end, run from the source distribution and install the appropriate driver for the required back end.
``database drivers``:inxx
Once the proper driver is installed, start web2py from source, and it will find the driver. Here is a list of drivers:
``DAL``:inxx ``SQLite``:inxx ``MySQL``:inxx ``PostgresSQL``:inxx ``Oracle``:inxx ``MSSQL``:inxx ``FireBird``:inxx ``DB2``:inxx ``Informix``:inxx ``Sybase``:inxx ``Teradata``:inxx ``MongoDB``:inxx ``CouchDB``:inxx ``SAPDB``:inxx ``Cubrid``:inxx
----------
database | drivers (source)
SQLite | sqlite3 or pysqlite2 or zxJDBC ``zxjdbc``:cite (on Jython)
PostgreSQL | psycopg2 ``psycopg2``:cite or pg8000 ``pg8000``:cite or zxJDBC ``zxjdbc``:cite (on Jython)
MySQL | pymysql ``pymysql``:cite or MySQLdb ``mysqldb``:cite
Oracle | cx_Oracle ``cxoracle``:cite
MSSQL | pyodbc ``pyodbc``:cite
FireBird | kinterbasdb ``kinterbasdb``:cite or fdb or pyodbc
DB2 | pyodbc ``pyodbc``:cite
Informix | informixdb ``informixdb``:cite
Ingres | ingresdbi ``ingresdbi``:cite
Cubrid | cubriddb ``cubridb``:cite ``cubridb``:cite
Sybase | Sybase ``Sybase``:cite
Teradata | pyodbc ``Teradata``:cite
SAPDB | sapdb ``SAPDB``:cite
MongoDB | pymongo ``pymongo``:cite
IMAP | imaplib ``IMAP``:cite
---------
``sqlite3``, ``pymysql``, ``pg8000``, and ``imaplib`` ship with web2py. Support of MongoDB is experimental. The IMAP option allows to use DAL to access IMAP.
web2py defines the following classes that make up the DAL:
**DAL** represents a database connection. For example:
``sqlite``:inxx
``
db = DAL('sqlite://storage.db')
``:code
``define_table``:inxx
**Table** represents a database table. You do not directly instantiate Table; instead, ``DAL.define_table`` instantiates it.
``
db.define_table('mytable', Field('myfield'))
``:code
The most important methods of a Table are:
``insert``:inxx
``truncate``:inxx
``drop``:inxx
``import_from_csv_file``:inxx
``count``:inxx
``.insert``, ``.truncate``, ``.drop``, and ``.import_from_csv_file``.
``Field``:inxx
Some times you may need to generate SQL as if you had a connection but without a
``
db = DAL('...', do_connect=False)
``:code
In this case you will be able to call ``_select``, ``_insert``, ``_update``, and ``_delete`` to generate SQL but not call ``select``, ``insert``, ``update``, and ``delete``. In most of the cases you can use ``do_connect=False`` even without having the required database drivers.
Notice that by default web2py uses utf8 character encoding for databases. If you work with existing databases that behave differently, you have to change it with the optional parameter ``db_codec`` like
``
db = DAL('...', db_codec='latin1')
``:code
otherwise you'll get UnicodeDecodeErrors tickets.
#### Connection pooling
``connection pooling``:inxx
The second argument of the DAL constructor is the ``pool_size``; it defaults to zero.
As it is rather slow to establish a new database connection for each request, web2py implements a mechanism for connection pooling. Once a connection is established and the page has been served and the transaction completed, the connection is not closed but goes into a pool. When the next http request arrives, web2py tries to obtain a connection from the pool and use that for the new transaction. If there are no available connections in the pool, a new connection is established.
The ``pool_size`` parameter is ignored by SQLite and Google App Engine.
Connections in the pools are shared sequentially among threads, in the sense that they may be used by two different but not simultaneous threads. There is only one pool for each web2py process.
-When web2py starts, the pool is always empty. The pool grows up to the minimum between the value of ``pool_size`` and the max number of concurrent requests. This means that if ``pool_size=10`` but our server never receives more than 5 concurrent requests, then the actual pool size will only grow to 5. If ``pool_size=0`` then connection pooling is not used.
Connection pooling is ignored for SQLite, since it would not yield any benefit.
#### Connection failures
If web2py fails to connect to the database it waits 1 seconds and tries again up to 5 times before declaring a failure. In case of connection pooling it is possible that a pooled connection that stays open but unused for some time is closed by the database end. Thanks to the retry feature web2py tries to re-establish these dropped connections.
-When using connection pooling a connection is used, put back in the pool and then recycled. It is possible that while the connection is idle in pool the connection is closed by the database server. This can be because of a malfunction or a timeout. When this happens web2py detects it and re-establish the connection.
-
#### Replicated databases
The first argument of ``DAL(...)`` can be a list of URIs. In this case web2py tries to connect to each of them. The main purpose for this is to deal with multiple database servers and distribute the workload among them). Here is a typical use case:
``
db = DAL(['mysql://...1','mysql://...2','mysql://...3'])
``:code
In this case the DAL tries to connect to the first and, on failure, it
will try the second and the third. This can also be used to distribute load
in a database master-slave configuration. We will talk more about this
in Chapter 13 in the context of scalability.
### Reserved keywords
``reserved Keywords``:inxx
There is also another argument that can be passed to the DAL constructor to check table names and column names against reserved SQL keywords in target back-end databases.
This argument is ``check_reserved`` and it defaults to None.
This is a list of strings that contain the database back-end adapter names.
The adapter name is the same as used in the DAL connection string. So if you want to check against PostgreSQL and MSSQL then your connection string would look as follows:
``
db = DAL('sqlite://storage.db',
check_reserved=['postgres', 'mssql'])
``:code
The DAL will scan the keywords in the same order as of the list.
There are two extra options "all" and "common". If you specify all, it will check against all known SQL keywords. If you specify common, it will only check against common SQL keywords such as ``SELECT``, ``INSERT``, ``UPDATE``, etc.
For supported back-ends you may also specify if you would like to check against the non-reserved SQL keywords as well. In this case you would append ``_nonreserved`` to the name. For example:
``
check_reserved=['postgres', 'postgres_nonreserved']
``:code
The following database backends support reserved words checking.
-----
**PostgreSQL** | ``postgres(_nonreserved)``
**MySQL** | ``mysql``
**FireBird** | ``firebird(_nonreserved)``
**MSSQL** | ``mssql``
**Oracle** | ``oracle``
-----
### ``DAL``, ``Table``, ``Field``
The best way to understand the DAL API is to try each function yourself. This can be done interactively via the web2py shell, although ultimately, DAL code goes in the models and controllers.
Start by creating a connection. For the sake of example, you can use SQLite. Nothing in this discussion changes when you change the back-end engine.
``
>>> db = DAL('sqlite://storage.db')
``:code
The database is now connected and the connection is stored in the global variable ``db``.
At any time you can retrieve the connection string.
``_uri``:inxx
``
>>> print db._uri
sqlite://storage.db
``:code
and the database name
``_dbname``:inxx
``
>>> print db._dbname
Not all of them are relevant for every field. "length" is relevant only for fiel
``decimal(n,m)`` | ``IS_DECIMAL_IN_RANGE(-1e100, 1e100)``
``date`` | ``IS_DATE()``
``time`` | ``IS_TIME()``
``datetime`` | ``IS_DATETIME()``
``password`` | ``None``
``upload`` | ``None``
``reference <table>`` | ``IS_IN_DB(db,table.field,format)``
``list:string`` | ``None``
``list:integer`` | ``None``
``list:reference <table>`` | ``IS_IN_DB(db,table.field,format,multiple=True)``
``json`` | ``IS_JSON()``
``bigint`` | ``None``
``big-id`` | ``None``
``big-reference`` | ``None``
---------
Decimal requires and returns values as ``Decimal`` objects, as defined in the Python ``decimal`` module. SQLite does not handle the ``decimal`` type so internally we treat it as a ``double``. The (n,m) are the number of digits in total and the number of digits after the decimal point respectively.
The ``big-id`` and, ``big-reference`` are only supported by some of the database engines and are experimental. They are not normally used as field types unless for legacy tables, however, the DAL constructor has a ``bigint_id`` argument that when set to ``True`` makes the ``id`` fields and ``reference`` fields ``big-id`` and ``big-reference`` respectively.
The ``list:`` fields are special because they are designed to take advantage of certain denormalization features on NoSQL (in the case of Google App Engine NoSQL, the field types ``ListProperty`` and ``StringListProperty``) and back-port them all the other supported relational databases. On relational databases lists are stored as a ``text`` field. The items are separated by a ``|`` and each ``|`` in string item is escaped as a ``||``. They are discussed in their own section.
The ``json`` field type is pretty much explanatory. It can store any json serializable object. It is designed to work specifically for MongoDB and backported to the other database adapters for portability.
-------
Notice that ``requires=...`` is enforced at the level of forms, ``required=True`` is enforced at the level of the DAL (insert), while ``notnull``, ``unique`` and ``ondelete`` are enforced at the level of the database. While they sometimes may seem redundant, it is important to maintain the distinction when programming with the DAL.
-------
``ondelete``:inxx
- ``ondelete`` translates into the "ON DELETE" SQL statement. By default it is set to "CASCADE". This tells the database that when it deletes a record, it should also delete all records that refer to it. To disable this feature, set ``ondelete`` to "NO ACTION" or "SET NULL".
- ``notnull=True`` translates into the "NOT NULL" SQL statement. It prevents the database from inserting null values for the field.
- ``unique=True`` translates into the "UNIQUE" SQL statement and it makes sure that values of this field are unique within the table. It is enforced at the database level.
- ``uploadfield`` applies only to fields of type "upload". A field of type "upload" stores the name of a file saved somewhere else, by default on the filesystem under the application "uploads/" folder. If ``uploadfield`` is set, then the file is stored in a blob field within the same table and the value of ``uploadfield`` is the name of the blob field. This will be discussed in more detail later in the context of SQLFORM.
-- ``uploadfolder`` defaults to the application's "uploads/" folder. If set to a different path, files will uploaded to a different folder. For example, uploadfolder=os.path.join(request.folder,'static/temp') will upload files to the web2py/applications/myapp/static/temp folder.
-- ``uploadseparate`` if set to True will upload files under different subfolders of the ''uploadfolder'' folder. This is optimized to avoid too many files under the same folder/subfolder. ATTENTION: You cannot change the value of ``uploadseparate`` from True to False without breaking the system. web2py either uses the separate subfolders or it does not. Changing the behavior after files have been uploaded will prevent web2py from being able to retrieve those files. If this happens it is possible to move files and fix the problem but this is not described here.
-- ``uploadfs`` allows you specify a different file system where to upload files, including an Amazon S3 storage or a remote FTP storage. This option requires PyFileSystem installed. ``uploadfs`` must point to ``PyFileSystem``. ``PyFileSystem``:inxx ``uploadfs``:idxx
- ``widget`` must be one of the available widget objects, including custom widgets, for example: ``SQLFORM.widgets.string.widget``. A list of available widgets will be discussed later. Each field type has a default widget.
-- ``label`` is a string (or something that can be serialized to a string) that contains the label to be used for this field in auto-generated forms.
-- ``comment`` is a string (or something that can be serialized to a string) that contains a comment associated with this field, and will be displayed to the right of the input field in the autogenerated forms.
-- ``writable`` if a field is writable, it can be edited in autogenerated create and update forms.
-- ``readable`` if a field is readable, it will be visible in read-only forms. If a field is neither readable nor writable, it will not be displayed in create and update forms.
- ``update`` contains the default value for this field when the record is updated.
- ``compute`` is an optional function. If a record is inserted or updated, the compute function will be executed and the field will be populated with the function result. The record is passed to the compute function as a ``dict``, and the dict will not include the current value of that, or any other compute field.
- ``authorize`` can be used to require access control on the corresponding field, for "upload" fields only. It will be discussed more in detail in the context of Authentication and Authorization.
- ``autodelete`` determines if the corresponding uploaded file should be deleted when the record referencing the file is deleted. For "upload" fields only.
- ``represent`` can be None or can point to a function that takes a field value and returns an alternate representation for the field value. Examples:
``
db.mytable.name.represent = lambda name,row: name.capitalize()
db.mytable.other_id.represent = lambda id,row: row.myfield
db.mytable.some_uploadfield.represent = lambda value,row: \
A('get it', _href=URL('download', args=value))
``:code
``blob``:inxx
"blob" fields are also special. By default, binary data is encoded in base64 before being stored into the actual database field, and it is decoded when extracted. This has the negative effect of using 25% more storage space than necessary in blob fields, but has two advantages. On average it reduces the amount of data communicated between web2py and the database server, and it makes the communication independent of back-end-specific escaping conventions.
Most attributes of fields and tables can be modified after they are defined:
``
db.define_table('person',Field('name',default=''),format='%(name)s')
db.person._format = '%(name)s/%(id)s'
We refer to this behavior as a "migration". web2py logs all migrations and migra
The first argument of ``define_table`` is always the table name. The other unnamed arguments are the fields (Field). The function also takes an optional last argument called "migrate" which must be referred to explicitly by name as in:
``
>>> db.define_table('person', Field('name'), migrate='person.table')
``:code
The value of migrate is the filename (in the "databases" folder for the application) where web2py stores internal migration information for this table.
These files are very important and should never be removed while the corresponding tables exist. In cases where a table has been dropped and the corresponding file still exist, it can be removed manually. By default, migrate is set to True. This causes web2py to generate the filename from a hash of the connection string. If migrate is set to False, the migration is not performed, and web2py assumes that the table exists in the datastore and it contains (at least) the fields listed in ``define_table``.
The best practice is to give an explicit name to the migrate table.
There may not be two tables in the same application with the same migrate filename.
The DAL class also takes a "migrate" argument, which determines the default value of migrate for calls to ``define_table``. For example,
``
>>> db = DAL('sqlite://storage.db', migrate=False)
``:code
will set the default value of migrate to False whenever ``db.define_table`` is called without a migrate argument.
------
Notice that web2py only migrates new columns, removed columns, and changes in column type (not in sqlite). web2py does not migrate changes in attributes such as changes in the values of ``default``, ``unique``, ``notnull``, and ``ondelete``.
------
Migrations can be disabled for all tables at the moment of connection:
``
db = DAL(...,migrate_enabled=False)
``
This is the recommended behavior when two apps share the same database. Only one of the two apps should perform migrations, the other should disabled them.
### Fixing broken migrations
``fake_migrate``:inxx
There are two common problems with migrations and there are ways to recover from them.
One problem is specific with SQLite. SQLite does not enforce column types and cannot drop columns. This means that if you have a column of type string and you remove it, it is not really removed. If you add the column again with a different type (for example datetime) you end up with a datetime column that contains strings (junk for practical purposes). web2py does not complain about this because it does not know what is in the database, until it tries to retrieve records and fails.
If web2py returns an error in the gluon.sql.parse function when selecting records, this is the problem: corrupted data in a column because of the above issue.
The solution consists in updating all records of the table and updating the values in the column in question with None.
The other problem is more generic but typical with MySQL. MySQL does not allow more than one ALTER TABLE in a transaction. This means that web2py must break complex transactions into smaller ones (one ALTER TABLE at the time) and commit one piece at the time. It is therefore possible that part of a complex transaction gets committed and one part fails, leaving web2py in a corrupted state. Why would part of a transaction fail? Because, for example, it involves altering a table and converting a string column into a datetime column, web2py tries to convert the data, but the data cannot be converted. What happens to web2py? It gets confused about what exactly is the table structure actually stored in the database.
The solution consists of disabling migrations for all tables and enabling fake migrations:
``
db.define_table(....,migrate=False,fake_migrate=True)
``:code
This will rebuild web2py metadata about the table according to the table definition. Try multiple table definitions to see which one works (the one before the failed migration and the one after the failed migration). Once successful remove the ``fake_migrate=True`` attribute.
Before attempting to fix migration problems it is prudent to make a copy of "applications/yourapp/databases/*.table" files.
Migration problems can also be fixed for all tables at once:
``
db = DAL(...,fake_migrate_all=True)
``:code
-Although if this fails, it will not help in narrowing down the problem.
### ``insert``
Given a table, you can insert records
``insert``:inxx
``
>>> db.person.insert(name="Alex")
1
>>> db.person.insert(name="Bob")
2
``:code
Insert returns the unique "id" value of each record inserted.
You can truncate the table, i.e., delete all records and reset the counter of the id.
``truncate``:inxx
``
>>> db.person.truncate()
Finally, you can drop tables and all data will be lost:
Currently the DAL API does not provide a command to create indexes on tables, but this can be done using the ``executesql`` command. This is because the existence of indexes can make migrations complex, and it is better to deal with them explicitly. Indexes may be needed for those fields that are used in recurrent queries.
Here is an example of how to [[create an index using SQL in SQLite http://www.sqlite.org/lang_createindex.html]]:
``
>>> db = DAL('sqlite://storage.db')
>>> db.define_table('person', Field('name'))
>>> db.executesql('CREATE INDEX IF NOT EXISTS myidx ON person (name);')
``:code
Other database dialects have very similar syntaxes but may not support the optional "IF NOT EXISTS" directive.
### Legacy databases and keyed tables
web2py can connect to legacy databases under some conditions.
The easiest way is when these conditions are met:
- Each table must have a unique auto-increment integer field called "id"
- Records must be referenced exclusively using the "id" field.
When accessing an existing table, i.e., a table not created by web2py in the current application, always set ``migrate=False``.
If the legacy table has an auto-increment integer field but it is not called "id", web2py can still access it but the table definition must contain explicitly as ``Field('....','id')`` where ... is the name of the auto-increment integer field.
``keyed table``:inxx
Finally if the legacy table uses a primary key that is not an auto-increment id field it is possible to use a "keyed table", for example:
``
db.define_table('account',
Field('accnum','integer'),
Field('acctype'),
Field('accdesc'),
primarykey=['accnum','acctype'],
migrate=False)
``:code
- ``primarykey`` is a list of the field names that make up the primary key.
- All primarykey fields have a ``NOT NULL`` set even if not specified.
- Keyed tables can only reference other keyed tables.
- Referencing fields must use the ``reference tablename.fieldname`` format.
- The ``update_record`` function is not available for Rows of keyed tables.
-------
Note that currently this is only available for DB2, MS-SQL, Ingres and Informix, but others can be easily added.
-------
At the time of writing, we cannot guarantee that the ``primarykey`` attribute works with every existing legacy table and every supported database backend.
For simplicity, we recommend, if possible, creating a database view that has an auto-increment id field.
### Distributed transaction
``distributed transactions``:inxx
------
At the time of writing this feature is only supported
by PostgreSQL, MySQL and Firebird, since they expose API for two-phase commits.
------
Assuming you have two (or more) connections to distinct PostgreSQL databases, for example:
``
db_a = DAL('postgres://...')
db_b = DAL('postgres://...')
``:code
Alex
``:code
You can have the fetched records appear in random order:
``
>>> for row in db().select(
db.person.ALL, orderby='<random>'):
print row.name
Carl
Alex
Bob
``:code
-----
The use of ``orderby='<random>'`` is not supported on Google NoSQL. However, in this situation and likewise in many others where built-ins are insufficient, imports can be used:
``
import random
rows=db(...).select().sort(lambda row: random.random())
``:code
-----
And you can sort the records according to multiple fields by concatenating them with a "|":
``
>>> for row in db().select(
db.person.ALL, orderby=db.person.name|db.person.id):
print row.name
Carl
Bob
Alex
``:code
Using ``groupby`` together with ``orderby``, you can group records with the same value for the specified field (this is back-end specific, and is not on the Google NoSQL):
``
>>> for row in db().select(
db.person.ALL,
orderby=db.person.name, groupby=db.person.name):
print row.name
Alex
Bob
Carl
``:code
web2py also allows updating a single record that is already in memory using ``up
because for a single row, the method ``update`` updates the row object but not the database record, as in the case of ``update_record``.
It is also possible to change the attributes of a row (one at a time) and then call ``update_record()`` without arguments to save the changes:
``
>>> row = db(db.person.id > 2).select().first()
>>> row.name = 'Curt'
>>> row.update_record() # saves above change
``:code
The ``update_record`` method is available only if the table's ``id`` field is included in the select, and ``cacheable`` is not set to ``True``.
#### Inserting and updating from a dictionary
A common issue consists of needing to insert or update records in a table where the name of the table, the field to be updated, and the value for the field are all stored in variables. For example: ``tablename``, ``fieldname``, and ``value``.
The insert can be done using the following syntax:
``
db[tablename].insert(**{fieldname:value})
``:
The update of record with given id can be done with: ``_id``:inxx
``
db(db[tablename]._id==id).update(**{fieldname:value})
``:code
Notice we used ``table._id`` instead of ``table.id``. In this way the query works even for tables with a field of type "id" which has a name other than "id".
#### ``first`` and ``last``
``first``:inxx ``last``:inxx
Given a Rows object containing records:
``
>>> rows = db(query).select()
>>> first_row = rows.first()
>>> last_row = rows.last()
``:code
name
Max
Tim
John
Tim
``:code
You can do a union of the records removing duplicates:
``
>>> rows3 = rows1 | rows2
>>> print rows3
name
Max
Tim
John
``:code
#### ``find``, ``exclude``, ``sort``
``find``:inxx ``exclude``:inxx ``sort``:inxx
There are times when one needs to perform two selects and one contains a subset of a previous select. In this case it is pointless to access the database again. The ``find``, ``exclude`` and ``sort`` objects allow you to manipulate a Rows objects and generate another one without accessing the database. More specifically:
- ``find`` returns a new set of Rows filtered by a condition and leaves the original unchanged.
- ``exclude`` returns a new set of Rows filtered by a condition and removes them from the original Rows.
- ``sort`` returns a new set of Rows sorted by a condition and leaves the original unchanged.
All these methods take a single argument, a function that acts on each individual row.
Here is an example of usage:
``
>>> db.define_table('person',Field('name'))
>>> db.person.insert(name='John')
>>> db.person.insert(name='Max')
>>> db.person.insert(name='Alex')
>>> rows = db(db.person).select()
>>> for row in rows.find(lambda row: row.name[0]=='M'):
print row.name
Max
>>> print len(rows)
3
>>> for row in rows.exclude(lambda row: row.name[0]=='M'):
print row.name
Here is ``_delete`` ``_delete``:inxx
DELETE FROM person WHERE person.name='Alex';
``:code
And finally, here is ``_update`` ``_update``:inxx
``
>>> print db(db.person.name=='Alex')._update()
UPDATE person SET WHERE person.name='Alex';
``:code
-----
Moreover you can always use ``db._lastsql`` to return the most recent
SQL code, whether it was executed manually using executesql or was SQL
generated by the DAL.
-----
### Exporting and importing data
``export``:inxx ``import``:inxx
#### CSV (one Table at a time)
When a DALRows object is converted to a string it is automatically
serialized in CSV:
``csv``:inxx
``
>>> rows = db(db.person.id==db.thing.owner).select()
>>> print rows
person.id,person.name,thing.id,thing.name,thing.owner
1,Alex,1,Boat,1
1,Alex,2,Chair,1
2,Bob,3,Shoes,2
``:code
You can serialize a single table in CSV and store it in a file "test.csv":
``
>>> open('test.csv', 'w').write(str(db(db.person.id).select()))
``:code
This is equivalent to
``
>>> rows = db(db.person.id).select()
>>> rows.export_to_csv_file(open('test.csv', 'w'))
``:code
You can read the CSV file back with:
``
>>> db.person.import_from_csv_file(open('test.csv', 'r'))
``:code
When importing, web2py looks for the field names in the CSV header. In this example, it finds two columns: "person.id" and "person.name". It ignores the "person." prefix, and it ignores the "id" fields. Then all records are appended and assigned new ids. Both of these operations can be performed via the appadmin web interface.
#### CSV (all tables at once)
In web2py, you can backup/restore an entire database with two commands:
To export:
``
>>> db.export_to_csv_file(open('somefile.csv', 'wb'))
``:code
To import:
``
db.define_table('thing',
format='%(name)s')
if not db(db.person).count():
id = db.person.insert(name="Massimo")
db.thing.insert(owner=id, name="Chair")
``:code
Each record is identified by an ID and referenced by that ID. If you
have two copies of the database used by distinct web2py installations,
the ID is unique only within each database and not across the databases.
This is a problem when merging records from different databases.
In order to make a record uniquely identifiable across databases, they
must:
- have a unique id (UUID),
- have an event_time (to figure out which one is more recent if multiple copies),
- reference the UUID instead of the id.
This can be achieved without modifying web2py. Here is what to do:
**1.** Change the above model into:
``
db.define_table('person',
Field('uuid', length=64, default=lambda:str(uuid.uuid4())),
Field('modified_on', 'datetime', default=now),
Field('name'),
format='%(name)s')
db.define_table('thing',
Field('uuid', length=64, default=lambda:str(uuid.uuid4())),
Field('modified_on', 'datetime', default=now),
Field('owner', length=64),
Field('name'),
format='%(name)s')
db.thing.owner.requires = IS_IN_DB(db,'person.uuid','%(name)s')
if not db(db.person.id).count():
id = uuid.uuid4()
db.person.insert(name="Massimo", uuid=id)
db.thing.insert(owner=id, name="Chair")
``:code
-------
Note, in the above table definitions, the default value for the two 'uuid' fields is set to a lambda function, which returns a UUID (converted to a string). The lambda function is called once for each record inserted, ensuring that each record gets a unique UUID, even if multiple records are inserted in a single transaction.
-------
**2.** Create a controller action to export the database:
``
def export():
s = StringIO.StringIO()
db.export_to_csv_file(s)
response.headers['Content-Type'] = 'text/csv'
return s.getvalue()
``:code
**3.** Create a controller action to import a saved copy of the other database and sync records:
``
def import_and_sync():
form = FORM(INPUT(_type='file', _name='data'), INPUT(_type='submit'))
if form.process(session=None).accepted:
db.import_from_csv_file(form.vars.data.file,unique=False)
# for every table
for table in db.tables:
# for every uuid, delete all but the latest
items = db(db[table]).select(db[table].id,
db[table].uuid,
orderby=db[table].modified_on,
groupby=db[table].uuid)
for item in items:
db((db[table].uuid==item.uuid)&\
(db[table].id!=item.id)).delete()
return dict(form=form)
``:code
-Notice that ``session=None`` disables the CSRF protection since this URL is intended to be accessed from outside.
-
**4.** Create an index manually to make the search by uuid faster.
-Notice that steps 2 and 3 work for every database model; they are not
-specific for this example.
``XML-RPC``:inxx
Alternatively, you can use XML-RPC to export/import the file.
If the records reference uploaded files, you also need to export/import the content of the uploads folder. Notice that files therein are already labeled by UUIDs so you do not need to worry about naming conflicts and references.
#### HTML and XML (one Table at a time)
-``DALRows objects``:inxx
-DALRows objects also have an ``xml`` method (like helpers) that serializes it to XML/HTML:
``HTML``:inxx
``
>>> rows = db(db.person.id > 0).select()
>>> print rows.xml()
<table>
<thead>
<tr>
<th>person.id</th>
<th>person.name</th>
<th>thing.id</th>
<th>thing.name</th>
<th>thing.owner</th>
</tr>
</thead>
<tbody>
<tr class="even">
<td>1</td>
<td>Alex</td>
<td>1</td>
<td>Boat</td>
<td>1</td>
</tr>
...
</tbody>
</table>
``:code
-``DALRows custom tags``:inxx
-If you need to serialize the DALRows in any other XML format with custom tags, you can easily do that using the universal TAG helper and the * notation:
``XML``:inxx
``
>>> rows = db(db.person.id > 0).select()
>>> print TAG.result(*[TAG.row(*[TAG.field(r[f], _name=f) \
for f in db.person.fields]) for r in rows])
<result>
<row>
<field name="id">1</field>
<field name="name">Alex</field>
</row>
...
</result>
``:code
#### Data representation
``export_to_csv_file``:inxx
The ``export_to_csv_file`` function accepts a keyword argument named ``represent``. When ``True`` it will use the columns ``represent`` function while exporting the data instead of the raw data.
The select method also takes a cache argument, which defaults to None. For cachi
In the following example, you see a controller that caches a select on the previously defined db.log table. The actual select fetches data from the back-end database no more frequently than once every 60 seconds and stores the result in cache.ram. If the next call to this controller occurs in less than 60 seconds since the last database IO, it simply fetches the previous data from cache.ram.
``cache select``:inxx
``
def cache_db_select():
logs = db().select(db.log.ALL, cache=(cache.ram, 60))
return dict(logs=logs)
``:code
``cacheable``:inxx
The ``select`` method has an optional ``cacheable`` argument, normally set to ``False``. When ``cacheable=True`` the resulting ``Rows`` is serializable but The ``Row``s lack ``update_record`` and ``delete_record`` methods.
If you do not need these methods you can speed up selects a lot by setting the cacheable attribute:
``
rows = db(query).select(cacheable=True)
``:code
-The results of a ``select`` are normally complex, un-pickleable objects; they cannot be stored in a session and cannot be cached in any other way than the one explained here unless the ``cache`` attribute is set or ``cacheable=True``.
-
When the ``cache`` argument is set but ``cacheable=False`` (default) only the database results are cached, not the actual Rows object. When the ``cache`` argument is used in conjunction with ``cacheable=True`` the entire Rows object is cached and this results in much faster caching:
``
rows = db(query).select(cache=(cache.ram,3600),cacheable=True)
``:code
### Self-Reference and aliases
``self reference``:inxx
``alias``:inxx
It is possible to define tables with fields that refer to themselves, here is an example:
``reference table``:inxx
``
db.define_table('person',
Field('name'),
Field('father_id', 'reference person'),
Field('mother_id', 'reference person'))
``:code
Notice that the alternative notation of using a table object as field type will fail in this case, because it uses a variable ``db.person`` before it is defined:
Here ``f`` is a dict of fields passed to insert or update, ``id`` is the id of t
>>> db(db.person.id==1).update(name='Tim')
(<Set (person.id = 1)>, {'name': 'Tim'})
(<Set (person.id = 1)>, {'name': 'Tim'})
>>> db(db.person.id==1).delete()
(<Set (person.id = 1)>,)
(<Set (person.id = 1)>,)
``:code
The return values of these callback should be ``None`` or ``False``. If any of the ``_before_*`` callback returns a ``True`` value it will abort the actual insert/update/delete operation.
``update_naive``:inxx.
Some times a callback may need to perform an update in the same of a different table and one wants to avoid callbacks calling themselves recursively.
For this purpose there the Set objects have an ``update_naive`` method that works like ``update`` but ignores before and after callbacks.
#### Record versioning
``_enable_record_versioning``:inxx
It is possible to ask web2py to save every copy of a record when the record is modified. There are different ways to do it and it can be done for all tables at once using the syntax:
``
auth.enable_record_versioning(db)
``:code
this requires Auth and it is discussed in the chapter about authentication.
It can also be done for each individual table as discussed below.
Consider the following table:
``
db.define_table('stored_item',
Field('name'),
Field('quantity','integer'),
Field('is_active','boolean',
writable=False,readable=False,default=True))
``:code
Notice the hidden boolean field called ``is_active`` and defaulting to
True.
Any select, delete or update in this table, will include only public blog posts.
``
db.blog_post._common_filter = lambda query: db.blog_post.is_public == True
``
It serves both as a way to avoid repeating the "db.blog_post.is_public==True" phrase in each blog post search, and also as a security enhancement, that prevents you from forgetting to disallow viewing of none public posts.
In case you actually do want items left out by the common filter (for example, allowing the admin to see none public posts), you can either remove the filter:
``
db.blog_post._common_filter = None
``
or ignore it:
``
db(query, ignore_common_filters=True).select(...)
``
#### Custom ``Field`` types (experimental)
``SQLCustomType``:inxx
-It is possible to define new/custom field types. For example we consider here the example if a field that contains binary data in compressed form:
``
from gluon.dal import SQLCustomType
import zlib
compressed = SQLCustomType(
type ='text',
native='text',
encoder =(lambda x: zlib.compress(x or '')),
decoder = (lambda x: zlib.decompress(x))
)
db.define_table('example', Field('data',type=compressed))
``:code
``SQLCustomType`` is a field type factory. Its ``type`` argument must be one of the standard web2py types. It tells web2py how to treat the field values at the web2py level. ``native`` is the name of the field as far as the database is concerned. Allowed names depend on the database engine. ``encoder`` is an optional transformation function applied when the data is stored and ``decoder`` is the optional reversed transformation function.
This feature is marked as experimental. In practice it has been in web2py for a long time and it works but it can make the code not portable, for example when the native type is database specific. It does not work on Google App Engine NoSQL.
#### Using DAL without define tables
To access the data and its attributes we still have to define all the tables we
If we just need access to the data but not to the web2py table attributes, we get away without re-defining the tables but simply asking web2py to read the necessary info from the metadata in the .table files:
``
from gluon import DAL, Field
db = DAL('sqlite://storage.sqlite',folder='path/to/app/databases',
auto_import=True))
``:code
This allows us to access any ``db.table`` without need to re-define it.
#### PostGIS, SpatiaLite, and MS Geo (experimental)
``PostGIS``:inxx ``StatiaLite``:inxx ``Geo Extensions``:inxx
``geometry``:inxx ``geoPoint``:inxx ``geoLine``:inxx ``geoPolygon``:inxx
The DAL supports geographical APIs using PostGIS (for PostgreSQL), spatialite (for SQLite), and MSSQL and Spatial Extensions. This is a feature that was sponsored by the Sahana project and implemented by Denes Lengyel.
DAL provides geometry and geography fields types and the following functions:
``st_asgeojson``:inxx ``st_astext``:inxx ``st_contained``:inxx ``st_contains``:inxx
``st_distance``:inxx ``st_equals``:inxx ``st_intersects``:inxx ``st_overlaps``:inxx
``st_simplify``:inxx ``st_touches``:inxx ``st_within``:inxx
``
st_asgeojson (PostGIS only)
st_astext
st_contains
st_distance
st_equals
st_intersects
st_overlaps
st_simplify (PostGIS only)
st_touches
st_within
st_x
st_y
``
Here are some examples:
calls
``
Table.insert(myfield='myvalue')
``
which delegates the adapter by returning:
``
db._adapter.insert(db.mytable,db.mytable._listify(dict(myfield='myvalue')))
``
Here ``db.mytable._listify`` converts the dict of arguments into a list of ``(field,value)`` and calls the ``insert`` method of the ``adapter``. ``db._adapter`` does more or less the following:
``
query = db._adapter._insert(db.mytable,list_of_fields)
db._adapter.execute(query)
``
where the first line builds the query and the second executes it.
``BaseAdapter`` define the interface for all adapters.
"gluon/dal.py" at the moment of writing this book, contains the following adapters:
``
SQLiteAdapter extends BaseAdapter
JDBCSQLiteAdapter extends SQLiteAdapter
MySQLAdapter extends BaseAdapter
PostgreSQLAdapter extends BaseAdapter
JDBCPostgreSQLAdapter extends PostgreSQLAdapter
OracleAdapter extends BaseAdapter
MSSQLAdapter extends BaseAdapter
MSSQL2Adapter extends MSSQLAdapter
FireBirdAdapter extends BaseAdapter
FireBirdEmbeddedAdapter extends FireBirdAdapter
InformixAdapter extends BaseAdapter
DB2Adapter extends BaseAdapter
IngresAdapter extends BaseAdapter
IngresUnicodeAdapter extends IngresAdapter
GoogleSQLAdapter extends MySQLAdapter
NoSQLAdapter extends BaseAdapter
class MySQLAdapter(BaseAdapter):
credential_decoder=lambda x:x, driver_args={},
adapter_args={}):
# parse uri string and store parameters in driver_args
...
# define a connection function
def connect(driver_args=driver_args):
return self.driver.connect(**driver_args)
# place it in the pool
self.pool_connection(connect)
# set optional parameters (after connection)
self.execute('SET FOREIGN_KEY_CHECKS=1;')
self.execute("SET sql_mode='NO_BACKSLASH_ESCAPES';")
# override BaseAdapter methods as needed
def lastrowid(self,table):
self.execute('select last_insert_id();')
return int(self.cursor.fetchone()[0])
``:code
Looking at the various adapters as examples should be easy to write new ones.

- ``ondelete`` translates into the "ON DELETE" SQL statement. By default it is set to "CASCADE". This tells the database that when it deletes a record, it should also delete all records that refer to it. To disable this feature, set ``ondelete`` to "NO ACTION" or "SET NULL".
- ``notnull=True`` translates into the "NOT NULL" SQL statement. It prevents the database from inserting null values for the field.
- ``unique=True`` translates into the "UNIQUE" SQL statement and it makes sure that values of this field are unique within the table. It is enforced at the database level.
- ``uploadfield`` applies only to fields of type "upload". A field of type "upload" stores the name of a file saved somewhere else, by default on the filesystem under the application "uploads/" folder. If ``uploadfield`` is set, then the file is stored in a blob field within the same table and the value of ``uploadfield`` is the name of the blob field. This will be discussed in more detail later in the context of SQLFORM.
- ``uploadfolder`` defaults to the application's "uploads/" folder. If set to a different path, files will uploaded to a different folder. For example, uploadfolder=os.path.join(request.folder,'static/temp') will upload files to the web2py/applications/myapp/static/temp folder.
- ``uploadseparate`` if set to True will upload files under different subfolders of the ''uploadfolder'' folder. This is optimized to avoid too many files under the same folder/subfolder. ATTENTION: You cannot change the value of ``uploadseparate`` from True to False without breaking the system. web2py either uses the separate subfolders or it does not. Changing the behavior after files have been uploaded will prevent web2py from being able to retrieve those files. If this happens it is possible to move files and fix the problem but this is not described here.
- ``uploadfs`` allows you specify a different file system where to upload files, including an Amazon S3 storage or a remote FTP storage. This option requires PyFileSystem installed. ``uploadfs`` must point to ``PyFileSystem``. ``PyFileSystem``:inxx ``uploadfs``:idxx
- ``widget`` must be one of the available widget objects, including custom widgets, for example: ``SQLFORM.widgets.string.widget``. A list of available widgets will be discussed later. Each field type has a default widget.
- ``label`` is a string (or something that can be serialized to a string) that contains the label to be used for this field in auto-generated forms.
- ``comment`` is a string (or something that can be serialized to a string) that contains a comment associated with this field, and will be displayed to the right of the input field in the autogenerated forms.
- ``writable`` if a field is writable, it can be edited in autogenerated create and update forms.
- ``readable`` if a field is readable, it will be visible in read-only forms. If a field is neither readable nor writable, it will not be displayed in create and update forms.
- ``update`` contains the default value for this field when the record is updated.
- ``compute`` is an optional function. If a record is inserted or updated, the compute function will be executed and the field will be populated with the function result. The record is passed to the compute function as a ``dict``, and the dict will not include the current value of that, or any other compute field.
- ``authorize`` can be used to require access control on the corresponding field, for "upload" fields only. It will be discussed more in detail in the context of Authentication and Authorization.
- ``autodelete`` determines if the corresponding uploaded file should be deleted when the record referencing the file is deleted. For "upload" fields only.
- ``represent`` can be None or can point to a function that takes a field value and returns an alternate representation for the field value. Examples:
``
db.mytable.name.represent = lambda name,row: name.capitalize()
db.mytable.other_id.represent = lambda id,row: row.myfield
db.mytable.some_uploadfield.represent = lambda value,row: \
A('get it', _href=URL('download', args=value))
``:code
``blob``:inxx
"blob" fields are also special. By default, binary data is encoded in base64 before being stored into the actual database field, and it is decoded when extracted. This has the negative effect of using 25% more storage space than necessary in blob fields, but has two advantages. On average it reduces the amount of data communicated between web2py and the database server, and it makes the communication independent of back-end-specific escaping conventions.
Most attributes of fields and tables can be modified after they are defined:
``
db.define_table('person',Field('name',default=''),format='%(name)s')
db.person._format = '%(name)s/%(id)s'
Given a field, you can access the attributes set in its definition:
>>> print db.person.name.type
string
>>> print db.person.name.unique
False
>>> print db.person.name.notnull
False
>>> print db.person.name.length
32
``:code
including its parent table, tablename, and parent connection:
``
>>> db.person.name._table == db.person
True
>>> db.person.name._tablename == 'person'
True
>>> db.person.name._db == db
True
``:code
A field also has methods. Some of them are used to build queries and we will see them later.
A special method of the field object is ``validate`` and it calls the validators for the field.
``
print db.person.name.validate('John')
``
which returns a tuple ``(value, error)``. ``error`` is ``None`` if the input passes validation.
### Migrations
``migrations``:inxx
``define_table`` checks whether or not the corresponding table exists. If it does not, it generates the SQL to create it and executes the SQL. If the table does exist but differs from the one being defined, it generates the SQL to alter the table and executes it. If a field has changed type but not name, it will try to convert the data (If you do not want this, you need to redefine the table twice, the first time, letting web2py drop the field by removing it, and the second time adding the newly defined field so that web2py can create it.). If the table exists and matches the current definition, it will leave it alone. In all cases it will create the ``db.person`` object that represents the table.
We refer to this behavior as a "migration". web2py logs all migrations and migration attempts in the file "databases/sql.log".
The first argument of ``define_table`` is always the table name. The other unnamed arguments are the fields (Field). The function also takes an optional last argument called "migrate" which must be referred to explicitly by name as in:
``
>>> db.define_table('person', Field('name'), migrate='person.table')
``:code
You can even build a query (using operators like ==, !=, <, >, <=, >=, like, bel
When you call ``db`` with a query, you define a set of records. You can store it in a variable ``s`` and write:
``Set``:inxx
``
>>> s = db(q)
``:code
Notice that no database query has been performed so far. DAL + Query simply define a set of records in this db that match the query.
web2py determines from the query which table (or tables) are involved and, in fact, there is no need to specify that.
### ``select``
Given a Set, ``s``, you can fetch the records with the command ``select``:
``Rows``:inxx ``select``:inxx
``
>>> rows = s.select()
``:code
``Row``:inxx
It returns an iterable object of class ``gluon.sql.Rows`` whose elements are Row objects. ``gluon.sql.Row`` objects act like dictionaries, but their elements can also be accessed as attributes, like ``gluon.storage.Storage``.The former differ from the latter because its values are read-only.
The Rows object allows looping over the result of the select and printing the selected field values for each row:
``
>>> for row in rows:
print row.id, row.name
1 Alex
``:code
You can do all the steps in one statement:
``
>>> for row in db(db.person.name=='Alex').select():
print row.name
Alex
``:code
``ALL``:inxx
The select command can take arguments. All unnamed arguments are interpreted as the names of the fields that you want to fetch. For example, you can be explicit on fetching field "id" and field "name":
``
>>> for row in db().select(db.person.id, db.person.name):
Here we will consider the same example as in the previous subsection. In particu
One can define a ``total_price`` virtual field as
``
>>> db.item.total_price = Field.Virtual(
lambda row: row.item.unit_price*row.item.quantity)
``:code
i.e. by simply defining a new field ``total_price`` to be a ``Field.Virtual``. The only argument of the constructor is a function that takes a row and returns the computed values.
A virtual field defined as the one above is automatically computed for all records when the records are selected:
``
>>> for row in db(db.item).select(): print row.total_price
``
It is also possible to define method fields which are calculated on-demand, when called.
For example:
``
>>> db.item.discounted_total = Field.Method(lambda row, discount=0.0: \
row.item.unit_price*row.item.quantity*(1.0-discount/100))
``:code
In this case ``row.discounted_total`` is not a value but a function. The function takes the same arguments as the function passed to the ``Method`` constructor except for ``row`` which is implicit (think of it as ``self`` for rows objects).
The lazy field in the example above allows one to compute the total price for each ``item``:
``
>>> for row in db(db.item).select(): print row.discounted_total()
``
And it also allows to pass an optional ``discount`` percentage (15%):
``
>>> for row in db(db.item).select(): print row.discounted_total(15)
``
Virtual and Method fields can also be defined in place when a table is defined:
``
>>> db.define_table('item',
that produces a ``SELECT/OPTION`` multiple drop-box in forms.
Also notice that this field gets a default ``represent`` attribute which represents the list of references as a comma-separated list of formatted references. This is used in read forms and ``SQLTABLE``s.
-----
While ``list:reference`` has a default validator and a default representation, ``list:integer`` and ``list:string`` do not. So these two need an ``IS_IN_SET`` or an ``IS_IN_DB`` validator if you want to use them in forms.
-----
### Other operators
web2py has other operators that provide an API to access equivalent SQL operators.
Let's define another table "log" to store security events, their event_time and severity, where the severity is an integer number.
``date``:inxx ``datetime``:inxx ``time``:inxx
``
>>> db.define_table('log', Field('event'),
Field('event_time', 'datetime'),
Field('severity', 'integer'))
``:code
As before, insert a few events, a "port scan", an "xss injection" and an "unauthorized login".
For the sake of the example, you can log events with the same event_time but with different severities (1, 2, and 3 respectively).
``
>>> import datetime
>>> now = datetime.datetime.now()
>>> print db.log.insert(
event='port scan', event_time=now, severity=1)
1
>>> print db.log.insert(
event='xss injection', event_time=now, severity=2)
2
>>> print db.log.insert(
event='unauthorized login', event_time=now, severity=3)
3
``:code
#### ``like``, ``regexp``, ``startswith``, ``contains``, ``upper``, ``lower``
``like``:inxx ``startswith``:inxx ``regexp``:inxx
``contains``:inxx ``upper``:inxx ``lower``:inxx
Fields have a like operator that you can use to match strings:
In the following example, you see a controller that caches a select on the previ
``
def cache_db_select():
logs = db().select(db.log.ALL, cache=(cache.ram, 60))
return dict(logs=logs)
``:code
``cacheable``:inxx
The ``select`` method has an optional ``cacheable`` argument, normally set to ``False``. When ``cacheable=True`` the resulting ``Rows`` is serializable but The ``Row``s lack ``update_record`` and ``delete_record`` methods.
If you do not need these methods you can speed up selects a lot by setting the cacheable attribute:
``
rows = db(query).select(cacheable=True)
``:code
-------
The results of a ``select`` are normally complex, un-pickleable objects; they cannot be stored in a session and cannot be cached in any other way than the one explained here unless the ``cache`` attribute is set or ``cacheable=True``.
-------
When the ``cache`` argument is set but ``cacheable=False`` (default) only the database results are cached, not the actual Rows object. When the ``cache`` argument is used in conjunction with ``cacheable=True`` the entire Rows object is cached and this results in much faster caching:
``
rows = db(query).select(cache=(cache.ram,3600),cacheable=True)
``:code
### Self-Reference and aliases
``self reference``:inxx
``alias``:inxx
It is possible to define tables with fields that refer to themselves, here is an example:
``reference table``:inxx
``
db.define_table('person',
Field('name'),
Field('father_id', 'reference person'),
Field('mother_id', 'reference person'))
``:code
Notice that the alternative notation of using a table object as field type will fail in this case, because it uses a variable ``db.person`` before it is defined:
``
db.stored_item._enable_record_versioning()
or in a more verbose syntax:
``
db.stored_item._enable_record_versioning(
archive_db = db,
archive_name = 'stored_item_archive',
current_record = 'current_record',
is_active = 'is_active')
``
The ``archive_db=db`` tells web2py to store the archive table in the same database as the ``stored_item`` table. The ``archive_name`` sets the name for the archive table. The archive table has the same fields as the original table ``stored_item`` except that unique fields are no longer unique (because it needs to store multiple versions) and has an extra field which name is specified by ``current_record`` and which is a reference to the current record in the ``stored_item`` table.
When records are deleted, they are not really deleted. A deleted record is copied in the ``stored_item_archive`` table (like when it is modified) and the ``is_active`` field is set to False. By enabling record versioning web2py sets a ``custom_filter`` on this table that hides all fields in table ``stored_item`` where the ``is_active`` field is set to False. The ``is_active`` parameter in the ``_enable_record_versioning`` method allows to specify the name of the field used by the ``custom_filter`` to determine if the field was deleted or not.
``custom_filter``s are ignored by the appadmin interface.
#### Common fields and multi-tenancy
``common fields``:inxx
``multi tenancy``:inxx
``db._common_fields`` is a list of fields that should belong to all the tables. This list can also contain tables and it is understood as all fields from the table. For example occasionally you find yourself in need to add a signature to all your tables but the ```auth`` tables. In this case, after you ``db.define_tables()`` but before defining any other table, insert
``
db._common_fields.append(auth.signature)
``
One field is special: "request_tenant".
This field does not exist but you can create it and add it to any of your tables (or them all):
``
db._common_fields.append(Field('request_tenant',
default=request.env.http_host,writable=False))
``
For every table with a field called ``db._request_tenant``, all records for all queries are always automatically filtered by:
``
db.table.request_tenant == db.table.request_tenant.default
``:code
and for every record insert, this field is set to the default value.
the uri string is then parsed in more detail by the adapter itself.
For any adapter you can replace the driver with a different one:
``
from gluon.dal import MySQLAdapter
MySQLAdapter.driver = mysqldb
``
and you can specify optional driver arguments and adapter arguments:
``
db =DAL(..., driver_args={}, adapter_args={})
``
#### Gotchas
**SQLite** does not support dropping and altering columns. That means that web2py migrations will work up to a point. If you delete a field from a table, the column will remain in the database but be invisible to web2py. If you decide to reinstate the column, web2py will try re-create it and fail. In this case you must set ``fake_migrate=True`` so that metadata is rebuilt without attempting to add the column again. Also, for the same reason, **SQLite** is not aware of any change of column type. If you insert a number in a string field, it will be stored as string. If you later change the model and replace the type "string" with type "integer", SQLite will continue to keep the number as a string and this may cause problem when you try to extract the data.
**MySQL** does not support multiple ALTER TABLE within a single transaction. This means that any migration process is broken into multiple commits. If something happens that causes a failure it is possible to break a migration (the web2py metadata are no longer in sync with the actual table structure in the database). This is unfortunate but it can be prevented (migrate one table at the time) or it can be fixed a posteriori (revert the web2py model to what corresponds to the table structure in database, set ``fake_migrate=True`` and after the metadata has been rebuilt, set ``fake_migrate=False`` and migrate the table again).
**Google SQL** has the same problems as MySQL and more. In particular table metadata itself must be stored in the database in a table that is not migrated by web2py. This is because Google App Engine has a read-only file system. Web2py migrations in Google:SQL combined with the MySQL issue described above can result in metadata corruption. Again, this can be prevented (my migrating the table at once and then setting migrate=False so that the metadata table is not accessed any more) or it can fixed a posteriori (my accessing the database using the Google dashboard and deleting any corrupted entry from the table called ``web2py_filesystem``.
``limitby``:inxx
**MSSQL** does not support the SQL OFFSET keyword. Therefore the database cannot do pagination. When doing a ``limitby=(a,b)`` web2py will fetch the first ``b`` rows and discard the first ``a``. This may result in a considerable overhead when compared with other database engines.
**Oracle** also does not support pagination. It does not support neither the OFFSET nor the LIMIT keywords. Web2py achieves pagination by translating a ``db(...).select(limitby=(a,b))`` into a complex three-way nested select (as suggested by official Oracle documentation). This works for simple select but may break for complex selects involving aliased fields and or joins.
**MSSQL** has problems with circular references in tables that have ONDELETE CASCADE. This is an MSSSQL bug and you work around it by setting the ondelete attribute for all reference fields to "NO ACTION". You can also do it once and for all before you define tables:
``
db = DAL('mssql://....')
for key in ['reference','reference FK']:
db._adapter.types[key]=db._adapter.types[key].replace(
'%(on_delete_action)s','NO ACTION')
``:code
**MSSQL** also has problems with arguments passed to the DISTINCT keyword and therefore
while this works,
``
db(query).select(distinct=True)
``
this does not
``
db(query).select(distinct=db.mytable.myfield)
``
**Google NoSQL (Datastore)** does not allow joins, left joins, aggregates, expression, OR involving more than one table, the ‘like’ operator searches in "text" fields. Transactions are limited and not provided automatically by web2py (you need to use the Google API ``run_in_transaction`` which you can look up in the Google App Engine documentation online). Google also limits the number of records you can retrieve in each one query (1000 at the time of writing). On the Google datastore record IDs are integer but they are not sequential. While on SQL the "list:string" type is mapped into a "text" type, on the Google Datastore it is mapped into a ``ListStringProperty``. Similarly "list:integer" and "list:reference" are mapped into "ListProperty". This makes searches for content inside these fields types are more efficient on Google NoSQL than on SQL databases.
- ``ondelete`` translates into the "ON DELETE" SQL statement. By default it is set to "CASCADE". This tells the database that when it deletes a record, it should also delete all records that refer to it. To disable this feature, set ``ondelete`` to "NO ACTION" or "SET NULL".
- ``notnull=True`` translates into the "NOT NULL" SQL statement. It prevents the database from inserting null values for the field.
- ``unique=True`` translates into the "UNIQUE" SQL statement and it makes sure that values of this field are unique within the table. It is enforced at the database level.
- ``uploadfield`` applies only to fields of type "upload". A field of type "upload" stores the name of a file saved somewhere else, by default on the filesystem under the application "uploads/" folder. If ``uploadfield`` is set, then the file is stored in a blob field within the same table and the value of ``uploadfield`` is the name of the blob field. This will be discussed in more detail later in the context of SQLFORM.
- ``uploadfolder`` defaults to the application's "uploads/" folder. If set to a different path, files will uploaded to a different folder. For example, uploadfolder=os.path.join(request.folder,'static/temp') will upload files to the web2py/applications/myapp/static/temp folder.
- ``uploadseparate`` if set to True will upload files under different subfolders of the ''uploadfolder'' folder. This is optimized to avoid too many files under the same folder/subfolder. ATTENTION: You cannot change the value of ``uploadseparate`` from True to False without breaking the system. web2py either uses the separate subfolders or it does not. Changing the behavior after files have been uploaded will prevent web2py from being able to retrieve those files. If this happens it is possible to move files and fix the problem but this is not described here.
- ``uploadfs`` allows you specify a different file system where to upload files, including an Amazon S3 storage or a remote FTP storage. This option requires PyFileSystem installed. ``uploadfs`` must point to ``PyFileSystem``. ``PyFileSystem``:inxx ``uploadfs``:idxx
- ``widget`` must be one of the available widget objects, including custom widgets, for example: ``SQLFORM.widgets.string.widget``. A list of available widgets will be discussed later. Each field type has a default widget.
- ``label`` is a string (or something that can be serialized to a string) that contains the label to be used for this field in autogenerated forms.
- ``comment`` is a string (or something that can be serialized to a string) that contains a comment associated with this field, and will be displayed to the right of the input field in the autogenerated forms.
- ``writable`` if a field is writable, it can be edited in autogenerated create and update forms.
- ``readable`` if a field is readable, it will be visible in readonly forms. If a field is neither readable nor writable, it will not be displayed in create and update forms.
- ``update`` contains the default value for this field when the record is updated.
- ``compute`` is an optional function. If a record is inserted or updated, the compute function will be executed and the field will be populated with the function result. The record is passed to the compute function as a ``dict``, and the dict will not include the current value of that, or any other compute field.
- ``authorize`` can be used to require access control on the corresponding field, for "upload" fields only. It will be discussed more in detail in the context of Authentication and Authorization.
- ``autodelete`` determines if the corresponding uploaded file should be deleted when the record referencing the file is deleted. For "upload" fields only.
- ``represent`` can be None or can point to a function that takes a field value and returns an alternate representation for the field value. Examples:
``
db.mytable.name.represent = lambda name,row: name.capitalize()
db.mytable.other_id.represent = lambda id,row: row.myfield
db.mytable.some_uploadfield.represent = lambda value,row: \
A('get it', _href=URL('download', args=value))
``:code
``blob``:inxx
"blob" fields are also special. By default, binary data is encoded in base64 before being stored into the actual database field, and it is decoded when extracted. This has the negative effect of using 25% more storage space than necessary in blob fields, but has two advantages. On average it reduces the amount of data communicated between web2py and the database server, and it makes the communication independent of back-end-specific escaping conventions.
Most attributes of fields and tables can be modified after they are defined:
``
db.define_table('person',Field('name',default=''),format='%(name)s')
db.person._format = '%(name)s/%(id)s'
Given a field, you can access the attributes set in its definition:
>>> print db.person.name.type
string
>>> print db.person.name.unique
False
>>> print db.person.name.notnull
False
>>> print db.person.name.length
32
``:code
including its parent table, tablename, and parent connection:
``
>>> db.person.name._table == db.person
True
>>> db.person.name._tablename == 'person'
True
>>> db.person.name._db == db
True
``:code
A field also has methods. Some of them are used to build queries and we will seem them later.
A special method of the field object is ``validate`` and it calls the validators for the field.
``
print db.person.name.validate('John')
``
which returns a tuple ``(value, error)``. ``error`` is ``None`` if the input passes validation.
### Migrations
``migrations``:inxx
``define_table`` checks whether or not the corresponding table exists. If it does not, it generates the SQL to create it and executes the SQL. If the table does exist but differs from the one being defined, it generates the SQL to alter the table and executes it. If a field has changed type but not name, it will try to convert the data (If you do not want this, you need to redefine the table twice, the first time, letting web2py drop the field by removing it, and the second time adding the newly defined field so that web2py can create it.). If the table exists and matches the current definition, it will leave it alone. In all cases it will create the ``db.person`` object that represents the table.
We refer to this behavior as a "migration". web2py logs all migrations and migration attempts in the file "databases/sql.log".
The first argument of ``define_table`` is always the table name. The other unnamed arguments are the fields (Field). The function also takes an optional last argument called "migrate" which must be referred to explicitly by name as in:
``
>>> db.define_table('person', Field('name'), migrate='person.table')
``:code
You can even build a query (using operators like ==, !=, <, >, <=, >=, like, bel
When you call ``db`` with a query, you define a set of records. You can store it in a variable ``s`` and write:
``Set``:inxx
``
>>> s = db(q)
``:code
Notice that no database query has been performed so far. DAL + Query simply define a set of records in this db that match the query.
web2py determines from the query which table (or tables) are involved and, in fact, there is no need to specify that.
### ``select``
Given a Set, ``s``, you can fetch the records with the command ``select``:
``Rows``:inxx ``select``:inxx
``
>>> rows = s.select()
``:code
``Row``:inxx
It returns an iterable object of class ``gluon.sql.Rows`` whose elements are Row objects. ``gluon.sql.Row`` objects act like dictionaries, but their elements can also be accessed as attributes, like ``gluon.storage.Storage``.The former differ from the latter because its values are readonly.
The Rows object allows looping over the result of the select and printing the selected field values for each row:
``
>>> for row in rows:
print row.id, row.name
1 Alex
``:code
You can do all the steps in one statement:
``
>>> for row in db(db.person.name=='Alex').select():
print row.name
Alex
``:code
``ALL``:inxx
The select command can take arguments. All unnamed arguments are interpreted as the names of the fields that you want to fetch. For example, you can be explicit on fetching field "id" and field "name":
``
>>> for row in db().select(db.person.id, db.person.name):
Here we will consider the same example as in the previous subsection. In particu
One can define a ``total_price`` virtual field as
``
>>> db.item.total_price = Field.Virtual(
lambda row: row.item.unit_price*row.item.quantity)
``:code
i.e. by simply defining a new field ``total_price`` to be a ``Field.Virtual``. The only argument of the constructor is a function that takes a row and returns the computed values.
A virtual field defined as the one above is automatically computed for all records when the records are selected:
``
>>> for row in db(db.item).select(): print row.total_price
``
It is also possible to define method fields which are calculated on-demand, when called.
For example:
``
>>> db.item.discounted_total = Field.Method(lambda row, discount=0.0: \
row.item.unit_price*row.item.quantity*(1.0-discount/100.0))
``:code
In this case ``row.discounted_total`` is not a value but a function. The function takes the same arguments as the function passed to the ``Method`` constructor except for ``row`` which is implicit (think of it as ``self`` for rows objects).
The lazy field in the example above allows one to compute the total price for each ``item``:
``
>>> for row in db(db.item).select(): print row.discounted_total()
``
And it also allows to pass an optional ``discount`` percentage (15%):
``
>>> for row in db(db.item).select(): print row.discounted_total(15)
``
Virtual and Method fields can also be defined in place when a table is defined:
``
>>> db.define_table('item',
that produces a ``SELECT/OPTION`` multiple drop-box in forms.
Also notice that this field gets a default ``represent`` attribute which represents the list of references as a comma-separated list of formatted references. This is used in read forms and ``SQLTABLE``s.
-----
While ``list:reference`` has a default validator and a default representation, ``list:integer`` and ``list:string`` do not. So these two need an ``IS_IN_SET`` or an ``IS_IN_DB`` validator if you want to use them in forms.
-----
### Other operators
web2py has other operators that provide an API to access equivalent SQL operators.
Let's define another table "log" to store security events, their event_time and severity, where the severity is an integer number.
``date``:inxx ``datetime``:inxx ``time``:inxx
``
>>> db.define_table('log', Field('event'),
Field('event_time', 'datetime'),
Field('severity', 'integer'))
``:code
As before, insert a few events, a "port scan", an "xss injection" and an "unauthorized login".
For the sake of the example, you can log events with the same event_time but with different severities (1, 2, 3 respectively).
``
>>> import datetime
>>> now = datetime.datetime.now()
>>> print db.log.insert(
event='port scan', event_time=now, severity=1)
1
>>> print db.log.insert(
event='xss injection', event_time=now, severity=2)
2
>>> print db.log.insert(
event='unauthorized login', event_time=now, severity=3)
3
``:code
#### ``like``, ``regexp``, ``startswith``, ``contains``, ``upper``, ``lower``
``like``:inxx ``startswith``:inxx ``regexp``:inxx
``contains``:inxx ``upper``:inxx ``lower``:inxx
Fields have a like operator that you can use to match strings:
In the following example, you see a controller that caches a select on the previ
``
def cache_db_select():
logs = db().select(db.log.ALL, cache=(cache.ram, 60))
return dict(logs=logs)
``:code
``cacheable``:inxx
The ``select`` method has an optional ``cacheable`` argument, normally set to ``False``. When ``cacheable=True`` the resulting ``Rows`` is serializable but The ``Row``s lack ``update_record`` and ``delete_record`` methods.
If you do not need these methods you can speed up selects a lot by setting the cacheable attribute:
``
rows = db(query).select(cacheable=True)
``:code
-------
The results of a ``select`` are normally complex, un-pickleable objects; they cannot be stored in a session and cannot be cached in any other way than the one explained here unless the ``cache`` attribute is set or ``cacheable=True``.
-------
When the ``cache`` argument is set but ``cacheable=False`` (default) only the database results are cached, not the actual Rows object. When the ``cache`` argument is used in conjunction with ``cacheable=True`` the entire Rows object is cached and this results in much baster caching:
``
rows = db(query).select(cache=(cache.ram,3600),cacheable=True)
``:code
### Self-Reference and aliases
``self reference``:inxx
``alias``:inxx
It is possible to define tables with fields that refer to themselves, here is an example:
``reference table``:inxx
``
db.define_table('person',
Field('name'),
Field('father_id', 'reference person'),
Field('mother_id', 'reference person'))
``:code
Notice that the alternative notation of using a table object as field type will fail in this case, because it uses a variable ``db.person`` before it is defined:
``
db.stored_item._enable_record_versioning()
or in a more verbose syntax:
``
db.stored_item._enable_record_versioning(
archive_db = db,
archive_name = 'stored_item_archive',
current_record = 'current_record',
is_active = 'is_active')
``
The ``archive_db=db`` tells web2py to store the archive table in the same database as the ``stored_item`` table. The ``archive_name`` sets the name for the archive table. The archive table has the same fields as the original table ``stored_item`` except that unique fields are no longer unique (because it needs to store multiple versions) and has an extra field which name is specified by ``current_record`` and which is a reference to the current record in the ``stored_item`` table.
When records are deleted, they are not really deleted. A deleted record is copied in the ``stored_item_archive`` table (like when it is modified) and the ``is_active`` field is set to False. By enabling record versioning web2py sets a ``custom_filter`` on this table that hides all fields in table ``stored_item`` where the ``is_active`` field is set to False. The ``is_active`` parameter in the ``_enable_record_versioning`` method allows to specify the name of the field used by the ``custom_filter`` to determine if the field was deleted or not.
``custom_filter``s are ignored by the appadmin interface.
#### Common fields and multi-tenancy
``common fields``:inxx
``multi tenancy``:inxx
``db._common_fields`` is a list of fields that should belong to all the tables. This list can also contain tables and it it is understood as all fields from the table. For example occasionally you find yourself in need to add a signature to all your tables but the ```auth`` tables. In this case, after you ``db.define_tables()`` but before defining any other table, insert
``
db._common_fields.append(auth.signature)
``
One field is special: "request_tenant".
This field does not exist but you can create it and add it to any of your tables (or them all):
``
db._common_fields.append(Field('request_tenant',
default=request.env.http_host,writable=False))
``
For every table with a field called ``db._request_tenant``, all records for all queries are always automatically filtered by:
``
db.table.request_tenant == db.table.request_tenant.default
``:code
and for every record insert, this field is set to the default value.
the uri string is then parsed in more detail by the adapter itself.
For any adapter you can replace the driver with a different one:
``
from gluon.dal import MySQLAdapter
MySQLAdapter.driver = mysqldb
``
and you can specify optional driver arguments and adapter arguments:
``
db =DAL(..., driver_args={}, adapter_args={})
``
#### Gotchas
**SQLite** does not support dropping and altering columns. That means that web2py migrations will work up to a point. If you delete a field from a table, the column will remain in the database but be invisible to web2py. If you decide to reinstate the column, web2py will try re-create it and fail. In this case you must set ``fake_migrate=True`` so that metadata is rebuilt without attempting to add the column again. Also, for the same reason, **SQLite** is not aware of any change of column type. If you insert a number in a string field, it will be stored as string. If you later change the model and replace the type "string" with type "integer", SQLite will continue to keep the number as a string and this may cause problem when you try to extract the data.
**MySQL** does not support multiple ALTER TABLE within a single transaction. This means that any migration process is broken into multiple commits. If something happens that causes a failure it is possible to break a migration (the web2py metadata are no longer in sync with the actual table structure in the database). This is unfortunate but it can be prevented (migrate one table at the time) or it can be fixed a posteriori (revert the web2py model to what corresponds to the table structure in database, set ``fake_migrate=True`` and after the metadata has been rebuilt, set ``fake_migrate=False`` and migrate the table again).
**Google SQL** has the same problems as MySQL and more. In particular table metadata itself must be stored in the database in a table that is not migrated by web2py. This is because Google App Engine has a readonly file system. Web2py migrations in Google:SQL combined with the MySQL issue described above can result in metadata corruption. Again, this can be prevented (my migrating the table at once and then setting migrate=False so that the metadata table is not accessed any more) or it can fixed a posteriori (my accessing the database using the Google dashboard and deleting any corrupted entry from the table called ``web2py_filesystem``.
``limitby``:inxx
**MSSQL** does not support the SQL OFFSET keyword. Therefore the database cannot do pagination. When doing a ``limitby=(a,b)`` web2py will fetch the first ``b`` rows and discard the first ``a``. This may result in a considerable overhead when compared with other database engines.
**Oracle** also does not support pagination. It does not support neither the OFFSET nor the LIMIT keywords. Web2py achieves pagination by translating a ``db(...).select(limitby=(a,b))`` into a complex three-way nested select (as suggested by official Oracle documentation). This works for simple select but may break for complex selects involving aliased fields and or joins.
**MSSQL** has problems with circular references in tables that have ONDELETE CASCADE. This is an MSSSQL bug and you work around it by setting the ondelete attribute for all reference fields to "NO ACTION". You can also do it once and for all before you define tables:
``
db = DAL('mssql://....')
for key in ['reference','reference FK']:
db._adapter.types[key]=db._adapter.types[key].replace(
'%(on_delete_action)s','NO ACTION')
``:code
**MSSQL** also has problems with arguments passed to the DISTINCT keyword and therefore
while this works,
``
db(query).select(distinct=True)
``
this does not
``
db(query).select(distinct=db.mytable.myfield)
``
**Google NoSQL (Datastore)** does not allow joins, left joins, aggregates, expression, OR involving more than one table, the like operator and search in "text"" fields. Transactions are limited and not provided automatically by web2py (you need to use the Google API ``run_in_transaction`` which you can look up in the Google App Engine documentation online). Google also limits the number of records you can retrieve in each one query (1000 at the time of writing). On the Google datastore record IDs are integer but they are not sequential. While on SQL the "list:string" type is mapped into a "text" type, on the Google Datastore it is mapped into a ``ListStringProperty``. Similarly "list:integer" and "list:reference" are mapped into "ListProperty". This makes searches for content inside these fields types are more efficient on Google NoSQL than on SQL databases.

or by explicit negation with the "``~``" unary operator:
``
>>> rows = db(~(db.person.name=='Alex') | (db.person.id>3)).select()
>>> for row in rows: print row.id, row.name
2 Bob
3 Carl
``:code
------
Due to Python restrictions in overloading "``and``" and "``or``" operators, these cannot be used in forming queries. The binary operators "``&``" and "``|``" must be used instead. Note that these operators (unlike "``and``" and "``or``") have higher precedence than comparison operators, so the "extra" parentheses in the above examples are mandatory. Similarly, the unary operator "``~``" has higher precedence than comparison operators, so ``~``-negated comparisons must also be parenthesized.
------
or by explicit negation with the "``~``" unary operator:
``
>>> rows = db((~db.person.name=='Alex') | (db.person.id>3)).select()
>>> for row in rows: print row.id, row.name
2 Bob
3 Carl
``:code
------
Due to Python restrictions in overloading "``and``" and "``or``" operators, these cannot be used in forming queries. The binary operators "``&``" and "``|``" must be used instead. Note that these operators (unlike "``and``" and "``or``") have higher precedence than comparison operators, so the "extra" parentheses in the above examples are mandatory.
------

``
>>> db.item.discounted_total = Field.Method(lambda row, discount=0.0: \
row.item.unit_price*row.item.quantity*(1.0-discount/100.0))
``:code
``
>>> db.item.discounted_total = Field.Method(lambda row, discount=0.0: \
row.item.unit_price*row.item.quantity*(1.0-discount/100))
``:code

web2py comes with a Database Abstraction Layer (DAL), an API that maps Python objects into database objects such as queries, tables, and records. The DAL dynamically generates the SQL in real time using the specified dialect for the database back end, so that you do not have to write SQL code or learn different SQL dialects (the term SQL is used generically), and the application will be portable among different types of databases. At the time of this writing, the supported databases are SQLite (which comes with Python and thus web2py), PostgreSQL, MySQL, Oracle, MSSQL, FireBird, DB2, Informix, Ingres, MongoDB, and the Google App Engine (SQL and NoSQL). Experimentally we support more databases. Please check on the web2py web site and mailing list for more recent adapters. Google NoSQL is treated as a particular case in Chapter 13.
The Windows binary distribution works out of the box with SQLite and MySQL. The Mac binary distribution works out of the box with SQLite.
To use any other database back-end, run from the source distribution and install the appropriate driver for the required back end.
``database drivers``:inxx
Once the proper driver is installed, start web2py from source, and it will find the driver. Here is a list of drivers:
``DAL``:inxx ``SQLite``:inxx ``MySQL``:inxx ``PostgresSQL``:inxx ``Oracle``:inxx ``MSSQL``:inxx ``FireBird``:inxx ``DB2``:inxx ``Informix``:inxx ``Sybase``:inxx ``Teradata``:inxx ``MongoDB``:inxx ``CouchDB``:inxx ``SAPDB``:inxx ``Cubrid``:inxx
----------
database | drivers (source)
SQLite | sqlite3 or pysqlite2 or zxJDBC ``zxjdbc``:cite (on Jython)
PostgreSQL | psycopg2 ``psycopg2``:cite or pg8000 ``pg8000``:cite or zxJDBC ``zxjdbc``:cite (on Jython)
MySQL | pymysql ``pymysql``:cite or MySQLdb ``mysqldb``:cite
Oracle | cx_Oracle ``cxoracle``:cite
MSSQL | pyodbc ``pyodbc``:cite
FireBird | kinterbasdb ``kinterbasdb``:cite or fdb or pyodbc
DB2 | pyodbc ``pyodbc``:cite
Informix | informixdb ``informixdb``:cite
Ingres | ingresdbi ``ingresdbi``:cite
Not all of them are relevant for every field. "length" is relevant only for fiel
``time`` | ``IS_TIME()``
``datetime`` | ``IS_DATETIME()``
``password`` | ``None``
``upload`` | ``None``
``reference <table>`` | ``IS_IN_DB(db,table.field,format)``
``list:string`` | ``None``
``list:integer`` | ``None``
``list:reference <table>`` | ``IS_IN_DB(db,table.field,format,multiple=True)``
``json`` | ``IS_JSON()``
``bigint`` | ``None``
``big-id`` | ``None``
``big-reference`` | ``None``
---------
Decimal requires and returns values as ``Decimal`` objects, as defined in the Python ``decimal`` module. SQLite does not handle the ``decimal`` type so internally we treat it as a ``double``. The (n,m) are the number of digits in total and the number of digits after the decimal point respectively.
The ``big-id`` and, ``big-reference`` are only supported by some of the database engines and are experimental. They are not normally used as field types unless for legacy tables, however, the DAL constructor has a ``bigint_id`` argument that when set to ``True`` makes the ``id`` fields and ``reference`` fields ``big-id`` and ``big-reference`` respectively.
The ``list:`` fields are special because they are designed to take advantage of certain denormalization features on NoSQL (in the case of Google App Engine NoSQL, the field types ``ListProperty`` and ``StringListProperty``) and back-port them all the other supported relational databases. On relational databases lists are stored as a ``text`` field. The items are separated by a ``|`` and each ``|`` in string item is escaped as a ``||``. They are discussed in their own section.
+The ``json`` field type is pretty much explanatory. It can store any json serializable object. It is designed to work specifically for MongoDB and backported to the other database adapters for portability.
+
-------
Notice that ``requires=...`` is enforced at the level of forms, ``required=True`` is enforced at the level of the DAL (insert), while ``notnull``, ``unique`` and ``ondelete`` are enforced at the level of the database. While they sometimes may seem redundant, it is important to maintain the distinction when programming with the DAL.
-------
web2py comes with a Database Abstraction Layer (DAL), an API that maps Python objects into database objects such as queries, tables, and records. The DAL dynamically generates the SQL in real time using the specified dialect for the database back end, so that you do not have to write SQL code or learn different SQL dialects (the term SQL is used generically), and the application will be portable among different types of databases. At the time of this writing, the supported databases are SQLite (which comes with Python and thus web2py), PostgreSQL, MySQL, Oracle, MSSQL, FireBird, DB2, Informix, Ingres, and the Google App Engine (SQL and NoSQL). Experimentally we support more databases. Please check on the web2py web site and mailing list for more recent adapters. Google NoSQL is treated as a particular case in Chapter 13.
The Windows binary distribution works out of the box with SQLite and MySQL. The Mac binary distribution works out of the box with SQLite.
To use any other database back-end, run from the source distribution and install the appropriate driver for the required back end.
``database drivers``:inxx
Once the proper driver is installed, start web2py from source, and it will find the driver. Here is a list of drivers:
``DAL``:inxx ``SQLite``:inxx ``MySQL``:inxx ``PostgresSQL``:inxx ``Oracle``:inxx ``MSSQL``:inxx ``FireBird``:inxx ``DB2``:inxx ``Informix``:inxx ``Sybase``:inxx ``Teradata``:inxx ``MongoDB``:inxx ``CouchDB``:inxx ``SAPDB``:inxx ``Cubrid``:inxx
----------
database | drivers (source)
SQLite | sqlite3 or pysqlite2 or zxJDBC ``zxjdbc``:cite (on Jython)
PostgreSQL | psycopg2 ``psycopg2``:cite or pg8000 ``pg8000``:cite or zxJDBC ``zxjdbc``:cite (on Jython)
MySQL | pymysql ``pymysql``:cite or MySQLdb ``mysqldb``:cite
Oracle | cx_Oracle ``cxoracle``:cite
MSSQL | pyodbc ``pyodbc``:cite
FireBird | kinterbasdb ``kinterbasdb``:cite or fdb or pyodbc
DB2 | pyodbc ``pyodbc``:cite
Informix | informixdb ``informixdb``:cite
Ingres | ingresdbi ``ingresdbi``:cite
Not all of them are relevant for every field. "length" is relevant only for fiel
``time`` | ``IS_TIME()``
``datetime`` | ``IS_DATETIME()``
``password`` | ``None``
``upload`` | ``None``
``reference <table>`` | ``IS_IN_DB(db,table.field,format)``
``list:string`` | ``None``
``list:integer`` | ``None``
``list:reference <table>`` | ``IS_IN_DB(db,table.field,format,multiple=True)``
``json`` | ``IS_JSON()``
``bigint`` | ``None``
``big-id`` | ``None``
``big-reference`` | ``None``
---------
Decimal requires and returns values as ``Decimal`` objects, as defined in the Python ``decimal`` module. SQLite does not handle the ``decimal`` type so internally we treat it as a ``double``. The (n,m) are the number of digits in total and the number of digits after the decimal point respectively.
The ``big-id`` and, ``big-reference`` are only supported by some of the database engines and are experimental. They are not normally used as field types unless for legacy tables, however, the DAL constructor has a ``bigint_id`` argument that when set to ``True`` makes the ``id`` fields and ``reference`` fields ``big-id`` and ``big-reference`` respectively.
The ``list:`` fields are special because they are designed to take advantage of certain denormalization features on NoSQL (in the case of Google App Engine NoSQL, the field types ``ListProperty`` and ``StringListProperty``) and back-port them all the other supported relational databases. On relational databases lists are stored as a ``text`` field. The items are separated by a ``|`` and each ``|`` in string item is escaped as a ``||``. They are discussed in their own section.
-------
Notice that ``requires=...`` is enforced at the level of forms, ``required=True`` is enforced at the level of the DAL (insert), while ``notnull``, ``unique`` and ``ondelete`` are enforced at the level of the database. While they sometimes may seem redundant, it is important to maintain the distinction when programming with the DAL.
-------

+The value of migrate is the filename (in the "databases" folder for the application) where web2py stores internal migration information for this table.
+These files are very important and should never be removed while the corresponding tables exist. In cases where a table has been dropped and the corresponding file still exist, it can be removed manually. By default, migrate is set to True. This causes web2py to generate the filename from a hash of the connection string. If migrate is set to False, the migration is not performed, and web2py assumes that the table exists in the datastore and it contains (at least) the fields listed in ``define_table``.
The best practice is to give an explicit name to the migrate table.
-The value of migrate is the filename (in the "databases" folder for the application) where web2py stores internal migration information for this table. These files are very important and should never be removed except when the entire database is dropped. In this case, the ".table" files have to be removed manually. By default, migrate is set to True. This causes web2py to generate the filename from a hash of the connection string. If migrate is set to False, the migration is not performed, and web2py assumes that the table exists in the datastore and it contains (at least) the fields listed in ``define_table``.
The best practice is to give an explicit name to the migrate table.

+Notice that by default web2py uses utf8 character encoding for databases. If you work with existing databases that behave differently, you have to change it with the optional parameter ``db_codec`` like
+
+``
+db = DAL('...', db_codec='latin1')
+``:code
+
+otherwise you'll get UnicodeDecodeErrors tickets.
+
#### Connection pooling
``connection pooling``:inxx
#### Connection pooling
``connection pooling``:inxx

In this case, the return values are not parsed or transformed by the DAL, and the format depends on the specific database driver. This usage with selects is normally not needed, but it is more common with indexes.
``executesql`` takes four optional arguments: ``placeholders``, ``as_dict``, ``fields`` and ``colnames``.
``placeholders`` is an optional
sequence of values to be substituted in
or, if supported by the DB driver, a dictionary with keys
matching named placeholders in your SQL.
If ``as_dict`` is set to True,
and the results cursor returned by the DB driver will be
converted to a sequence of dictionaries keyed with the db
field names. Results returned with ``as_dict = True ``are
the same as those returned when applying **.as_list()** to a normal select.
``
[{field1: value1, field2: value2}, {field1: value1b, field2: value2b}]
``:code
The ``fields`` argument is a list of DAL Field objects that match the
fields returned from the DB. The Field objects should be part of one or
more Table objects defined on the DAL object. The ``fields`` list can
include one or more DAL Table objects in addition to or instead of
including Field objects, or it can be just a single table (not in a
list). In that case, the Field objects will be extracted from the
table(s).
Instead of specifying the ``fields`` argument, the ``colnames`` argument
can be specified as a list of field names in tablename.fieldname format.
Again, these should represent tables and fields defined on the DAL
object.
It is also possible to specify both ``fields`` and the associated
``colnames``. In that case, ``fields`` can also include DAL Expression
objects in addition to Field objects. For Field objects in "fields",
the associated ``colnames`` must still be in tablename.fieldname format.
For Expression objects in ``fields``, the associated ``colnames`` can
be any arbitrary labels.
The SQLTABLE constructor takes the following optional arguments:
- ``headers`` a dictionary mapping field names to their labels to be used as headers (default to ``{}``). It can also be an instruction. Currently we support ``headers='fieldname:capitalize'``.
- ``truncate`` the number of characters for truncating long values in the table (default is 16)
- ``columns`` the list of fieldnames to be shown as columns (in tablename.fieldname format).
Those not listed are not displayed (defaults to all).
- ``**attributes`` generic helper attributes to be passed to the most external TABLE object.
Here is an example:
``
{{extend 'layout.html'}}
<h1>Records</h1>
{{=SQLTABLE(rows,
headers='fieldname:capitalize',
truncate=100,
upload=URL('download'))
}}
``:code
``SQLFORM.grid``:inxx ``SQLFORM.smartgrid``:inxx
------
``SQLTABLE`` is useful but there are times when one needs more. ``SQLFORM.grid`` is an extension of SQLTABLE that creates a table with search features and pagination, as well as ability to open detailed records, create, edit and delete records. ``SQLFORM.smartgrid`` is a further generalization that allows all of the above but also creates buttons to access referencing records.
------
In this case, the return values are not parsed or transformed by the DAL, and the format depends on the specific database driver. This usage with selects is normally not needed, but it is more common with indexes.
``executesql`` takes two optional arguments: ``placeholders`` and ``as_dict``
``placeholders`` is an optional
sequence of values to be substituted in
or, if supported by the DB driver, a dictionary with keys
matching named placeholders in your SQL.
If ``as_dict`` is set to True,
and the results cursor returned by the DB driver will be
converted to a sequence of dictionaries keyed with the db
field names. Results returned with ``as_dict = True ``are
the same as those returned when applying **.as_list()** to a normal select.
``
[{field1: value1, field2: value2}, {field1: value1b, field2: value2b}]
``:code
-``executesql`` have two optional arguments: ``fields`` and ``colnames``.
-
The ``fields`` argument is a list of DAL Field objects that match the
fields returned from the DB. The Field objects should be part of one or
more Table objects defined on the DAL object. The ``fields`` list can
include one or more DAL Table objects in addition to or instead of
including Field objects, or it can be just a single table (not in a
list). In that case, the Field objects will be extracted from the
table(s).
Instead of specifying the ``fields`` argument, the ``colnames`` argument
can be specified as a list of field names in tablename.fieldname format.
Again, these should represent tables and fields defined on the DAL
object.
It is also possible to specify both ``fields`` and the associated
``colnames``. In that case, ``fields`` can also include DAL Expression
objects in addition to Field objects. For Field objects in "fields",
the associated ``colnames`` must still be in tablename.fieldname format.
For Expression objects in ``fields``, the associated ``colnames`` can
be any arbitrary labels.
The SQLTABLE constructor takes the following optional arguments:
- ``headers`` a dictionary mapping field names to their labels to be used as headers (default to ``{}``). It can also be an instruction. Currently we support ``headers='fieldname:capitalize'``.
- ``truncate`` the number of characters for truncating long values in the table (default is 16)
- ``columns`` the list of fieldnames to be shown as columns (in tablename.fieldname format).
Those not listed are not displayed (defaults to all).
- ``**attributes`` generic helper attributes to be passed to the most external TABLE object.
Here is an example:
``
{{extend 'layout.html'}}
<h1>Records</h1>
{{=SQLTABLE(rows,
headers='fieldname:capitalize',
truncate=100,
upload=URL('download'))
}}
``:code
``SQLFORM.grid``:inxx ``SQLFORM.smartgrid``:inxx
------
``SQLTABLE`` is useful but there are types when one needs more. ``SQLFORM.grid`` is an extension of SQLTABLE that creates a table with search features and pagination, as well as ability to open detailed records, create, edit and delete records. ``SQLFORM.smartgrid`` is a further generalization that allows all of the above but also creates buttons to access referencing records.
------

``
db = DAL('...', do_connect=False)
``:code
In this case you will be able to call ``_select``, ``_insert``, ``_update``, and ``_delete`` to generate SQL but not call ``select``, ``insert``, ``update``, and ``delete``. In most of the cases you can use ``do_connect=False`` even without having the required database drivers.
#### Connection pooling
``connection pooling``:inxx
The second argument of the DAL constructor is the ``pool_size``; it defaults to zero.
As it is rather slow to establish a new database connection for each request, web2py implements a mechanism for connection pooling. Once a connection is established and the page has been served and the transaction completed, the connection is not closed but goes into a pool. When the next http request arrives, web2py tries to obtain a connection from the pool and use that for the new transaction. If there are no available connections in the pool, a new connection is established.
The ``pool_size`` parameter is ignored by SQLite and Google App Engine.
Connections in the pools are shared sequentially among threads, in the sense that they may be used by two different but not simultaneous threads. There is only one pool for each web2py process.
When web2py starts, the pool is always empty. The pool grows up to the minimum between the value of ``pool_size`` and the max number of concurrent requests. This means that if ``pool_size=10`` but our server never receives more than 5 concurrent requests, then the actual pool size will only grow to 5. If ``pool_size=0`` then connection pooling is not used.
Connection pooling is ignored for SQLite, since it would not yield any benefit.
#### Connection failures
If web2py fails to connect to the database it waits 1 seconds and tries again up to 5 times before declaring a failure. In case of connection pooling it is possible that a pooled connection that stays open but unused for some time is closed by the database end. Thanks to the retry feature web2py tries to re-establish these dropped connections.
It is also possible to build queries using in-place logical operators:
``
>>> query = db.person.name!='Alex'
>>> query &= db.person.id>3
>>> query |= db.person.name=='John'
``
#### ``count``, ``isempty``, ``delete``, ``update``
You can count records in a set:
``count``:inxx ``isempty``:inxx
``
>>> print db(db.person.id > 0).count()
3
``:code
Notice that ``count`` takes an optional ``distinct`` argument which defaults to False, and it works very much like the same argument for ``select``. ``count`` has also a ``cache`` argument that works very much like the equivalent argument of the ``select`` method.
Sometimes you may need to check if a table is empty. A more efficient way than counting is using the ``isempty`` method:
``
>>> print db(db.person.id > 0).isempty()
False
``:code
or equivalently:
``
>>> print db(db.person).isempty()
False
``:code
You can delete records in a set:
``delete``:inxx
``
>>> db(db.person.id > 3).delete()
``:code
Max
Tim
>>> print rows2
person.name
John
Tim
``
You can do a union of the records in two set of rows:
``
>>> rows3 = rows1 & rows2
>>> print rows3
name
Max
Tim
John
Tim
``:code
You can do a union of the records removing duplicates:
``
>>> rows3 = rows1 | rows2
>>> print rows3
name
Max
Tim
John
``:code
#### ``find``, ``exclude``, ``sort``
``find``:inxx ``exclude``:inxx ``sort``:inxx
There are times when one needs to perform two selects and one contains a subset of a previous select. In this case it is pointless to access the database again. The ``find``, ``exclude`` and ``sort`` objects allow you to manipulate a Rows objects and generate another one without accessing the database. More specifically:
- ``find`` returns a new set of Rows filtered by a condition and leaves the original unchanged.
- ``exclude`` returns a new set of Rows filtered by a condition and removes them from the original Rows.
- ``sort`` returns a new set of Rows sorted by a condition and leaves the original unchanged.
All these methods take a single argument, a function that acts on each individual row.
Max
>>> print len(rows)
2
>>> for row in rows.sort(lambda row: row.name):
print row.name
Alex
John
``:code
They can be combined:
``
>>> rows = db(db.person).select()
>>> rows = rows.find(
lambda row: 'x' in row.name).sort(
lambda row: row.name)
>>> for row in rows:
print row.name
Alex
Max
``:code
Sort takes an optional argument ``reverse=True`` with the obvious meaning.
The ``find`` method as an optional limitby argument with the same syntax and functionality as the Set select ``method``.
### Other methods
#### ``update_or_insert``
``update_or_insert``:inxx
Some times you need to perform an insert only if there is no record with the same values as those being inserted.
This can be done with
``
db.define_table('person',Field('name'),Field('birthplace'))
db.person.update_or_insert(name='John',birthplace='Chicago')
``:code
The record will be inserted only of there is no other user called John born in Chicago.
The SQL IN operator is realized via the belongs method which returns true when t
``belongs``:inxx
``
>>> for row in db(db.log.severity.belongs((1, 2))).select():
print row.event
port scan
xss injection
``:code
The DAL also allows a nested select as the argument of the belongs operator. The only caveat is that the nested select has to be a ``_select``, not a ``select``, and only one field has to be selected explicitly, the one that defines the set.
``nested select``:inxx
``
>>> bad_days = db(db.log.severity==3)._select(db.log.event_time)
>>> for row in db(db.log.event_time.belongs(bad_days)).select():
print row.event
port scan
xss injection
unauthorized login
``:code
In those cases where a nested select is required and the look-up field is a reference we can also use a query as argument. For example:
``
db.define_table('person',Field('name'))
db.define_table('thing',Field('owner'),Field('owner','reference thing'))
db(db.thing.owner.belongs(db.person.name=='Jonathan')).select()
``:code
In this case it is obvious that the next select only needs the field referenced by the ``db.thing.owner`` field so we do not need the more verbose ``_select`` notation.
``nested_select``:inxx
A nested select can also be used as insert/update value but in this case the syntax is different:
``
lazy = db(db.person.name=='Jonathan').nested_select(db.person.id)
db(db.thing.id==1).update(owner = lazy)
``:code
In this case ``lazy`` is a nested expression that computes the ``id`` of person "Jonathan". The two lines result in one single SQL query.
db(query, ignore_common_filters=True).select(...)
``SQLCustomType``:inxx
It is possible to define new/custom field types. For example we consider here the example if a field that contains binary data in compressed form:
``
from gluon.dal import SQLCustomType
import zlib
compressed = SQLCustomType(
type ='text',
native='text',
encoder =(lambda x: zlib.compress(x or '')),
decoder = (lambda x: zlib.decompress(x))
)
db.define_table('example', Field('data',type=compressed))
``:code
``SQLCustomType`` is a field type factory. Its ``type`` argument must be one of the standard web2py types. It tells web2py how to treat the field values at the web2py level. ``native`` is the name of the field as far as the database is concerned. Allowed names depend on the database engine. ``encoder`` is an optional transformation function applied when the data is stored and ``decoder`` is the optional reversed transformation function.
This feature is marked as experimental. In practice it has been in web2py for a long time and it works but it can make the code not portable, for example when the native type is database specific. It does not work on Google App Engine NoSQL.
#### Using DAL without define tables
The DAL can be used from any Python program simply by doing this:
``
from gluon import DAL, Field
db = DAL('sqlite://storage.sqlite',folder='path/to/app/databases')
``:code
i.e. import the DAL, Field, connect and specify the folder which contains the .table files (the app/databases folder).
To access the data and its attributes we still have to define all the tables we are going to access with ``db.define_tables(...)``.
If we just need access to the data but not to the web2py table attributes, we get away without re-defining the tables but simply asking web2py to read the necessary info from the metadata in the .table files:
``
from gluon import DAL, Field
db = DAL('sqlite://storage.sqlite',folder='path/to/app/databases',
auto_import=True))
For any adapter you can replace the driver with a different one:
from gluon.dal import MySQLAdapter
MySQLAdapter.driver = mysqldb
``
and you can specify optional driver arguments and adapter arguments:
``
db =DAL(..., driver_args={}, adapter_args={})
``
#### Gotchas
**SQLite** does not support dropping and altering columns. That means that web2py migrations will work up to a point. If you delete a field from a table, the column will remain in the database but be invisible to web2py. If you decide to reinstate the column, web2py will try re-create it and fail. In this case you must set ``fake_migrate=True`` so that metadata is rebuilt without attempting to add the column again. Also, for the same reason, **SQLite** is not aware of any change of column type. If you insert a number in a string field, it will be stored as string. If you later change the model and replace the type "string" with type "integer", SQLite will continue to keep the number as a string and this may cause problem when you try to extract the data.
**MySQL** does not support multiple ALTER TABLE within a single transaction. This means that any migration process is broken into multiple commits. If something happens that causes a failure it is possible to break a migration (the web2py metadata are no longer in sync with the actual table structure in the database). This is unfortunate but it can be prevented (migrate one table at the time) or it can be fixed a posteriori (revert the web2py model to what corresponds to the table structure in database, set ``fake_migrate=True`` and after the metadata has been rebuilt, set ``fake_migrate=False`` and migrate the table again).
**Google SQL** has the same problems as MySQL and more. In particular table metadata itself must be stored in the database in a table that is not migrated by web2py. This is because Google App Engine has a readonly file system. Web2py migrations in Google:SQL combined with the MySQL issue described above can result in metadata corruption. Again, this can be prevented (my migrating the table at once and then setting migrate=False so that the metadata table is not accessed any more) or it can fixed a posteriori (my accessing the database using the Google dashboard and deleting any corrupted entry from the table called ``web2py_filesystem``.
``limitby``:inxx
**MSSQL** does not support the SQL OFFSET keyword. Therefore the database cannot do pagination. When doing a ``limitby=(a,b)`` web2py will fetch the first ``b`` rows and discard the first ``a``. This may result in a considerable overhead when compared with other database engines.
**Oracle** also does not support pagination. It does not support neither the OFFSET nor the LIMIT keywords. Web2py achieves pagination by translating a ``db(...).select(limitby=(a,b))`` into a complex three-way nested select (as suggested by official Oracle documentation). This works for simple select but may break for complex selects involving aliased fields and or joins.
**MSSQL** has problems with circular references in tables that have ONDELETE CASCADE. This is an MSSSQL bug and you work around it by setting the ondelete attribute for all reference fields to "NO ACTION". You can also do it once and for all before you define tables:
``
db = DAL('mssql://....')
for key in ['reference','reference FK']:
db._adapter.types[key]=db._adapter.types[key].replace(
'%(on_delete_action)s','NO ACTION')
``:code
**MSSQL** also has problems with arguments passed to the DISTINCT keyword and therefore
while this works,
``
db(query).select(distinct=True)
``
this does not
``
db(query).select(distinct=db.mytable.myfield)
``
**Google NoSQL (Datastore)** does not allow joins, left joins, aggregates, expression, OR involving more than one table, the like operator and search in "text"" fields. Transactions are limited and not provided automatically by web2py (you need to use the Google API ``run_in_transaction`` which you can look up in the Google App Engine documentation online). Google also limits the number of records you can retrieve in each one query (1000 at the time of writing). On the Google datastore record IDs are integer but they are not sequential. While on SQL the "list:string" type is mapped into a "text" type, on the Google Datastore it is mapped into a ``ListStringProperty``. Similarly "list:integer" and "list:reference" are mapped into "ListProperty". This makes searches for content inside these fields types are more efficient on Google NoSQL than on SQL databases.
``
db = DAL(..., do_connect=False)
``:code
In this case you will be able to call ``_select``, ``_insert``, ``_update``, and ``_delete`` to generate SQL but call ``select``, ``insert``, ``update``, and ``delete``. In most of the cases you can use ``do_connect=False`` even without having the required database drivers.
#### Connection pooling
``connection pooling``:inxx
The second argument of the DAL constructor is the ``pool_size``; it defaults to zero.
As it is rather slow to establish a new database connection for each request, web2py implements a mechanism for connection pooling. Once a connection is established and the page has been served and the transaction completed, the connection is not closed but goes into a pool. When the next http request arrives, web2py tries to obtain a connection from the pool and use that for the new transaction. If there are no available connections in the pool, a new connection is established.
The ``pool_size`` parameter is ignored by SQLite and Google App Engine.
Connections in the pools are shared sequentially among threads, in the sense that they may be used by two different but not simultaneous threads. There is only one pool for each web2py process.
When web2py starts, the pool is always empty. The pool grows up to the minimum between the value of ``pool_size`` and the max number of concurrent requests. This means that if ``pool_size=10`` but our server never receives more than 5 concurrent requests, then the actual pool size will only grow to 5. If ``pool_size=0`` then connection pooling is not used.
Connection pooling is ignored for SQLite, since it would not yield any benefit.
#### Connection failures
If web2py fails to connect to the database it waits 1 seconds and tries again up to 5 times before declaring a failure. In case of connection pooling it is possible that a pooled connection that stays open but unused for some time is closed by the database end. Thanks to the retry feature web2py tries to re-establish these dropped connections.
It is also possible to build queries using in-place logical operators:
``
>>> query = db.person.name!='Alex'
>>> query &= db.person.id>3
>>> query |= db.person.name=='John'
``
#### ``count``, ``isempty``, ``delete``, ``update``
You can count records in a set:
``count``:inxx ``isempty``:inxx
``
>>> print db(db.person.id > 0).count()
3
``:code
Notice that ``count`` takes an optional ``distinct`` argument which defaults to False, and it works very much like the same argument for ``select``. ``count`` has also a ``cache`` argument that works very much like the equivalent argument of the ``select`` method.
Sometimes you may need to check is a table is empty. A more efficient way than counting is using the ``isempty`` method:
``
>>> print db(db.person.id > 0).isempty()
False
``:code
or equivalently:
``
>>> print db(db.person).isempty()
False
``:code
You can delete records in a set:
``delete``:inxx
``
>>> db(db.person.id > 3).delete()
``:code
Max
Tim
>>> print rows2
person.name
John
Tim
``
You can do a union of the records in two set of rows:
``
>>> rows3 = rows1 & rows2
>>> print rows3
name
Max
Tim
John
Tim
``:code
You can do a union of the records and removing duplicates:
``
>>> rows3 = rows1 | rows2
>>> print rows3
name
Max
Tim
John
``:code
#### ``find``, ``exclude``, ``sort``
``find``:inxx ``exclude``:inxx ``sort``:inxx
There are times when one needs to perform two selects and one contains a subset of a previous select. In this case it is pointless to access the database again. The ``find``, ``exclude`` and ``sort`` objects allow you to manipulate a Rows objects and generate another one without accessing the database. More specifically:
- ``find`` returns a new set of Rows filtered by a condition and leaves the original unchanged.
- ``exclude`` returns a new set of Rows filtered by a condition and removes them from the original Rows.
- ``sort`` returns a new set of Rows sorted by a condition and leaves the original unchanged.
All these methods take a single argument, a function that acts on each individual row.
Max
>>> print len(rows)
2
>>> for row in rows.sort(lambda row: row.name):
print row.name
Alex
John
``:code
They can be combined:
``
>>> rows = db(db.person).select()
>>> rows = rows.find(
lambda row: 'x' in row.name).sort(
lambda row: row.name)
>>> for row in rows:
print row.name
Alex
Max
``:code
Sort takes an optional argument ``reverse=True`` which an obvious meaning.
The ``find`` method as an optional limitby argument with the same syntax and functionality as the Set select ``method``.
### Other methods
#### ``update_or_insert``
``update_or_insert``:inxx
Some times you need to perform an insert only if there is no record with the same values as those being inserted.
This can be done with
``
db.define_table('person',Field('name'),Field('birthplace'))
db.person.update_or_insert(name='John',birthplace='Chicago')
``:code
The record will be inserted only of there is no other user called John born in Chicago.
The SQL IN operator is realized via the belongs method which returns true when t
``belongs``:inxx
``
>>> for row in db(db.log.severity.belongs((1, 2))).select():
print row.event
port scan
xss injection
``:code
The DAL also allows a nested select as the argument of the belongs operator. The only caveat is that the nested select has to be a ``_select``, not a ``select``, and only one field has to be selected explicitly, the one that defines the set.
``nested select``:inxx
``
>>> bad_days = db(db.log.severity==3)._select(db.log.event_time)
>>> for row in db(db.log.event_time.belongs(bad_days)).select():
print row.event
port scan
xss injection
unauthorized login
``:code
In those cases where a nested select is required and the loop-up field is a reference we can also use a query as argument. For example:
``
db.define_table('person',Field('name'))
db.define_table('thing',Field('owner'),Field('owner','reference thing'))
db(db.thing.owner.belongs(db.person.name=='Jonathan')).select()
``:code
In this case it is obvious that the next select only needs the field referenced by the ``db.thing.owner`` field so we do not need the more verbose ``_select`` notation.
``nested_select``:inxx
A nested select can also be used as insert/update value but in this case the syntax is different:
``
lazy = db(db.person.name=='Jonathan').nested_select(db.person.id)
db(db.thing.id==1).update(owner = lazy)
``:code
In this case ``lazy`` is a nested expression that computes the ``id`` of person "Jonathan". The two lines result in one single SQL query.
db(query, ignore_common_filters=True).select(...)
``SQLCustomType``:inxx
It is possible to define new/custom field types. For example we consider here the example if a field that contains binary data in compressed form:
``
from gluon.dal import SQLCustomType
import zlib
compressed = SQLCustomType(
type ='text',
native='text',
encoder =(lambda x: zlib.compress(x or '')),
decoder = (lambda x: zlib.decompress(x))
)
db.define_table('example', Field('data',type=compressed))
``:code
``SQLCustomType`` is a field type factory. Its ``type`` argument must be one of the standard web2py types. It tells web2py how to treat the field values at the web2py level. ``native`` is the name of the field as far as the database is concerned. Allowed names depend on the database engine. ``encoder`` is an optional transformation function applied when the data is stored and ``decoder`` is the optional reversed transformation function.
This feature is marked as experimental. In practice is has been in web2py for a long time and it works but it can make the code not portable, for example when the native type is database specific. It does not work on Google App Engine NoSQL.
#### Using DAL without define tables
The DAL can be used from any Python program simply by doing this:
``
from gluon import DAL, Field
db = DAL('sqlite://storage.sqlite',folder='path/to/app/databases')
``:code
i.e. import the DAL, Field, connect and specify the folder which contains the .table files (the app/databases folder).
To access the data and its attributes we still have to define all the tables we are going to access with ``db.define_tables(...)``.
If we just need access to the data but not to the web2py table attributes, we get away without re-defining the tables but simply asking web2py to read the necessary info from the metadata in the .table files:
``
from gluon import DAL, Field
db = DAL('sqlite://storage.sqlite',folder='path/to/app/databases',
auto_import=True))
For any adapter you can replace the driver with a different one:
from gluon.dal import MySQLAdapter
MySQLAdapter.driver = mysqldb
``
and you can specify optional driver arguments and adapter arguments:
``
db =DAL(..., driver_args={}, adapter_args={})
``
#### Gotchas
**SQLite** does not support dropping and altering columns. That means that web2py migrations will work up to a point. If you delete a field from a table, the column will remain in the database but be invisible to web2py. If you decide to reinstate the column, web2py will try re-create it and fail. In this case you must set ``fake_migrate=True`` so that metadata is rebuilt without attempting to add the column again. Also, for the same reason, **SQLite** is not aware of any change of column type. If you insert a number in a string field, it will be stored as string. If you later change the model and replace the type "string" with type "integer", SQLite will continue to keep the number as a string and this may cause problem when you try to extract the data.
**MySQL** does not support multiple ALTER TABLE within a single transaction. This means that any migration process is broken into multiple commits. If something happens that causes a failure it is possible to break a migration (the web2py metadata are no longer in sync with the actual table structure in the database). This is unfortunate but it can be prevented (migrate one table at the time) or it can be fixed a posteriori (revert the web2py model to what corresponds to the table structure in database, set ``fake_migrate=True`` and after the metadata has been rebuilt, set ``fake_migrate=False`` and migrate the table again).
**Google SQL** has the same problems as MySQL and more. In particular table metadata itself must be stored in the database in a table that is not migrated by web2py. This is because Google App Engine has a readonly file system. Web2py migrations in Google:SQL combined with the MySQL issue described above can result in metadata corruption. Again, this can be prevented (my migrating the table at once and then setting migrate=False so that the metadata table is not accessed any more) or it can fixed a posteriori (my accessing the database using the Google dashboard and deleting any corrupted entry from the table called ``web2py_filesystem``.
``limitby``:inxx
**MSSQL** does not support the SQL OFFSET keyword. Therefore the database cannot do pagination. When doing a ``limitby=(a,b)`` web2py will fetch the first ``b`` rows and discard the first the ``a``. This may result in a considerable overhead when compared with other database engines.
**Oracle** also does not support pagination. It does not support neither the OFFSET nor the LIMIT keywords. Web2py achieves pagination by translating a ``db(...).select(limitby=(a,b))`` into a complex three-way nested select (as suggested by official Oracle documentation). This works for simple select but may break for complex selects involving aliased fields and or joins.
**MSSQL** has problems with circular references in tables that have ONDELETE CASCADE. This is an MSSSQL bug and you work around it by setting the ondelete attribute for all reference fields to "NO ACTION". You can also do it once and for all before you define tables:
``
db = DAL('mssql://....')
for key in ['reference','reference FK']:
db._adapter.types[key]=db._adapter.types[key].replace(
'%(on_delete_action)s','NO ACTION')
``:code
**MSSQL** also has problems with arguments passed to the DISTINCT keyword and therefore
while this works,
``
db(query).select(distinct=True)
``
this does not
``
db(query).select(distinct=db.mytable.myfield)
``
**Google NoSQL (Datastore)** does not allow joins, left joins, aggregates, expression, OR involving more than one table, the like operator and search in "text"" fields. Transactions are limited and not provided automatically by web2py (you need to use the Google API ``run_in_transaction`` which you can look up in the Google App Engine documentation online). Google also limits the number of records you can retrieve in each one query (1000 at the time of writing). On the Google datastore record IDs are integer but they are not sequential. While on SQL the "list:string" type is mapped into a "text" type, on the Google Datastore it is mapped into a ``ListStringProperty``. Similarly "list:integer" and "list:reference" are mapped into "ListProperty". This makes that searches for content inside these fields types are more efficient on Google NoSQL than on SQL databases.

----------
**field type** | **default field validators**
``string`` | ``IS_LENGTH(length)`` default length is 512
``text`` | ``IS_LENGTH(65536)``
``blob`` | ``None``
``boolean`` | ``None``
``integer`` | ``IS_INT_IN_RANGE(-1e100, 1e100)``
``double`` | ``IS_FLOAT_IN_RANGE(-1e100, 1e100)``
``decimal(n,m)`` | ``IS_DECIMAL_IN_RANGE(-1e100, 1e100)``
``date`` | ``IS_DATE()``
``time`` | ``IS_TIME()``
``datetime`` | ``IS_DATETIME()``
``password`` | ``None``
``upload`` | ``None``
``reference <table>`` | ``IS_IN_DB(db,table.field,format)``
``list:string`` | ``None``
``list:integer`` | ``None``
``list:reference <table>`` | ``IS_IN_DB(db,table.field,format,multiple=True)``
+``json`` | ``IS_JSON()``
``bigint`` | ``None``
``big-id`` | ``None``
``big-reference`` | ``None``
---------
Decimal requires and returns values as ``Decimal`` objects, as defined in the Python ``decimal`` module. SQLite does not handle the ``decimal`` type so internally we treat it as a ``double``. The (n,m) are the number of digits in total and the number of digits after the decimal point respectively.
The ``big-id`` and, ``big-reference`` are only supported by some of the database engines and are experimental. They are not normally used as field types unless for legacy tables, however, the DAL constructor has a ``bigint_id`` argument that when set to ``True`` makes the ``id`` fields and ``reference`` fields ``big-id`` and ``big-reference`` respectively.
The ``list:`` fields are special because they are designed to take advantage of certain denormalization features on NoSQL (in the case of Google App Engine NoSQL, the field types ``ListProperty`` and ``StringListProperty``) and back-port them all the other supported relational databases. On relational databases lists are stored as a ``text`` field. The items are separated by a ``|`` and each ``|`` in string item is escaped as a ``||``. They are discussed in their own section.
-------
Notice that ``requires=...`` is enforced at the level of forms, ``required=True`` is enforced at the level of the DAL (insert), while ``notnull``, ``unique`` and ``ondelete`` are enforced at the level of the database. While they sometimes may seem redundant, it is important to maintain the distinction when programming with the DAL.
-------
``ondelete``:inxx
- ``ondelete`` translates into the "ON DELETE" SQL statement. By default it is set to "CASCADE". This tells the database that when it deletes a record, it should also delete all records that refer to it. To disable this feature, set ``ondelete`` to "NO ACTION" or "SET NULL".
- ``notnull=True`` translates into the "NOT NULL" SQL statement. It prevents the database from inserting null values for the field.
- ``unique=True`` translates into the "UNIQUE" SQL statement and it makes sure that values of this field are unique within the table. It is enforced at the database level.
db = DAL('sqlite://storage.sqlite',folder='path/to/app/databases',
``:code
This allows us to access any ``db.table`` without need to re-define it.
#### PostGIS, SpatiaLite, and MS Geo (experimental)
``PostGIS``:inxx ``StatiaLite``:inxx ``Geo Extensions``:inxx
``geometry``:inxx ``geoPoint``:inxx ``geoLine``:inxx ``geoPolygon``:inxx
The DAL supports geographical APIs using PostGIS (for PostgreSQL), spatialite (for SQLite), and MSSQL and Spatial Extensions. This is a feature that was sponsored by the Sahana project and implemented by Denes Lengyel.
DAL provides geometry and geography fields types and the following functions:
``st_asgeojson``:inxx ``st_astext``:inxx ``st_contained``:inxx ``st_contains``:inxx
``st_distance``:inxx ``st_equals``:inxx ``st_intersects``:inxx ``st_overlaps``:inxx
``st_simplify``:inxx ``st_touches``:inxx ``st_within``:inxx
``
st_asgeojson (PostGIS only)
st_astext
st_contains
st_distance
st_equals
st_intersects
st_overlaps
st_simplify (PostGIS only)
st_touches
st_within
+st_x
+st_y
``
----------
**field type** | **default field validators**
``string`` | ``IS_LENGTH(length)`` default length is 512
``text`` | ``IS_LENGTH(65536)``
``blob`` | ``None``
``boolean`` | ``None``
``integer`` | ``IS_INT_IN_RANGE(-1e100, 1e100)``
``double`` | ``IS_FLOAT_IN_RANGE(-1e100, 1e100)``
``decimal(n,m)`` | ``IS_DECIMAL_IN_RANGE(-1e100, 1e100)``
``date`` | ``IS_DATE()``
``time`` | ``IS_TIME()``
``datetime`` | ``IS_DATETIME()``
``password`` | ``None``
``upload`` | ``None``
``reference <table>`` | ``IS_IN_DB(db,table.field,format)``
``list:string`` | ``None``
``list:integer`` | ``None``
``list:reference <table>`` | ``IS_IN_DB(db,table.field,format,multiple=True)``
``bigint`` | ``None``
``big-id`` | ``None``
``big-reference`` | ``None``
---------
Decimal requires and returns values as ``Decimal`` objects, as defined in the Python ``decimal`` module. SQLite does not handle the ``decimal`` type so internally we treat it as a ``double``. The (n,m) are the number of digits in total and the number of digits after the decimal point respectively.
The ``big-id`` and, ``big-reference`` are only supported by some of the database engines and are experimental. They are not normally used as field types unless for legacy tables, however, the DAL constructor has a ``bigint_id`` argument that when set to ``True`` makes the ``id`` fields and ``reference`` fields ``big-id`` and ``big-reference`` respectively.
The ``list:`` fields are special because they are designed to take advantage of certain denormalization features on NoSQL (in the case of Google App Engine NoSQL, the field types ``ListProperty`` and ``StringListProperty``) and back-port them all the other supported relational databases. On relational databases lists are stored as a ``text`` field. The items are separated by a ``|`` and each ``|`` in string item is escaped as a ``||``. They are discussed in their own section.
-------
Notice that ``requires=...`` is enforced at the level of forms, ``required=True`` is enforced at the level of the DAL (insert), while ``notnull``, ``unique`` and ``ondelete`` are enforced at the level of the database. While they sometimes may seem redundant, it is important to maintain the distinction when programming with the DAL.
-------
``ondelete``:inxx
- ``ondelete`` translates into the "ON DELETE" SQL statement. By default it is set to "CASCADE". This tells the database that when it deletes a record, it should also delete all records that refer to it. To disable this feature, set ``ondelete`` to "NO ACTION" or "SET NULL".
- ``notnull=True`` translates into the "NOT NULL" SQL statement. It prevents the database from inserting null values for the field.
- ``unique=True`` translates into the "UNIQUE" SQL statement and it makes sure that values of this field are unique within the table. It is enforced at the database level.
db = DAL('sqlite://storage.sqlite',folder='path/to/app/databases',
``:code
This allows us to access any ``db.table`` without need to re-define it.
#### PostGIS, SpatiaLite, and MS Geo (experimental)
``PostGIS``:inxx ``StatiaLite``:inxx ``Geo Extensions``:inxx
``geometry``:inxx ``geoPoint``:inxx ``geoLine``:inxx ``geoPolygon``:inxx
The DAL supports geographical APIs using PostGIS (for PostgreSQL), spatialite (for SQLite), and MSSQL and Spatial Extensions. This is a feature that was sponsored by the Sahana project and implemented by Denes Lengyel.
DAL provides geometry and geography fields types and the following functions:
``st_asgeojson``:inxx ``st_astext``:inxx ``st_contained``:inxx ``st_contains``:inxx
``st_distance``:inxx ``st_equals``:inxx ``st_intersects``:inxx ``st_overlaps``:inxx
``st_simplify``:inxx ``st_touches``:inxx ``st_within``:inxx
``
st_asgeojson (PostGIS only)
st_astext
-st_contained
st_contains
st_distance
st_equals
st_intersects
st_overlaps
st_simplify (PostGIS only)
st_touches
st_within
``

``
+>>> db.item.total_price = Field.Virtual(
+ lambda row: row.item.unit_price*row.item.quantity)
``:code
i.e. by simply defining a new field ``total_price`` to be a ``Field.Virtual``. The only argument of the constructor is a function that takes a row and returns the computed values.
A virtual field defined as the one above is automatically computed for all records when the records are selected:
``
>>> for row in db(db.item).select(): print row.total_price
``
It is also possible to define method fields which are calculated on-demand, when called.
For example:
``
>>> db.item.discounted_total = Field.Method(lambda row, discount=0.0: \
row.item.unit_price*row.item.quantity*(1.0-discount/100))
``:code
``
->>> db.item.total_price = Field.Virtual(lambda row: row.unit_price*row.quantity)
``:code
i.e. by simply defining a new field ``total_price`` to be a ``Field.Virtual``. The only argument of the constructor is a function that takes a row and returns the computed values.
A virtual field defined as the one above is automatically computed for all records when the records are selected:
``
>>> for row in db(db.item).select(): print row.total_price
``
It is also possible to define method fields which are calculated on-demand, when called.
For example:
``
>>> db.item.discounted_total = Field.Method(lambda row, discount=0.0: \
row.unit_price*row.quantity*(1.0-discount/100))
``:code

- ``ondelete`` translates into the "ON DELETE" SQL statement. By default it is set to "CASCADE". This tells the database that when it deletes a record, it should also delete all records that refer to it. To disable this feature, set ``ondelete`` to "NO ACTION" or "SET NULL".
- ``notnull=True`` translates into the "NOT NULL" SQL statement. It prevents the database from inserting null values for the field.
- ``unique=True`` translates into the "UNIQUE" SQL statement and it makes sure that values of this field are unique within the table. It is enforced at the database level.
- ``uploadfield`` applies only to fields of type "upload". A field of type "upload" stores the name of a file saved somewhere else, by default on the filesystem under the application "uploads/" folder. If ``uploadfield`` is set, then the file is stored in a blob field within the same table and the value of ``uploadfield`` is the name of the blob field. This will be discussed in more detail later in the context of SQLFORM.
- ``uploadfolder`` defaults to the application's "uploads/" folder. If set to a different path, files will uploaded to a different folder. For example, uploadfolder=os.path.join(request.folder,'static/temp') will upload files to the web2py/applications/myapp/static/temp folder.
- ``uploadseparate`` if set to True will upload files under different subfolders of the ''uploadfolder'' folder. This is optimized to avoid too many files under the same folder/subfolder. ATTENTION: You cannot change the value of ``uploadseparate`` from True to False without breaking the system. web2py either uses the separate subfolders or it does not. Changing the behavior after files have been uploaded will prevent web2py from being able to retrieve those files. If this happens it is possible to move files and fix the problem but this is not described here.
- ``uploadfs`` allows you specify a different file system where to upload files, including an Amazon S3 storage or a remote FTP storage. This option requires PyFileSystem installed. ``uploadfs`` must point to ``PyFileSystem``. ``PyFileSystem``:inxx ``uploadfs``:idxx
- ``widget`` must be one of the available widget objects, including custom widgets, for example: ``SQLFORM.widgets.string.widget``. A list of available widgets will be discussed later. Each field type has a default widget.
- ``label`` is a string (or something that can be serialized to a string) that contains the label to be used for this field in autogenerated forms.
- ``comment`` is a string (or something that can be serialized to a string) that contains a comment associated with this field, and will be displayed to the right of the input field in the autogenerated forms.
- ``writable`` if a field is writable, it can be edited in autogenerated create and update forms.
- ``readable`` if a field is readable, it will be visible in readonly forms. If a field is neither readable nor writable, it will not be displayed in create and update forms.
- ``update`` contains the default value for this field when the record is updated.
- ``compute`` is an optional function. If a record is inserted or updated, the compute function will be executed and the field will be populated with the function result. The record is passed to the compute function as a ``dict``, and the dict will not include the current value of that, or any other compute field.
- ``authorize`` can be used to require access control on the corresponding field, for "upload" fields only. It will be discussed more in detail in the context of Authentication and Authorization.
- ``autodelete`` determines if the corresponding uploaded file should be deleted when the record referencing the file is deleted. For "upload" fields only.
- ``represent`` can be None or can point to a function that takes a field value and returns an alternate representation for the field value. Examples:
``
db.mytable.name.represent = lambda name,row: name.capitalize()
db.mytable.other_id.represent = lambda id,row: row.myfield
db.mytable.some_uploadfield.represent = lambda value,row: \
A('get it', _href=URL('download', args=value))
``:code
``blob``:inxx
"blob" fields are also special. By default, binary data is encoded in base64 before being stored into the actual database field, and it is decoded when extracted. This has the negative effect of using 25% more storage space than necessary in blob fields, but has two advantages. On average it reduces the amount of data communicated between web2py and the database server, and it makes the communication independent of back-end-specific escaping conventions.
Insert returns the unique "id" value of each record inserted.
You can truncate the table, i.e., delete all records and reset the counter of the id.
``truncate``:inxx
``
>>> db.person.truncate()
``:code
Now, if you insert a record again, the counter starts again at 1 (this is back-end specific and does not apply to Google NoSQL):
``
>>> db.person.insert(name="Alex")
1
``:code
Notice you can pass parameters to ``truncate``, for example you can tell SQLITE to restart the id counter.
``
db.person.truncate('RESTART IDENTITY CASCADE')
``:code
The argument is in raw SQL and therefore engine specific.
``bulk_insert``:inxx
web2py also provides a bulk_insert method
``
>>> db.person.bulk_insert([{'name':'Alex'}, {'name':'John'}, {'name':'Tim'}])
[3,4,5]
``:code
It takes a list of dictionaries of fields to be inserted and performs multiple inserts at once. It returns the IDs of the inserted records. On the supported relational databases there is no advantage in using this function as opposed to looping and performing individual inserts but on Google App Engine NoSQL, there is a major speed advantage.
### ``commit`` and ``rollback``
No create, drop, insert, truncate, delete, or update operation is actually committed until you issue the commit command
``commit``:inxx
``
>>> db.commit()
``:code
To check it let's insert a new record:
are equivalent to
#### ``as_dict`` and ``as_list``
``as_list``:inxx ``as_dict``:inxx
A Row object can be serialized into a regular dictionary using the ``as_dict()`` method and a Rows object can be serialized into a list of dictionaries using the ``as_list()`` method. Here are some examples:
``
>>> rows = db(query).select()
>>> rows_list = rows.as_list()
>>> first_row_dict = rows.first().as_dict()
``:code
These methods are convenient for passing Rows to generic views and or to store Rows in sessions (since Rows objects themselves cannot be serialized since contain a reference to an open DB connection):
``
>>> rows = db(query).select()
>>> session.rows = rows # not allowed!
>>> session.rows = rows.as_list() # allowed!
``:code
#### Combining rows
Row objects can be combined at the Python level. Here we assume:
``
>>> print rows1
person.name
Max
Tim
>>> print rows2
person.name
John
Tim
``
You can do a union of the records in two set of rows:
``
>>> rows3 = rows1 & rows2
>>> print rows3
name
Max
Max
>>> print len(rows)
2
>>> for row in rows.sort(lambda row: row.name):
print row.name
Alex
John
``:code
They can be combined:
``
>>> rows = db(db.person).select()
>>> rows = rows.find(
lambda row: 'x' in row.name).sort(
lambda row: row.name)
>>> for row in rows:
print row.name
Alex
Max
``:code
Sort takes an optional argument ``reverse=True`` which an obvious meaning.
The ``find`` method as an optional limitby argument with the same syntax and functionality as the Set select ``method``.
### Other methods
#### ``update_or_insert``
``update_or_insert``:inxx
Some times you need to perform an insert only if there is no record with the same values as those being inserted.
This can be done with
``
db.define_table('person',Field('name'),Field('birthplace'))
db.person.update_or_insert(name='John',birthplace='Chicago')
``:code
The record will be inserted only of there is no other user called John born in Chicago.
In the following example, you see a controller that caches a select on the previ
``
def cache_db_select():
logs = db().select(db.log.ALL, cache=(cache.ram, 60))
return dict(logs=logs)
``:code
``cacheable``:inxx
The ``select`` method has an optional ``cacheable`` argument, normally set to ``False``. When ``cacheable=True`` the resulting ``Rows`` is serializable but The ``Row``s lack ``update_record`` and ``delete_record`` methods.
If you do not need these methods you can speed up selects a lot by setting the cacheable attribute:
``
rows = db(query).select(cacheable=True)
``:code
-------
The results of a ``select`` are normally complex, un-pickleable objects; they cannot be stored in a session and cannot be cached in any other way than the one explained here unless the ``cache`` attribute is set or ``cacheable=True``.
-------
When the ``cache`` argument is set but ``cacheable=False`` (default) only the database results are cached, not the actual Rows object. When the ``cache`` argument is used in conjunction with ``cacheable=True`` the entire Rows object is cached and this results in much baster caching:
``
rows = db(query).select(cache=(cache.ram,3600),cacheable=True)
``:code
### Self-Reference and aliases
``self reference``:inxx
``alias``:inxx
It is possible to define tables with fields that refer to themselves, here is an example:
``reference table``:inxx
``
db.define_table('person',
Field('name'),
Field('father_id', 'reference person'),
Field('mother_id', 'reference person'))
``:code
Notice that the alternative notation of using a table object as field type will fail in this case, because it uses a variable ``db.person`` before it is defined:
``
db.define_table('person',
Field('name'),
Field('father_id', db.person), # wrong!
Field('mother_id', db.person)) # wrong!
``:code
In general ``db.tablename`` and ``"reference tablename"`` are equivalent field types, but the latter is the only one allowed for self.references.
``with_alias``:inxx
If the table refers to itself, then it is not possible to perform a JOIN to select a person and its parents without use of the SQL "AS" keyword. This is achieved in web2py using the ``with_alias``. Here is an example:
``
>>> Father = db.person.with_alias('father')
>>> Mother = db.person.with_alias('mother')
>>> db.person.insert(name='Massimo')
1
>>> db.person.insert(name='Claudia')
2
>>> db.person.insert(name='Marco', father_id=1, mother_id=2)
3
db = DAL('sqlite://storage.sqlite',folder='path/to/app/databases')
i.e. import the DAL, Field, connect and specify the folder which contains the .table files (the app/databases folder).
To access the data and its attributes we still have to define all the tables we are going to access with ``db.define_tables(...)``.
If we just need access to the data but not to the web2py table attributes, we get away without re-defining the tables but simply asking web2py to read the necessary info from the metadata in the .table files:
``
from gluon import DAL, Field
db = DAL('sqlite://storage.sqlite',folder='path/to/app/databases',
auto_import=True))
``:code
This allows us to access any ``db.table`` without need to re-define it.
#### PostGIS, SpatiaLite, and MS Geo (experimental)
``PostGIS``:inxx ``StatiaLite``:inxx ``Geo Extensions``:inxx
``geometry``:inxx ``geoPoint``:inxx ``geoLine``:inxx ``geoPolygon``:inxx
The DAL supports geographical APIs using PostGIS (for PostgreSQL), spatialite (for SQLite), and MSSQL and Spatial Extensions. This is a feature that was sponsored by the Sahana project and implemented by Denes Lengyel.
DAL provides geometry and geography fields types and the following functions:
``st_asgeojson``:inxx ``st_astext``:inxx ``st_contained``:inxx ``st_contains``:inxx
``st_distance``:inxx ``st_equals``:inxx ``st_intersects``:inxx ``st_overlaps``:inxx
``st_simplify``:inxx ``st_touches``:inxx ``st_within``:inxx
``
st_asgeojson (PostGIS only)
st_astext
st_contained
st_contains
st_distance
st_equals
st_intersects
st_overlaps
st_simplify (PostGIS only)
st_touches
st_within
``
Here are some examples:
``
from gluon.dal import DAL, Field, geoPoint, geoLine, geoPolygon
db = DAL("mssql://user:pass@host:db")
sp = db.define_table('spatial', Field('loc','geometry()'))
``:code
Below we insert a point, a line, and a polygon:
``
sp.insert(loc=geoPoint(1,1))
sp.insert(loc=geoLine((100,100),(20,180),(180,180)))
sp.insert(loc=geoPolygon((0,0),(150,0),(150,150),(0,150),(0,0)))
``:code
Notice that
``
rows = db(sp.id>0).select()
``:code
Always returns the geometry data serialized as text.
You can also do the same more explicitly using ``st_astext()``:
``
print db(sp.id>0).select(sp.id, sp.loc.st_astext())
spatial.id,spatial.loc.STAsText()
1, "POINT (1 2)"
2, "LINESTRING (100 100, 20 180, 180 180)"
3, "POLYGON ((0 0, 150 0, 150 150, 0 150, 0 0))"
``:code
You can ask for the native representation by using ``st_asgeojson()`` (in PostGIS only):
``
print db(sp.id>0).select(sp.id, sp.loc.st_asgeojson().with_alias('loc'))
spatial.id,loc
1, [1, 2]
2, [[100, 100], [20 180], [180, 180]]
3, [[[0, 0], [150, 0], [150, 150], [0, 150], [0, 0]]]
``:code
(notice an array is a point, an array of arrays is a line, and an array of array of arrays is a polygon).
Here are example of how to use geographical functions:
``
query = sp.loc.st_intersects(geoLine((20,120),(60,160)))
query = sp.loc.st_overlaps(geoPolygon((1,1),(11,1),(11,11),(11,1),(1,1)))
query = sp.loc.st_contains(geoPoint(1,1))
print db(query).select(sp.id,sp.loc)
spatial.id,spatial.loc
3,"POLYGON ((0 0, 150 0, 150 150, 0 150, 0 0))"
``:code
- ``ondelete`` translates into the "ON DELETE" SQL statement. By default it is set to "CASCADE". This tells the database that when it deletes a record, it should also delete all records that refer to it. To disable this feature, set ``ondelete`` to "NO ACTION" or "SET NULL".
- ``notnull=True`` translates into the "NOT NULL" SQL statement. It prevents the database from inserting null values for the field.
- ``unique=True`` translates into the "UNIQUE" SQL statement and it makes sure that values of this field are unique within the table. It is enforced at the database level.
- ``uploadfield`` applies only to fields of type "upload". A field of type "upload" stores the name of a file saved somewhere else, by default on the filesystem under the application "uploads/" folder. If ``uploadfield`` is set, then the file is stored in a blob field within the same table and the value of ``uploadfield`` is the name of the blob field. This will be discussed in more detail later in the context of SQLFORM.
- ``uploadfolder`` defaults to the application's "uploads/" folder. If set to a different path, files will uploaded to a different folder. For example, uploadfolder=os.path.join(request.folder,'static/temp') will upload files to the web2py/applications/myapp/static/temp folder.
- ``uploadseparate`` if set to True will upload files under different subfolders of the ''uploadfolder'' folder. This is optimized to avoid too many files under the same folder/subfolder. ATTENTION: You cannot change the value of ``uploadseparate`` from True to False without breaking the system. web2py either uses the separate subfolders or it does not. Changing the behavior after files have been uploaded will prevent web2py from being able to retrieve those files. If this happens it is possible to move files and fix the problem but this is not described here.
- ``uploadfs`` allows you specify a different filessystem where to upload files, including an Amazon S3 storage or a remote FTP storage. This option requires PyFileSystem installed. ``uploadfs`` must point to ``PyFileSystem``. ``PyFileSystem``:inxx ``uploadfs``:idxx
- ``widget`` must be one of the available widget objects, including custom widgets, for example: ``SQLFORM.widgets.string.widget``. A list of available widgets will be discussed later. Each field type has a default widget.
- ``label`` is a string (or something that can be serialized to a string) that contains the label to be used for this field in autogenerated forms.
- ``comment`` is a string (or something that can be serialized to a string) that contains a comment associated with this field, and will be displayed to the right of the input field in the autogenerated forms.
- ``writable`` if a field is writable, it can be edited in autogenerated create and update forms.
- ``readable`` if a field is readable, it will be visible in readonly forms. If a field is neither readable nor writable, it will not be displayed in create and update forms.
- ``update`` contains the default value for this field when the record is updated.
- ``compute`` is an optional function. If a record is inserted or updated, the compute function will be executed and the field will be populated with the function result. The record is passed to the compute function as a ``dict``, and the dict will not include the current value of that, or any other compute field.
- ``authorize`` can be used to require access control on the corresponding field, for "upload" fields only. It will be discussed more in detail in the context of Authentication and Authorization.
- ``autodelete`` determines if the corresponding uploaded file should be deleted when the record referencing the file is deleted. For "upload" fields only.
- ``represent`` can be None or can point to a function that takes a field value and returns an alternate representation for the field value. Examples:
``
db.mytable.name.represent = lambda name,row: name.capitalize()
db.mytable.other_id.represent = lambda id,row: row.myfield
db.mytable.some_uploadfield.represent = lambda value,row: \
A('get it', _href=URL('download', args=value))
``:code
``blob``:inxx
"blob" fields are also special. By default, binary data is encoded in base64 before being stored into the actual database field, and it is decoded when extracted. This has the negative effect of using 25% more storage space than necessary in blob fields, but has two advantages. On average it reduces the amount of data communicated between web2py and the database server, and it makes the communication independent of back-end-specific escaping conventions.
Insert returns the unique "id" value of each record inserted.
You can truncate the table, i.e., delete all records and reset the counter of the id.
``truncate``:inxx
``
>>> db.person.truncate()
``:code
Now, if you insert a record again, the counter starts again at 1 (this is back-end specific and does not apply to Google NoSQL):
``
>>> db.person.insert(name="Alex")
1
``:code
Notice you can pass parameters to ``truncate``, for example you can tell SQLITE to restart the id counter.
``
db.person.truncate('RESTART IDENTITY CASCADE')
``:code
The argument is in raw SQL and thefeore engine specific.
``bulk_insert``:inxx
web2py also provides a bulk_insert method
``
>>> db.person.bulk_insert([{'name':'Alex'}, {'name':'John'}, {'name':'Tim'}])
[3,4,5]
``:code
It takes a list of dictionaries of fields to be inserted and performs multiple inserts at once. It returns the IDs of the inserted records. On the supported relational databases there is no advantage in using this function as opposed to looping and performing individual inserts but on Google App Engine NoSQL, there is a major speed advantage.
### ``commit`` and ``rollback``
No create, drop, insert, truncate, delete, or update operation is actually committed until you issue the commit command
``commit``:inxx
``
>>> db.commit()
``:code
To check it let's insert a new record:
are equivalent to
#### ``as_dict`` and ``as_list``
``as_list``:inxx ``as_dict``:inxx
A Row object can be serialized into a regular dictionary using the ``as_dict()`` method and a Rows object can be serialized into a list of dictionaries using the ``as_list()`` method. Here are some examples:
``
>>> rows = db(query).select()
>>> rows_list = rows.as_list()
>>> first_row_dict = rows.first().as_dict()
``:code
These methods are convenient for passing Rows to generic views and or to store Rows in sessions (since Rows objects themselves cannot be serialized since contain a reference to an open DB connection):
``
>>> rows = db(query).select()
>>> session.rows = rows # not allowed!
>>> session.rows = rows.as_list() # allowed!
``:code
#### Combining rows
Row objects can be conbined at the Python level. Here we assume:
``
>>> print rows1
person.name
Max
Tim
>>> print rows2
person.name
John
Tim
``
You can do a union of the records in two set of rows:
``
>>> rows3 = rows1 & rows2
>>> print rows3
name
Max
Max
>>> print len(rows)
2
>>> for row in rows.sort(lambda row: row.name):
print row.name
Alex
John
``:code
They can be combined:
``
>>> rows = db(db.person).select()
>>> rows = rows.find(
lambda row: 'x' in row.name).sort(
lambda row: row.name)
>>> for row in rows:
print row.name
Alex
Max
``:code
Sort takes an optional argment ``reverse=True`` which an obvious meaning.
The ``find`` method as an optional limitby argument with the same syntax and functionality as the Set select ``method``.
### Other methods
#### ``update_or_insert``
``update_or_insert``:inxx
Some times you need to perform an insert only if there is no record with the same values as those being inserted.
This can be done with
``
db.define_table('person',Field('name'),Field('birthplace'))
db.person.update_or_insert(name='John',birthplace='Chicago')
``:code
The record will be inserted only of there is no other user called John born in Chicago.
In the following example, you see a controller that caches a select on the previ
``
def cache_db_select():
logs = db().select(db.log.ALL, cache=(cache.ram, 60))
return dict(logs=logs)
``:code
``cacheable``:inxx
The ``select`` method has an optional ``cacheable`` argument, normally set to ``False``. When ``cacheable=True`` the resulting ``Rows`` is serializable but The ``Row``s lack ``update_record`` and ``delete_record`` methods.
If you do not need these methods you can speed up selects a lot by setting the cacheable attribute:
``
rows = db(query).select(cacheable=True)
``:code
-------
The results of a ``select`` are normally complex, un-pickleable objects; they cannot be stored in a session and cannot be cached in any other way than the one explained here unless the ``cache`` attribute is set or ``cacheable=True``.
-------
When the ``cache`` argument is set but ``cacheable=False`` (default) only the database results are cached, not the actual Rows object. When the ``cache`` argument is used in conjuction with ``cacheable=True`` the entire Rows object is cached and this results in much baster caching:
``
rows = db(query).select(cache=(cache.ram,3600),cacheable=True)
``:code
### Self-Reference and aliases
``self reference``:inxx
``alias``:inxx
It is possible to define tables with fields that refer to themselves, here is an example:
``reference table``:inxx
``
db.define_table('person',
Field('name'),
Field('father_id', 'reference person'),
Field('mother_id', 'reference person'))
``:code
Notice that the alterantive notation of using a table object as field type will fail in this case, because it uses a variable ``db.person`` before it is defined:
``
db.define_table('person',
Field('name'),
Field('father_id', db.person), # wrong!
Field('mother_id', db.person)) # wrong!
``:code
In general ``db.tablename`` and ``"reference tablename"`` are equivalent field types, but the latter is the only one allowed for self.references.
``with_alias``:inxx
If the table refers to itself, then it is not possible to perform a JOIN to select a person and its parents without use of the SQL "AS" keyword. This is achieved in web2py using the ``with_alias``. Here is an example:
``
>>> Father = db.person.with_alias('father')
>>> Mother = db.person.with_alias('mother')
>>> db.person.insert(name='Massimo')
1
>>> db.person.insert(name='Claudia')
2
>>> db.person.insert(name='Marco', father_id=1, mother_id=2)
3
db = DAL('sqlite://storage.sqlite',folder='path/to/app/databases')
i.e. import the DAL, Field, connect and specify the folder which contains the .table files (the app/databases folder).
To access the data and its attributes we still have to define all the tables we are going to access with ``db.define_tables(...)``.
If we just need access to the data but not to the web2py table attributes, we get away without re-defining the tables but simply asking web2py to read the necessary info from the metadata in the .table files:
``
from gluon import DAL, Field
db = DAL('sqlite://storage.sqlite',folder='path/to/app/databases',
auto_import=True))
``:code
This allows us to access any ``db.table`` without need to re-define it.
#### PostGIS, SpatiaLite, and MS Geo (experimental)
``PostGIS``:inxx ``StatiaLite``:inxx ``Geo Extensions``:inxx
``geometry``:inxx ``geoPoint``:inxx ``geoLine``:inxx ``geoPolygon``:inxx
The DAL supports geographical APIs using PostGIS (for PostgreSQL), spatialite (for SQLite), and MSSQL and Spatial Extensions. This is a feature that was sponosred by the Sahana project and implemented by Denes Lengyel.
DAL provides geometry and geography fields types and the following functions:
``st_asgeojson``:inxx ``st_astext``:inxx ``st_contained``:inxx ``st_contains``:inxx
``st_distance``:inxx ``st_equals``:inxx ``st_intersects``:inxx ``st_overlaps``:inxx
``st_simplify``:inxx ``st_touches``:inxx ``st_within``:inxx
``
st_asgeojson (PostGIS only)
st_astext
st_contained
st_contains
st_distance
st_equals
st_intersects
st_overlaps
st_simplify (PostGIS only)
st_touches
st_within
``
Here are some examples:
``
from gluon.dal import DAL, Field, geoPoint, geoLine, geoPolygon
db = DAL("mssql://user:pass@host:db")
sp = db.define_table('spatial', Field('loc','geometry()'))
``:code
Below we insert a point, a line, and a polygon:
``
sp.insert(loc=geoPoint(1,1))
sp.insert(loc=geoLine((100,100),(20,180),(180,180)))
sp.insert(loc=geoPolygon((0,0),(150,0),(150,150),(0,150),(0,0)))
``:code
Notice that
``
rows = db(sp.id>0).select()
``:code
Always returns the geometry data serialized as text.
You can also do the same more explicitely using ``st_astext()``:
``
print db(sp.id>0).select(sp.id, sp.loc.st_astext())
spatial.id,spatial.loc.STAsText()
1, "POINT (1 2)"
2, "LINESTRING (100 100, 20 180, 180 180)"
3, "POLYGON ((0 0, 150 0, 150 150, 0 150, 0 0))"
``:code
You can ask for the native representation by using ``st_asgeojson()`` (in PostGIS only):
``
print db(sp.id>0).select(sp.id, sp.loc.st_asgeojson().with_alias('loc'))
spatial.id,loc
1, [1, 2]
2, [[100, 100], [20 180], [180, 180]]
3, [[[0, 0], [150, 0], [150, 150], [0, 150], [0, 0]]]
``:code
(notice an array is a point, an array of arrays is a line, and an array of array of arrays is a polygon).
Here are example of how to use geographical functions:
``
query = sp.loc.st_intersects(geoLine((20,120),(60,160)))
query = sp.loc.st_overlaps(geoPolygon((1,1),(11,1),(11,11),(11,1),(1,1)))
query = sp.loc.st_contains(geoPoint(1,1))
print db(query).select(sp.id,sp.loc)
spatial.id,spatial.loc
3,"POLYGNON ((0 0, 150 0, 150 150, 0 150, 0 0))"
``:code

+#### Inserting and updating from a dictionary
+
+A common issue consists of needing to insert or update records in a table where the name of the table, the field to be updated, and the value for the field are all stored in variables. For example: ``tablename``, ``fieldname``, and ``value``.
+
+The insert can be done using the following syntax:
+
+``
+db[tablename].insert(**{fieldname:value})
+``:
+
+The update of record with given id can be done with: ``_id``:inxx
+
+``
+db(db[tablename]._id==id).update(**{fieldname:value})
+``:code
+
+Notice we used ``table._id`` instead of ``table.id``. In this way the query works even for tables with a field of type "id" which has a name other than "id".
+
+
#### ``first`` and ``last``
``first``:inxx ``last``:inxx
#### ``first`` and ``last``
``first``:inxx ``last``:inxx

- ``primarykey`` is a list of the field names that make up the primary key.
- All primarykey fields have a ``NOT NULL`` set even if not specified.
- Keyed tables can only reference other keyed tables.
- Referencing fields must use the ``reference tablename.fieldname`` format.
- The ``update_record`` function is not available for Rows of keyed tables.
-------
Note that currently this is only available for DB2, MS-SQL, Ingres and Informix, but others can be easily added.
-------
At the time of writing, we cannot guarantee that the ``primarykey`` attribute works with every existing legacy table and every supported database backend.
For simplicity, we recommend, if possible, creating a database view that has an auto-increment id field.
### Distributed transaction
``distributed transactions``:inxx
------
At the time of writing this feature is only supported
by PostgreSQL, MySQL and Firebird, since they expose API for two-phase commits.
------
Assuming you have two (or more) connections to distinct PostgreSQL databases, for example:
It is possible to create a table that contains all the fields from another table
``
db.define_table('person', Field('name'))
db.define_table('doctor', db.person, Field('specialization'))
``:code
``dummy table``:inxx
It is also possible to define a dummy table that is not stored in a database in order to reuse it in multiple other places. For example:
``
signature = db.Table(db, 'signature',
Field('created_on', 'datetime', default=request.now),
Field('created_by', db.auth_user, default=auth.user_id),
Field('updated_on', 'datetime', update=request.now),
Field('updated_by', db.auth_user, update=auth.user_id))
db.define_table('payment', Field('amount', 'double'), signature)
``:code
This example assumes that standard web2py authentication is enabled.
Notice that if you use ``Auth`` web2py already creates one such table for you:
``
auth = Auth(db)
db.define_table('payment', Field('amount', 'double'), auth.signature)
``
When using table inheritance, if you want the inheriting table to inherit validators, be sure to define the validators of the parent table before defining the inheriting table.
#### ``filter_in`` and ``filter_out``
``filter_in``:inxx ``filter_out``:inxx
It is possible to define a filter for each field to be called before a value is inserted into the database for that field and after a value is retrieved from the database.
Imagine for example that you want to store a serializable Python data structure in a field in the json format. Here is how it could be accomplished:
``
>>> from simplejson import loads, dumps
>>> db.define_table('anyobj',Field('name'),Field('data','text'))
>>> db.anyobj.data.filter_in = lambda obj, dumps=dumps: dumps(obj)
>>> db.anyobj.data.filter_out = lambda txt, loads=loads: loads(txt)
For this purpose there the Set objects have an ``update_naive`` method that work
It is possible to ask web2py to save every copy of a record when the record is modified. There are different ways to do it and it can be done for all tables at once using the syntax:
``
auth.enable_record_versioning(db)
``:code
this requires Auth and it is discussed in the chapter about authentication.
It can also be done for each individual table as discussed below.
Consider the following table:
``
db.define_table('stored_item',
Field('name'),
Field('quantity','integer'),
Field('is_active','boolean',
writable=False,readable=False,default=True))
``:code
Notice the hidden boolean field called ``is_active`` and defaulting to
True.
- ``primarykey`` is a list of the field names that make up the primary key.
- All primarykey fields have a ``NOT NULL`` set even if not specified.
- Keyed table can only refer are to other keyed tables.
- Referencing fields must use the ``reference tablename.fieldname`` format.
- The ``update_record`` function is not available for Rows of keyed tables.
-------
Note that currently this is only available for DB2, MS-SQL, Ingres and Informix, but others can be easily added.
-------
At the time of writing, we cannot guarantee that the ``primarykey`` attribute works with every existing legacy table and every supported database backend.
For simplicity, we recommend, if possible, creating a database view that has an auto-increment id field.
### Distributed transaction
``distributed transactions``:inxx
------
At the time of writing this feature is only supported
by PostgreSQL, MySQL and Firebird, since they expose API for two-phase commits.
------
Assuming you have two (or more) connections to distinct PostgreSQL databases, for example:
It is possible to create a table that contains all the fields from another table
``
db.define_table('person', Field('name'))
db.define_table('doctor', db.person, Field('specialization'))
``:code
``dummy table``:inxx
It is also possible to define a dummy table that is not stored in a database in order to reuse it in multiple other places. For example:
``
signature = db.Table(db, 'signature',
Field('created_on', 'datetime', default=request.now),
Field('created_by', db.auth_user, default=auth.user_id),
Field('updated_on', 'datetime', update=request.now),
Field('updated_by', db.auth_user, update=auth.user_id))
db.define_table('payment', Field('amount', 'double'), signature)
``:code
This example assumes that standard web2py authentication is enabled.
Notice that if you user ``Auth`` web2py already creates one such table for you:
``
auth = Auth(db)
db.define_table('payment', Field('amount', 'double'), auth.signature)
``
When using table inheritance, if you want the inheriting table to inherit validators, be sure to define the validators of the parent table before defining the inheriting table.
#### ``filter_in`` and ``filter_out``
``filter_in``:inxx ``filter_out``:inxx
It is possible to define a filter for each field to be called before a value is inserted into the database for that field and after a value is retrieved from the database.
Imagine for example that you want to store a serializable Python data structure in a field in the json format. Here is how it could be accomplished:
``
>>> from simplejson import loads, dumps
>>> db.define_table('anyobj',Field('name'),Field('data','text'))
>>> db.anyobj.data.filter_in = lambda obj, dumps=dumps: dumps(obj)
>>> db.anyobj.data.filter_out = lambda txt, loads=loads: loads(txt)
For this purpose there the Set objects have an ``update_naive`` method that work
It is possible to ask web2py to save every copy of a record when the record is modified. There are different ways to do it and it can be done for all tables at once using the syntax:
``
auth.enable_record_versioning(db)
``:code
this requires Auth and it is discussed in the chapter about authentication.
It can also be done for each individual table as discussed below.
Consider the following table:
``
db.define_table('stored_item',
Field('name'),
Field('quantity','integer'),
Field('is_active','boolean',
writable=False,readable=False,default=True))
``:code
Notice we the hidden boolean field called ``is_active`` and defaulting to
True.

+
### Record representation
### Record representation

The second argument of the DAL constructor is the ``pool_size``; it defaults to zero.
As it is rather slow to establish a new database connection for each request, web2py implements a mechanism for connection pooling. Once a connection is established and the page has been served and the transaction completed, the connection is not closed but goes into a pool. When the next http request arrives, web2py tries to obtain a connection from the pool and use that for the new transaction. If there are no available connections in the pool, a new connection is established.
The ``pool_size`` parameter is ignored by SQLite and Google App Engine.
Connections in the pools are shared sequentially among threads, in the sense that they may be used by two different but not simultaneous threads. There is only one pool for each web2py process.
When web2py starts, the pool is always empty. The pool grows up to the minimum between the value of ``pool_size`` and the max number of concurrent requests. This means that if ``pool_size=10`` but our server never receives more than 5 concurrent requests, then the actual pool size will only grow to 5. If ``pool_size=0`` then connection pooling is not used.
Connection pooling is ignored for SQLite, since it would not yield any benefit.
#### Connection failures
If web2py fails to connect to the database it waits 1 seconds and tries again up to 5 times before declaring a failure. In case of connection pooling it is possible that a pooled connection that stays open but unused for some time is closed by the database end. Thanks to the retry feature web2py tries to re-establish these dropped connections.
When using connection pooling a connection is used, put back in the pool and then recycled. It is possible that while the connection is idle in pool the connection is closed by the database server. This can be because of a malfunction or a timeout. When this happens web2py detects it and re-establish the connection.
#### Replicated databases
The first argument of ``DAL(...)`` can be a list of URIs. In this case web2py tries to connect to each of them. The main purpose for this is to deal with multiple database servers and distribute the workload among them). Here is a typical use case:
If you now insert again, the counter will again be set to 2, since the previous
``
>>> db.person.insert(name="Bob")
2
``:code
Code in models, views and controllers is enclosed in web2py code that looks like this:
``
try:
execute models, controller function and view
except:
rollback all connections
log the traceback
send a ticket to the visitor
else:
commit all connections
save cookies, sessions and return the page
``:code
There is no need to ever call ``commit`` or ``rollback`` explicitly in web2py unless one needs more granular control.
### Raw SQL
#### Timing queries
All queries are automatically timed by web2py. The variable ``db._timings`` is a list of tuples. Each tuple contains the raw SQL query as passed to the database driver and the time it took to execute in seconds. This variable can be displayed in views using the toolbar:
``
{{=response.toolbar()}}
``
#### ``executesql``
The DAL allows you to explicitly issue SQL statements.
``executesql``:inxx
``
>>> print db.executesql('SELECT * FROM person;')
[(1, u'Massimo'), (2, u'Massimo')]
``:code
In this case, the return values are not parsed or transformed by the DAL, and the format depends on the specific database driver. This usage with selects is normally not needed, but it is more common with indexes.
ret = db.mytable.validate_and_insert(field='value')
works very much like
``
id = db.mytable.insert(field='value')
``:code
except that it calls the validators for the fields before performing the insert and bails out if the validation does not pass. If validation does not pass the errors can be found in ``ret.error``. If it passes, the id of the new record is in ``ret.id``. Mind that normally validation is done by the form processing logic so this function is rarely needed.
Similarly
``
ret = db(query).validate_and_update(field='value')
``:code
works very much the same as
``
num = db(query).update(field='value')
``:code
except that it calls the validators for the fields before performing the update. Notice that it only works if query involves a single table. The number of updated records can be found in ``res.updated`` and errors will be ``ret.errors``.
#### ``smart_query`` (experimental)
There are times when you need to parse a query using natural language such as
``
name contain m and age greater than 18
``
The DAL provides a method to parse this type of queries:
``
search = 'name contain m and age greater than 18'
rows = db.smart_query([db.person],search).select()
``
The first argument must be a list of tables or fields that should be allowed in the search. It raises a ``RuntimeError`` if the search string is invalid. This functionality can be used to build RESTful interfaces (see chapter 10) and it is used internally by the ``SQLFORM.grid`` and ``SQLFORM.smartgrid``.
In the smartquery search string, a field can be identified by fieldname only and or by tablename.fieldname. Strings may be delimited by double quotes if they contain spaces.
DAL provides geometry and geography fields types and the following functions:
``
st_asgeojson (PostGIS only)
st_astext
st_contained
st_contains
st_distance
st_equals
st_intersects
st_overlaps
st_simplify (PostGIS only)
st_touches
st_within
``
Here are some examples:
``
from gluon.dal import DAL, Field, geoPoint, geoLine, geoPolygon
db = DAL("mssql://user:pass@host:db")
sp = db.define_table('spatial', Field('loc','geometry()'))
``:code
Below we insert a point, a line, and a polygon:
``
sp.insert(loc=geoPoint(1,1))
sp.insert(loc=geoLine((100,100),(20,180),(180,180)))
sp.insert(loc=geoPolygon((0,0),(150,0),(150,150),(0,150),(0,0)))
``:code
Notice that
``
rows = db(sp.id>0).select()
``:code
Always returns the geometry data serialized as text.
You can also do the same more explicitely using ``st_astext()``:
``
print db(sp.id>0).select(sp.id, sp.loc.st_astext())
spatial.id,spatial.loc.STAsText()
+1, "POINT (1 2)"
+2, "LINESTRING (100 100, 20 180, 180 180)"
+3, "POLYGON ((0 0, 150 0, 150 150, 0 150, 0 0))"
``:code
You can ask for the native representation by using ``st_asgeojson()`` (in PostGIS only):
``
print db(sp.id>0).select(sp.id, sp.loc.st_asgeojson().with_alias('loc'))
spatial.id,loc
1, [1, 2]
2, [[100, 100], [20 180], [180, 180]]
3, [[[0, 0], [150, 0], [150, 150], [0, 150], [0, 0]]]
``:code
(notice an array is a point, an array of arrays is a line, and an array of array of arrays is a polygon).
Here are example of how to use geographical functions:
``
query = sp.loc.st_intersects(geoLine((20,120),(60,160)))
query = sp.loc.st_overlaps(geoPolygon((1,1),(11,1),(11,11),(11,1),(1,1)))
query = sp.loc.st_contains(geoPoint(1,1))
TeradataAdapter extends DB2Adapter (experimental)
SAPDBAdapter extends BaseAdapter (experimental)
CouchDBAdapter extends NoSQLAdapter (experimental)
MongoDBAdapter extends NoSQLAdapter (experimental)
``
which override the behavior of the ``BaseAdapter``.
Each adapter has more or less this structure:
``
class MySQLAdapter(BaseAdapter):
# specify a diver to use
driver = globals().get('pymysql',None)
# map web2py types into database types
types = {
'boolean': 'CHAR(1)',
'string': 'VARCHAR(%(length)s)',
'text': 'LONGTEXT',
...
}
# connect to the database using driver
def __init__(self,db,uri,pool_size=0,folder=None,db_codec ='UTF-8',
credential_decoder=lambda x:x, driver_args={},
adapter_args={}):
# parse uri string and store parameters in driver_args
...
# define a connection function
def connect(driver_args=driver_args):
return self.driver.connect(**driver_args)
# place it in the pool
self.pool_connection(connect)
# set optional parameters (after connection)
self.execute('SET FOREIGN_KEY_CHECKS=1;')
self.execute("SET sql_mode='NO_BACKSLASH_ESCAPES';")
# override BaseAdapter methods as needed
def lastrowid(self,table):
self.execute('select last_insert_id();')
db =DAL(..., driver_args={}, adapter_args={})
db = DAL('mssql://....')
for key in ['reference','reference FK']:
db._adapter.types[key]=db._adapter.types[key].replace(
'%(on_delete_action)s','NO ACTION')
``:code
**MSSQL** also has problems with arguments passed to the DISTINCT keyword and therefore
while this works,
``
db(query).select(distinct=True)
``
this does not
``
db(query).select(distinct=db.mytable.myfield)
``
**Google NoSQL (Datastore)** does not allow joins, left joins, aggregates, expression, OR involving more than one table, the like operator and search in "text"" fields. Transactions are limited and not provided automatically by web2py (you need to use the Google API ``run_in_transaction`` which you can look up in the Google App Engine documentation online). Google also limits the number of records you can retrieve in each one query (1000 at the time of writing). On the Google datastore record IDs are integer but they are not sequential. While on SQL the "list:string" type is mapped into a "text" type, on the Google Datastore it is mapped into a ``ListStringProperty``. Similarly "list:integer" and "list:reference" are mapped into "ListProperty". This makes that searches for content inside these fields types are more efficient on Google NoSQL than on SQL databases.
The second argument of the DAL constructor is the ``pool_size``; it defaults to 0.
As it is rather slow to establish a new database connection for each request, web2py implements a mechanism for connection pooling. Once a connection is established and the page has been served and the transaction completed, the connection is not closed but goes into a pool. When the next http request arrives, web2py tries to obtain a connection from the pool and use that for the new transaction. If there are no available connections in the pool, a new connection is established.
The ``pool_size`` parameter is ignored by SQLite and Google App Engine.
Connections in the pools are shared sequentially among threads, in the sense that they may be used by two different but not simultaneous threads. There is only one pool for each web2py process.
When web2py starts, the pool is always empty. The pool grows up to the minimum between the value of ``pool_size`` and the max number of concurrent requests. This means that if ``pool_size=10`` but our server never receives more than 5 concurrent requests, then the actual pool size will only grow to 5. If ``pool_size=0`` then connection pooling is not used.
Connection pooling is ignored for SQLite, since it would not yield any benefit.
#### Connection failures
If web2py fails to connect to the database it waits 1 seconds and tries again up to 5 times before declaring a failure. In case of connection pooling it is possible that a pooled connection that stays open but unused for some time is closed by the database end. Thanks to the retry feature web2py tries to re-establish these dropped connections.
When using connection pooling a connection is used, put back in the pool and then recycled. It is possible that while the connection is idle in pool the connection is closed by the database server. This can be because of a malfunction or a timeout. When this happens web2py detects it and re-establish the connection.
#### Replicated databases
The first argument of ``DAL(...)`` can be a list of URIs. In this case web2py tries to connect to each of them. The main purpose for this is to deal with multiple database servers and distribute the workload among them). Here is a typical use case:
If you now insert again, the counter will again be set to 2, since the previous
``
>>> db.person.insert(name="Bob")
2
``:code
Code in models, views and controllers is enclosed in web2py code that looks like this:
``
try:
execute models, controller function and view
except:
rollback all connections
log the traceback
send a ticket to the visitor
else:
commit all connections
save cookies, sessions and return the page
``:code
There is no need to ever call ``commit`` or ``rollback`` explicitly in web2py unless one needs more granular control.
### Raw sql
#### Timing queries
All queries are automatically timed by web2py. The variable ``db._timings`` is a list of tuples. Each tuple contains the raw SQL query as passed to the database driver and the time it took to execute in seconds. This variable can be displayed in views using the toolbar:
``
{{=response.toolbar()}}
``
#### ``executesql``
The DAL allows you to explicitly issue SQL statements.
``executesql``:inxx
``
>>> print db.executesql('SELECT * FROM person;')
[(1, u'Massimo'), (2, u'Massimo')]
``:code
In this case, the return values are not parsed or transformed by the DAL, and the format depends on the specific database driver. This usage with selects is normally not needed, but it is more common with indexes.
ret = db.mytable.validate_and_insert(field='value')
works very much like
``
id = db.mytable.insert(field='value')
``:code
except that it calls the validators for the fields before performing the insert and bails out if the validation does not pass. If validation does not pass the errors can be found in ``ret.error``. If it passes, the id of the new record is in ``ret.id``. Mind that normally validation is done by the form processing logic so this function is rarely needed.
Similarly
``
ret = db(query).validate_and_update(field='value')
``:code
works very much the same as
``
num = db(query).update(field='value')
``:code
except that it calls the validators for the fields before performing the update. Notice that it only works if query involves a single table. The number of updated records can be found in ``res.updated`` and errors will be ``ret.errors``.
#### ``smart_query`` (experimental)
There are times when you need to parse a query using natural language such as
``
name contain m and age greater than 18
``
The DAL provides a method to parse this type of queries:
``
search = 'name contain m and age greater than 18'
rows = db.smart_query([db.person],search).select()
``
The first argument must be a list of tables or fields that should be allowed in the search. It raises a ``RuntimeError`` if the search string is invalid. This functionality can be used to build RESTful interfaces (see chapter 10) and it is used internally by the ``SQLFORM.grid`` and ``SQLFORM.smartgrid``.
In the smartquery search string, a field can be identified by fieldname only and or by tablename.fieldname. Strings may be delimited by double quotes if they contain spaces.
DAL provides geometry and geography fields types and the following functions:
``
st_asgeojson (PostGIS only)
st_astext
st_contained
st_contains
st_distance
st_equals
st_intersects
st_overlaps
st_simplify (PostGIS only)
st_touches
st_within
``
Here are some examples:
``
from gluon.dal import DAL, Field, geoPoint, geoLine, geoPolygon
db = DAL("mssql://user:pass@host:db")
-sp = db.define_table('spatial',
- Field('loc','geometry()')
)
``:code
Below we insert a point, a line, and a polygon:
``
sp.insert(loc=geoPoint(1,1))
sp.insert(loc=geoLine((100,100),(20,180),(180,180)))
sp.insert(loc=geoPolygon((0,0),(150,0),(150,150),(0,150),(0,0)))
``:code
Notice that
``
rows = db(sp.id>0).select()
``:code
Always returns the geometry data serialized as text.
You can also do the same more explicitely using ``st_astext()``:
``
print db(sp.id>0).select(sp.id, sp.loc.st_astext())
spatial.id,spatial.loc.STAsText()
-1, POINT (1 2)"
-2, LINESTRING (100 100, 20 180, 180 180)
-3, POLYGON ((0 0, 150 0, 150 150, 0 150, 0 0))
``:code
You can ask for the native representation by using ``st_asgeojson()`` (in PostGIS only):
``
print db(sp.id>0).select(sp.id, sp.loc.st_asgeojson().with_alias('loc'))
spatial.id,loc
1, [1, 2]
2, [[100, 100], [20 180], [180, 180]]
3, [[[0, 0], [150, 0], [150, 150], [0, 150], [0, 0]]]
``:code
(notice an array is a point, an array of arrays is a line, and an array of array of arrays is a polygon).
Here are example of how to use geographical functions:
``
query = sp.loc.st_intersects(geoLine((20,120),(60,160)))
query = sp.loc.st_overlaps(geoPolygon((1,1),(11,1),(11,11),(11,1),(1,1)))
query = sp.loc.st_contains(geoPoint(1,1))
TeradataAdapter extends DB2Adapter (experimental)
SAPDBAdapter extends BaseAdapter (experimental)
CouchDBAdapter extends NoSQLAdapter (experimental)
MongoDBAdapter extends NoSQLAdapter (experimental)
``
which override the behavior of the ``BaseAdapter``.
Each adapter has more or less this structure:
``
class MySQLAdapter(BaseAdapter):
# specify a diver to use
driver = globals().get('pymysql',None)
# map web2py types into database types
types = {
'boolean': 'CHAR(1)',
'string': 'VARCHAR(%(length)s)',
'text': 'LONGTEXT',
...
}
# connect to the database using driver
def __init__(self,db,uri,pool_size=0,folder=None,db_codec ='UTF-8',
credential_decoder=lambda x:x, driver_args={},
adapter_args={}):
# parse uri string and store parameters in driver_args
...
# define a connection function
def connect(driver_args=driver_args):
return self.driver.connect(**driver_args)
# place it in the pool
self.pool_connection(connect)
# set optional parameters (after connection)
self.execute('SET FOREIGN_KEY_CHECKS=1;')
self.execute("SET sql_mode='NO_BACKSLASH_ESCAPES';")
# override BaseAdapter methods as needed
def lastrowid(self,table):
self.execute('select last_insert_id();')
db =DAL(..., driver_args={}, adapter_args={})
db = DAL('mssql://....')
for key in ['reference','reference FK']:
db._adapter.types[key]=db._adapter.types[key].replace(
'%(on_delete_action)s','NO ACTION')
``:code
**MSSQL** also has problems with arguments passed to the DISTINCT keyword and therefore
while this works,
``
db(query).select(distinct=True)
``
this does not
``
db(query).select(distinct=db.mytable.myfield)
``
**Google NoSQL (Datastore)** does not allow joins, left joins, aggregates, expression, OR involving more than one table, the like operator and search in "text"" fields. Transactions are limited and not provided automatically by web2py (you need to use the Google API ``run_in_transaction`` which you can look up in the Google App Engine documentation online). Google also limits the number of records you can retrieve in each one query (1000 at the time of writing). On the Google datastore record IDs are integer but they are not sequential. While on SQL the "list:string" type is mapped into a "text" type, on the Google Datastore it is mapped into a ``ListStringProperty``. Similarly "list:integer" and "list:reference" are mapped into "ListProperty". This makes that searches for content inside these fields types are more efficient on Google NoSQL than on SQL databases.
-
-

-------------
**SQLite** | ``sqlite://storage.db``
**MySQL** | ``mysql://username:password@localhost/test``
**PostgreSQL** | ``postgres://username:password@localhost/test``
**MSSQL** | ``mssql://username:password@localhost/test``
**FireBird** | ``firebird://username:password@localhost/test``
**Oracle** | ``oracle://username/password@test``
**DB2** | ``db2://username:password@test``
**Ingres** | ``ingres://username:password@localhost/test``
**Sybase** | ``sybase://username:password@localhost/test``
**Informix** | ``informix://username:password@test``
**Teradata** | ``teradata://DSN=dsn;UID=user;PWD=pass;DATABASE=name``
**Cubrid** | ``cubrid://username:password@localhost/test``
**SAPDB** | ``sapdb://username:password@localhost/test``
**IMAP** | ``imap://user:password@server:port``
**MongoDB** | ``mongodb://username:password@localhost/test``
+**Google/SQL** | ``google:sql``
+**Google/NoSQL** | ``google:datastore``
-------------
-------------
**SQLite** | ``sqlite://storage.db``
**MySQL** | ``mysql://username:password@localhost/test``
**PostgreSQL** | ``postgres://username:password@localhost/test``
**MSSQL** | ``mssql://username:password@localhost/test``
**FireBird** | ``firebird://username:password@localhost/test``
**Oracle** | ``oracle://username/password@test``
**DB2** | ``db2://username:password@test``
**Ingres** | ``ingres://username:password@localhost/test``
**Sybase** | ``sybase://username:password@localhost/test``
**Informix** | ``informix://username:password@test``
**Teradata** | ``teradata://DSN=dsn;UID=user;PWD=pass;DATABASE=database``
**Cubrid** | ``cubrid://username:password@localhost/test``
**SAPDB** | ``sapdb://username:password@localhost/test``
**IMAP** | ``imap://user:password@server:port``
**MongoDB** | ``mongodb://username:password@localhost/test``
-**Google App Engine/SQL** | ``google:sql``
-**Google App Engine/NoSQL** | ``google:datastore``
-------------

``virtual fields``:inxx
Virtual fields are also computed fields (as in the previous subsection) but they differ from those because they are ''virtual'' in the sense that they are not stored in the db and they are computed each time records are extracted from the database. They can be used to simplify the user's code without using additional storage but they cannot be used for searching.
+#### New style virtual fields
+
+web2py provides a new and easier way to define virtual fields and lazy virtual fields. This section is marked experimental because they APIs may still change a little from what is described here.
+
+Here we will consider the same example as in the previous subsection. In particular we consider the following model:
+
+``
+>>> db.define_table('item',
+ Field('unit_price','double'),
+ Field('quantity','integer'),
+``:code
+
+One can define a ``total_price`` virtual field as
+
+``
+>>> db.item.total_price = Field.Virtual(lambda row: row.unit_price*row.quantity)
+``:code
+
+i.e. by simply defining a new field ``total_price`` to be a ``Field.Virtual``. The only argument of the constructor is a function that takes a row and returns the computed values.
+
+A virtual field defined as the one above is automatically computed for all records when the records are selected:
+
+``
+>>> for row in db(db.item).select(): print row.total_price
+``
+
+It is also possible to define method fields which are calculated on-demand, when called.
+For example:
+
+``
+>>> db.item.discounted_total = Field.Method(lambda row, discount=0.0: \
+ row.unit_price*row.quantity*(1.0-discount/100))
+``:code
+
+In this case ``row.discounted_total`` is not a value but a function. The function takes the same arguments as the function passed to the ``Method`` constructor except for ``row`` which is implicit (think of it as ``self`` for rows objects).
+
+The lazy field in the example above allows one to compute the total price for each ``item``:
+
+``
+>>> for row in db(db.item).select(): print row.discounted_total()
+``
+
+And it also allows to pass an optional ``discount`` percentage (15%):
+
+``
+>>> for row in db(db.item).select(): print row.discounted_total(15)
+``
+
+Virtual and Method fields can also be defined in place when a table is defined:
+
+``
+>>> db.define_table('item',
+ Field('unit_price','double'),
+ Field('quantity','integer'),
+ Field.Virtual('total_price', lambda row: ...),
+ Field.Method('discounted_total', lambda row, discount=0.0: ...))
+``:code
+
+
+------
+Mind that virtual fields do not have the same attributes as the other fields (default, readable, requires, etc) and they do not appear in the list of ``db.table.fields`` and are not visualized by default in tables (TABLE) and grids (SQLFORM.grid, SQLFORM.smartgrid).
+------
+
+#### Old style virtual fields
+
+In order to define one or more virtual fields, you can also define a container class, instantiate it and link it to a table or to a select. For example, consider the following table:
+
``
>>> db.define_table('item',
Field('unit_price','double'),
Field('quantity','integer'),
``:code
One can define a ``total_price`` virtual field as
``
>>> class MyVirtualFields(object):
def total_price(self):
return self.item.unit_price*self.item.quantity
>>> db.item.virtualfields.append(MyVirtualFields())
``:code
Notice that each method of the class that takes a single argument (self) is a new virtual field. ``self`` refers to each one row of the select. Field values are referred by full path as in ``self.item.unit_price``. The table is linked to the virtual fields by appending an instance of the class to the table's ``virtualfields`` attribute.
Virtual fields can also access recursive fields as in
``
>>> db.define_table('item',
Field('unit_price','double'))
Virtual fields can be ''lazy''; all they need to do is return a function and acc
Field('quantity','integer'),
>>> class MyVirtualFields(object):
def lazy_total_price(self):
def lazy(self=self):
return self.item.unit_price \
* self.item.quantity
return lazy
>>> db.item.virtualfields.append(MyVirtualFields())
>>> for item in db(db.item).select():
print item.lazy_total_price()
``:code
or shorter using a lambda function:
``
>>> class MyVirtualFields(object):
def lazy_total_price(self):
return lambda self=self: self.item.unit_price \
* self.item.quantity
``:code
### One to many relation
``one to many``:inxx
To illustrate how to implement one to many relations with the web2py DAL, define another table "thing" that refers to the table "person" which we redefine here:
``
>>> db.define_table('person',
Field('name'),
format='%(name)s')
>>> db.define_table('thing',
Field('name'),
Field('owner', 'reference person'),
format='%(name)s')
``:code
Table "thing" has two fields, the name of the thing and the owner of the thing. The "owner" field id a reference field. A reference type can be specified in two equivalent ways:
``
Field('owner', 'reference person')
Field('owner', db.person)
``:code
Curt Boat
``:code
Similarly, you can search for all things owned by Alex:
``
>>> for row in persons_and_things(db.person.name=='Alex').select():
print row.thing.name
Boat
Chair
``:code
and all owners of Boat:
``
>>> for row in persons_and_things(db.thing.name=='Boat').select():
print row.person.name
Alex
Curt
``:code
A lighter alternative to Many 2 Many relations is tagging. Tagging is discussed in the context of the ``IS_IN_DB`` validator. Tagging works even on database backends that do not support JOINs like the Google App Engine NoSQL.
### ``list:<type>``, and ``contains``
``list:string``:inxx
``list:integer``:inxx
``list:reference``:inxx
``contains``:inxx
``multiple``:inxx
``tags``:inxx
web2py provides the following special field types:
``
list:string
list:integer
list:reference <table>
``:code
They can contain lists of strings, of integers and of references respectively.
On Google App Engine NoSQL ``list:string`` is mapped into ``StringListProperty``, the other two are mapped into ``ListProperty(int)``. On relational databases they all mapped into text fields which contain the list of items separated by ``|``. For example ``[1,2,3]`` is mapped into ``|1|2|3|``.
For lists of string the items are escaped so that any ``|`` in the item is replaced by a ``||``. Anyway this is an internal representation and it is transparent to the user.
The DAL can be used from any Python program simply by doing this:
``
from gluon import DAL, Field
db = DAL('sqlite://storage.sqlite',folder='path/to/app/databases')
``:code
i.e. import the DAL, Field, connect and specify the folder which contains the .table files (the app/databases folder).
To access the data and its attributes we still have to define all the tables we are going to access with ``db.define_tables(...)``.
If we just need access to the data but not to the web2py table attributes, we get away without re-defining the tables but simply asking web2py to read the necessary info from the metadata in the .table files:
``
from gluon import DAL, Field
db = DAL('sqlite://storage.sqlite',folder='path/to/app/databases',
auto_import=True))
``:code
This allows us to access any ``db.table`` without need to re-define it.
#### PostGIS, SpatiaLite, and MS Geo (experimental)
-#### Old style virtual fields
``virtualfields``:inxx
Virtual fields are also computed fields (as in the previous subsection) but they differ from those because they are ''virtual'' in the sense that they are not stored in the db and they are computed each time records are extracted from the database. They can be used to simplify the user's code without using additional storage but they cannot be used for searching.
-In order to define one or more virtual fields, you have to define a container class, instantiate it and link it to a table or to a select. For example, consider the following table:
``
>>> db.define_table('item',
Field('unit_price','double'),
Field('quantity','integer'),
``:code
One can define a ``total_price`` virtual field as
``
>>> class MyVirtualFields(object):
def total_price(self):
return self.item.unit_price*self.item.quantity
>>> db.item.virtualfields.append(MyVirtualFields())
``:code
Notice that each method of the class that takes a single argument (self) is a new virtual field. ``self`` refers to each one row of the select. Field values are referred by full path as in ``self.item.unit_price``. The table is linked to the virtual fields by appending an instance of the class to the table's ``virtualfields`` attribute.
Virtual fields can also access recursive fields as in
``
>>> db.define_table('item',
Field('unit_price','double'))
Virtual fields can be ''lazy''; all they need to do is return a function and acc
Field('quantity','integer'),
>>> class MyVirtualFields(object):
def lazy_total_price(self):
def lazy(self=self):
return self.item.unit_price \
* self.item.quantity
return lazy
>>> db.item.virtualfields.append(MyVirtualFields())
>>> for item in db(db.item).select():
print item.lazy_total_price()
``:code
or shorter using a lambda function:
``
>>> class MyVirtualFields(object):
def lazy_total_price(self):
return lambda self=self: self.item.unit_price \
* self.item.quantity
``:code
-
-#### New style virtual fields (experimental)
-
-web2py provides a new and easier way to define virtual fields and lazy virtual fields. This section is marked experimental because they APIs may still change a little from what is described here.
-
-Here we will consider the same example as in the previous subsection. In particular we consider the following model:
-
-``
->>> db.define_table('item',
- Field('unit_price','double'),
- Field('quantity','integer'),
-``:code
-
-One can define a ``total_price`` virtual field as
-
-``
->>> db.item.total_price = Field.Virtual(lambda row: row.unit_price*row.quantity)
-``:code
-
-i.e. by simply defining a new field ``total_price`` to be a ``Field.Virtual``. The only argument of the constructor is a function that takes a row and returns the computed values.
-
-A virtual field defined as the one above is automatically computed for all records when the records are selected:
-
-``
->>> for row in db(db.item).select(): print row.total_price
-``
-
-It is also possible to define method fields which are calculated on-demand, when called.
-For example:
-
-``
->>> db.item.discounted_total = Field.Method(lambda row, discount=0.0: \
- row.unit_price*row.quantity*(1.0-discount/100))
-``:code
-
-In this case ``row.discounted_total`` is not a value but a function. The function takes the same arguments as the function passed to the ``Method`` constructor except for ``row`` which is implicit (think of it as ``self`` for rows objects).
-
-The lazy field in the example above allows one to compute the total price for each ``item``:
-
-``
->>> for row in db(db.item).select(): print row.discounted_total()
-``
-
-And it also allows to pass an optional ``discount`` percentage (15%):
-
-``
->>> for row in db(db.item).select(): print row.discounted_total(15)
-``
-
-Virtual and Method fields can also be defined in place when a table is defined:
-
-``
->>> db.define_table('item',
- Field('unit_price','double'),
- Field('quantity','integer'),
- Field.Virtual('total_price', lambda row: ...),
- Field.Method('discounted_total', lambda row, discount=0.0: ...))
-``:code
-
-
-Mind that virtual fields do not have the same attributes as the other fields (default, readable, requires, etc) and they do not appear in the list of ``db.table.fields`` and are not visualized by default in tables (TABLE) and grids (SQLFORM.grid, SQLFORM.smartgrid).
-
### One to many relation
``one to many``:inxx
To illustrate how to implement one to many relations with the web2py DAL, define another table "thing" that refers to the table "person" which we redefine here:
``
>>> db.define_table('person',
Field('name'),
format='%(name)s')
>>> db.define_table('thing',
Field('name'),
Field('owner', 'reference person'),
format='%(name)s')
``:code
Table "thing" has two fields, the name of the thing and the owner of the thing. The "owner" field id a reference field. A reference type can be specified in two equivalent ways:
``
Field('owner', 'reference person')
Field('owner', db.person)
``:code
Curt Boat
``:code
Similarly, you can search for all things owned by Alex:
``
>>> for row in persons_and_things(db.person.name=='Alex').select():
print row.thing.name
Boat
Chair
``:code
and all owners of Boat:
``
>>> for row in persons_and_things(db.thing.name=='Boat').select():
print row.person.name
Alex
Curt
``:code
A lighter alternative to Many 2 Many relations is tagging. Tagging is discussed in the context of the ``IS_IN_DB`` validator. Tagging works even on database backends that do not support JOINs like the Google App Engine NoSQL.
### Many to many, ``list:<type>``, and ``contains``
``list:string``:inxx
``list:integer``:inxx
``list:reference``:inxx
``contains``:inxx
``multiple``:inxx
``tags``:inxx
web2py provides the following special field types:
``
list:string
list:integer
list:reference <table>
``:code
They can contain lists of strings, of integers and of references respectively.
On Google App Engine NoSQL ``list:string`` is mapped into ``StringListProperty``, the other two are mapped into ``ListProperty(int)``. On relational databases they all mapped into text fields which contain the list of items separated by ``|``. For example ``[1,2,3]`` is mapped into ``|1|2|3|``.
For lists of string the items are escaped so that any ``|`` in the item is replaced by a ``||``. Anyway this is an internal representation and it is transparent to the user.
The DAL can be used from any Python program simply by doing this:
``
from gluon import DAL, Field
db = DAL('sqlite://storage.sqlite',folder='path/to/app/databases')
``:code
i.e. import the DAL, Field, connect and specify the folder which contains the .table files (the app/databases folder).
To access the data and its attributes we still have to define all the tables we are going to access with ``db.define_tables(...)``.
If we just need access to the data but not to the web2py table attributes, we get away without re-defining the tables but simply asking web2py to read the necessary info from the metadata in the .table files:
``
from gluon import DAL, Field
db = DAL('sqlite://storage.sqlite',folder='path/to/app/databases',
auto_import=True))
``:code
This allows us to access any ``db.table`` without need to re-define it.
#### PostGIS, SpatiaLite, and MS Geo Extensions (experimental)

----------
database | drivers (source)
SQLite | sqlite3 or pysqlite2 or zxJDBC ``zxjdbc``:cite (on Jython)
PostgreSQL | psycopg2 ``psycopg2``:cite or pg8000 ``pg8000``:cite or zxJDBC ``zxjdbc``:cite (on Jython)
MySQL | pymysql ``pymysql``:cite or MySQLdb ``mysqldb``:cite
Oracle | cx_Oracle ``cxoracle``:cite
MSSQL | pyodbc ``pyodbc``:cite
FireBird | kinterbasdb ``kinterbasdb``:cite or fdb or pyodbc
DB2 | pyodbc ``pyodbc``:cite
Informix | informixdb ``informixdb``:cite
Ingres | ingresdbi ``ingresdbi``:cite
Cubrid | cubriddb ``cubridb``:cite ``cubridb``:cite
Sybase | Sybase ``Sybase``:cite
Teradata | pyodbc ``Teradata``:cite
SAPDB | sapdb ``SAPDB``:cite
MongoDB | pymongo ``pymongo``:cite
IMAP | imaplib ``IMAP``:cite
---------
----------
database | drivers (source)
SQLite | sqlite3 or pysqlite2 or zxJDBC ``zxjdbc``:cite (on Jython)
PostgreSQL | psycopg2 ``psycopg2``:cite or pg8000 ``pg8000``:cite or zxJDBC ``zxjdbc``:cite (on Jython)
MySQL | pymysql ``pymysql``:cite or MySQLdb ``mysqldb``:cite
Oracle | cx_Oracle ``cxoracle``:cite
MSSQL | pyodbc ``pyodbc``:cite
FireBird | kinterbasdb ``kinterbasdb``:cite or fdb or pyodbc
DB2 | pyodbc ``pyodbc``:cite
Informix | informixdb ``informixdb``:cite
Ingres | ingresdbi ``ingresdbi``:cite
Cubrid | cubriddb ``cubriddb``:cite ``cubrid``:cite
Sybase | Sybase ``Sybase``:cite
Teradata | pyodbc ``Teradata``:cite
SAPDB | sapdb ``SAPDB``:cite
MongoDB | pymongo ``pymongo``:cite
IMAP | imaplib ``IMAP``:cite
---------

#### PostGIS, SpatiaLite, and MS Geo Extensions (experimental)
+``PostGIS``:inxx ``StatiaLite``:inxx ``Geo Extensions``:inxx
+``geometry``:inxx ``geoPoint``:inxx ``geoLine``:inxx ``geoPolygon``:inxx
+
+The DAL supports geographical APIs using PostGIS (for PostgreSQL), spatialite (for SQLite), and MSSQL and Spatial Extensions. This is a feature that was sponosred by the Sahana project and implemented by Denes Lengyel.
DAL provides geometry and geography fields types and the following functions:
+``st_asgeojson``:inxx ``st_astext``:inxx ``st_contained``:inxx ``st_contains``:inxx
+``st_distance``:inxx ``st_equals``:inxx ``st_intersects``:inxx ``st_overlaps``:inxx
+``st_simplify``:inxx ``st_touches``:inxx ``st_within``:inxx
+
``
st_asgeojson (PostGIS only)
st_astext
st_contained
st_contains
st_distance
st_equals
st_intersects
st_overlaps
st_simplify (PostGIS only)
st_touches
st_within
``
Here are some examples:
``
from gluon.dal import DAL, Field, geoPoint, geoLine, geoPolygon
db = DAL("mssql://user:pass@host:db")
sp = db.define_table('spatial',
Field('loc','geometry()')
)
``:code
Below we insert a point, a line, and a polygon:
``
sp.insert(loc=geoPoint(1,1))
sp.insert(loc=geoLine((100,100),(20,180),(180,180)))
sp.insert(loc=geoPolygon((0,0),(150,0),(150,150),(0,150),(0,0)))
``:code
Notice that
``
rows = db(sp.id>0).select()
``:code
+Always returns the geometry data serialized as text.
+You can also do the same more explicitely using ``st_astext()``:
``
print db(sp.id>0).select(sp.id, sp.loc.st_astext())
spatial.id,spatial.loc.STAsText()
1, POINT (1 2)"
2, LINESTRING (100 100, 20 180, 180 180)
3, POLYGON ((0 0, 150 0, 150 150, 0 150, 0 0))
``:code
You can ask for the native representation by using ``st_asgeojson()`` (in PostGIS only):
``
print db(sp.id>0).select(sp.id, sp.loc.st_asgeojson().with_alias('loc'))
spatial.id,loc
1, [1, 2]
2, [[100, 100], [20 180], [180, 180]]
3, [[[0, 0], [150, 0], [150, 150], [0, 150], [0, 0]]]
``:code
+(notice an array is a point, an array of arrays is a line, and an array of array of arrays is a polygon).
+
Here are example of how to use geographical functions:
``
query = sp.loc.st_intersects(geoLine((20,120),(60,160)))
query = sp.loc.st_overlaps(geoPolygon((1,1),(11,1),(11,11),(11,1),(1,1)))
query = sp.loc.st_contains(geoPoint(1,1))
+print db(query).select(sp.id,sp.loc)
+spatial.id,spatial.loc
+3,"POLYGNON ((0 0, 150 0, 150 150, 0 150, 0 0))"
``:code
Computed distances can also be retrieved as floating point numbers:
``
dist = sp.loc.st_distance(geoPoint(-1,2)).with_alias('dist')
print db(sp.id>0).select(sp.id, dist)
spatial.id, dist
1 2.0
2 140.714249456
3 1.0
``:code
#### Copy data from one db into another
#### postgis and spatialite (experimental)
-The DAL supports geographical APIs using postgis (for postgresql) and spatialite (for MSSQL and SQLite). This is a feature that was sponosred by the Sahana project and implemented by Denes Lengyel.
DAL provides geometry and geography fields types and the following functions:
``
st_asgeojson (PostGIS only)
st_astext
st_contained
st_contains
st_distance
st_equals
st_intersects
st_overlaps
st_simplify (PostGIS only)
st_touches
st_within
``
Here are some examples:
``
from gluon.dal import DAL, Field, geoPoint, geoLine, geoPolygon
db = DAL("mssql://user:pass@host:db")
sp = db.define_table('spatial',
Field('loc','geometry()')
)
``:code
Below we insert a point, a line, and a polygon:
``
sp.insert(loc=geoPoint(1,1))
sp.insert(loc=geoLine((100,100),(20,180),(180,180)))
sp.insert(loc=geoPolygon((0,0),(150,0),(150,150),(0,150),(0,0)))
``:code
Notice that
``
print db(sp.id>0).select()
``:code
-will not display the values of these fields. You can print them with:
``
print db(sp.id>0).select(sp.id, sp.loc.st_astext())
spatial.id,spatial.loc.STAsText()
1, POINT (1 2)"
2, LINESTRING (100 100, 20 180, 180 180)
3, POLYGON ((0 0, 150 0, 150 150, 0 150, 0 0))
``:code
or serialize them as json:
``
print db(sp.id>0).select(sp.id, sp.loc.st_asgeojson().with_alias('loc'))
spatial.id,loc
1, [1, 2]
2, [[100, 100], [20 180], [180, 180]]
3, [[[0, 0], [150, 0], [150, 150], [0, 150], [0, 0]]]
``:code
Here are example of how to use geographical functions:
``
query = sp.loc.st_intersects(geoLine((20,120),(60,160)))
query = sp.loc.st_overlaps(geoPolygon((1,1),(11,1),(11,11),(11,1),(1,1)))
query = sp.loc.st_contains(geoPoint(1,1))
-print db(query).select(sp.id,sp.loc.st_astext())
-spatial.id,spatial.loc.STAsText()
-3,"POLYGON ((0 0, 150 0, 150 150, 0 150, 0 0))"
``:code
Computed distances can also be retrieved as floating point numbers:
``
dist = sp.loc.st_distance(geoPoint(-1,2)).with_alias('dist')
print db(sp.id>0).select(sp.id, dist)
spatial.id, dist
1 2.0
2 140.714249456
3 1.0
``:code
-New versions of web2py may include automatic conversions from promitive geographical types to Python geographical objects encoded, for example, as tuples.
-
#### Copy data from one db into another

web2py comes with a Database Abstraction Layer (DAL), an API that maps Python objects into database objects such as queries, tables, and records. The DAL dynamically generates the SQL in real time using the specified dialect for the database back end, so that you do not have to write SQL code or learn different SQL dialects (the term SQL is used generically), and the application will be portable among different types of databases. At the time of this writing, the supported databases are SQLite (which comes with Python and thus web2py), PostgreSQL, MySQL, Oracle, MSSQL, FireBird, DB2, Informix, Ingres, and the Google App Engine (SQL and NoSQL). Experimentally we support more databases. Please check on the web2py web site and mailing list for more recent adapters. Google NoSQL is treated as a particular case in Chapter 13.
The Windows binary distribution works out of the box with SQLite and MySQL. The Mac binary distribution works out of the box with SQLite.
To use any other database back-end, run from the source distribution and install the appropriate driver for the required back end.
``database drivers``:inxx
Once the proper driver is installed, start web2py from source, and it will find the driver. Here is a list of drivers:
``DAL``:inxx ``SQLite``:inxx ``MySQL``:inxx ``PostgresSQL``:inxx ``Oracle``:inxx ``MSSQL``:inxx ``FireBird``:inxx ``DB2``:inxx ``Informix``:inxx ``Sybase``:inxx ``Teradata``:inxx ``MongoDB``:inxx ``CouchDB``:inxx ``SAPDB``:inxx ``Cubrid``:inxx
----------
database | drivers (source)
SQLite | sqlite3 or pysqlite2 or zxJDBC ``zxjdbc``:cite (on Jython)
PostgreSQL | psycopg2 ``psycopg2``:cite or pg8000 ``pg8000``:cite or zxJDBC ``zxjdbc``:cite (on Jython)
MySQL | pymysql ``pymysql``:cite or MySQLdb ``mysqldb``:cite
Oracle | cx_Oracle ``cxoracle``:cite
MSSQL | pyodbc ``pyodbc``:cite
FireBird | kinterbasdb ``kinterbasdb``:cite or fdb or pyodbc
DB2 | pyodbc ``pyodbc``:cite
Informix | informixdb ``informixdb``:cite
Ingres | ingresdbi ``ingresdbi``:cite
A connection with the database is established by creating an instance of the DAL
**MSSQL** | ``mssql://username:password@localhost/test``
**FireBird** | ``firebird://username:password@localhost/test``
**Oracle** | ``oracle://username/password@test``
**DB2** | ``db2://username:password@test``
**Ingres** | ``ingres://username:password@localhost/test``
**Sybase** | ``sybase://username:password@localhost/test``
**Informix** | ``informix://username:password@test``
**Teradata** | ``teradata://DSN=dsn;UID=user;PWD=pass;DATABASE=database``
**Cubrid** | ``cubrid://username:password@localhost/test``
**SAPDB** | ``sapdb://username:password@localhost/test``
**IMAP** | ``imap://user:password@server:port``
**MongoDB** | ``mongodb://username:password@localhost/test``
**Google App Engine/SQL** | ``google:sql``
**Google App Engine/NoSQL** | ``google:datastore``
-------------
Notice that in SQLite the database consists of a single file. If it does not exist, it is created. This file is locked every time it is accessed. In the case of MySQL, PostgreSQL, MSSQL, FireBird, Oracle, DB2, Ingres and Informix the database "test" must be created outside web2py. Once the connection is established, web2py will create, alter, and drop tables appropriately.
It is also possible to set the connection string to ``None``. In this case DAL will not connect to any back-end database, but the API can still be accessed for testing. Examples of this will be discussed in Chapter 7.
+Some times you may need to generate SQL as if you had a connection but without actually connecting to the database. This can be done with
+
+``
+db = DAL(..., do_connect=False)
+``:code
+
+In this case you will be able to call ``_select``, ``_insert``, ``_update``, and ``_delete`` to generate SQL but call ``select``, ``insert``, ``update``, and ``delete``. In most of the cases you can use ``do_connect=False`` even without having the required database drivers.
+
#### Connection pooling
``connection pooling``:inxx
The second argument of the DAL constructor is the ``pool_size``; it defaults to 0.
As it is rather slow to establish a new database connection for each request, web2py implements a mechanism for connection pooling. Once a connection is established and the page has been served and the transaction completed, the connection is not closed but goes into a pool. When the next http request arrives, web2py tries to obtain a connection from the pool and use that for the new transaction. If there are no available connections in the pool, a new connection is established.
The ``pool_size`` parameter is ignored by SQLite and Google App Engine.
Connections in the pools are shared sequentially among threads, in the sense that they may be used by two different but not simultaneous threads. There is only one pool for each web2py process.
When web2py starts, the pool is always empty. The pool grows up to the minimum between the value of ``pool_size`` and the max number of concurrent requests. This means that if ``pool_size=10`` but our server never receives more than 5 concurrent requests, then the actual pool size will only grow to 5. If ``pool_size=0`` then connection pooling is not used.
Connection pooling is ignored for SQLite, since it would not yield any benefit.
#### Connection failures
If web2py fails to connect to the database it waits 1 seconds and tries again up to 5 times before declaring a failure. In case of connection pooling it is possible that a pooled connection that stays open but unused for some time is closed by the database end. Thanks to the retry feature web2py tries to re-establish these dropped connections.
When using connection pooling a connection is used, put back in the pool and then recycled. It is possible that while the connection is idle in pool the connection is closed by the database server. This can be because of a malfunction or a timeout. When this happens web2py detects it and re-establish the connection.
sqlite
The connection string is called a ``_uri`` because it is an instance of a Uniform Resource Identifier.
The DAL allows multiple connections with the same database or with different databases, even databases of different types. For now, we will assume the presence of a single database since this is the most common situation.
``define_table``:inxx ``Field``:inxx
``type``:inxx ``length``:inxx ``default``:inxx ``requires``:inxx ``required``:inxx ``unique``:inxx
``notnull``:inxx ``ondelete``:inxx ``uploadfield``:inxx ``uploadseparate``:inxx ``migrate``:inxx ``sql.log``:inxx
The most important method of a DAL is ``define_table``:
``
>>> db.define_table('person', Field('name'))
``:code
It defines, stores and returns a ``Table`` object called "person" containing a field (column) "name". This object can also be accessed via ``db.person``, so you do not need to catch the return value.
Do not declare a field called "id", because one is created by web2py anyway. Every table has a field called "id" by default. It is an auto-increment integer field (starting at 1) used for cross-reference and for making every record unique, so "id" is a primary key. (Note: the id's starting at 1 is back-end specific. For example, this does not apply to the Google App Engine NoSQL.)
``named id field``:inxx
Optionally you can define a field of ``type='id'`` and web2py will use this field as auto-increment id field. This is not recommended except when accessing legacy database tables. With some limitation, you can also use different primary keys and this is discussed in the section on "Legacy databases and keyed tables".
+Tables can be defined only once but you can force web2py to redefine an existing table:
+
+``
+db.define_table('person', Field('name'))
+db.define_table('person', Field('name'), redefine=True)
+``:code
+
+The redefinition may trigger a migration if field content is different.
+
----------
Because usually in web2py models are executed before controllers, it is possible that some table are defined even if not needed. It is therefore necessary to speed up the code by making table definitions lazy. This is done by setting the ``DAL(...,lazy_tables=True)`` attributes. Tables will be actually created only when accessed.
----------
### Record representation
It is optional but recommended to specify a format representation for records:
``
>>> db.define_table('person', Field('name'), format='%(name)s')
``:code
or
``
>>> db.define_table('person', Field('name'), format='%(name)s %(id)s')
``:code
or even more complex ones using a function:
``
>>> db.define_table('person', Field('name'),
format=lambda r: r.name or 'anonymous')
And you can update all records in a set by passing named arguments corresponding
#### Expressions
The value assigned an update statement can be an expression. For example consider this model
``
>>> db.define_table('person',
Field('name'),
Field('visits', 'integer', default=0))
>>> db(db.person.name == 'Massimo').update(
visits = db.person.visits + 1)
``:code
The values used in queries can also be expressions
``
>>> db.define_table('person',
Field('name'),
Field('visits', 'integer', default=0),
Field('clicks', 'integer', default=0))
>>> db(db.person.visits == db.person.clicks + 1).delete()
``:code
+#### ``case`` ``case``:inxx
+
+An expression can contain a case clause for example:
+
+``
+>>> db.define_table('person',Field('name'))
+>>> condition = db.person.name.startswith('M')
+>>> yes_or_no = condition.case('Yes','No')
+>>> for row in db().select(db.person.name, yes_or_no):
+... print row.person.name, row(yes_or_no)
+Max Yes
+John No
+``:code
+
#### ``update_record``
``update_record``:inxx
web2py also allows updating a single record that is already in memory using ``update_record``
``
>>> row = db(db.person.id==2).select().first()
>>> row.update_record(name='Curt')
``:code
``update_record`` should not be confused with
``
>>> row.update(name='Curt')
``:code
because for a single row, the method ``update`` updates the row object but not the database record, as in the case of ``update_record``.
It is also possible to change the attributes of a row (one at a time) and then call ``update_record()`` without arguments to save the changes:
Here ``f`` is a dict of fields passed to insert or update, ``id`` is the id of t
>>> db(db.person.id==1).update(name='Tim')
(<Set (person.id = 1)>, {'name': 'Tim'})
(<Set (person.id = 1)>, {'name': 'Tim'})
>>> db(db.person.id==1).delete()
(<Set (person.id = 1)>,)
(<Set (person.id = 1)>,)
``:code
The return values of these callback should be ``None`` or ``False``. If any of the ``_before_*`` callback returns a ``True`` value it will abort the actual insert/update/delete operation.
``update_naive``:inxx.
Some times a callback may need to perform an update in the same of a different table and one wants to avoid callbacks calling themselves recursively.
For this purpose there the Set objects have an ``update_naive`` method that works like ``update`` but ignores before and after callbacks.
#### Record versioning
``_enable_record_versioning``:inxx
+It is possible to ask web2py to save every copy of a record when the record is modified. There are different ways to do it and it can be done for all tables at once using the syntax:
+
+``
+auth.enable_record_versioning(db)
+``:code
+
+this requires Auth and it is discussed in the chapter about authentication.
+It can also be done for each individual table as discussed below.
Consider the following table:
``
db.define_table('stored_item',
Field('name'),
Field('quantity','integer'),
Field('is_active','boolean',
writable=False,readable=False,default=True))
``:code
Notice we the hidden boolean field called ``is_active`` and defaulting to
True.
We can tell web2py to create a new table (in the same or a different database) and store all previous versions of each record in the table, when modified.
This is done in the following way:
``
db.stored_item._enable_record_versioning()
``:code
The DAL can be used from any Python program simply by doing this:
``
from gluon import DAL, Field
db = DAL('sqlite://storage.sqlite',folder='path/to/app/databases')
``:code
i.e. import the DAL, Field, connect and specify the folder which contains the .table files (the app/databases folder).
To access the data and its attributes we still have to define all the tables we are going to access with ``db.define_tables(...)``.
If we just need access to the data but not to the web2py table attributes, we get away without re-defining the tables but simply asking web2py to read the necessary info from the metadata in the .table files:
``
from gluon import DAL, Field
db = DAL('sqlite://storage.sqlite',folder='path/to/app/databases',
auto_import=True))
``:code
This allows us to access any ``db.table`` without need to re-define it.
+#### postgis and spatialite (experimental)
+
+The DAL supports geographical APIs using postgis (for postgresql) and spatialite (for MSSQL and SQLite). This is a feature that was sponosred by the Sahana project and implemented by Denes Lengyel.
+
+DAL provides geometry and geography fields types and the following functions:
+
+``
+st_asgeojson (PostGIS only)
+st_astext
+st_contained
+st_contains
+st_distance
+st_equals
+st_intersects
+st_overlaps
+st_simplify (PostGIS only)
+st_touches
+st_within
+``
+
+Here are some examples:
+
+``
+from gluon.dal import DAL, Field, geoPoint, geoLine, geoPolygon
+db = DAL("mssql://user:pass@host:db")
+sp = db.define_table('spatial',
+ Field('loc','geometry()')
+)
+``:code
+
+Below we insert a point, a line, and a polygon:
+``
+sp.insert(loc=geoPoint(1,1))
+sp.insert(loc=geoLine((100,100),(20,180),(180,180)))
+sp.insert(loc=geoPolygon((0,0),(150,0),(150,150),(0,150),(0,0)))
+``:code
+
+Notice that
+``
+print db(sp.id>0).select()
+``:code
+
+will not display the values of these fields. You can print them with:
+
+``
+print db(sp.id>0).select(sp.id, sp.loc.st_astext())
+spatial.id,spatial.loc.STAsText()
+1, POINT (1 2)"
+2, LINESTRING (100 100, 20 180, 180 180)
+3, POLYGON ((0 0, 150 0, 150 150, 0 150, 0 0))
+``:code
+
+or serialize them as json:
+
+``
+print db(sp.id>0).select(sp.id, sp.loc.st_asgeojson().with_alias('loc'))
+spatial.id,loc
+1, [1, 2]
+2, [[100, 100], [20 180], [180, 180]]
+3, [[[0, 0], [150, 0], [150, 150], [0, 150], [0, 0]]]
+``:code
+
+Here are example of how to use geographical functions:
+
+``
+query = sp.loc.st_intersects(geoLine((20,120),(60,160)))
+query = sp.loc.st_overlaps(geoPolygon((1,1),(11,1),(11,11),(11,1),(1,1)))
+query = sp.loc.st_contains(geoPoint(1,1))
+print db(query).select(sp.id,sp.loc.st_astext())
+spatial.id,spatial.loc.STAsText()
+3,"POLYGON ((0 0, 150 0, 150 150, 0 150, 0 0))"
+``:code
+
+Computed distances can also be retrieved as floating point numbers:
+
+``
+dist = sp.loc.st_distance(geoPoint(-1,2)).with_alias('dist')
+print db(sp.id>0).select(sp.id, dist)
+spatial.id, dist
+1 2.0
+2 140.714249456
+3 1.0
+``:code
+
+New versions of web2py may include automatic conversions from promitive geographical types to Python geographical objects encoded, for example, as tuples.
+
#### Copy data from one db into another
web2py comes with a Database Abstraction Layer (DAL), an API that maps Python objects into database objects such as queries, tables, and records. The DAL dynamically generates the SQL in real time using the specified dialect for the database back end, so that you do not have to write SQL code or learn different SQL dialects (the term SQL is used generically), and the application will be portable among different types of databases. At the time of this writing, the supported databases are SQLite (which comes with Python and thus web2py), PostgreSQL, MySQL, Oracle, MSSQL, FireBird, DB2, Informix, Ingres, the Google App Engine (SQL and NoSQL) and MongoDB. Experimentally we support more databases. Please check on the web2py web site and mailing list for more recent adapters. Google NoSQL is treated as a particular case in Chapter 13.
The Windows binary distribution works out of the box with SQLite and MySQL. The Mac binary distribution works out of the box with SQLite.
To use any other database back-end, run from the source distribution and install the appropriate driver for the required back end.
``database drivers``:inxx
Once the proper driver is installed, start web2py from source, and it will find the driver. Here is a list of drivers:
``DAL``:inxx ``SQLite``:inxx ``MySQL``:inxx ``PostgresSQL``:inxx ``Oracle``:inxx ``MSSQL``:inxx ``FireBird``:inxx ``DB2``:inxx ``Informix``:inxx ``Sybase``:inxx ``Teradata``:inxx ``MongoDB``:inxx ``CouchDB``:inxx ``SAPDB``:inxx ``Cubrid``:inxx
----------
database | drivers (source)
SQLite | sqlite3 or pysqlite2 or zxJDBC ``zxjdbc``:cite (on Jython)
PostgreSQL | psycopg2 ``psycopg2``:cite or pg8000 ``pg8000``:cite or zxJDBC ``zxjdbc``:cite (on Jython)
MySQL | pymysql ``pymysql``:cite or MySQLdb ``mysqldb``:cite
Oracle | cx_Oracle ``cxoracle``:cite
MSSQL | pyodbc ``pyodbc``:cite
FireBird | kinterbasdb ``kinterbasdb``:cite or fdb or pyodbc
DB2 | pyodbc ``pyodbc``:cite
Informix | informixdb ``informixdb``:cite
Ingres | ingresdbi ``ingresdbi``:cite
A connection with the database is established by creating an instance of the DAL
**MSSQL** | ``mssql://username:password@localhost/test``
**FireBird** | ``firebird://username:password@localhost/test``
**Oracle** | ``oracle://username/password@test``
**DB2** | ``db2://username:password@test``
**Ingres** | ``ingres://username:password@localhost/test``
**Sybase** | ``sybase://username:password@localhost/test``
**Informix** | ``informix://username:password@test``
**Teradata** | ``teradata://DSN=dsn;UID=user;PWD=pass;DATABASE=database``
**Cubrid** | ``cubrid://username:password@localhost/test``
**SAPDB** | ``sapdb://username:password@localhost/test``
**IMAP** | ``imap://user:password@server:port``
**MongoDB** | ``mongodb://username:password@localhost/test``
**Google App Engine/SQL** | ``google:sql``
**Google App Engine/NoSQL** | ``google:datastore``
-------------
Notice that in SQLite the database consists of a single file. If it does not exist, it is created. This file is locked every time it is accessed. In the case of MySQL, PostgreSQL, MSSQL, FireBird, Oracle, DB2, Ingres and Informix the database "test" must be created outside web2py. Once the connection is established, web2py will create, alter, and drop tables appropriately.
It is also possible to set the connection string to ``None``. In this case DAL will not connect to any back-end database, but the API can still be accessed for testing. Examples of this will be discussed in Chapter 7.
#### Connection pooling
``connection pooling``:inxx
The second argument of the DAL constructor is the ``pool_size``; it defaults to 0.
As it is rather slow to establish a new database connection for each request, web2py implements a mechanism for connection pooling. Once a connection is established and the page has been served and the transaction completed, the connection is not closed but goes into a pool. When the next http request arrives, web2py tries to obtain a connection from the pool and use that for the new transaction. If there are no available connections in the pool, a new connection is established.
The ``pool_size`` parameter is ignored by SQLite and Google App Engine.
Connections in the pools are shared sequentially among threads, in the sense that they may be used by two different but not simultaneous threads. There is only one pool for each web2py process.
When web2py starts, the pool is always empty. The pool grows up to the minimum between the value of ``pool_size`` and the max number of concurrent requests. This means that if ``pool_size=10`` but our server never receives more than 5 concurrent requests, then the actual pool size will only grow to 5. If ``pool_size=0`` then connection pooling is not used.
Connection pooling is ignored for SQLite, since it would not yield any benefit.
#### Connection failures
If web2py fails to connect to the database it waits 1 seconds and tries again up to 5 times before declaring a failure. In case of connection pooling it is possible that a pooled connection that stays open but unused for some time is closed by the database end. Thanks to the retry feature web2py tries to re-establish these dropped connections.
When using connection pooling a connection is used, put back in the pool and then recycled. It is possible that while the connection is idle in pool the connection is closed by the database server. This can be because of a malfunction or a timeout. When this happens web2py detects it and re-establish the connection.
sqlite
The connection string is called a ``_uri`` because it is an instance of a Uniform Resource Identifier.
The DAL allows multiple connections with the same database or with different databases, even databases of different types. For now, we will assume the presence of a single database since this is the most common situation.
``define_table``:inxx ``Field``:inxx
``type``:inxx ``length``:inxx ``default``:inxx ``requires``:inxx ``required``:inxx ``unique``:inxx
``notnull``:inxx ``ondelete``:inxx ``uploadfield``:inxx ``uploadseparate``:inxx ``migrate``:inxx ``sql.log``:inxx
The most important method of a DAL is ``define_table``:
``
>>> db.define_table('person', Field('name'))
``:code
It defines, stores and returns a ``Table`` object called "person" containing a field (column) "name". This object can also be accessed via ``db.person``, so you do not need to catch the return value.
Do not declare a field called "id", because one is created by web2py anyway. Every table has a field called "id" by default. It is an auto-increment integer field (starting at 1) used for cross-reference and for making every record unique, so "id" is a primary key. (Note: the id's starting at 1 is back-end specific. For example, this does not apply to the Google App Engine NoSQL.)
``named id field``:inxx
Optionally you can define a field of ``type='id'`` and web2py will use this field as auto-increment id field. This is not recommended except when accessing legacy database tables. With some limitation, you can also use different primary keys and this is discussed in the section on "Legacy databases and keyed tables".
----------
Because usually in web2py models are executed before controllers, it is possible that some table are defined even if not needed. It is therefore necessary to speed up the code by making table definitions lazy. This is done by setting the ``DAL(...,lazy_tables=True)`` attributes. Tables will be actually created only when accessed.
----------
### Record representation
It is optional but recommended to specify a format representation for records:
``
>>> db.define_table('person', Field('name'), format='%(name)s')
``:code
or
``
>>> db.define_table('person', Field('name'), format='%(name)s %(id)s')
``:code
or even more complex ones using a function:
``
>>> db.define_table('person', Field('name'),
format=lambda r: r.name or 'anonymous')
And you can update all records in a set by passing named arguments corresponding
#### Expressions
The value assigned an update statement can be an expression. For example consider this model
``
>>> db.define_table('person',
Field('name'),
Field('visits', 'integer', default=0))
>>> db(db.person.name == 'Massimo').update(
visits = db.person.visits + 1)
``:code
The values used in queries can also be expressions
``
>>> db.define_table('person',
Field('name'),
Field('visits', 'integer', default=0),
Field('clicks', 'integer', default=0))
>>> db(db.person.visits == db.person.clicks + 1).delete()
``:code
#### ``update_record``
``update_record``:inxx
web2py also allows updating a single record that is already in memory using ``update_record``
``
>>> row = db(db.person.id==2).select().first()
>>> row.update_record(name='Curt')
``:code
``update_record`` should not be confused with
``
>>> row.update(name='Curt')
``:code
because for a single row, the method ``update`` updates the row object but not the database record, as in the case of ``update_record``.
It is also possible to change the attributes of a row (one at a time) and then call ``update_record()`` without arguments to save the changes:
Here ``f`` is a dict of fields passed to insert or update, ``id`` is the id of t
>>> db(db.person.id==1).update(name='Tim')
(<Set (person.id = 1)>, {'name': 'Tim'})
(<Set (person.id = 1)>, {'name': 'Tim'})
>>> db(db.person.id==1).delete()
(<Set (person.id = 1)>,)
(<Set (person.id = 1)>,)
``:code
The return values of these callback should be ``None`` or ``False``. If any of the ``_before_*`` callback returns a ``True`` value it will abort the actual insert/update/delete operation.
``update_naive``:inxx.
Some times a callback may need to perform an update in the same of a different table and one wants to avoid callbacks calling themselves recursively.
For this purpose there the Set objects have an ``update_naive`` method that works like ``update`` but ignores before and after callbacks.
#### Record versioning
``_enable_record_versioning``:inxx
-It is possible to ask web2py to save every copy of a record when the record is modified. There are many ways to do it and it can be done for all tables at once using the ``auth.enable_record_versioning`` method, discussed in the chapter about authentication, or it can be done for each individual table as discussed here.
Consider the following table:
``
db.define_table('stored_item',
Field('name'),
Field('quantity','integer'),
Field('is_active','boolean',
writable=False,readable=False,default=True))
``:code
Notice we the hidden boolean field called ``is_active`` and defaulting to
True.
We can tell web2py to create a new table (in the same or a different database) and store all previous versions of each record in the table, when modified.
This is done in the following way:
``
db.stored_item._enable_record_versioning()
``:code
The DAL can be used from any Python program simply by doing this:
``
from gluon import DAL, Field
db = DAL('sqlite://storage.sqlite',folder='path/to/app/databases')
``:code
i.e. import the DAL, Field, connect and specify the folder which contains the .table files (the app/databases folder).
To access the data and its attributes we still have to define all the tables we are going to access with ``db.define_tables(...)``.
If we just need access to the data but not to the web2py table attributes, we get away without re-defining the tables but simply asking web2py to read the necessary info from the metadata in the .table files:
``
from gluon import DAL, Field
db = DAL('sqlite://storage.sqlite',folder='path/to/app/databases',
auto_import=True))
``:code
This allows us to access any ``db.table`` without need to re-define it.
#### Copy data from one db into another

``Expression``:inxx
+
**Expression** is something like an ``orderby`` or ``groupby`` expression. The Field class is derived from the Expression. Here is an example.
``
myorder = db.mytable.myfield.upper() | db.mytable.id
db().select(db.table.ALL, orderby=myorder)
``:code
### Connection strings
``connection strings``:inxx
A connection with the database is established by creating an instance of the DAL object:
``
>>> db = DAL('sqlite://storage.db', pool_size=0)
``:code
``db`` is not a keyword; it is a local variable that stores the connection object ``DAL``. You are free to give it a different name. The constructor of ``DAL`` requires a single argument, the connection string. The connection string is the only web2py code that depends on a specific back-end database. Here are examples of connection strings for specific types of supported back-end databases (in all cases, we assume the database is running from localhost on its default port and is named "test"):
-------------
**SQLite** | ``sqlite://storage.db``
**MySQL** | ``mysql://username:password@localhost/test``
**PostgreSQL** | ``postgres://username:password@localhost/test``
**MSSQL** | ``mssql://username:password@localhost/test``
which returns a tuple ``(value, error)``. ``error`` is ``None`` if the input pas
We refer to this behavior as a "migration". web2py logs all migrations and migration attempts in the file "databases/sql.log".
The first argument of ``define_table`` is always the table name. The other unnamed arguments are the fields (Field). The function also takes an optional last argument called "migrate" which must be referred to explicitly by name as in:
``
>>> db.define_table('person', Field('name'), migrate='person.table')
``:code
The value of migrate is the filename (in the "databases" folder for the application) where web2py stores internal migration information for this table. These files are very important and should never be removed except when the entire database is dropped. In this case, the ".table" files have to be removed manually. By default, migrate is set to True. This causes web2py to generate the filename from a hash of the connection string. If migrate is set to False, the migration is not performed, and web2py assumes that the table exists in the datastore and it contains (at least) the fields listed in ``define_table``.
The best practice is to give an explicit name to the migrate table.
There may not be two tables in the same application with the same migrate filename.
The DAL class also takes a "migrate" argument, which determines the default value of migrate for calls to ``define_table``. For example,
``
>>> db = DAL('sqlite://storage.db', migrate=False)
``:code
will set the default value of migrate to False whenever ``db.define_table`` is called without a migrate argument.
+------
+Notice that web2py only migrates new columns, removed columns, and changes in column type (not in sqlite). web2py does not migrate changes in attributes such as changes in the values of ``default``, ``unique``, ``notnull``, and ``ondelete``.
+------
+
Migrations can be disabled for all tables at the moment of connection:
``
db = DAL(...,migrate_enabled=False)
``
This is the recommended behavior when two apps share the same database. Only one of the two apps should perform migrations, the other should disabled them.
### Fixing broken migrations
``fake_migrate``:inxx
There are two common problems with migrations and there are ways to recover from them.
One problem is specific with SQLite. SQLite does not enforce column types and cannot drop columns. This means that if you have a column of type string and you remove it, it is not really removed. If you add the column again with a different type (for example datetime) you end up with a datetime column that contains strings (junk for practical purposes). web2py does not complain about this because it does not know what is in the database, until it tries to retrieve records and fails.
If web2py returns an error in the gluon.sql.parse function when selecting records, this is the problem: corrupted data in a column because of the above issue.
The solution consists in updating all records of the table and updating the values in the column in question with None.
The other problem is more generic but typical with MySQL. MySQL does not allow more than one ALTER TABLE in a transaction. This means that web2py must break complex transactions into smaller ones (one ALTER TABLE at the time) and commit one piece at the time. It is therefore possible that part of a complex transaction gets committed and one part fails, leaving web2py in a corrupted state. Why would part of a transaction fail? Because, for example, it involves altering a table and converting a string column into a datetime column, web2py tries to convert the data, but the data cannot be converted. What happens to web2py? It gets confused about what exactly is the table structure actually stored in the database.
Given a table, you can insert records
1
>>> db.person.insert(name="Bob")
2
``:code
Insert returns the unique "id" value of each record inserted.
You can truncate the table, i.e., delete all records and reset the counter of the id.
``truncate``:inxx
``
>>> db.person.truncate()
``:code
Now, if you insert a record again, the counter starts again at 1 (this is back-end specific and does not apply to Google NoSQL):
``
>>> db.person.insert(name="Alex")
1
``:code
+Notice you can pass parameters to ``truncate``, for example you can tell SQLITE to restart the id counter.
+
+``
+db.person.truncate('RESTART IDENTITY CASCADE')
+``:code
+
+The argument is in raw SQL and thefeore engine specific.
+
``bulk_insert``:inxx
web2py also provides a bulk_insert method
``
>>> db.person.bulk_insert([{'name':'Alex'}, {'name':'John'}, {'name':'Tim'}])
[3,4,5]
``:code
It takes a list of dictionaries of fields to be inserted and performs multiple inserts at once. It returns the IDs of the inserted records. On the supported relational databases there is no advantage in using this function as opposed to looping and performing individual inserts but on Google App Engine NoSQL, there is a major speed advantage.
### ``commit`` and ``rollback``
No create, drop, insert, truncate, delete, or update operation is actually committed until you issue the commit command
``commit``:inxx
``
>>> db.commit()
``:code
To check it let's insert a new record:
``
Here is an example:
------
``SQLTABLE`` is useful but there are types when one needs more. ``SQLFORM.grid`` is an extension of SQLTABLE that creates a table with search features and pagination, as well as ability to open detailed records, create, edit and delete records. ``SQLFORM.smartgrid`` is a further generalization that allows all of the above but also creates buttons to access referencing records.
------
Here is an example of usage of ``SQLFORM.grid``:
``
def index():
return dict(grid=SQLFORM.grid(query))
``:code
and the corresponding view:
``
{{extend 'layout.html'}}
{{=grid}}
``
``SQLFORM.grid`` and ``SQLFORM.smartgrid`` should be preferred to ``SQLTABLE`` because they are more powerful although higher level and therefore more constraining. They will be explained in more detail in chapter 8.
#### ``orderby``, ``groupby``, ``limitby``, ``distinct``, ``having``
The ``select`` command takes five optional arguments: orderby, groupby, limitby, left and cache. Here we discuss the first three.
You can fetch the records sorted by name:
``orderby``:inxx ``groupby``:inxx ``having``:inxx
``
>>> for row in db().select(
db.person.ALL, orderby=db.person.name):
print row.name
Alex
Bob
Carl
``:code
You can fetch the records sorted by name in reverse order (notice the tilde):
``
>>> for row in db().select(
db.person.ALL, orderby=~db.person.name):
print row.name
Carl
Bob
Alex
``:code
You can have the fetched records appear in random order:
And you can sort the records according to multiple fields by concatenating them
``
>>> for row in db().select(
db.person.ALL, orderby=db.person.name|db.person.id):
print row.name
Carl
Bob
Alex
``:code
Using ``groupby`` together with ``orderby``, you can group records with the same value for the specified field (this is back-end specific, and is not on the Google NoSQL):
``
>>> for row in db().select(
db.person.ALL,
orderby=db.person.name, groupby=db.person.name):
print row.name
Alex
Bob
Carl
``:code
+You can use ``having`` in conjunction with ``groupby`` to group conditionally (only those ``having`` the condition are grouped.
+
+``
+>>> print db(query1).select(db.person.ALL, groupby=db.person.name, having=query2)
+``
+
+Notice that query1 filters records to be displayed, query2 filters records to be grouped.
+
``distinct``:inxx
With the argument ``distinct=True``, you can specify that you only want to select distinct records. This has the same effect as grouping using all specified fields except that it does not require sorting. When using distinct it is important not to select ALL fields, and in particular not to select the "id" field, else all records will always be distinct.
Here is an example:
``
>>> for row in db().select(db.person.name, distinct=True):
print row.name
Alex
Bob
Carl
``:code
Notice that ``distinct`` can also be an expression for example:
``
>>> for row in db().select(db.person.name,distinct=db.person.name):
print row.name
Alex
Bob
Carl
and the binary OR operator "``|``":
>>> for row in rows: print row.id, row.name
1 Alex
``:code
You can negate a query (or sub-query) with the "``!=``" binary operator:
``
>>> rows = db((db.person.name!='Alex') | (db.person.id>3)).select()
>>> for row in rows: print row.id, row.name
2 Bob
3 Carl
``:code
or by explicit negation with the "``~``" unary operator:
``
>>> rows = db((~db.person.name=='Alex') | (db.person.id>3)).select()
>>> for row in rows: print row.id, row.name
2 Bob
3 Carl
``:code
+------
Due to Python restrictions in overloading "``and``" and "``or``" operators, these cannot be used in forming queries. The binary operators "``&``" and "``|``" must be used instead. Note that these operators (unlike "``and``" and "``or``") have higher precedence than comparison operators, so the "extra" parentheses in the above examples are mandatory.
+------
It is also possible to build queries using in-place logical operators:
``
>>> query = db.person.name!='Alex'
>>> query &= db.person.id>3
>>> query |= db.person.name=='John'
``
#### ``count``, ``isempty``, ``delete``, ``update``
You can count records in a set:
``count``:inxx ``isempty``:inxx
``
>>> print db(db.person.id > 0).count()
3
``:code
are equivalent to
>>> last_row = rows[-1] if len(rows)>0 else None
``:code
#### ``as_dict`` and ``as_list``
``as_list``:inxx ``as_dict``:inxx
A Row object can be serialized into a regular dictionary using the ``as_dict()`` method and a Rows object can be serialized into a list of dictionaries using the ``as_list()`` method. Here are some examples:
``
>>> rows = db(query).select()
>>> rows_list = rows.as_list()
>>> first_row_dict = rows.first().as_dict()
``:code
These methods are convenient for passing Rows to generic views and or to store Rows in sessions (since Rows objects themselves cannot be serialized since contain a reference to an open DB connection):
``
>>> rows = db(query).select()
>>> session.rows = rows # not allowed!
>>> session.rows = rows.as_list() # allowed!
``:code
+#### Combining rows
+
+Row objects can be conbined at the Python level. Here we assume:
+
+``
+>>> print rows1
+person.name
+Max
+Tim
+>>> print rows2
+person.name
+John
+Tim
+``
+
+
+You can do a union of the records in two set of rows:
+
+``
+>>> rows3 = rows1 & rows2
+>>> print rows3
+name
+Max
+Tim
+John
+Tim
+``:code
+
+You can do a union of the records and removing duplicates:
+
+``
+>>> rows3 = rows1 | rows2
+>>> print rows3
+name
+Max
+Tim
+John
+``:code
+
#### ``find``, ``exclude``, ``sort``
``find``:inxx ``exclude``:inxx ``sort``:inxx
There are times when one needs to perform two selects and one contains a subset of a previous select. In this case it is pointless to access the database again. The ``find``, ``exclude`` and ``sort`` objects allow you to manipulate a Rows objects and generate another one without accessing the database. More specifically:
- ``find`` returns a new set of Rows filtered by a condition and leaves the original unchanged.
- ``exclude`` returns a new set of Rows filtered by a condition and removes them from the original Rows.
- ``sort`` returns a new set of Rows sorted by a condition and leaves the original unchanged.
All these methods take a single argument, a function that acts on each individual row.
Here is an example of usage:
``
>>> db.define_table('person',Field('name'))
>>> db.person.insert(name='John')
>>> db.person.insert(name='Max')
>>> db.person.insert(name='Alex')
>>> rows = db(db.person).select()
>>> for row in rows.find(lambda row: row.name[0]=='M'):
print row.name
Max
Max
>>> print len(rows)
2
>>> for row in rows.sort(lambda row: row.name):
print row.name
Alex
John
``:code
They can be combined:
``
>>> rows = db(db.person).select()
>>> rows = rows.find(
lambda row: 'x' in row.name).sort(
lambda row: row.name)
>>> for row in rows:
print row.name
Alex
Max
``:code
+Sort takes an optional argment ``reverse=True`` which an obvious meaning.
+
The ``find`` method as an optional limitby argument with the same syntax and functionality as the Set select ``method``.
### Other methods
#### ``update_or_insert``
``update_or_insert``:inxx
Some times you need to perform an insert only if there is no record with the same values as those being inserted.
This can be done with
``
db.define_table('person',Field('name'),Field('birthplace'))
db.person.update_or_insert(name='John',birthplace='Chicago')
``:code
The record will be inserted only of there is no other user called John born in Chicago.
You can specify which values to use as a key to determine if the record exists. For example:
Here we will consider the same example as in the previous subsection. In particu
``
>>> db.define_table('item',
Field('unit_price','double'),
Field('quantity','integer'),
``:code
One can define a ``total_price`` virtual field as
``
>>> db.item.total_price = Field.Virtual(lambda row: row.unit_price*row.quantity)
``:code
i.e. by simply defining a new field ``total_price`` to be a ``Field.Virtual``. The only argument of the constructor is a function that takes a row and returns the computed values.
A virtual field defined as the one above is automatically computed for all records when the records are selected:
``
>>> for row in db(db.item).select(): print row.total_price
``
It is also possible to define method fields which are calculated on-demand, when called.
For example:
``
>>> db.item.discounted_total = Field.Method(lambda row, discount=0.0: \
row.unit_price*row.quantity*(1.0-discount/100))
``:code
In this case ``row.discounted_total`` is not a value but a function. The function takes the same arguments as the function passed to the ``Method`` constructor except for ``row`` which is implicit (think of it as ``self`` for rows objects).
The lazy field in the example above allows one to compute the total price for each ``item``:
``
>>> for row in db(db.item).select(): print row.discounted_total()
``
And it also allows to pass an optional ``discount`` percentage (15%):
``
>>> for row in db(db.item).select(): print row.discounted_total(15)
``
+Virtual and Method fields can also be defined in place when a table is defined:
+
+``
+>>> db.define_table('item',
+ Field('unit_price','double'),
+ Field('quantity','integer'),
+ Field.Virtual('total_price', lambda row: ...),
+ Field.Method('discounted_total', lambda row, discount=0.0: ...))
+``:code
+
+
------
Mind that virtual fields do not have the same attributes as the other fields (default, readable, requires, etc) and they do not appear in the list of ``db.table.fields`` and are not visualized by default in tables (TABLE) and grids (SQLFORM.grid, SQLFORM.smartgrid).
------
### One to many relation
``one to many``:inxx
To illustrate how to implement one to many relations with the web2py DAL, define another table "thing" that refers to the table "person" which we redefine here:
``
>>> db.define_table('person',
Field('name'),
format='%(name)s')
>>> db.define_table('thing',
Field('name'),
Field('owner', 'reference person'),
format='%(name)s')
``:code
Table "thing" has two fields, the name of the thing and the owner of the thing. The "owner" field id a reference field. A reference type can be specified in two equivalent ways:
generated by the DAL.
#### CSV (one Table at a time)
When a DALRows object is converted to a string it is automatically
serialized in CSV:
``csv``:inxx
``
>>> rows = db(db.person.id==db.thing.owner).select()
>>> print rows
person.id,person.name,thing.id,thing.name,thing.owner
1,Alex,1,Boat,1
1,Alex,2,Chair,1
2,Bob,3,Shoes,2
``:code
You can serialize a single table in CSV and store it in a file "test.csv":
``
>>> open('test.csv', 'w').write(str(db(db.person.id).select()))
``:code
+This is equivalent to
+
+``
+>>> rows = db(db.person.id).select()
+>>> rows.export_to_csv_file(open('test.csv', 'w'))
+``:code
+
+You can read the CSV file back with:
``
>>> db.person.import_from_csv_file(open('test.csv', 'r'))
``:code
When importing, web2py looks for the field names in the CSV header. In this example, it finds two columns: "person.id" and "person.name". It ignores the "person." prefix, and it ignores the "id" fields. Then all records are appended and assigned new ids. Both of these operations can be performed via the appadmin web interface.
#### CSV (all tables at once)
In web2py, you can backup/restore an entire database with two commands:
To export:
``
>>> db.export_to_csv_file(open('somefile.csv', 'wb'))
``:code
To import:
``
>>> db.import_from_csv_file(open('somefile.csv', 'rb'))
``:code
If you need to serialize the DALRows in any other XML format with custom tags, y
...
</result>
``:code
#### Data representation
``export_to_csv_file``:inxx
The ``export_to_csv_file`` function accepts a keyword argument named ``represent``. When ``True`` it will use the columns ``represent`` function while exporting the data instead of the raw data.
``colnames``:inxx
The function also accepts a keyword argument named ``colnames`` that should contain a list of column names one wish to export. It defaults to all columns.
Both ``export_to_csv_file`` and ``import_from_csv_file`` accept keyword arguments that tell the csv parser the format to save/load the files:
- ``delimiter``: delimiter to separate values (default ',')
- ``quotechar``: character to use to quote string values (default to double quotes)
- ``quoting``: quote system (default ``csv.QUOTE_MINIMAL``)
Here is some example usage:
``
>>> import csv
+>>> rows = db(query).select()
+>>> rows.export_to_csv_file(open('/tmp/test.txt', 'w'),
delimiter='|',
quotechar='"',
quoting=csv.QUOTE_NONNUMERIC)
``:code
``Expression``:inxx
**Expression** is something like an ``orderby`` or ``groupby`` expression. The Field class is derived from the Expression. Here is an example.
``
myorder = db.mytable.myfield.upper() | db.mytable.id
db().select(db.table.ALL, orderby=myorder)
``:code
### Connection strings
``connection strings``:inxx
A connection with the database is established by creating an instance of the DAL object:
``
>>> db = DAL('sqlite://storage.db', pool_size=0)
``:code
``db`` is not a keyword; it is a local variable that stores the connection object ``DAL``. You are free to give it a different name. The constructor of ``DAL`` requires a single argument, the connection string. The connection string is the only web2py code that depends on a specific back-end database. Here are examples of connection strings for specific types of supported back-end databases (in all cases, we assume the database is running from localhost on its default port and is named "test"):
-------------
**SQLite** | ``sqlite://storage.db``
**MySQL** | ``mysql://username:password@localhost/test``
**PostgreSQL** | ``postgres://username:password@localhost/test``
**MSSQL** | ``mssql://username:password@localhost/test``
which returns a tuple ``(value, error)``. ``error`` is ``None`` if the input pas
We refer to this behavior as a "migration". web2py logs all migrations and migration attempts in the file "databases/sql.log".
The first argument of ``define_table`` is always the table name. The other unnamed arguments are the fields (Field). The function also takes an optional last argument called "migrate" which must be referred to explicitly by name as in:
``
>>> db.define_table('person', Field('name'), migrate='person.table')
``:code
The value of migrate is the filename (in the "databases" folder for the application) where web2py stores internal migration information for this table. These files are very important and should never be removed except when the entire database is dropped. In this case, the ".table" files have to be removed manually. By default, migrate is set to True. This causes web2py to generate the filename from a hash of the connection string. If migrate is set to False, the migration is not performed, and web2py assumes that the table exists in the datastore and it contains (at least) the fields listed in ``define_table``.
The best practice is to give an explicit name to the migrate table.
There may not be two tables in the same application with the same migrate filename.
The DAL class also takes a "migrate" argument, which determines the default value of migrate for calls to ``define_table``. For example,
``
>>> db = DAL('sqlite://storage.db', migrate=False)
``:code
will set the default value of migrate to False whenever ``db.define_table`` is called without a migrate argument.
Migrations can be disabled for all tables at the moment of connection:
``
db = DAL(...,migrate_enabled=False)
``
This is the recommended behavior when two apps share the same database. Only one of the two apps should perform migrations, the other should disabled them.
### Fixing broken migrations
``fake_migrate``:inxx
There are two common problems with migrations and there are ways to recover from them.
One problem is specific with SQLite. SQLite does not enforce column types and cannot drop columns. This means that if you have a column of type string and you remove it, it is not really removed. If you add the column again with a different type (for example datetime) you end up with a datetime column that contains strings (junk for practical purposes). web2py does not complain about this because it does not know what is in the database, until it tries to retrieve records and fails.
If web2py returns an error in the gluon.sql.parse function when selecting records, this is the problem: corrupted data in a column because of the above issue.
The solution consists in updating all records of the table and updating the values in the column in question with None.
The other problem is more generic but typical with MySQL. MySQL does not allow more than one ALTER TABLE in a transaction. This means that web2py must break complex transactions into smaller ones (one ALTER TABLE at the time) and commit one piece at the time. It is therefore possible that part of a complex transaction gets committed and one part fails, leaving web2py in a corrupted state. Why would part of a transaction fail? Because, for example, it involves altering a table and converting a string column into a datetime column, web2py tries to convert the data, but the data cannot be converted. What happens to web2py? It gets confused about what exactly is the table structure actually stored in the database.
Given a table, you can insert records
1
>>> db.person.insert(name="Bob")
2
``:code
Insert returns the unique "id" value of each record inserted.
You can truncate the table, i.e., delete all records and reset the counter of the id.
``truncate``:inxx
``
>>> db.person.truncate()
``:code
Now, if you insert a record again, the counter starts again at 1 (this is back-end specific and does not apply to Google NoSQL):
``
>>> db.person.insert(name="Alex")
1
``:code
``bulk_insert``:inxx
web2py also provides a bulk_insert method
``
>>> db.person.bulk_insert([{'name':'Alex'}, {'name':'John'}, {'name':'Tim'}])
[3,4,5]
``:code
It takes a list of dictionaries of fields to be inserted and performs multiple inserts at once. It returns the IDs of the inserted records. On the supported relational databases there is no advantage in using this function as opposed to looping and performing individual inserts but on Google App Engine NoSQL, there is a major speed advantage.
### ``commit`` and ``rollback``
No create, drop, insert, truncate, delete, or update operation is actually committed until you issue the commit command
``commit``:inxx
``
>>> db.commit()
``:code
To check it let's insert a new record:
``
Here is an example:
------
``SQLTABLE`` is useful but there are types when one needs more. ``SQLFORM.grid`` is an extension of SQLTABLE that creates a table with search features and pagination, as well as ability to open detailed records, create, edit and delete records. ``SQLFORM.smartgrid`` is a further generalization that allows all of the above but also creates buttons to access referencing records.
------
Here is an example of usage of ``SQLFORM.grid``:
``
def index():
return dict(grid=SQLFORM.grid(query))
``:code
and the corresponding view:
``
{{extend 'layout.html'}}
{{=grid}}
``
``SQLFORM.grid`` and ``SQLFORM.smartgrid`` should be preferred to ``SQLTABLE`` because they are more powerful although higher level and therefore more constraining. They will be explained in more detail in chapter 8.
#### ``orderby``, ``groupby``, ``limitby``, ``distinct``
The ``select`` command takes five optional arguments: orderby, groupby, limitby, left and cache. Here we discuss the first three.
You can fetch the records sorted by name:
``orderby``:inxx
``
>>> for row in db().select(
db.person.ALL, orderby=db.person.name):
print row.name
Alex
Bob
Carl
``:code
You can fetch the records sorted by name in reverse order (notice the tilde):
``
>>> for row in db().select(
db.person.ALL, orderby=~db.person.name):
print row.name
Carl
Bob
Alex
``:code
You can have the fetched records appear in random order:
And you can sort the records according to multiple fields by concatenating them
``
>>> for row in db().select(
db.person.ALL, orderby=db.person.name|db.person.id):
print row.name
Carl
Bob
Alex
``:code
Using ``groupby`` together with ``orderby``, you can group records with the same value for the specified field (this is back-end specific, and is not on the Google NoSQL):
``
>>> for row in db().select(
db.person.ALL,
orderby=db.person.name, groupby=db.person.name):
print row.name
Alex
Bob
Carl
``:code
``distinct``:inxx
With the argument ``distinct=True``, you can specify that you only want to select distinct records. This has the same effect as grouping using all specified fields except that it does not require sorting. When using distinct it is important not to select ALL fields, and in particular not to select the "id" field, else all records will always be distinct.
Here is an example:
``
>>> for row in db().select(db.person.name, distinct=True):
print row.name
Alex
Bob
Carl
``:code
Notice that ``distinct`` can also be an expression for example:
``
>>> for row in db().select(db.person.name,distinct=db.person.name):
print row.name
Alex
Bob
Carl
and the binary OR operator "``|``":
>>> for row in rows: print row.id, row.name
1 Alex
``:code
You can negate a query (or sub-query) with the "``!=``" binary operator:
``
>>> rows = db((db.person.name!='Alex') | (db.person.id>3)).select()
>>> for row in rows: print row.id, row.name
2 Bob
3 Carl
``:code
or by explicit negation with the "``~``" unary operator:
``
>>> rows = db((~db.person.name=='Alex') | (db.person.id>3)).select()
>>> for row in rows: print row.id, row.name
2 Bob
3 Carl
``:code
Due to Python restrictions in overloading "``and``" and "``or``" operators, these cannot be used in forming queries. The binary operators "``&``" and "``|``" must be used instead. Note that these operators (unlike "``and``" and "``or``") have higher precedence than comparison operators, so the "extra" parentheses in the above examples are mandatory.
It is also possible to build queries using in-place logical operators:
``
>>> query = db.person.name!='Alex'
>>> query &= db.person.id>3
>>> query |= db.person.name=='John'
``
#### ``count``, ``isempty``, ``delete``, ``update``
You can count records in a set:
``count``:inxx ``isempty``:inxx
``
>>> print db(db.person.id > 0).count()
3
``:code
are equivalent to
>>> last_row = rows[-1] if len(rows)>0 else None
``:code
#### ``as_dict`` and ``as_list``
``as_list``:inxx ``as_dict``:inxx
A Row object can be serialized into a regular dictionary using the ``as_dict()`` method and a Rows object can be serialized into a list of dictionaries using the ``as_list()`` method. Here are some examples:
``
>>> rows = db(query).select()
>>> rows_list = rows.as_list()
>>> first_row_dict = rows.first().as_dict()
``:code
These methods are convenient for passing Rows to generic views and or to store Rows in sessions (since Rows objects themselves cannot be serialized since contain a reference to an open DB connection):
``
>>> rows = db(query).select()
>>> session.rows = rows # not allowed!
>>> session.rows = rows.as_list() # allowed!
``:code
#### ``find``, ``exclude``, ``sort``
``find``:inxx ``exclude``:inxx ``sort``:inxx
There are times when one needs to perform two selects and one contains a subset of a previous select. In this case it is pointless to access the database again. The ``find``, ``exclude`` and ``sort`` objects allow you to manipulate a Rows objects and generate another one without accessing the database. More specifically:
- ``find`` returns a new set of Rows filtered by a condition and leaves the original unchanged.
- ``exclude`` returns a new set of Rows filtered by a condition and removes them from the original Rows.
- ``sort`` returns a new set of Rows sorted by a condition and leaves the original unchanged.
All these methods take a single argument, a function that acts on each individual row.
Here is an example of usage:
``
>>> db.define_table('person',Field('name'))
>>> db.person.insert(name='John')
>>> db.person.insert(name='Max')
>>> db.person.insert(name='Alex')
>>> rows = db(db.person).select()
>>> for row in rows.find(lambda row: row.name[0]=='M'):
print row.name
Max
Max
>>> print len(rows)
2
>>> for row in rows.sort(lambda row: row.name):
print row.name
Alex
John
``:code
They can be combined:
``
>>> rows = db(db.person).select()
>>> rows = rows.find(
lambda row: 'x' in row.name).sort(
lambda row: row.name)
>>> for row in rows:
print row.name
Alex
Max
``:code
The ``find`` method as an optional limitby argument with the same syntax and functionality as the Set select ``method``.
### Other methods
#### ``update_or_insert``
``update_or_insert``:inxx
Some times you need to perform an insert only if there is no record with the same values as those being inserted.
This can be done with
``
db.define_table('person',Field('name'),Field('birthplace'))
db.person.update_or_insert(name='John',birthplace='Chicago')
``:code
The record will be inserted only of there is no other user called John born in Chicago.
You can specify which values to use as a key to determine if the record exists. For example:
Here we will consider the same example as in the previous subsection. In particu
``
>>> db.define_table('item',
Field('unit_price','double'),
Field('quantity','integer'),
``:code
One can define a ``total_price`` virtual field as
``
>>> db.item.total_price = Field.Virtual(lambda row: row.unit_price*row.quantity)
``:code
i.e. by simply defining a new field ``total_price`` to be a ``Field.Virtual``. The only argument of the constructor is a function that takes a row and returns the computed values.
A virtual field defined as the one above is automatically computed for all records when the records are selected:
``
>>> for row in db(db.item).select(): print row.total_price
``
It is also possible to define lazy virtual fields which are calculated on-demand, when called.
For example:
``
>>> db.item.total_price = Field.Lazy(lambda row, discount=0.0: \
row.unit_price*row.quantity*(1.0-discount/100))
``:code
In this case ``row.total_price`` is not a value but a function. The function takes the same arguments as the function passed to the ``Lazy`` constructor except for ``row`` which is implicit (think of it as ``self`` for rows objects).
The lazy field in the example above allows one to compute the total price for each ``item``:
``
>>> for row in db(db.item).select(): print row.total_price()
``
And it also allows to pass an optional ``discount`` percentage (15%):
``
>>> for row in db(db.item).select(): print row.total_price(15)
``
------
Mind that virtual fields do not have the same attributes as the other fields (default, readable, requires, etc) and they do not appear in the list of ``db.table.fields`` and are not visualized by default in tables (TABLE) and grids (SQLFORM.grid, SQLFORM.smartgrid).
------
### One to many relation
``one to many``:inxx
To illustrate how to implement one to many relations with the web2py DAL, define another table "thing" that refers to the table "person" which we redefine here:
``
>>> db.define_table('person',
Field('name'),
format='%(name)s')
>>> db.define_table('thing',
Field('name'),
Field('owner', 'reference person'),
format='%(name)s')
``:code
Table "thing" has two fields, the name of the thing and the owner of the thing. The "owner" field id a reference field. A reference type can be specified in two equivalent ways:
generated by the DAL.
#### CSV (one Table at a time)
When a DALRows object is converted to a string it is automatically
serialized in CSV:
``csv``:inxx
``
>>> rows = db(db.person.id==db.thing.owner).select()
>>> print rows
person.id,person.name,thing.id,thing.name,thing.owner
1,Alex,1,Boat,1
1,Alex,2,Chair,1
2,Bob,3,Shoes,2
``:code
You can serialize a single table in CSV and store it in a file "test.csv":
``
>>> open('test.csv', 'w').write(str(db(db.person.id).select()))
``:code
-and you can easily read it back with:
``
>>> db.person.import_from_csv_file(open('test.csv', 'r'))
``:code
When importing, web2py looks for the field names in the CSV header. In this example, it finds two columns: "person.id" and "person.name". It ignores the "person." prefix, and it ignores the "id" fields. Then all records are appended and assigned new ids. Both of these operations can be performed via the appadmin web interface.
#### CSV (all tables at once)
In web2py, you can backup/restore an entire database with two commands:
To export:
``
>>> db.export_to_csv_file(open('somefile.csv', 'wb'))
``:code
To import:
``
>>> db.import_from_csv_file(open('somefile.csv', 'rb'))
``:code
If you need to serialize the DALRows in any other XML format with custom tags, y
...
</result>
``:code
#### Data representation
``export_to_csv_file``:inxx
The ``export_to_csv_file`` function accepts a keyword argument named ``represent``. When ``True`` it will use the columns ``represent`` function while exporting the data instead of the raw data.
``colnames``:inxx
The function also accepts a keyword argument named ``colnames`` that should contain a list of column names one wish to export. It defaults to all columns.
Both ``export_to_csv_file`` and ``import_from_csv_file`` accept keyword arguments that tell the csv parser the format to save/load the files:
- ``delimiter``: delimiter to separate values (default ',')
- ``quotechar``: character to use to quote string values (default to double quotes)
- ``quoting``: quote system (default ``csv.QUOTE_MINIMAL``)
Here is some example usage:
``
>>> import csv
->>> db.export_to_csv_file(open('/tmp/test.txt', 'w'),
delimiter='|',
quotechar='"',
quoting=csv.QUOTE_NONNUMERIC)
``:code

+The ``update_record`` method is available only if the table's ``id`` field is included in the select, and ``cacheable`` is not set to ``True``.
+
#### ``first`` and ``last``
``first``:inxx ``last``:inxx
#### ``first`` and ``last``
``first``:inxx ``last``:inxx

It is also possible to change the attributes of a row (one at a time) and then call ``update_record()`` without arguments to save the changes:
It is also possible to change the attributes of a row (one at the time) and then call ``update_record()`` without arguments to save the changes:

``sqlite3``, ``pymysql``, ``pg8000``, and ``imaplib`` ship with web2py. Support of MongoDB is experimental. The IMAP option allows to use DAL to access IMAP.
``sqlite3``, ``pymysql``, ``pg8000``, and ``imaplib`` ship with web2py. Support of MongoDB is experimental. The IMAP option allows to use DAL to access IAMP.

Consider the previous table person and a new table "thing" referencing a "person":
``
+>>> db.define_table('thing',
+ Field('name'),
+ Field('owner','reference person'))
``:code
and a simple select from this table:
``
>>> things = db(db.thing).select()
``:code
which is equivalent to
``
>>> things = db(db.thing._id>0).select()
``:code
where ``._id`` is a reference to the primary key of the table. Normally ``db.thing._id`` is the same as ``db.thing.id`` and we will assume that in most of this book. ``_id``:inxx
For each Row of things it is possible to fetch not just fields from the selected table (thing) but also from linked tables (recursively):
``
>>> for thing in things: print thing.name, thing.owner.name
``:code
And it also allows to pass an optional ``discount`` percentage (15%):
``
------
Mind that virtual fields do not have the same attributes as the other fields (default, readable, requires, etc) and they do not appear in the list of ``db.table.fields`` and are not visualized by default in tables (TABLE) and grids (SQLFORM.grid, SQLFORM.smartgrid).
------
### One to many relation
``one to many``:inxx
To illustrate how to implement one to many relations with the web2py DAL, define another table "thing" that refers to the table "person" which we redefine here:
``
>>> db.define_table('person',
Field('name'),
format='%(name)s')
>>> db.define_table('thing',
Field('name'),
Field('owner', 'reference person'),
format='%(name)s')
``:code
+Table "thing" has two fields, the name of the thing and the owner of the thing. The "owner" field id a reference field. A reference type can be specified in two equivalent ways:
+
+``
+Field('owner', 'reference person')
+Field('owner', db.person)
+``:code
+
+The latter is always converted to the former. They are equivalent except in the case of lazy tables, self references or other types of cyclic references where the former notation is the only allowed notation.
+
+When a field type is another table, it is intended that the field reference the other table by its id. In fact, you can print the actual type value and get:
``
>>> print db.thing.owner.type
reference person
``:code
Now, insert three things, two owned by Alex and one by Bob:
``
>>> db.thing.insert(name='Boat', owner=1)
1
>>> db.thing.insert(name='Chair', owner=1)
2
>>> db.thing.insert(name='Shoes', owner=2)
3
``:code
You can select as you did for any other table:
``
>>> for row in db(db.thing.owner==1).select():
print row.name
Boat
Which would render something similar to
"hello"|35|"this is the text description"|"2009-03-03"
``:code
For more information consult the official Python documentation ``quoteall``:cite
### Caching selects
The select method also takes a cache argument, which defaults to None. For caching purposes, it should be set to a tuple where the first element is the cache model (cache.ram, cache.disk, etc.), and the second element is the expiration time in seconds.
In the following example, you see a controller that caches a select on the previously defined db.log table. The actual select fetches data from the back-end database no more frequently than once every 60 seconds and stores the result in cache.ram. If the next call to this controller occurs in less than 60 seconds since the last database IO, it simply fetches the previous data from cache.ram.
``cache select``:inxx
``
def cache_db_select():
logs = db().select(db.log.ALL, cache=(cache.ram, 60))
return dict(logs=logs)
``:code
``cacheable``:inxx
The ``select`` method has an optional ``cacheable`` argument, normally set to ``False``. When ``cacheable=True`` the resulting ``Rows`` is serializable but The ``Row``s lack ``update_record`` and ``delete_record`` methods.
If you do not need these methods you can speed up selects a lot by setting the cacheable attribute:
``
rows = db(query).select(cacheable=True)
``:code
-------
The results of a ``select`` are normally complex, un-pickleable objects; they cannot be stored in a session and cannot be cached in any other way than the one explained here unless the ``cache`` attribute is set or ``cacheable=True``.
-------
+When the ``cache`` argument is set but ``cacheable=False`` (default) only the database results are cached, not the actual Rows object. When the ``cache`` argument is used in conjuction with ``cacheable=True`` the entire Rows object is cached and this results in much baster caching:
+
+``
+rows = db(query).select(cache=(cache.ram,3600),cacheable=True)
+``:code
+
### Self-Reference and aliases
``self reference``:inxx
``alias``:inxx
+It is possible to define tables with fields that refer to themselves, here is an example:
+``reference table``:inxx
``
db.define_table('person',
Field('name'),
Field('father_id', 'reference person'),
Field('mother_id', 'reference person'))
``:code
Notice that the alterantive notation of using a table object as field type will fail in this case, because it uses a variable ``db.person`` before it is defined:
``
db.define_table('person',
Field('name'),
+ Field('father_id', db.person), # wrong!
+ Field('mother_id', db.person)) # wrong!
``:code
In general ``db.tablename`` and ``"reference tablename"`` are equivalent field types, but the latter is the only one allowed for self.references.
Consider the previous table person and a new table "thing" referencing a "person":
``
->>> db.define_table('thing', Field('name'), Field('owner','reference person'))
``:code
and a simple select from this table:
``
>>> things = db(db.thing).select()
``:code
which is equivalent to
``
>>> things = db(db.thing._id>0).select()
``:code
where ``._id`` is a reference to the primary key of the table. Normally ``db.thing._id`` is the same as ``db.thing.id`` and we will assume that in most of this book. ``_id``:inxx
For each Row of things it is possible to fetch not just fields from the selected table (thing) but also from linked tables (recursively):
``
>>> for thing in things: print thing.name, thing.owner.name
``:code
And it also allows to pass an optional ``discount`` percentage (15%):
``
------
Mind that virtual fields do not have the same attributes as the other fields (default, readable, requires, etc) and they do not appear in the list of ``db.table.fields`` and are not visualized by default in tables (TABLE) and grids (SQLFORM.grid, SQLFORM.smartgrid).
------
### One to many relation
``one to many``:inxx
To illustrate how to implement one to many relations with the web2py DAL, define another table "thing" that refers to the table "person" which we redefine here:
``
>>> db.define_table('person',
Field('name'),
format='%(name)s')
>>> db.define_table('thing',
Field('name'),
Field('owner', 'reference person'),
format='%(name)s')
``:code
-Table "thing" has two fields, the name of the thing and the owner of the thing. When a field type is another table, it is intended that the field reference the other table by its id. In fact, you can print the actual type value and get:
``
>>> print db.thing.owner.type
reference person
``:code
Now, insert three things, two owned by Alex and one by Bob:
``
>>> db.thing.insert(name='Boat', owner=1)
1
>>> db.thing.insert(name='Chair', owner=1)
2
>>> db.thing.insert(name='Shoes', owner=2)
3
``:code
You can select as you did for any other table:
``
>>> for row in db(db.thing.owner==1).select():
print row.name
Boat
Which would render something similar to
"hello"|35|"this is the text description"|"2009-03-03"
``:code
For more information consult the official Python documentation ``quoteall``:cite
### Caching selects
The select method also takes a cache argument, which defaults to None. For caching purposes, it should be set to a tuple where the first element is the cache model (cache.ram, cache.disk, etc.), and the second element is the expiration time in seconds.
In the following example, you see a controller that caches a select on the previously defined db.log table. The actual select fetches data from the back-end database no more frequently than once every 60 seconds and stores the result in cache.ram. If the next call to this controller occurs in less than 60 seconds since the last database IO, it simply fetches the previous data from cache.ram.
``cache select``:inxx
``
def cache_db_select():
logs = db().select(db.log.ALL, cache=(cache.ram, 60))
return dict(logs=logs)
``:code
``cacheable``:inxx
The ``select`` method has an optional ``cacheable`` argument, normally set to ``False``. When a select is cached, ``cacheable`` is set to ``True``. This makes a simple ``Rows`` result which is serializable but The ``Row``s lack ``update_record`` and ``delete_record`` methods.
If you do not need these methods you can speed up selects even if you do not plan to cache then by setting the cacheable attribute:
``
rows = db(query).select(cacheable=True)
``:code
-------
The results of a ``select`` are normally complex, un-pickleable objects; they cannot be stored in a session and cannot be cached in any other way than the one explained here unless the ``cache`` attribute is set or ``cacheable=True``.
-------
### Self-Reference and aliases
``self reference``:inxx
``alias``:inxx
-It is possible to define tables with fields that refer to themselves although the usual notation may fail. The following code would be wrong because it uses a variable ``db.person`` before it is defined:
``
db.define_table('person',
Field('name'),
Field('father_id', 'reference person'),
Field('mother_id', 'reference person'))
``:code
-The solution consists of using an alternate notation
``reference table``:inxx
``
db.define_table('person',
Field('name'),
- Field('father_id', 'reference person'),
- Field('mother_id', 'reference person'))
``:code
In fact ``db.tablename`` and ``"reference tablename"`` are equivalent field types.

The ``big-id`` and, ``big-reference`` are only supported by some of the database engines and are experimental. They are not normally used as field types unless for legacy tables, however, the DAL constructor has a ``bigint_id`` argument that when set to ``True`` makes the ``id`` fields and ``reference`` fields ``big-id`` and ``big-reference`` respectively.
The ``list:`` fields are special because they are designed to take advantage of certain denormalization features on NoSQL (in the case of Google App Engine NoSQL, the field types ``ListProperty`` and ``StringListProperty``) and back-port them all the other supported relational databases. On relational databases lists are stored as a ``text`` field. The items are separated by a ``|`` and each ``|`` in string item is escaped as a ``||``. They are discussed in their own section.
-------
Notice that ``requires=...`` is enforced at the level of forms, ``required=True`` is enforced at the level of the DAL (insert), while ``notnull``, ``unique`` and ``ondelete`` are enforced at the level of the database. While they sometimes may seem redundant, it is important to maintain the distinction when programming with the DAL.
-------
``ondelete``:inxx
- ``ondelete`` translates into the "ON DELETE" SQL statement. By default it is set to "CASCADE". This tells the database that when it deletes a record, it should also delete all records that refer to it. To disable this feature, set ``ondelete`` to "NO ACTION" or "SET NULL".
- ``notnull=True`` translates into the "NOT NULL" SQL statement. It prevents the database from inserting null values for the field.
- ``unique=True`` translates into the "UNIQUE" SQL statement and it makes sure that values of this field are unique within the table. It is enforced at the database level.
- ``uploadfield`` applies only to fields of type "upload". A field of type "upload" stores the name of a file saved somewhere else, by default on the filesystem under the application "uploads/" folder. If ``uploadfield`` is set, then the file is stored in a blob field within the same table and the value of ``uploadfield`` is the name of the blob field. This will be discussed in more detail later in the context of SQLFORM.
- ``uploadfolder`` defaults to the application's "uploads/" folder. If set to a different path, files will uploaded to a different folder. For example, uploadfolder=os.path.join(request.folder,'static/temp') will upload files to the web2py/applications/myapp/static/temp folder.
- ``uploadseparate`` if set to True will upload files under different subfolders of the ''uploadfolder'' folder. This is optimized to avoid too many files under the same folder/subfolder. ATTENTION: You cannot change the value of ``uploadseparate`` from True to False without breaking the system. web2py either uses the separate subfolders or it does not. Changing the behavior after files have been uploaded will prevent web2py from being able to retrieve those files. If this happens it is possible to move files and fix the problem but this is not described here.
- ``uploadfs`` allows you specify a different filessystem where to upload files, including an Amazon S3 storage or a remote FTP storage. This option requires PyFileSystem installed. ``uploadfs`` must point to ``PyFileSystem``. ``PyFileSystem``:inxx ``uploadfs``:idxx
- ``widget`` must be one of the available widget objects, including custom widgets, for example: ``SQLFORM.widgets.string.widget``. A list of available widgets will be discussed later. Each field type has a default widget.
- ``label`` is a string (or something that can be serialized to a string) that contains the label to be used for this field in autogenerated forms.
- ``comment`` is a string (or something that can be serialized to a string) that contains a comment associated with this field, and will be displayed to the right of the input field in the autogenerated forms.
- ``writable`` if a field is writable, it can be edited in autogenerated create and update forms.
The first argument of ``define_table`` is always the table name. The other unnam
``:code
The value of migrate is the filename (in the "databases" folder for the application) where web2py stores internal migration information for this table. These files are very important and should never be removed except when the entire database is dropped. In this case, the ".table" files have to be removed manually. By default, migrate is set to True. This causes web2py to generate the filename from a hash of the connection string. If migrate is set to False, the migration is not performed, and web2py assumes that the table exists in the datastore and it contains (at least) the fields listed in ``define_table``.
The best practice is to give an explicit name to the migrate table.
There may not be two tables in the same application with the same migrate filename.
The DAL class also takes a "migrate" argument, which determines the default value of migrate for calls to ``define_table``. For example,
``
>>> db = DAL('sqlite://storage.db', migrate=False)
``:code
will set the default value of migrate to False whenever ``db.define_table`` is called without a migrate argument.
Migrations can be disabled for all tables at the moment of connection:
``
db = DAL(...,migrate_enabled=False)
``
This is the recommended behavior when two apps share the same database. Only one of the two apps should perform migrations, the other should disabled them.
### Fixing broken migrations
``fake_migrate``:inxx
There are two common problems with migrations and there are ways to recover from them.
One problem is specific with SQLite. SQLite does not enforce column types and cannot drop columns. This means that if you have a column of type string and you remove it, it is not really removed. If you add the column again with a different type (for example datetime) you end up with a datetime column that contains strings (junk for practical purposes). web2py does not complain about this because it does not know what is in the database, until it tries to retrieve records and fails.
If web2py returns an error in the gluon.sql.parse function when selecting records, this is the problem: corrupted data in a column because of the above issue.
The solution consists in updating all records of the table and updating the values in the column in question with None.
The other problem is more generic but typical with MySQL. MySQL does not allow more than one ALTER TABLE in a transaction. This means that web2py must break complex transactions into smaller ones (one ALTER TABLE at the time) and commit one piece at the time. It is therefore possible that part of a complex transaction gets committed and one part fails, leaving web2py in a corrupted state. Why would part of a transaction fail? Because, for example, it involves altering a table and converting a string column into a datetime column, web2py tries to convert the data, but the data cannot be converted. What happens to web2py? It gets confused about what exactly is the table structure actually stored in the database.
The solution consists of disabling migrations for all tables and enabling fake migrations:
``
db.define_table(....,migrate=False,fake_migrate=True)
``:code
This will rebuild web2py metadata about the table according to the table definition. Try multiple table definitions to see which one works (the one before the failed migration and the one after the failed migration). Once successful remove the ``fake_migrate=True`` attribute.
The DAL allows you to explicitly issue SQL statements.
>>> print db.executesql('SELECT * FROM person;')
[(1, u'Massimo'), (2, u'Massimo')]
``:code
In this case, the return values are not parsed or transformed by the DAL, and the format depends on the specific database driver. This usage with selects is normally not needed, but it is more common with indexes.
``executesql`` takes two optional arguments: ``placeholders`` and ``as_dict``
``placeholders`` is an optional
sequence of values to be substituted in
or, if supported by the DB driver, a dictionary with keys
matching named placeholders in your SQL.
If ``as_dict`` is set to True,
and the results cursor returned by the DB driver will be
converted to a sequence of dictionaries keyed with the db
field names. Results returned with ``as_dict = True ``are
the same as those returned when applying **.as_list()** to a normal select.
``
[{field1: value1, field2: value2}, {field1: value1b, field2: value2b}]
``:code
``executesql`` have two optional arguments: ``fields`` and ``colnames``.
The ``fields`` argument is a list of DAL Field objects that match the
fields returned from the DB. The Field objects should be part of one or
more Table objects defined on the DAL object. The ``fields`` list can
include one or more DAL Table objects in addition to or instead of
including Field objects, or it can be just a single table (not in a
list). In that case, the Field objects will be extracted from the
table(s).
Instead of specifying the ``fields`` argument, the ``colnames`` argument
can be specified as a list of field names in tablename.fieldname format.
Again, these should represent tables and fields defined on the DAL
object.
It is also possible to specify both ``fields`` and the associated
``colnames``. In that case, ``fields`` can also include DAL Expression
objects in addition to Field objects. For Field objects in "fields",
the associated ``colnames`` must still be in tablename.fieldname format.
For Expression objects in ``fields``, the associated ``colnames`` can
be any arbitrary labels.
The easiest way is when these conditions are met:
When accessing an existing table, i.e., a table not created by web2py in the current application, always set ``migrate=False``.
-------
If the legacy table has an auto-increment integer field but it is not called "id", web2py can still access it but the table definition must contain explicitly as ``Field('....','id')`` where ... is the name of the auto-increment integer field.
``keyed table``:inxx
Finally if the legacy table uses a primary key that is not an auto-increment id field it is possible to use a "keyed table", for example:
``
db.define_table('account',
Field('accnum','integer'),
Field('acctype'),
Field('accdesc'),
primarykey=['accnum','acctype'],
migrate=False)
``:code
- ``primarykey`` is a list of the field names that make up the primary key.
- All primarykey fields have a ``NOT NULL`` set even if not specified.
- Keyed table can only refer are to other keyed tables.
- Referencing fields must use the ``reference tablename.fieldname`` format.
- The ``update_record`` function is not available for Rows of keyed tables.
-------
Note that currently this is only available for DB2, MS-SQL, Ingres and Informix, but others can be easily added.
-------
At the time of writing, we cannot guarantee that the ``primarykey`` attribute works with every existing legacy table and every supported database backend.
For simplicity, we recommend, if possible, creating a database view that has an auto-increment id field.
### Distributed transaction
``distributed transactions``:inxx
------
At the time of writing this feature is only supported
by PostgreSQL, MySQL and Firebird, since they expose API for two-phase commits.
------
Assuming you have two (or more) connections to distinct PostgreSQL databases, for example:
``
The SQLTABLE constructor takes the following optional arguments:
- ``headers`` a dictionary mapping field names to their labels to be used as headers (default to ``{}``). It can also be an instruction. Currently we support ``headers='fieldname:capitalize'``.
- ``truncate`` the number of characters for truncating long values in the table (default is 16)
- ``columns`` the list of fieldnames to be shown as columns (in tablename.fieldname format).
Those not listed are not displayed (defaults to all).
- ``**attributes`` generic helper attributes to be passed to the most external TABLE object.
Here is an example:
``
{{extend 'layout.html'}}
<h1>Records</h1>
{{=SQLTABLE(rows,
headers='fieldname:capitalize',
truncate=100,
upload=URL('download'))
}}
``:code
``SQLFORM.grid``:inxx ``SQLFORM.smartgrid``:inxx
------
``SQLTABLE`` is useful but there are types when one needs more. ``SQLFORM.grid`` is an extension of SQLTABLE that creates a table with search features and pagination, as well as ability to open detailed records, create, edit and delete records. ``SQLFORM.smartgrid`` is a further generalization that allows all of the above but also creates buttons to access referencing records.
------
Here is an example of usage of ``SQLFORM.grid``:
``
def index():
return dict(grid=SQLFORM.grid(query))
``:code
and the corresponding view:
``
{{extend 'layout.html'}}
{{=grid}}
``
``SQLFORM.grid`` and ``SQLFORM.smartgrid`` should be preferred to ``SQLTABLE`` because they are more powerful although higher level and therefore more constraining. They will be explained in more detail in chapter 8.
#### ``orderby``, ``groupby``, ``limitby``, ``distinct``
The ``select`` command takes five optional arguments: orderby, groupby, limitby, left and cache. Here we discuss the first three.
You can fetch the records sorted by name:
``orderby``:inxx
``
>>> for row in db().select(
db.person.ALL, orderby=db.person.name):
print row.name
Alex
Bob
Carl
``:code
You can fetch the records sorted by name in reverse order (notice the tilde):
``
>>> for row in db().select(
Another way to achieve a similar result is by using a join, specifically an INNE
Alex has Boat
Alex has Chair
Bob has Shoes
``:code
Observe that web2py did a join, so the rows now contain two records, one from each table, linked together. Because the two records may have fields with conflicting names, you need to specify the table when extracting a field value from a row. This means that while before you could do:
``
row.name
``:code
and it was obvious whether this was the name of a person or a thing, in the result of a join you have to be more explicit and say:
``
row.person.name
``:code
or:
``
row.thing.name
``:code
There is an alternative syntax for INNER JOINS:
``
>>> rows = db(db.person).select(join=db.thing.on(db.person.id==db.thing.owner))
>>> for row in rows:
print row.person.name, 'has', row.thing.name
Alex has Boat
Alex has Chair
Bob has Shoes
``:code
While the output is the same, the generated SQL in the two cases can be different. The latter syntax removes possible ambiguities when the same table is joined twice and aliased:
``
>>> db.define_table('thing',
Field('name'),
Field('owner1','reference person'),
Field('owner2','reference person'))
>>> rows = db(db.person).select(
join=[db.person.with_alias('owner1').on(db.person.id==db.thing.owner1).
db.person.with_alias('owner2').on(db.person.id==db.thing.owner2)])
``
For the sake of the example, you can log events with the same event_time but wit
>>> print db.log.insert(
event='unauthorized login', event_time=now, severity=3)
3
``:code
#### ``like``, ``regexp``, ``startswith``, ``contains``, ``upper``, ``lower``
``like``:inxx ``startswith``:inxx ``regexp``:inxx
``contains``:inxx ``upper``:inxx ``lower``:inxx
Fields have a like operator that you can use to match strings:
``
>>> for row in db(db.log.event.like('port%')).select():
print row.event
port scan
``:code
Here "port%" indicates a string starting with "port". The percent sign character, "%", is a wild-card character that means "any sequence of characters".
The like operator is case-insensitive but it can be made case-sensitive with
``
db.mytable.myfield.like('value',case_sensitive=True)
``:code
web2py also provides some shortcuts:
``
db.mytable.myfield.startswith('value')
db.mytable.myfield.contains('value')
``:code
which are equivalent respectively to
``
db.mytable.myfield.like('value%')
db.mytable.myfield.like('%value%')
``:code
port scan
xss injection
``:code
The DAL also allows a nested select as the argument of the belongs operator. The only caveat is that the nested select has to be a ``_select``, not a ``select``, and only one field has to be selected explicitly, the one that defines the set.
``nested select``:inxx
``
>>> bad_days = db(db.log.severity==3)._select(db.log.event_time)
>>> for row in db(db.log.event_time.belongs(bad_days)).select():
print row.event
port scan
xss injection
unauthorized login
``:code
In those cases where a nested select is required and the loop-up field is a reference we can also use a query as argument. For example:
``
db.define_table('person',Field('name'))
db.define_table('thing',Field('owner'),Field('owner','reference thing'))
db(db.thing.owner.belongs(db.person.name=='Jonathan')).select()
``:code
In this case it is obvious that the next select only needs the field referenced by the ``db.thing.owner`` field so we do not need the more verbose ``_select`` notation.
``nested_select``:inxx
A nested select can also be used as insert/update value but in this case the syntax is different:
``
lazy = db(db.person.name=='Jonathan').nested_select(db.person.id)
db(db.thing.id==1).update(owner = lazy)
``:code
In this case ``lazy`` is a nested expression that computes the ``id`` of person "Jonathan". The two lines result in one single SQL query.
#### ``sum``, ``avg``, ``min``, ``max`` and ``len``
``sum``:inxx ``avg``:inxx ``min``:inxx ``max``:inxx
Previously, you have used the count operator to count records. Similarly, you can use the sum operator to add (sum) the values of a specific field from a group of records. As in the case of count, the result of a sum is retrieved via the store object:
``
>>> sum = db.log.severity.sum()
>>> print db().select(sum).first()[sum]
6
``:code
You can also use ``avg``, ``min``, and ``max`` to the average, minimum, and maximum value respectively for the selected records. For example:
``
>>> max = db.log.severity.max()
>>> print db().select(max).first()[max]
3
``:code
``.len()`` computes the length of a string, text or boolean fields.
Expressions can be combined to form more complex expressions. For example here we are computing the sum of the length of all the severity strings in the logs, increased of one:
``
>>> sum = (db.log.severity.len()+1).sum()
>>> print db().select(sum).first()[sum]
``:code
#### Substrings
One can build an expression to refer to a substring. For example, we can group things whose name starts with the same three characters and select only one from each group:
``
db(db.thing).select(distinct = db.thing.name[:3])
``:code
#### Default values with ``coalesce`` and ``coalesce_zero``
There are times when you need to pull a value from database but also need a default values if the value for a record is set to NULL. In SQL there is a keyword, ``COALESCE``, for this. web2py has an equivalent ``coalesce`` method:
``
>>> db.define_table('sysuser',Field('username'),Field('fullname'))
>>> db.sysuser.insert(username='max',fullname='Max Power')
>>> db.sysuser.insert(username='tim',fullname=None)
print db(db.sysuser).select(db.sysuser.fullname.coalesce(db.sysuser.username))
"COALESCE(sysuser.fullname,sysuser.username)"
Max Power
tim
``
Other times you need to compute a mathematical expression but some fields have a value set to None while it should be zero.
``coalesce_zero`` comes to the rescue by defaulting None to zero in the query:
This is best explained via some examples.
``:code
Here ``f`` is a dict of fields passed to insert or update, ``id`` is the id of the newly inserted record, ``s`` is the Set object used for update or delete.
``
>>> db.person.insert(name='John')
({'name': 'John'},)
({'name': 'John'}, 1)
>>> db(db.person.id==1).update(name='Tim')
(<Set (person.id = 1)>, {'name': 'Tim'})
(<Set (person.id = 1)>, {'name': 'Tim'})
>>> db(db.person.id==1).delete()
(<Set (person.id = 1)>,)
(<Set (person.id = 1)>,)
``:code
The return values of these callback should be ``None`` or ``False``. If any of the ``_before_*`` callback returns a ``True`` value it will abort the actual insert/update/delete operation.
``update_naive``:inxx.
Some times a callback may need to perform an update in the same of a different table and one wants to avoid callbacks calling themselves recursively.
For this purpose there the Set objects have an ``update_naive`` method that works like ``update`` but ignores before and after callbacks.
#### Record versioning
``_enable_record_versioning``:inxx
It is possible to ask web2py to save every copy of a record when the record is modified. There are many ways to do it and it can be done for all tables at once using the ``auth.enable_record_versioning`` method, discussed in the chapter about authentication, or it can be done for each individual table as discussed here.
Consider the following table:
``
db.define_table('stored_item',
Field('name'),
Field('quantity','integer'),
Field('is_active','boolean',
writable=False,readable=False,default=True))
``:code
Notice we the hidden boolean field called ``is_active`` and defaulting to
True.
We can tell web2py to create a new table (in the same or a different database) and store all previous versions of each record in the table, when modified.
This is done in the following way:
``
db.stored_item._enable_record_versioning()
``:code
or in a more verbose syntax:
``
db.stored_item._enable_record_versioning(
archive_db = db,
archive_name = 'stored_item_archive',
current_record = 'current_record',
is_active = 'is_active')
``
The ``archive_db=db`` tells web2py to store the archive table in the same database as the ``stored_item`` table. The ``archive_name`` sets the name for the archive table. The archive table has the same fields as the original table ``stored_item`` except that unique fields are no longer unique (because it needs to store multiple versions) and has an extra field which name is specified by ``current_record`` and which is a reference to the current record in the ``stored_item`` table.
When records are deleted, they are not really deleted. A deleted record is copied in the ``stored_item_archive`` table (like when it is modified) and the ``is_active`` field is set to False. By enabling record versioning web2py sets a ``custom_filter`` on this table that hides all fields in table ``stored_item`` where the ``is_active`` field is set to False. The ``is_active`` parameter in the ``_enable_record_versioning`` method allows to specify the name of the field used by the ``custom_filter`` to determine if the field was deleted or not.
``custom_filter``s are ignored by the appadmin interface.
#### Common fields and multi-tenancy
``common fields``:inxx
``multi tenancy``:inxx
``db._common_fields`` is a list of fields that should belong to all the tables. This list can also contain tables and it it is understood as all fields from the table. For example occasionally you find yourself in need to add a signature to all your tables but the ```auth`` tables. In this case, after you ``db.define_tables()`` but before defining any other table, insert
``
db._common_fields.append(auth.signature)
``
One field is special: "request_tenant".
This field does not exist but you can create it and add it to any of your tables (or them all):
``
db._common_fields.append(Field('request_tenant',
default=request.env.http_host,writable=False))
``
The file "gluon/dal.py" defines, among other, the following classes.
``
ConnectionPool
BaseAdapter extends ConnectionPool
Row
DAL
Reference
Table
Expression
Field
Query
Set
Rows
``
Their use has been explained in the previous sections, except for ``BaseAdapter``. When the methods of a ``Table`` or ``Set`` object need to communicate with the database they delegate to methods of the adapter the task to generate the SQL and or the function call.
For example:
``
db.mytable.insert(myfield='myvalue')
``
calls
``
Table.insert(myfield='myvalue')
``
which delegates the adapter by returning:
``
db._adapter.insert(db.mytable,db.mytable._listify(dict(myfield='myvalue')))
``
Here ``db.mytable._listify`` converts the dict of arguments into a list of ``(field,value)`` and calls the ``insert`` method of the ``adapter``. ``db._adapter`` does more or less the following:
``
query = db._adapter._insert(db.mytable,list_of_fields)
db._adapter.execute(query)
``
MySQLAdapter.driver = mysqldb
``
and you can specify optional driver arguments and adapter arguments:
``
db =DAL(..., driver_args={}, adapter_args={})
``
#### Gotchas
**SQLite** does not support dropping and altering columns. That means that web2py migrations will work up to a point. If you delete a field from a table, the column will remain in the database but be invisible to web2py. If you decide to reinstate the column, web2py will try re-create it and fail. In this case you must set ``fake_migrate=True`` so that metadata is rebuilt without attempting to add the column again. Also, for the same reason, **SQLite** is not aware of any change of column type. If you insert a number in a string field, it will be stored as string. If you later change the model and replace the type "string" with type "integer", SQLite will continue to keep the number as a string and this may cause problem when you try to extract the data.
**MySQL** does not support multiple ALTER TABLE within a single transaction. This means that any migration process is broken into multiple commits. If something happens that causes a failure it is possible to break a migration (the web2py metadata are no longer in sync with the actual table structure in the database). This is unfortunate but it can be prevented (migrate one table at the time) or it can be fixed a posteriori (revert the web2py model to what corresponds to the table structure in database, set ``fake_migrate=True`` and after the metadata has been rebuilt, set ``fake_migrate=False`` and migrate the table again).
**Google SQL** has the same problems as MySQL and more. In particular table metadata itself must be stored in the database in a table that is not migrated by web2py. This is because Google App Engine has a readonly file system. Web2py migrations in Google:SQL combined with the MySQL issue described above can result in metadata corruption. Again, this can be prevented (my migrating the table at once and then setting migrate=False so that the metadata table is not accessed any more) or it can fixed a posteriori (my accessing the database using the Google dashboard and deleting any corrupted entry from the table called ``web2py_filesystem``.
``limitby``:inxx
**MSSQL** does not support the SQL OFFSET keyword. Therefore the database cannot do pagination. When doing a ``limitby=(a,b)`` web2py will fetch the first ``b`` rows and discard the first the ``a``. This may result in a considerable overhead when compared with other database engines.
**Oracle** also does not support pagination. It does not support neither the OFFSET nor the LIMIT keywords. Web2py achieves pagination by translating a ``db(...).select(limitby=(a,b))`` into a complex three-way nested select (as suggested by official Oracle documentation). This works for simple select but may break for complex selects involving aliased fields and or joins.
The ``big-id`` and, ``big-reference`` are only supported by some of the database engines and are experimental. They are not normally used as field types unless for legacy tables, however, the DAL constructor has a ``bigint_id`` argument that when set to ``True`` makes the ``id`` fields and ``reference`` fields ``big-id`` and ``big-referece`` respectively.
The ``list:`` fields are special because they are designed to take advantage of certain denormalization features on NoSQL (in the case of Google App Engine NoSQL, the field types ``ListProperty`` and ``StringListProperty``) and back-port them all the other supported relational databases. On relational databases lists are stored as a ``text`` field. The items are separated by a ``|`` and each ``|`` in string item is escaped as a ``||``. They are discussed in their own section.
-------
Notice that ``requires=...`` is enforced at the level of forms, ``required=True`` is enforced at the level of the DAL (insert), while ``notnull``, ``unique`` and ``ondelete`` are enforced at the level of the database. While they sometimes may seem redundant, it is important to maintain the distinction when programming with the DAL.
-------
``ondelete``:inxx
- ``ondelete`` translates into the "ON DELETE" SQL statement. By default it is set to "CASCADE". This tells the database that when it deletes a record, it should also delete all records that refer to it. To disable this feature, set ``ondelete`` to "NO ACTION" or "SET NULL".
- ``notnull=True`` translates into the "NOT NULL" SQL statement. It prevents the database from inserting null values for the field.
- ``unique=True`` translates into the "UNIQUE" SQL statement and it makes sure that values of this field are unique within the table. It is enforced at the database level.
- ``uploadfield`` applies only to fields of type "upload". A field of type "upload" stores the name of a file saved somewhere else, by default on the filesystem under the application "uploads/" folder. If ``uploadfield`` is set, then the file is stored in a blob field within the same table and the value of ``uploadfield`` is the name of the blob field. This will be discussed in more detail later in the context of SQLFORM.
- ``uploadfolder`` defaults to the application's "uploads/" folder. If set to a different path, files will uploaded to a different folder. For example, uploadfolder=os.path.join(request.folder,'static/temp') will upload files to the web2py/applications/myapp/static/temp folder.
- ``uploadseparate`` if set to True will upload files under different subfolders of the ''uploadfolder'' folder. This is optimized to avoid too many files under the same folder/subfolder. ATTENTION: You cannot change the value of ``uploadseparate`` from True to False without breaking the system. web2py either uses the separate subfolders or it does not. Changing the behavior after files have been uploaded will prevent web2py from being able to retrieve those files. If this happens it is possible to move files and fix the problem but this is not described here.
- ``uploadfs`` allows you specify a different filessystem where to upload files, including an Amazon S3 storage or a remote FTP storage. This option requires PyFileSystem installed. ``uploadfs`` must point to ``PyFileSystem``. ``PyFileSystem``:inxx ``uploadfs``:idxx
- ``widget`` must be one of the available widget objects, including custom widgets, for example: ``SQLFORM.widgets.string.widget``. A list of available widgets will be discussed later. Each field type has a default widget.
- ``label`` is a string (or something that can be serialized to a string) that contains the label to be used for this field in autogenerated forms.
- ``comment`` is a string (or something that can be serialized to a string) that contains a comment associated with this field, and will be displayed to the right of the input field in the autogenerated forms.
- ``writable`` if a field is writable, it can be edited in autogenerated create and update forms.
The first argument of ``define_table`` is always the table name. The other unnam
``:code
The value of migrate is the filename (in the "databases" folder for the application) where web2py stores internal migration information for this table. These files are very important and should never be removed except when the entire database is dropped. In this case, the ".table" files have to be removed manually. By default, migrate is set to True. This causes web2py to generate the filename from a hash of the connection string. If migrate is set to False, the migration is not performed, and web2py assumes that the table exists in the datastore and it contains (at least) the fields listed in ``define_table``.
The best practice is to give an explicit name to the migrate table.
There may not be two tables in the same application with the same migrate filename.
The DAL class also takes a "migrate" argument, which determines the default value of migrate for calls to ``define_table``. For example,
``
>>> db = DAL('sqlite://storage.db', migrate=False)
``:code
will set the default value of migrate to False whenever ``db.define_table`` is called without a migrate argument.
Migrations can be disabled for all tables at the moment of connection:
``
db = DAL(...,migrate_enabled=False)
``
This is the recommended behaviour when two apps share the same database. Only one of the two apps should perform migrations, the other should disabled them.
### Fixing broken migrations
``fake_migrate``:inxx
There are two common problems with migrations and there are ways to recover from them.
One problem is specific with SQLite. SQLite does not enforce column types and cannot drop columns. This means that if you have a column of type string and you remove it, it is not really removed. If you add the column again with a different type (for example datetime) you end up with a datetime column that contains strings (junk for practical purposes). web2py does not complain about this because it does not know what is in the database, until it tries to retrieve records and fails.
If web2py returns an error in the gluon.sql.parse function when selecting records, this is the problem: corrupted data in a column because of the above issue.
The solution consists in updating all records of the table and updating the values in the column in question with None.
The other problem is more generic but typical with MySQL. MySQL does not allow more than one ALTER TABLE in a transaction. This means that web2py must break complex transactions into smaller ones (one ALTER TABLE at the time) and commit one piece at the time. It is therefore possible that part of a complex transaction gets committed and one part fails, leaving web2py in a corrupted state. Why would part of a transaction fail? Because, for example, it involves altering a table and converting a string column into a datetime column, web2py tries to convert the data, but the data cannot be converted. What happens to web2py? It gets confused about what exactly is the table structure actually stored in the database.
The solution consists of disabling migrations for all tables and enabling fake migrations:
``
db.define_table(....,migrate=False,fake_migrate=True)
``:code
This will rebuild web2py metadata about the table according to the table definition. Try multiple table definitions to see which one works (the one before the failed migration and the one after the failed migration). Once successful remove the ``fake_migrate=True`` attribute.
The DAL allows you to explicitly issue SQL statements.
>>> print db.executesql('SELECT * FROM person;')
[(1, u'Massimo'), (2, u'Massimo')]
``:code
In this case, the return values are not parsed or transformed by the DAL, and the format depends on the specific database driver. This usage with selects is normally not needed, but it is more common with indexes.
``executesql`` takes two optional arguments: ``placeholders`` and ``as_dict``
``placeholders`` is an optional
sequence of values to be substituted in
or, if supported by the DB driver, a dictionary with keys
matching named placeholders in your SQL.
If ``as_dict`` is set to True,
and the results cursor returned by the DB driver will be
converted to a sequence of dictionaries keyed with the db
field names. Results returned with ``as_dict = True ``are
the same as those returned when applying **.as_list()** to a normal select.
``
[{field1: value1, field2: value2}, {field1: value1b, field2: value2b}]
``:code
``executesql`` have two optional argumens: ``fields`` and ``colnames``.
The ``fields`` argument is a list of DAL Field objects that match the
fields returned from the DB. The Field objects should be part of one or
more Table objects defined on the DAL object. The ``fields`` list can
include one or more DAL Table objects in addition to or instead of
including Field objects, or it can be just a single table (not in a
list). In that case, the Field objects will be extracted from the
table(s).
Instead of specifying the ``fields`` argument, the ``colnames`` argument
can be specified as a list of field names in tablename.fieldname format.
Again, these should represent tables and fields defined on the DAL
object.
It is also possible to specify both ``fields`` and the associated
``colnames``. In that case, ``fields`` can also include DAL Expression
objects in addition to Field objects. For Field objects in "fields",
the associated ``colnames`` must still be in tablename.fieldname format.
For Expression objects in ``fields``, the associated ``colnames`` can
be any arbitrary labels.
The easiest way is when these conditions are met:
When accessing an existing table, i.e., a table not created by web2py in the current application, always set ``migrate=False``.
-------
If the legacy table has an auto-increment integer field but it is not called "id", web2py can still access it but the table definition must contain explicitly as ``Field('....','id')`` where ... is the name of the auto-increment integer field.
``keyed table``:inxx
Finally if the legacy table uses a primary key that is not an auto-increment id field it is possible to use a "keyed table", for example:
``
db.define_table('account',
Field('accnum','integer'),
Field('acctype'),
Field('accdesc'),
primarykey=['accnum','acctype'],
migrate=False)
``:code
- ``primarykey`` is a list of the field names that make up the primary key.
- All primarykey fields have a ``NOT NULL`` set even if not specified.
- Keyed table can only refer are to other keyed tables.
- Referenceing fields must use the ``reference tablename.fieldname`` format.
- The ``update_record`` function is not available for Rows of keyed tables.
-------
Note that currently this is only available for DB2, MS-SQL, Ingres and Informix, but others can be easily added.
-------
At the time of writing, we cannot guarantee that the ``primarykey`` attribute works with every existing legacy table and every supported database backend.
For simplicity, we recommend, if possible, creating a database view that has an auto-increment id field.
### Distributed transaction
``distributed transactions``:inxx
------
At the time of writing this feature is only supported
by PostgreSQL, MySQL and Firebird, since they expose API for two-phase commits.
------
Assuming you have two (or more) connections to distinct PostgreSQL databases, for example:
``
The SQLTABLE constructor takes the following optional arguments:
- ``headers`` a dictionary mapping field names to their labels to be used as headers (default to ``{}``). It can also be an instruction. Currently we support ``headers='fieldname:capitalize'``.
- ``truncate`` the number of characters for truncating long values in the table (default is 16)
- ``columns`` the list of fieldnames to be shown as columns (in tablename.fieldname format).
Those not listed are not displayed (defaults to all).
- ``**attributes`` generic helper attributes to be passed to the most external TABLE object.
Here is an example:
``
{{extend 'layout.html'}}
<h1>Records</h1>
{{=SQLTABLE(rows,
headers='fieldname:capitalize',
truncate=100,
upload=URL('download'))
}}
``:code
``SQLFORM.grid``:inxx ``SQLFORM.smartgrid``:inxx
------
``SQLTABLE`` is useful but there are types when one needs more. ``SQLFORM.grid`` is an extension of SQLTABLE that creates a table with search features and pagination, as well as ability to open detailed records, create, edit and delete records. ``SQLFORM.smartgrid`` is a further generalizaiton that allows all of the above but also creates buttons to access referencing records.
------
Here is an example of usage of ``SQLFORM.grid``:
``
def index():
return dict(grid=SQLFORM.grid(query))
``:code
and the corresponding view:
``
{{extend 'layout.html'}}
{{=grid}}
``
``SQLFORM.grid`` and ``SQLFORM.smartgrid`` should be preferrable to ``SQLTABLE`` because they are more powerful although higher level and therefore more constraining. They will be explained in more detail in chapter 8.
#### ``orderby``, ``groupby``, ``limitby``, ``distinct``
The ``select`` command takes five optional arguments: orderby, groupby, limitby, left and cache. Here we discuss the first three.
You can fetch the records sorted by name:
``orderby``:inxx
``
>>> for row in db().select(
db.person.ALL, orderby=db.person.name):
print row.name
Alex
Bob
Carl
``:code
You can fetch the records sorted by name in reverse order (notice the tilde):
``
>>> for row in db().select(
Another way to achieve a similar result is by using a join, specifically an INNE
Alex has Boat
Alex has Chair
Bob has Shoes
``:code
Observe that web2py did a join, so the rows now contain two records, one from each table, linked together. Because the two records may have fields with conflicting names, you need to specify the table when extracting a field value from a row. This means that while before you could do:
``
row.name
``:code
and it was obvious whether this was the name of a person or a thing, in the result of a join you have to be more explicit and say:
``
row.person.name
``:code
or:
``
row.thing.name
``:code
There is an alterantive syntax for INNER JOINS:
``
>>> rows = db(db.person).select(join=db.thing.on(db.person.id==db.thing.owner))
>>> for row in rows:
print row.person.name, 'has', row.thing.name
Alex has Boat
Alex has Chair
Bob has Shoes
``:code
While the output is the same, the generated SQL in the two cases can be different. The latter syntax removes possible ambiguities when the same table is joined twice and aliased:
``
>>> db.define_table('thing',
Field('name'),
Field('owner1','reference person'),
Field('owner2','reference person'))
>>> rows = db(db.person).select(
join=[db.person.with_alias('owner1').on(db.person.id==db.thing.owner1).
db.person.with_alias('owner2').on(db.person.id==db.thing.owner2)])
``
For the sake of the example, you can log events with the same event_time but wit
>>> print db.log.insert(
event='unauthorized login', event_time=now, severity=3)
3
``:code
#### ``like``, ``regexp``, ``startswith``, ``contains``, ``upper``, ``lower``
``like``:inxx ``startswith``:inxx ``regexp``:inxx
``contains``:inxx ``upper``:inxx ``lower``:inxx
Fields have a like operator that you can use to match strings:
``
>>> for row in db(db.log.event.like('port%')).select():
print row.event
port scan
``:code
Here "port%" indicates a string starting with "port". The percent sign character, "%", is a wild-card character that means "any sequence of characters".
The like operator is case insisite but it can be made case sensitive with
``
db.mytable.myfield.like('value',case_sensitive=True)
``:code
web2py also provides some shortcuts:
``
db.mytable.myfield.startswith('value')
db.mytable.myfield.contains('value')
``:code
which are equivalent respectively to
``
db.mytable.myfield.like('value%')
db.mytable.myfield.like('%value%')
``:code
port scan
xss injection
``:code
The DAL also allows a nested select as the argument of the belongs operator. The only caveat is that the nested select has to be a ``_select``, not a ``select``, and only one field has to be selected explicitly, the one that defines the set.
``nested select``:inxx
``
>>> bad_days = db(db.log.severity==3)._select(db.log.event_time)
>>> for row in db(db.log.event_time.belongs(bad_days)).select():
print row.event
port scan
xss injection
unauthorized login
``:code
In those cases where a nested select is required and the loop-up field is a reference we can also use a query as argument. For example:
``
db.define_table('person',Field('name'))
db.define_table('thing',Field('owner'),Field('owner','reference thing'))
db(db.thing.owner.belongs(db.person.name=='Johnathan')).select()
``:code
In this case it is obvious that the next select only needs the field referenced by the ``db.thing.owner`` field so we do not need the more verbose ``_select`` notation.
``nested_select``:inxx
A nested select can also be used as insert/update value but in this case the symtax is different:
``
lazy = db(db.person.name=='Jonathan').nested_select(db.person.id)
db(db.thing.id==1).update(owner = lazy)
``:code
In this case ``lazy`` is a nested expression that computes the ``id`` of person "Jonathan". The two lines result in one single SQL query.
#### ``sum``, ``avg``, ``min``, ``max`` and ``len``
``sum``:inxx ``avg``:inxx ``min``:inxx ``max``:inxx
Previously, you have used the count operator to count records. Similarly, you can use the sum operator to add (sum) the values of a specific field from a group of records. As in the case of count, the result of a sum is retrieved via the store object:
``
>>> sum = db.log.severity.sum()
>>> print db().select(sum).first()[sum]
6
``:code
You can also use ``avg``, ``min``, and ``max`` to the average, mininum, and maximum value respectively for the selected records. For example:
``
>>> max = db.log.severity.max()
>>> print db().select(max).first()[max]
3
``:code
``.len()`` computes the length of a string, text or boolean fields.
Expressions can be combined to form more complex expressions. For example here we are computing the sum of the length of all the severity strings in the logs, increased of one:
``
>>> sum = (db.log.severity.len()+1).sum()
>>> print db().select(sum).first()[sum]
``:code
#### Substrings
One can build an expression to refer to a substring. For example, we can group things whose name starts with the same three characters and select only one from each group:
``
db(db.thing).select(dictinct = db.thing.name[:3])
``:code
#### Default values with ``coalesce`` and ``coalesce_zero``
There are times when you need to pull a value from database but also need a default values if the value for a record is set to NULL. In SQL there is a keyword, ``COALESCE``, for this. web2py has an equivalent ``coalesce`` method:
``
>>> db.define_table('sysuser',Field('username'),Field('fullname'))
>>> db.sysuser.insert(username='max',fullname='Max Power')
>>> db.sysuser.insert(username='tim',fullname=None)
print db(db.sysuser).select(db.sysuser.fullname.coalesce(db.sysuser.username))
"COALESCE(sysuser.fullname,sysuser.username)"
Max Power
tim
``
Other times you need to compute a mathematical expression but some fields have a value set to None while it should be zero.
``coalesce_zero`` comes to the rescue by defaulting None to zero in the query:
This is best explained via some examples.
``:code
Here ``f`` is a dict of fields passed to insert or update, ``id`` is the id of the newly inserted record, ``s`` is the Set object used for update or delete.
``
>>> db.person.insert(name='John')
({'name': 'John'},)
({'name': 'John'}, 1)
>>> db(db.person.id==1).update(name='Tim')
(<Set (person.id = 1)>, {'name': 'Tim'})
(<Set (person.id = 1)>, {'name': 'Tim'})
>>> db(db.person.id==1).delete()
(<Set (person.id = 1)>,)
(<Set (person.id = 1)>,)
``:code
The return values of these callback should be ``None`` or ``False``. If any of the ``_before_*`` callback returns a ``True`` value it will abort the actual insert/update/delete operation.
``update_naive``:inxx.
Some times a callback may need to perform an update in the same of a different table ane one wants to avoid callbacks calling themselves recursively.
For this purpose there the Set objects have an ``update_naive`` method that works like ``update`` but ignores before and after callbacks.
#### Record versioning
``_enable_record_versioning``:inxx
It is possible to ask web2py to save every copy of a record when the record is modified. There are many ways to do it and it can be done for all tables at once using the ``auth.enable_record_versioning`` method, discussed in the chapter about authentication, or it can be done for each individual table as discussed here.
Consider the following table:
``
db.define_table('stored_item',
Field('name'),
Field('quantity','integer'),
Field('is_active','boolean',
writable=False,readable=False,default=True))
``:code
Notice we the hidden boolean field called ``is_active`` and defaulting to
True.
We can tell web2py to create a new table (in the same or a different database) and store all previous versions of each record in the table, when modified.
This is done in the following way:
``
db.stored_item._enable_record_versioning()
``:code
or in a more verbose syntax:
``
db.stored_item._enable_record_versioning(
archive_db = db,
archive_name = 'strored_item_archive',
current_record = 'current_record',
is_active = 'is_active')
``
The ``arhcive_db=db`` tells web2py to store the archive table in the same database as the ``stored_item`` table. The ``archive_name`` sets the name for the arvchive table. The archive table has the same fields as the original table ``stored_item`` except that unique fields are no longer unique (because it needs to store multiple versions) and has an extra field which name is specified by ``current_record`` and which is a reference to the current record in the ``stored_item`` table.
When records are deleted, they are not really deleted. A deleted record is copied in the ``strored_item_archive`` table (like when it is modified) and the ``is_active`` field is set to False. By enabling record versining web2py sets a ``custom_filter`` on this table that hides all fields in table ``strored_item`` where the ``is_active`` field is set to False. The ``is_active`` parameter in the ``_enable_record_versioning`` method allows to specify the name of the field used by the ``custom_filter`` to determine if the field was deleted or not.
``custom_filter``s are ignored by the appadmin interface.
#### Common fields and multi-tenancy
``common fields``:inxx
``multi tenancy``:inxx
``db._common_fields`` is a list of fields that should belong to all the tables. This list can also contain tables and it it is understood as all fields from the table. For example occasionally you find yourself in need to add a signature to all your tables but the ```auth`` tables. In this case, after you ``db.define_tables()`` but before defining any other table, insert
``
db._common_fields.append(auth.signature)
``
One field is special: "request_tenant".
This field does not exist but you can create it and add it to any of your tables (or them all):
``
db._common_fields.append(Field('request_tenant',
default=request.env.http_host,writable=False))
``
The file "gluon/dal.py" defines, among other, the following classes.
``
ConnectionPool
BaseAdapter extends ConnectionPool
Row
DAL
Reference
Table
Expression
Field
Query
Set
Rows
``
Their use has been explained in the previous sections, except for ``BaseAdapter``. When the methods of a ``Table`` or ``Set`` object need to communicate with the database they delegate to methods of the adapter the task to generate the SQL and or the function call.
For example:
``
db.myable.insert(myfield='myvalue')
``
calls
``
Table.insert(myfield='myvalue')
``
which delegates the adapter by returning:
``
db._adapter.insert(db.mytable,db.mytable._listify(dict(myfield='myvalue')))
``
Here ``db.mytable._listify`` converts the dict of arguments into a list of ``(field,value)`` and calls the ``insert`` method of the ``adapter``. ``db._adapter`` does more or less the following:
``
query = db._adapter._insert(db.mytable,list_of_fields)
db._adapter.execute(query)
``
MySQLAdapter.driver = mysqldb
``
and you can specify optional driver arguments and adapter arguments:
``
db =DAL(..., driver_args={}, adapter_args={})
``
#### Gotchas
**SQLite** does not support dropping and altering columns. That means that web2py migrations will work up to a point. If you delete a field from a table, the column will remain in the database but be invisible to web2py. If you decide to reinstate the column, web2py will try re-create it and fail. In this case you must set ``fake_migrate=True`` so that metadata is rebuilt without attempting to add the column again. Also, for the same reason, **SQLite** is not aware of any change of column type. If you insert a number in a string field, it will be stored as string. If you later change the model and replace the type "string" with type "integer", SQLite will continue to keep the number as a string and this may cause problem when you try to extract the data.
**MySQL** does not support multiple ALTER TABLE within a single transaction. This means that any migration process is broken into multiple commits. If something happens that causes a failure it is possible to break a migration (the web2py metadata are no longer in sync with the actual table structure in the database). This is unfortunate but it can be prevented (migrate one table at the time) or it can be fixed a posteriori (revert the web2py model to what corresponds to the table structure in database, set ``fake_migrate=True`` and after the metadata has been rebuilt, set ``fake_migrate=False`` and migrate the table again).
**Google SQL** has the same problems as MySQL and more. In particular table metadata itself must be stored in the database in a table that is not migrated by web2py. This is because Google App Engine has a readonly file system. Web2py migrations in Google:SQL combined with the MySQL issue described above can result in metadata corruption. Again, this can be prevented (my migrating the table at once and then setting migrate=False so that the metadata table is not accessed any more) or it can fixed a posteriori (my accessing the database using the Google dashboard and deleting any corrupted entry from the table called ``web2py_filesystem``.
``limitby``:inxx
**MSSQL** does not support the SQL OFFSET keyword. Therefore the database cannot do pagination. When doing a ``limitby=(a,b)`` web2py will fetch the first ``b`` rows and discard the first the ``a``. This may result in a considerable overhead when compared with other database engines.
**Oracle** also does not support pagination. It does not support neither the OFFSET nor the LIMIT keywords. Web2py achieves pagination by translating a ``db(...).select(limitby=(a,b))`` into a complex three-way nested select (as suggested by official Oracle documentation). This works for simple select but may break for complex selects involving alised fields and or joins.

#### ``filter_in`` and ``filter_out``
``filter_in``:inxx ``filter_out``:inxx
#### ``filter_in`` and `filter_out``
``filter_in``:inxx ``filter_out``:inxx

+#### ``filter_in`` and `filter_out``
+``filter_in``:inxx ``filter_out``:inxx
+
+It is possible to define a filter for each field to be called before a value is inserted into the database for that field and after a value is retrieved from the database.
+
+Imagine for example that you want to store a serializable Python data structure in a field in the json format. Here is how it could be accomplished:
+
+``
+>>> from simplejson import loads, dumps
+>>> db.define_table('anyobj',Field('name'),Field('data','text'))
+>>> db.anyobj.data.filter_in = lambda obj, dumps=dumps: dumps(obj)
+>>> db.anyobj.data.filter_out = lambda txt, loads=loads: loads(txt)
+>>> myobj = ['hello', 'world', 1, {2: 3}]
+>>> id = db.anyobj.insert(name='myobjname', data=myobj)
+>>> row = db.anyobj(id)
+>>> row.data
+['hello', 'world', 1, {2: 3}]
+``:code
+
+Another way to accomplish the same is by using a Field of type ``SQLCustomType``, as discussed later.
+
#### before and after callbacks
``_before_insert``:inxx
``_after_insert``:inxx
``_before_update``:inxx
``_after_update``:inxx
``_before_delete``:inxx
``_after_delete``:inxx
Web2py provides a mechanism to register callbacks to be called before and/or after insert, update and delete of records.
Each table stores six lists of callbacks:
``
db.mytable._before_insert
db.mytable._after_insert
db.mytable._before_update
db.mytable._after_update
db.mytable._before_delete
db.mytable._after_delete
Here ``f`` is a dict of fields passed to insert or update, ``id`` is the id of t
``
>>> db.person.insert(name='John')
({'name': 'John'},)
({'name': 'John'}, 1)
>>> db(db.person.id==1).update(name='Tim')
(<Set (person.id = 1)>, {'name': 'Tim'})
(<Set (person.id = 1)>, {'name': 'Tim'})
>>> db(db.person.id==1).delete()
(<Set (person.id = 1)>,)
(<Set (person.id = 1)>,)
``:code
The return values of these callback should be ``None`` or ``False``. If any of the ``_before_*`` callback returns a ``True`` value it will abort the actual insert/update/delete operation.
``update_naive``:inxx.
Some times a callback may need to perform an update in the same of a different table ane one wants to avoid callbacks calling themselves recursively.
For this purpose there the Set objects have an ``update_naive`` method that works like ``update`` but ignores before and after callbacks.
+#### Record versioning
+
+``_enable_record_versioning``:inxx
+
+It is possible to ask web2py to save every copy of a record when the record is modified. There are many ways to do it and it can be done for all tables at once using the ``auth.enable_record_versioning`` method, discussed in the chapter about authentication, or it can be done for each individual table as discussed here.
+
+Consider the following table:
+
+``
+db.define_table('stored_item',
+ Field('name'),
+ Field('quantity','integer'),
+ Field('is_active','boolean',
+ writable=False,readable=False,default=True))
+``:code
+
+Notice we the hidden boolean field called ``is_active`` and defaulting to
+True.
+
+We can tell web2py to create a new table (in the same or a different database) and store all previous versions of each record in the table, when modified.
+
+This is done in the following way:
+``
+db.stored_item._enable_record_versioning()
+``:code
+
+or in a more verbose syntax:
+
+``
+db.stored_item._enable_record_versioning(
+ archive_db = db,
+ archive_name = 'strored_item_archive',
+ current_record = 'current_record',
+ is_active = 'is_active')
+``
+
+The ``arhcive_db=db`` tells web2py to store the archive table in the same database as the ``stored_item`` table. The ``archive_name`` sets the name for the arvchive table. The archive table has the same fields as the original table ``stored_item`` except that unique fields are no longer unique (because it needs to store multiple versions) and has an extra field which name is specified by ``current_record`` and which is a reference to the current record in the ``stored_item`` table.
+
+When records are deleted, they are not really deleted. A deleted record is copied in the ``strored_item_archive`` table (like when it is modified) and the ``is_active`` field is set to False. By enabling record versining web2py sets a ``custom_filter`` on this table that hides all fields in table ``strored_item`` where the ``is_active`` field is set to False. The ``is_active`` parameter in the ``_enable_record_versioning`` method allows to specify the name of the field used by the ``custom_filter`` to determine if the field was deleted or not.
+
+``custom_filter``s are ignored by the appadmin interface.
+
#### Common fields and multi-tenancy
``common fields``:inxx
``multi tenancy``:inxx
#### before and after callbacks
``_before_insert``:inxx
``_after_insert``:inxx
``_before_update``:inxx
``_after_update``:inxx
``_before_delete``:inxx
``_after_delete``:inxx
Web2py provides a mechanism to register callbacks to be called before and/or after insert, update and delete of records.
Each table stores six lists of callbacks:
``
db.mytable._before_insert
db.mytable._after_insert
db.mytable._before_update
db.mytable._after_update
db.mytable._before_delete
db.mytable._after_delete
Here ``f`` is a dict of fields passed to insert or update, ``id`` is the id of t
``
>>> db.person.insert(name='John')
({'name': 'John'},)
({'name': 'John'}, 1)
>>> db(db.person.id==1).update(name='Tim')
(<Set (person.id = 1)>, {'name': 'Tim'})
(<Set (person.id = 1)>, {'name': 'Tim'})
>>> db(db.person.id==1).delete()
(<Set (person.id = 1)>,)
(<Set (person.id = 1)>,)
``:code
The return values of these callback should be ``None`` or ``False``. If any of the ``_before_*`` callback returns a ``True`` value it will abort the actual insert/update/delete operation.
``update_naive``:inxx.
Some times a callback may need to perform an update in the same of a different table ane one wants to avoid callbacks calling themselves recursively.
For this purpose there the Set objects have an ``update_naive`` method that works like ``update`` but ignores before and after callbacks.
#### Common fields and multi-tenancy
``common fields``:inxx
``multi tenancy``:inxx

+``executesql`` have two optional argumens: ``fields`` and ``colnames``.
+
+The ``fields`` argument is a list of DAL Field objects that match the
+fields returned from the DB. The Field objects should be part of one or
+more Table objects defined on the DAL object. The ``fields`` list can
+include one or more DAL Table objects in addition to or instead of
+including Field objects, or it can be just a single table (not in a
+list). In that case, the Field objects will be extracted from the
+table(s).
+
+Instead of specifying the ``fields`` argument, the ``colnames`` argument
+can be specified as a list of field names in tablename.fieldname format.
+Again, these should represent tables and fields defined on the DAL
+object.
+
+It is also possible to specify both ``fields`` and the associated
+``colnames``. In that case, ``fields`` can also include DAL Expression
+objects in addition to Field objects. For Field objects in "fields",
+the associated ``colnames`` must still be in tablename.fieldname format.
+For Expression objects in ``fields``, the associated ``colnames`` can
+be any arbitrary labels.
+
+Notice, the DAL Table objects referred to by ``fields`` or ``colnames`` can
+be dummy tables and do not have to represent any real tables in the
+database. Also, note that the ``fields`` and ``colnames`` must be in the
+same order as the fields in the results cursor returned from the DB.
+
-executesql have an optional ``fields`` argument.
-If not None, the
-results cursor returned by the DB driver will be converted to a
-DAL Rows object using the ``db._adapter.parse()`` method. This requires
-specifying the "fields" argument as a list of DAL Field objects
-that match the fields returned from the DB. The Field objects should
-be part of one or more Table objects defined on the DAL object.
-
-The ``fields`` list can include one or more DAL Table objects in addition
-to or instead of including Field objects, or it can be just a single
-table (not in a list). In that case, the Field objects will be
-extracted from the table(s).
-
-The field names will be extracted from the Field objects, or optionally,
-a list of field names can be provided (in tablename.fieldname format)
-via the ``colnames`` argument. Note, the fields and colnames must be in
-the same order as the fields in the results cursor returned from the DB.

+
### Other methods
#### ``update_or_insert``
``update_or_insert``:inxx
Some times you need to perform an insert only if there is no record with the same values as those being inserted.
This can be done with
``
db.define_table('person',Field('name'),Field('birthplace'))
db.person.update_or_insert(name='John',birthplace='Chicago')
``:code
The record will be inserted only of there is no other user called John born in Chicago.
You can specify which values to use as a key to determine if the record exists. For example:
``
db.person.update_or_insert(db.person.name=='John',
name='John',birthplace='Chicago')
``:code
xss injection
``:code
The DAL also allows a nested select as the argument of the belongs operator. The only caveat is that the nested select has to be a ``_select``, not a ``select``, and only one field has to be selected explicitly, the one that defines the set.
``nested select``:inxx
``
>>> bad_days = db(db.log.severity==3)._select(db.log.event_time)
>>> for row in db(db.log.event_time.belongs(bad_days)).select():
print row.event
port scan
xss injection
unauthorized login
``:code
In those cases where a nested select is required and the loop-up field is a reference we can also use a query as argument. For example:
``
db.define_table('person',Field('name'))
db.define_table('thing',Field('owner'),Field('owner','reference thing'))
db(db.thing.owner.belongs(db.person.name=='Johnathan')).select()
``:code
In this case it is obvious that the next select only needs the field referenced by the ``db.thing.owner`` field so we do not need the more verbose ``_select`` notation.
+``nested_select``:inxx
+
+A nested select can also be used as insert/update value but in this case the symtax is different:
+
+``
+lazy = db(db.person.name=='Jonathan').nested_select(db.person.id)
+db(db.thing.id==1).update(owner = lazy)
+``:code
+
+In this case ``lazy`` is a nested expression that computes the ``id`` of person "Jonathan". The two lines result in one single SQL query.
+
#### ``sum``, ``avg``, ``min``, ``max`` and ``len``
``sum``:inxx ``avg``:inxx ``min``:inxx ``max``:inxx
Previously, you have used the count operator to count records. Similarly, you can use the sum operator to add (sum) the values of a specific field from a group of records. As in the case of count, the result of a sum is retrieved via the store object:
``
>>> sum = db.log.severity.sum()
>>> print db().select(sum).first()[sum]
6
``:code
You can also use ``avg``, ``min``, and ``max`` to the average, mininum, and maximum value respectively for the selected records. For example:
``
>>> max = db.log.severity.max()
>>> print db().select(max).first()[max]
3
``:code
``.len()`` computes the length of a string, text or boolean fields.
It is also possible to define a dummy table that is not stored in a database in
signature = db.Table(db, 'signature',
Field('created_on', 'datetime', default=request.now),
Field('created_by', db.auth_user, default=auth.user_id),
Field('updated_on', 'datetime', update=request.now),
Field('updated_by', db.auth_user, update=auth.user_id))
db.define_table('payment', Field('amount', 'double'), signature)
``:code
This example assumes that standard web2py authentication is enabled.
Notice that if you user ``Auth`` web2py already creates one such table for you:
``
auth = Auth(db)
db.define_table('payment', Field('amount', 'double'), auth.signature)
``
When using table inheritance, if you want the inheriting table to inherit validators, be sure to define the validators of the parent table before defining the inheriting table.
+#### before and after callbacks
+
+``_before_insert``:inxx
+``_after_insert``:inxx
+``_before_update``:inxx
+``_after_update``:inxx
+``_before_delete``:inxx
+``_after_delete``:inxx
+
+Web2py provides a mechanism to register callbacks to be called before and/or after insert, update and delete of records.
+
+Each table stores six lists of callbacks:
+
+``
+db.mytable._before_insert
+db.mytable._after_insert
+db.mytable._before_update
+db.mytable._after_update
+db.mytable._before_delete
+db.mytable._after_delete
+``:code
+
+You can register callback function by appending it the corresponding function to one of those lists. The caveat is that depending on the functionality, the callback has different signature.
+
+This is best explained via some examples.
+
+``
+>>> db.define_table('person',Field('name'))
+>>> def pprint(*args): print args
+>>> db.person._before_insert.append(lambda f: pprint(f))
+>>> db.person._after_insert.append(lambda f,id: pprint(f,id))
+>>> db.person._before_update.append(lambda s,f: pprint(s,f))
+>>> db.person._after_update.append(lambda s,f: pprint(s,f))
+>>> db.person._before_delete.append(lambda s: pprint(s))
+>>> db.person._after_delete.append(lambda s: pprint(s))
+``:code
+
+Here ``f`` is a dict of fields passed to insert or update, ``id`` is the id of the newly inserted record, ``s`` is the Set object used for update or delete.
+
+``
+>>> db.person.insert(name='John')
+({'name': 'John'},)
+({'name': 'John'}, 1)
+>>> db(db.person.id==1).update(name='Tim')
+(<Set (person.id = 1)>, {'name': 'Tim'})
+(<Set (person.id = 1)>, {'name': 'Tim'})
+>>> db(db.person.id==1).delete()
+(<Set (person.id = 1)>,)
+(<Set (person.id = 1)>,)
+``:code
+
+The return values of these callback should be ``None`` or ``False``. If any of the ``_before_*`` callback returns a ``True`` value it will abort the actual insert/update/delete operation.
+
+``update_naive``:inxx.
+
+Some times a callback may need to perform an update in the same of a different table ane one wants to avoid callbacks calling themselves recursively.
+
+For this purpose there the Set objects have an ``update_naive`` method that works like ``update`` but ignores before and after callbacks.
+
#### Common fields and multi-tenancy
``common fields``:inxx
``multi tenancy``:inxx
### Other methods
#### ``update_or_insert``
``update_or_insert``:inxx
Some times you need to perform an insert only if there is no record with the same values as those being inserted.
This can be done with
``
db.define_table('person',Field('name'),Field('birthplace'))
db.person.update_or_insert(name='John',birthplace='Chicago')
``:code
The record will be inserted only of there is no other user called John born in Chicago.
You can specify which values to use as a key to determine if the record exists. For example:
``
db.person.update_or_insert(db.person.name=='John',
name='John',birthplace='Chicago')
``:code
xss injection
``:code
The DAL also allows a nested select as the argument of the belongs operator. The only caveat is that the nested select has to be a ``_select``, not a ``select``, and only one field has to be selected explicitly, the one that defines the set.
``nested select``:inxx
``
>>> bad_days = db(db.log.severity==3)._select(db.log.event_time)
>>> for row in db(db.log.event_time.belongs(bad_days)).select():
print row.event
port scan
xss injection
unauthorized login
``:code
In those cases where a nested select is required and the loop-up field is a reference we can also use a query as argument. For example:
``
db.define_table('person',Field('name'))
db.define_table('thing',Field('owner'),Field('owner','reference thing'))
db(db.thing.owner.belongs(db.person.name=='Johnathan')).select()
``:cite
In this case it is obvious that the next select only needs the field referenced by the ``db.thing.owner`` field so we do not need the more verbose ``_select`` notation.
#### ``sum``, ``avg``, ``min``, ``max`` and ``len``
``sum``:inxx ``avg``:inxx ``min``:inxx ``max``:inxx
Previously, you have used the count operator to count records. Similarly, you can use the sum operator to add (sum) the values of a specific field from a group of records. As in the case of count, the result of a sum is retrieved via the store object:
``
>>> sum = db.log.severity.sum()
>>> print db().select(sum).first()[sum]
6
``:code
You can also use ``avg``, ``min``, and ``max`` to the average, mininum, and maximum value respectively for the selected records. For example:
``
>>> max = db.log.severity.max()
>>> print db().select(max).first()[max]
3
``:code
``.len()`` computes the length of a string, text or boolean fields.
It is also possible to define a dummy table that is not stored in a database in
signature = db.Table(db, 'signature',
Field('created_on', 'datetime', default=request.now),
Field('created_by', db.auth_user, default=auth.user_id),
Field('updated_on', 'datetime', update=request.now),
Field('updated_by', db.auth_user, update=auth.user_id))
db.define_table('payment', Field('amount', 'double'), signature)
``:code
This example assumes that standard web2py authentication is enabled.
Notice that if you user ``Auth`` web2py already creates one such table for you:
``
auth = Auth(db)
db.define_table('payment', Field('amount', 'double'), auth.signature)
``
When using table inheritance, if you want the inheriting table to inherit validators, be sure to define the validators of the parent table before defining the inheriting table.
#### Common fields and multi-tenancy
``common fields``:inxx
``multi tenancy``:inxx

+The ``find`` method as an optional limitby argument with the same syntax and functionality as the Set select ``method``.
+
### Other methods
#### ``update_or_insert``
``update_or_insert``:inxx
Some times you need to perform an insert only if there is no record with the same values as those being inserted.
This can be done with
``
db.define_table('person',Field('name'),Field('birthplace'))
db.person.update_or_insert(name='John',birthplace='Chicago')
``:code
The record will be inserted only of there is no other user called John born in Chicago.
You can specify which values to use as a key to determine if the record exists. For example:
``
db.person.update_or_insert(db.person.name=='John',
name='John',birthplace='Chicago')
xss injection
The DAL also allows a nested select as the argument of the belongs operator. The only caveat is that the nested select has to be a ``_select``, not a ``select``, and only one field has to be selected explicitly, the one that defines the set.
``nested select``:inxx
``
>>> bad_days = db(db.log.severity==3)._select(db.log.event_time)
>>> for row in db(db.log.event_time.belongs(bad_days)).select():
print row.event
port scan
xss injection
unauthorized login
``:code
In those cases where a nested select is required and the loop-up field is a reference we can also use a query as argument. For example:
``
db.define_table('person',Field('name'))
db.define_table('thing',Field('owner'),Field('owner','reference thing'))
db(db.thing.owner.belongs(db.person.name=='Johnathan')).select()
``:cite
In this case it is obvious that the next select only needs the field referenced by the ``db.thing.owner`` field so we do not need the more verbose ``_select`` notation.
### Other methods
#### ``update_or_insert``
``update_or_insert``:inxx
Some times you need to perform an insert only if there is no record with the same values as those being inserted.
This can be done with
``
db.define_table('person',Field('name'),Field('birthplace'))
db.person.update_or_insert(name='John',birthplace='Chicago')
``:code
The record will be inserted only of there is no other user called John born in Chicago.
You can specify which values to use as a key to determine if the record exists. For example:
``
db.person.update_or_insert(db.person.name=='John',
name='John',birthplace='Chicago')
xss injection
The DAL also allows a nested select as the argument of the belongs operator. The only caveat is that the nested select has to be a ``_select``, not a ``select``, and only one field has to be selected explicitly, the one that defines the set.
``nested select``:inxx
``
>>> bad_days = db(db.log.severity==3)._select(db.log.event_time)
>>> for row in db(db.log.event_time.belongs(bad_days)).select():
print row.event
port scan
xss injection
unauthorized login
``:code
In those cases where a nested select is required and the loop-up field is a reference we can also use a query as argument. For example:
``
db.define_table('person',Field('name'))
db.define_table('thing',Field('owner'),Field('owner','reference thing'))
db(db.thing.owner.belongs(db.person.name=='Johnathan')).select()
``:cite
In this case it is obvious that the next select only needs the field referenced by the ``db.thing.owner` field so we do not need the more verbose ``_select`` notation.

``
db = DAL('postgres://username:password@localhost/mydb')
``
Before you switch, you want to move the data and rebuild all the metadata for the new database. We assume the new database to exist but we also assume it is empty.
Web2py provides a script that does this work for you:
``
cd web2py
python scripts/cpdb.py \
-f applications/app/databases \
-y 'sqlite://storage.sqlite' \
-Y 'postgres://username:password@localhost/mydb'
``
``
db = DAL('postgresql://username:password@hocalhost/mydb')
``
Before you switch, you want to move the data and rebuild all the metadata for the new database. We assume the new database to exist but we also assume it is empty.
Web2py provides a script that does this work for you:
``
cd web2py
python scripts/cpdb.py \
-f applications/app/databases \
-y 'sqlite://storage.sqlite' \
-Y 'postgresql://username:password@hocalhost/mydb'
``

web2py comes with a Database Abstraction Layer (DAL), an API that maps Python objects into database objects such as queries, tables, and records. The DAL dynamically generates the SQL in real time using the specified dialect for the database back end, so that you do not have to write SQL code or learn different SQL dialects (the term SQL is used generically), and the application will be portable among different types of databases. At the time of this writing, the supported databases are SQLite (which comes with Python and thus web2py), PostgreSQL, MySQL, Oracle, MSSQL, FireBird, DB2, Informix, Ingres, the Google App Engine (SQL and NoSQL) and MongoDB. Experimentally we support more databases. Please check on the web2py web site and mailing list for more recent adapters. Google NoSQL is treated as a particular case in Chapter 13.
The Windows binary distribution works out of the box with SQLite and MySQL. The Mac binary distribution works out of the box with SQLite.
To use any other database back-end, run from the source distribution and install the appropriate driver for the required back end.
``database drivers``:inxx
Once the proper driver is installed, start web2py from source, and it will find the driver. Here is a list of drivers:
+``DAL``:inxx ``SQLite``:inxx ``MySQL``:inxx ``PostgresSQL``:inxx ``Oracle``:inxx ``MSSQL``:inxx ``FireBird``:inxx ``DB2``:inxx ``Informix``:inxx ``Sybase``:inxx ``Teradata``:inxx ``MongoDB``:inxx ``CouchDB``:inxx ``SAPDB``:inxx ``Cubrid``:inxx
+
----------
database | drivers (source)
SQLite | sqlite3 or pysqlite2 or zxJDBC ``zxjdbc``:cite (on Jython)
PostgreSQL | psycopg2 ``psycopg2``:cite or pg8000 ``pg8000``:cite or zxJDBC ``zxjdbc``:cite (on Jython)
MySQL | pymysql ``pymysql``:cite or MySQLdb ``mysqldb``:cite
Oracle | cx_Oracle ``cxoracle``:cite
MSSQL | pyodbc ``pyodbc``:cite
FireBird | kinterbasdb ``kinterbasdb``:cite or fdb or pyodbc
DB2 | pyodbc ``pyodbc``:cite
Informix | informixdb ``informixdb``:cite
Ingres | ingresdbi ``ingresdbi``:cite
+Cubrid | cubriddb ``cubriddb``:cite ``cubrid``:cite
+Sybase | Sybase ``Sybase``:cite
+Teradata | pyodbc ``Teradata``:cite
+SAPDB | sapdb ``SAPDB``:cite
+MongoDB | pymongo ``pymongo``:cite
+IMAP | imaplib ``IMAP``:cite
---------
``sqlite3``, ``pymysql``, ``pg8000``, and ``imaplib`` ship with web2py. Support of MongoDB is experimental. The IMAP option allows to use DAL to access IAMP.
web2py defines the following classes that make up the DAL:
**DAL** represents a database connection. For example:
``sqlite``:inxx
``
db = DAL('sqlite://storage.db')
``:code
``define_table``:inxx
**Table** represents a database table. You do not directly instantiate Table; instead, ``DAL.define_table`` instantiates it.
``
db.define_table('mytable', Field('myfield'))
``:code
The most important methods of a Table are:
``insert``:inxx
``truncate``:inxx
``drop``:inxx
``import_from_csv_file``:inxx
db().select(db.table.ALL, orderby=myorder)
``:code
### Connection strings
``connection strings``:inxx
A connection with the database is established by creating an instance of the DAL object:
``
>>> db = DAL('sqlite://storage.db', pool_size=0)
``:code
``db`` is not a keyword; it is a local variable that stores the connection object ``DAL``. You are free to give it a different name. The constructor of ``DAL`` requires a single argument, the connection string. The connection string is the only web2py code that depends on a specific back-end database. Here are examples of connection strings for specific types of supported back-end databases (in all cases, we assume the database is running from localhost on its default port and is named "test"):
-------------
**SQLite** | ``sqlite://storage.db``
**MySQL** | ``mysql://username:password@localhost/test``
**PostgreSQL** | ``postgres://username:password@localhost/test``
**MSSQL** | ``mssql://username:password@localhost/test``
**FireBird** | ``firebird://username:password@localhost/test``
**Oracle** | ``oracle://username/password@test``
**DB2** | ``db2://username:password@test``
**Ingres** | ``ingres://username:password@localhost/test``
+**Sybase** | ``sybase://username:password@localhost/test``
**Informix** | ``informix://username:password@test``
+**Teradata** | ``teradata://DSN=dsn;UID=user;PWD=pass;DATABASE=database``
+**Cubrid** | ``cubrid://username:password@localhost/test``
+**SAPDB** | ``sapdb://username:password@localhost/test``
+**IMAP** | ``imap://user:password@server:port``
+**MongoDB** | ``mongodb://username:password@localhost/test``
**Google App Engine/SQL** | ``google:sql``
**Google App Engine/NoSQL** | ``google:datastore``
-------------
Notice that in SQLite the database consists of a single file. If it does not exist, it is created. This file is locked every time it is accessed. In the case of MySQL, PostgreSQL, MSSQL, FireBird, Oracle, DB2, Ingres and Informix the database "test" must be created outside web2py. Once the connection is established, web2py will create, alter, and drop tables appropriately.
It is also possible to set the connection string to ``None``. In this case DAL will not connect to any back-end database, but the API can still be accessed for testing. Examples of this will be discussed in Chapter 7.
#### Connection pooling
``connection pooling``:inxx
The second argument of the DAL constructor is the ``pool_size``; it defaults to 0.
As it is rather slow to establish a new database connection for each request, web2py implements a mechanism for connection pooling. Once a connection is established and the page has been served and the transaction completed, the connection is not closed but goes into a pool. When the next http request arrives, web2py tries to obtain a connection from the pool and use that for the new transaction. If there are no available connections in the pool, a new connection is established.
The ``pool_size`` parameter is ignored by SQLite and Google App Engine.
Connections in the pools are shared sequentially among threads, in the sense that they may be used by two different but not simultaneous threads. There is only one pool for each web2py process.
When web2py starts, the pool is always empty. The pool grows up to the minimum between the value of ``pool_size`` and the max number of concurrent requests. This means that if ``pool_size=10`` but our server never receives more than 5 concurrent requests, then the actual pool size will only grow to 5. If ``pool_size=0`` then connection pooling is not used.
For supported back-ends you may also specify if you would like to check against
``
check_reserved=['postgres', 'postgres_nonreserved']
``:code
The following database backends support reserved words checking.
-----
**PostgreSQL** | ``postgres(_nonreserved)``
**MySQL** | ``mysql``
**FireBird** | ``firebird(_nonreserved)``
**MSSQL** | ``mssql``
**Oracle** | ``oracle``
-----
### ``DAL``, ``Table``, ``Field``
The best way to understand the DAL API is to try each function yourself. This can be done interactively via the web2py shell, although ultimately, DAL code goes in the models and controllers.
Start by creating a connection. For the sake of example, you can use SQLite. Nothing in this discussion changes when you change the back-end engine.
``
>>> db = DAL('sqlite://storage.db')
``:code
The database is now connected and the connection is stored in the global variable ``db``.
At any time you can retrieve the connection string.
``_uri``:inxx
``
>>> print db._uri
sqlite://storage.db
``:code
and the database name
``_dbname``:inxx
``
>>> print db._dbname
sqlite
``:code
or even more complex ones using a function:
``
>>> db.define_table('person', Field('name'),
format=lambda r: r.name or 'anonymous')
``:code
The format attribute will be used for two purposes:
- To represent referenced records in select/option drop-downs.
- To set the ``db.othertable.person.represent`` attribute for all fields referencing this table. This means that SQLTABLE will not show references by id but will use the format preferred representation instead.
``Field constructor``:inxx
These are the default values of a Field constructor:
``
Field(name, 'string', length=None, default=None,
required=False, requires='<default>',
ondelete='CASCADE', notnull=False, unique=False,
uploadfield=True, widget=None, label=None, comment=None,
writable=True, readable=True, update=None, authorize=None,
autodelete=False, represent=None, compute=None,
uploadfolder=os.path.join(request.folder,'uploads'),
uploadseparate=None,uploadfs=None)
``:code
Not all of them are relevant for every field. "length" is relevant only for fields of type "string". "uploadfield" and "authorize" are relevant only for fields of type "upload". "ondelete" is relevant only for fields of type "reference" and "upload".
- ``length`` sets the maximum length of a "string", "password" or "upload" field. If ``length`` is not specified a default value is used but the default value is not guaranteed to be backward compatible. ''To avoid unwanted migrations on upgrades, we recommend that you always specify the length for string, password and upload fields.''
- ``default`` sets the default value for the field. The default value is used when performing an insert if a value is not explicitly specified. It is also used to pre-populate forms built from the table using SQLFORM. Note, rather than being a fixed value, the default can instead be a function (including a lambda function) that returns a value of the appropriate type for the field. In that case, the function is called once for each record inserted, even when multiple records are inserted in a single transaction.
- ``required`` tells the DAL that no insert should be allowed on this table if a value for this field is not explicitly specified.
- ``requires`` is a validator or a list of validators. This is not used by the DAL, but it is used by SQLFORM. The default validators for the given types are shown in the following table:
----------
**field type** | **default field validators**
``string`` | ``IS_LENGTH(length)`` default length is 512
``text`` | ``IS_LENGTH(65536)``
``blob`` | ``None``
``boolean`` | ``None``
``integer`` | ``IS_INT_IN_RANGE(-1e100, 1e100)``
``double`` | ``IS_FLOAT_IN_RANGE(-1e100, 1e100)``
``decimal(n,m)`` | ``IS_DECIMAL_IN_RANGE(-1e100, 1e100)``
``date`` | ``IS_DATE()``
``time`` | ``IS_TIME()``
``datetime`` | ``IS_DATETIME()``
``password`` | ``None``
``upload`` | ``None``
``reference <table>`` | ``IS_IN_DB(db,table.field,format)``
``list:string`` | ``None``
``list:integer`` | ``None``
``list:reference <table>`` | ``IS_IN_DB(db,table.field,format,multiple=True)``
+``bigint`` | ``None``
+``big-id`` | ``None``
+``big-reference`` | ``None``
---------
Decimal requires and returns values as ``Decimal`` objects, as defined in the Python ``decimal`` module. SQLite does not handle the ``decimal`` type so internally we treat it as a ``double``. The (n,m) are the number of digits in total and the number of digits after the decimal point respectively.
+The ``big-id`` and, ``big-reference`` are only supported by some of the database engines and are experimental. They are not normally used as field types unless for legacy tables, however, the DAL constructor has a ``bigint_id`` argument that when set to ``True`` makes the ``id`` fields and ``reference`` fields ``big-id`` and ``big-referece`` respectively.
+
The ``list:`` fields are special because they are designed to take advantage of certain denormalization features on NoSQL (in the case of Google App Engine NoSQL, the field types ``ListProperty`` and ``StringListProperty``) and back-port them all the other supported relational databases. On relational databases lists are stored as a ``text`` field. The items are separated by a ``|`` and each ``|`` in string item is escaped as a ``||``. They are discussed in their own section.
-------
Notice that ``requires=...`` is enforced at the level of forms, ``required=True`` is enforced at the level of the DAL (insert), while ``notnull``, ``unique`` and ``ondelete`` are enforced at the level of the database. While they sometimes may seem redundant, it is important to maintain the distinction when programming with the DAL.
-------
``ondelete``:inxx
- ``ondelete`` translates into the "ON DELETE" SQL statement. By default it is set to "CASCADE". This tells the database that when it deletes a record, it should also delete all records that refer to it. To disable this feature, set ``ondelete`` to "NO ACTION" or "SET NULL".
- ``notnull=True`` translates into the "NOT NULL" SQL statement. It prevents the database from inserting null values for the field.
- ``unique=True`` translates into the "UNIQUE" SQL statement and it makes sure that values of this field are unique within the table. It is enforced at the database level.
- ``uploadfield`` applies only to fields of type "upload". A field of type "upload" stores the name of a file saved somewhere else, by default on the filesystem under the application "uploads/" folder. If ``uploadfield`` is set, then the file is stored in a blob field within the same table and the value of ``uploadfield`` is the name of the blob field. This will be discussed in more detail later in the context of SQLFORM.
- ``uploadfolder`` defaults to the application's "uploads/" folder. If set to a different path, files will uploaded to a different folder. For example, uploadfolder=os.path.join(request.folder,'static/temp') will upload files to the web2py/applications/myapp/static/temp folder.
- ``uploadseparate`` if set to True will upload files under different subfolders of the ''uploadfolder'' folder. This is optimized to avoid too many files under the same folder/subfolder. ATTENTION: You cannot change the value of ``uploadseparate`` from True to False without breaking the system. web2py either uses the separate subfolders or it does not. Changing the behavior after files have been uploaded will prevent web2py from being able to retrieve those files. If this happens it is possible to move files and fix the problem but this is not described here.
+- ``uploadfs`` allows you specify a different filessystem where to upload files, including an Amazon S3 storage or a remote FTP storage. This option requires PyFileSystem installed. ``uploadfs`` must point to ``PyFileSystem``. ``PyFileSystem``:inxx ``uploadfs``:idxx
- ``widget`` must be one of the available widget objects, including custom widgets, for example: ``SQLFORM.widgets.string.widget``. A list of available widgets will be discussed later. Each field type has a default widget.
- ``label`` is a string (or something that can be serialized to a string) that contains the label to be used for this field in autogenerated forms.
- ``comment`` is a string (or something that can be serialized to a string) that contains a comment associated with this field, and will be displayed to the right of the input field in the autogenerated forms.
- ``writable`` if a field is writable, it can be edited in autogenerated create and update forms.
- ``readable`` if a field is readable, it will be visible in readonly forms. If a field is neither readable nor writable, it will not be displayed in create and update forms.
- ``update`` contains the default value for this field when the record is updated.
- ``compute`` is an optional function. If a record is inserted or updated, the compute function will be executed and the field will be populated with the function result. The record is passed to the compute function as a ``dict``, and the dict will not include the current value of that, or any other compute field.
- ``authorize`` can be used to require access control on the corresponding field, for "upload" fields only. It will be discussed more in detail in the context of Authentication and Authorization.
- ``autodelete`` determines if the corresponding uploaded file should be deleted when the record referencing the file is deleted. For "upload" fields only.
- ``represent`` can be None or can point to a function that takes a field value and returns an alternate representation for the field value. Examples:
``
db.mytable.name.represent = lambda name,row: name.capitalize()
db.mytable.other_id.represent = lambda id,row: row.myfield
db.mytable.some_uploadfield.represent = lambda value,row: \
A('get it', _href=URL('download', args=value))
``:code
``blob``:inxx
"blob" fields are also special. By default, binary data is encoded in base64 before being stored into the actual database field, and it is decoded when extracted. This has the negative effect of using 25% more storage space than necessary in blob fields, but has two advantages. On average it reduces the amount of data communicated between web2py and the database server, and it makes the communication independent of back-end-specific escaping conventions.
The DAL allows you to explicitly issue SQL statements.
>>> print db.executesql('SELECT * FROM person;')
[(1, u'Massimo'), (2, u'Massimo')]
``:code
In this case, the return values are not parsed or transformed by the DAL, and the format depends on the specific database driver. This usage with selects is normally not needed, but it is more common with indexes.
``executesql`` takes two optional arguments: ``placeholders`` and ``as_dict``
``placeholders`` is an optional
sequence of values to be substituted in
or, if supported by the DB driver, a dictionary with keys
matching named placeholders in your SQL.
If ``as_dict`` is set to True,
and the results cursor returned by the DB driver will be
converted to a sequence of dictionaries keyed with the db
field names. Results returned with ``as_dict = True ``are
the same as those returned when applying **.as_list()** to a normal select.
``
[{field1: value1, field2: value2}, {field1: value1b, field2: value2b}]
``:code
+executesql have an optional ``fields`` argument.
+If not None, the
+results cursor returned by the DB driver will be converted to a
+DAL Rows object using the ``db._adapter.parse()`` method. This requires
+specifying the "fields" argument as a list of DAL Field objects
+that match the fields returned from the DB. The Field objects should
+be part of one or more Table objects defined on the DAL object.
+
+The ``fields`` list can include one or more DAL Table objects in addition
+to or instead of including Field objects, or it can be just a single
+table (not in a list). In that case, the Field objects will be
+extracted from the table(s).
+
+The field names will be extracted from the Field objects, or optionally,
+a list of field names can be provided (in tablename.fieldname format)
+via the ``colnames`` argument. Note, the fields and colnames must be in
+the same order as the fields in the results cursor returned from the DB.
+
#### ``_lastsql``
Whether SQL was executed manually using executesql or was SQL generated by the DAL, you can always find the SQL code in ``db._lastsql``. This is useful for debugging purposes:
``_lastdb``:inxx
``
>>> rows = db().select(db.person.ALL)
>>> print db._lastsql
SELECT person.id, person.name FROM person;
``:code
-------
web2py never generates queries using the "*" operator. web2py is always explicit when selecting fields.
-------
### ``drop``
Finally, you can drop tables and all data will be lost:
``drop``:inxx
db.define_table('account',
Field('accnum','integer'),
Field('acctype'),
Field('accdesc'),
primarykey=['accnum','acctype'],
migrate=False)
``:code
- ``primarykey`` is a list of the field names that make up the primary key.
- All primarykey fields have a ``NOT NULL`` set even if not specified.
- Keyed table can only refer are to other keyed tables.
- Referenceing fields must use the ``reference tablename.fieldname`` format.
- The ``update_record`` function is not available for Rows of keyed tables.
-------
Note that currently this is only available for DB2, MS-SQL, Ingres and Informix, but others can be easily added.
-------
At the time of writing, we cannot guarantee that the ``primarykey`` attribute works with every existing legacy table and every supported database backend.
For simplicity, we recommend, if possible, creating a database view that has an auto-increment id field.
+
### Distributed transaction
``distributed transactions``:inxx
------
At the time of writing this feature is only supported
by PostgreSQL, MySQL and Firebird, since they expose API for two-phase commits.
------
Assuming you have two (or more) connections to distinct PostgreSQL databases, for example:
``
db_a = DAL('postgres://...')
db_b = DAL('postgres://...')
``:code
In your models or controllers, you can commit them concurrently with:
``
DAL.distributed_transaction_commit(db_a, db_b)
``:code
On failure, this function rolls back and raises an ``Exception``.
In controllers, when one action returns, if you have two distinct connections and you do not call the above function, web2py commits them separately. This means there is a possibility that one of the commits succeeds and one fails. The distributed transaction prevents this from happening.
### More on uploads
Consider the following model:
``
+>>> db.define_table('myfile',
+ Field('image', 'upload', default='path/'))
``:code
+In the case of an 'upload' field, the default value can optionally be set to a path (an absolute path or a path relative to the current app folder) and the default image will be set to a copy of the file at the path. A new copy is made for each new record that does not specify an image.
+
Normally an insert is handled automatically via a SQLFORM or a crud form (which is a SQLFORM) but occasionally you already have the file on the filesystem and want to upload it programmatically. This can be done in this way:
``
>>> stream = open(filename, 'rb')
>>> db.myfile.insert(image=db.myfile.image.store(stream, filename))
``:code
+It is also possible to insert a file in a simpler way and have the insert method call store automatically:
+
+``
+>>> stream = open(filename, 'rb')
+>>> db.myfile.insert(image=stream)
+``:code
+
+In this case the filename is obtained from the stream object if available.
+
The ``store`` method of the upload field object takes a file stream and a filename. It uses the filename to determine the extension (type) of the file, creates a new temp name for the file (according to web2py upload mechanism) and loads the file content in this new temp file (under the uploads folder unless specified otherwise). It returns the new temp name, which is then stored in the ``image`` field of the ``db.myfile`` table.
Note, if the file is to be stored in an associated blob field rather than the file system, the ``store()`` method will not insert the file in the blob field (because ``store()`` is called before the insert), so the file must be explicitly inserted into the blob field:
``
>>> db.define_table('myfile',
Field('image', 'upload', uploadfield='image_file'),
Field('image_file', 'blob'))
>>> stream = open(filename, 'rb')
>>> db.myfile.insert(image=db.myfile.image.store(stream, filename),
image_file=stream.read())
``:code
+
The opposite of ``.store`` is ``.retrieve``:
``
>>> row = db(db.myfile).select().first()
>>> (filename, stream) = db.myfile.image.retrieve(row.image)
>>> import shutil
>>> shutil.copyfileobj(stream,open(filename,'wb'))
``
### ``Query``, ``Set``, ``Rows``
Let's consider again the table defined (and dropped) previously and insert three records:
``
>>> db.define_table('person', Field('name'))
>>> db.person.insert(name="Alex")
1
>>> db.person.insert(name="Bob")
2
>>> db.person.insert(name="Carl")
3
which is equivalent to
db(db.mytable.id==id).update(myfield='somevalue')
``:code
and it updates an existing record with field values specified by the dictionary on the right hand side.
#### Fetching a ``Row``
Yet another convenient syntax is the following:
``
record = db.mytable(id)
record = db.mytable(db.mytable.id==id)
record = db.mytable(id,myfield='somevalue')
``:code
Apparently similar to ``db.mytable[id]`` the above syntax is more flexible and safer. First of all it checks whether ``id`` is an int (or ``str(id)`` is an int) and returns ``None`` if not (it never raises an exception). It also allows to specify multiple conditions that the record must meet. If they are not met, it also returns ``None``.
#### Recursive ``select``s
``recursive selects``:inxx
Consider the previous table person and a new table "thing" referencing a "person":
``
>>> db.define_table('thing', Field('name'), Field('owner','reference person'))
``:code
and a simple select from this table:
``
>>> things = db(db.thing).select()
``:code
which is equivalent to
``
>>> things = db(db.thing._id>0).select()
``:code
where ``._id`` is a reference to the primary key of the table. Normally ``db.thing._id`` is the same as ``db.thing.id`` and we will assume that in most of this book. ``_id``:inxx
For each Row of things it is possible to fetch not just fields from the selected table (thing) but also from linked tables (recursively):
``
>>> for thing in things: print thing.name, thing.owner.name
``:code
Here ``thing.owner.name`` requires one database select for each thing in things and it is therefore inefficient. We suggest using joins whenever possible instead of recursive selects, nevertheless this is convenient and practical when accessing individual records.
You can also do it backwards, by selecting the things referenced by a person:
``
person = db.person(id)
+for thing in person.thing.select(orderby=db.thing.name):
+ print person.name, 'owns', thing.name
``:code
In this last expressions ``person.thing`` is a shortcut for
``
db(db.thing.owner==person.id)
``:code
i.e. the Set of ``thing``s referenced by the current ``person``. This syntax breaks down if the referencing table has multiple references to the referenced table. In this case one needs to be more explicit and use a full Query.
#### Serializing ``Rows`` in views
Given the following action containing a query
``SQLTABLE``:inxx
``
def index()
return dict(rows = db(query).select())
``:code
The result of a select can be displayed in a view with the following syntax:
``
{{extend 'layout.html'}}
<h1>Records</h1>
{{=rows}}
``:code
Which is equivalent to:
Due to Python restrictions in overloading "``and``" and "``or``" operators, thes
It is also possible to build queries using in-place logical operators:
``
>>> query = db.person.name!='Alex'
>>> query &= db.person.id>3
>>> query |= db.person.name=='John'
``
#### ``count``, ``isempty``, ``delete``, ``update``
You can count records in a set:
``count``:inxx ``isempty``:inxx
``
>>> print db(db.person.id > 0).count()
3
``:code
Notice that ``count`` takes an optional ``distinct`` argument which defaults to False, and it works very much like the same argument for ``select``. ``count`` has also a ``cache`` argument that works very much like the equivalent argument of the ``select`` method.
Sometimes you may need to check is a table is empty. A more efficient way than counting is using the ``isempty`` method:
``
>>> print db(db.person.id > 0).isempty()
False
``:code
or equivalently:
``
>>> print db(db.person).isempty()
False
``:code
You can delete records in a set:
``delete``:inxx
``
>>> db(db.person.id > 3).delete()
In this case ``row.total_price`` is not a value but a function. The function tak
The lazy field in the example above allows one to compute the total price for each ``item``:
``
>>> for row in db(db.item).select(): print row.total_price()
``
And it also allows to pass an optional ``discount`` percentage (15%):
``
>>> for row in db(db.item).select(): print row.total_price(15)
``
------
Mind that virtual fields do not have the same attributes as the other fields (default, readable, requires, etc) and they do not appear in the list of ``db.table.fields`` and are not visualized by default in tables (TABLE) and grids (SQLFORM.grid, SQLFORM.smartgrid).
------
### One to many relation
``one to many``:inxx
To illustrate how to implement one to many relations with the web2py DAL, define another table "thing" that refers to the table "person" which we redefine here:
``
>>> db.define_table('person',
Field('name'),
format='%(name)s')
>>> db.define_table('thing',
Field('name'),
Field('owner', 'reference person'),
format='%(name)s')
``:code
Table "thing" has two fields, the name of the thing and the owner of the thing. When a field type is another table, it is intended that the field reference the other table by its id. In fact, you can print the actual type value and get:
``
>>> print db.thing.owner.type
reference person
``:code
Now, insert three things, two owned by Alex and one by Bob:
``
>>> db.thing.insert(name='Boat', owner=1)
1
>>> db.thing.insert(name='Chair', owner=1)
2
>>> db.thing.insert(name='Shoes', owner=2)
3
``:code
You can select as you did for any other table:
``
>>> for row in db(db.thing.owner==1).select():
print row.name
+Boat
+Chair
``:code
Because a thing has a reference to a person, a person can have many things, so a record of table person now acquires a new attribute thing, which is a Set, that defines the things of that person. This allows looping over all persons and fetching their things easily:
``referencing``:inxx
``
>>> for person in db().select(db.person.ALL):
print person.name
+ for thing in person.thing.select():
+ print ' ', thing.name
Alex
+ Boat
+ Chair
Bob
Shoes
Carl
``:code
#### Inner joins
Another way to achieve a similar result is by using a join, specifically an INNER JOIN. web2py performs joins automatically and transparently when the query links two or more tables as in the following example:
``Rows``:inxx ``inner join``:inxx ``join``:inxx
``
>>> rows = db(db.person.id==db.thing.owner).select()
>>> for row in rows:
+ print row.person.name, 'has', row.thing.name
+Alex has Boat
+Alex has Chair
+Bob has Shoes
``:code
Observe that web2py did a join, so the rows now contain two records, one from each table, linked together. Because the two records may have fields with conflicting names, you need to specify the table when extracting a field value from a row. This means that while before you could do:
``
row.name
``:code
and it was obvious whether this was the name of a person or a thing, in the result of a join you have to be more explicit and say:
``
row.person.name
``:code
or:
``
row.thing.name
``:code
There is an alterantive syntax for INNER JOINS:
``
>>> rows = db(db.person).select(join=db.thing.on(db.person.id==db.thing.owner))
>>> for row in rows:
+ print row.person.name, 'has', row.thing.name
+Alex has Boat
+Alex has Chair
+Bob has Shoes
``:code
While the output is the same, the generated SQL in the two cases can be different. The latter syntax removes possible ambiguities when the same table is joined twice and aliased:
``
>>> db.define_table('thing',
Field('name'),
Field('owner1','reference person'),
Field('owner2','reference person'))
>>> rows = db(db.person).select(
+ join=[db.person.with_alias('owner1').on(db.person.id==db.thing.owner1).
+ db.person.with_alias('owner2').on(db.person.id==db.thing.owner2)])
``
The value of ``join`` can be list of ``db.table.on(...)`` to join.
#### Left outer join
Notice that Carl did not appear in the list above because he has no things. If you intend to select on persons (whether they have things or not) and their things (if they have any), then you need to perform a LEFT OUTER JOIN. This is done using the argument "left" of the select command. Here is an example:
``Rows``:inxx ``left outer join``:inxx ``outer join``:inxx
``
>>> rows=db().select(
+ db.person.ALL, db.thing.ALL,
+ left=db.thing.on(db.person.id==db.thing.owner))
>>> for row in rows:
+ print row.person.name, 'has', row.thing.name
+Alex has Boat
+Alex has Chair
+Bob has Shoes
Carl has None
``:code
where:
``
left = db.thing.on(...)
``:code
does the left join query. Here the argument of ``db.thing.on`` is the condition required for the join (the same used above for the inner join). In the case of a left join, it is necessary to be explicit about which fields to select.
Multiple left joins can be combined by passing a list or tuple of ``db.mytable.on(...)`` to the ``left`` attribute.
#### Grouping and counting
When doing joins, sometimes you want to group rows according to certain criteria and count them. For example, count the number of things owned by every person. web2py allows this as well. First, you need a count operator. Second, you want to join the person table with the thing table by owner. Third, you want to select all rows (person + thing), group them by person, and count them while grouping:
``grouping``:inxx
``
>>> count = db.person.id.count()
>>> for row in db(db.person.id==db.thing.owner).select(
db.person.name, count, groupby=db.person.name):
print row.person.name, row[count]
Alex 2
Bob 1
``:code
Notice the count operator (which is built-in) is used as a field. The only issue here is in how to retrieve the information. Each row clearly contains a person and the count, but the count is not a field of a person nor is it a table. So where does it go? It goes into the storage object representing the record with a key equal to the query expression itself. The count method of the Field object has an optional ``distinct`` argument. When set to ``True`` it specifies that only distinct values of the field in question are to be counted.
### Many to many
``many-to-many``:inxx
In the previous examples, we allowed a thing to have one owner but one person could have many things. What if Boat was owned by Alex and Curt? This requires a many-to-many relation, and it is realized via an intermediate table that links a person to a thing via an ownership relation.
Here is how to do it:
``
>>> db.define_table('person',
Field('name'))
>>> db.define_table('thing',
Field('name'))
>>> db.define_table('ownership',
Field('person', 'reference person'),
Field('thing', 'reference thing'))
``:code
the existing ownership relationship can now be rewritten as:
``
+>>> db.ownership.insert(person=1, thing=1) # Alex owns Boat
+>>> db.ownership.insert(person=1, thing=2) # Alex owns Chair
+>>> db.ownership.insert(person=2, thing=3) # Bob owns Shoes
``:code
Now you can add the new relation that Curt co-owns Boat:
``
>>> db.ownership.insert(person=3, thing=1) # Curt owns Boat too
``:code
Because you now have a three-way relation between tables, it may be convenient to define a new set on which to perform operations:
``
>>> persons_and_things = db(
(db.person.id==db.ownership.person) \
& (db.thing.id==db.ownership.thing))
``:code
Now it is easy to select all persons and their things from the new Set:
``
+>>> for row in persons_and_things.select():
+ print row.person.name, row.thing.name
+Alex Boat
+Alex Chair
+Bob Shoes
+Curt Boat
``:code
Similarly, you can search for all things owned by Alex:
``
+>>> for row in persons_and_things(db.person.name=='Alex').select():
+ print row.thing.name
+Boat
+Chair
``:code
and all owners of Boat:
``
>>> for row in persons_and_things(db.thing.name=='Boat').select():
print row.person.name
Alex
Curt
``:code
A lighter alternative to Many 2 Many relations is tagging. Tagging is discussed in the context of the ``IS_IN_DB`` validator. Tagging works even on database backends that do not support JOINs like the Google App Engine NoSQL.
### Many to many, ``list:<type>``, and ``contains``
``list:string``:inxx
``list:integer``:inxx
``list:reference``:inxx
``contains``:inxx
``multiple``:inxx
``tags``:inxx
web2py provides the following special field types:
``
list:string
list:integer
Let's define another table "log" to store security events, their event_time and
Field('event_time', 'datetime'),
Field('severity', 'integer'))
``:code
As before, insert a few events, a "port scan", an "xss injection" and an "unauthorized login".
For the sake of the example, you can log events with the same event_time but with different severities (1, 2, 3 respectively).
``
>>> import datetime
>>> now = datetime.datetime.now()
>>> print db.log.insert(
event='port scan', event_time=now, severity=1)
1
>>> print db.log.insert(
event='xss injection', event_time=now, severity=2)
2
>>> print db.log.insert(
event='unauthorized login', event_time=now, severity=3)
3
``:code
+#### ``like``, ``regexp``, ``startswith``, ``contains``, ``upper``, ``lower``
+
+``like``:inxx ``startswith``:inxx ``regexp``:inxx
``contains``:inxx ``upper``:inxx ``lower``:inxx
Fields have a like operator that you can use to match strings:
``
>>> for row in db(db.log.event.like('port%')).select():
print row.event
port scan
``:code
Here "port%" indicates a string starting with "port". The percent sign character, "%", is a wild-card character that means "any sequence of characters".
+The like operator is case insisite but it can be made case sensitive with
+
+``
+db.mytable.myfield.like('value',case_sensitive=True)
+``:code
+
+
web2py also provides some shortcuts:
``
db.mytable.myfield.startswith('value')
db.mytable.myfield.contains('value')
``:code
which are equivalent respectively to
``
db.mytable.myfield.like('value%')
db.mytable.myfield.like('%value%')
``:code
Notice that ``contains`` has a special meaning for ``list:<type>`` fields and it was discussed in a previous section.
The ``contains`` method can also be passed a list of values and an optional boolean argument ``all`` to search for records that contain all values:
``
db.mytable.myfield.contains(['value1','value2'], all=True)
``
or any value from the list
``
db.mytable.myfield.contains(['value1','value2'], all=false)
``
+There is a also a ``regexp`` method that works like the ``like`` method but allows regular expression syntax for the look-up expression. It is only supported by PostgreSQL and SQLite.
The ``upper`` and ``lower`` methods allow you to convert the value of the field to upper or lower case, and you can also combine them with the like operator:
``upper``:inxx ``lower``:inxx
``
>>> for row in db(db.log.event.upper().like('PORT%')).select():
print row.event
port scan
``:code
#### ``year``, ``month``, ``day``, ``hour``, ``minutes``, ``seconds``
``hour``:inxx ``minutes``:inxx ``seconds``:inxx ``day``:inxx ``month``:inxx ``year``:inxx
The date and datetime fields have day, month and year methods. The datetime and time fields have hour, minutes and seconds methods. Here is an example:
``
>>> for row in db(db.log.event_time.year()==2009).select():
print row.event
port scan
xss injection
The SQL IN operator is realized via the belongs method which returns true when t
``belongs``:inxx
``
>>> for row in db(db.log.severity.belongs((1, 2))).select():
print row.event
port scan
xss injection
``:code
The DAL also allows a nested select as the argument of the belongs operator. The only caveat is that the nested select has to be a ``_select``, not a ``select``, and only one field has to be selected explicitly, the one that defines the set.
``nested select``:inxx
``
>>> bad_days = db(db.log.severity==3)._select(db.log.event_time)
>>> for row in db(db.log.event_time.belongs(bad_days)).select():
print row.event
port scan
xss injection
unauthorized login
``:code
+In those cases where a nested select is required and the loop-up field is a reference we can also use a query as argument. For example:
+
+``
+db.define_table('person',Field('name'))
+db.define_table('thing',Field('owner'),Field('owner','reference thing'))
+db(db.thing.owner.belongs(db.person.name=='Johnathan')).select()
+``:cite
+
+In this case it is obvious that the next select only needs the field referenced by the ``db.thing.owner` field so we do not need the more verbose ``_select`` notation.
+
+#### ``sum``, ``avg``, ``min``, ``max`` and ``len``
``sum``:inxx ``avg``:inxx ``min``:inxx ``max``:inxx
Previously, you have used the count operator to count records. Similarly, you can use the sum operator to add (sum) the values of a specific field from a group of records. As in the case of count, the result of a sum is retrieved via the store object:
``
>>> sum = db.log.severity.sum()
>>> print db().select(sum).first()[sum]
6
``:code
You can also use ``avg``, ``min``, and ``max`` to the average, mininum, and maximum value respectively for the selected records. For example:
``
>>> max = db.log.severity.max()
>>> print db().select(max).first()[max]
3
``:code
``.len()`` computes the length of a string, text or boolean fields.
Expressions can be combined to form more complex expressions. For example here we are computing the sum of the length of all the severity strings in the logs, increased of one:
``
>>> sum = (db.log.severity.len()+1).sum()
>>> print db().select(sum).first()[sum]
``:code
#### Substrings
One can build an expression to refer to a substring. For example, we can group things whose name starts with the same three characters and select only one from each group:
``
db(db.thing).select(dictinct = db.thing.name[:3])
``:code
#### Default values with ``coalesce`` and ``coalesce_zero``
There are times when you need to pull a value from database but also need a default values if the value for a record is set to NULL. In SQL there is a keyword, ``COALESCE``, for this. web2py has an equivalent ``coalesce`` method:
``
>>> db.define_table('sysuser',Field('username'),Field('fullname'))
>>> db.sysuser.insert(username='max',fullname='Max Power')
>>> db.sysuser.insert(username='tim',fullname=None)
print db(db.sysuser).select(db.sysuser.fullname.coalesce(db.sysuser.username))
"COALESCE(sysuser.fullname,sysuser.username)"
Max Power
tim
``
Other times you need to compute a mathematical expression but some fields have a value set to None while it should be zero.
``coalesce_zero`` comes to the rescue by defaulting None to zero in the query:
And finally, here is ``_update`` ``_update``:inxx
>>> print db(db.person.name=='Alex')._update()
UPDATE person SET WHERE person.name='Alex';
``:code
-----
Moreover you can always use ``db._lastsql`` to return the most recent
SQL code, whether it was executed manually using executesql or was SQL
generated by the DAL.
-----
### Exporting and importing data
``export``:inxx ``import``:inxx
#### CSV (one Table at a time)
When a DALRows object is converted to a string it is automatically
serialized in CSV:
``csv``:inxx
``
>>> rows = db(db.person.id==db.thing.owner).select()
>>> print rows
+person.id,person.name,thing.id,thing.name,thing.owner
+1,Alex,1,Boat,1
+1,Alex,2,Chair,1
+2,Bob,3,Shoes,2
``:code
You can serialize a single table in CSV and store it in a file "test.csv":
``
>>> open('test.csv', 'w').write(str(db(db.person.id).select()))
``:code
and you can easily read it back with:
``
>>> db.person.import_from_csv_file(open('test.csv', 'r'))
``:code
When importing, web2py looks for the field names in the CSV header. In this example, it finds two columns: "person.id" and "person.name". It ignores the "person." prefix, and it ignores the "id" fields. Then all records are appended and assigned new ids. Both of these operations can be performed via the appadmin web interface.
#### CSV (all tables at once)
In web2py, you can backup/restore an entire database with two commands:
To export:
``
Two tables are separated ``\r\n\r\n``. The file ends with the line
END
``:code
The file does not include uploaded files if these are not stored in the database. In any case it is easy enough to zip the "uploads" folder separately.
When importing, the new records will be appended to the database if it is not empty. In general the new imported records will not have the same record id as the original (saved) records but web2py will restore references so they are not broken, even if the id values may change.
If a table contains a field called
"uuid", this field will be used to identify duplicates. Also, if an
imported record has the same "uuid" as an existing record, the
previous record will be updated.
#### CSV and remote database synchronization
Consider the following model:
``
db = DAL('sqlite:memory:')
db.define_table('person',
Field('name'),
format='%(name)s')
db.define_table('thing',
Field('owner', 'reference person'),
Field('name'),
format='%(name)s')
if not db(db.person).count():
id = db.person.insert(name="Massimo")
db.thing.insert(owner=id, name="Chair")
``:code
Each record is identified by an ID and referenced by that ID. If you
have two copies of the database used by distinct web2py installations,
the ID is unique only within each database and not across the databases.
This is a problem when merging records from different databases.
In order to make a record uniquely identifiable across databases, they
must:
- have a unique id (UUID),
- have an event_time (to figure out which one is more recent if multiple copies),
- reference the UUID instead of the id.
This can be achieved without modifying web2py. Here is what to do:
**1.** Change the above model into:
``
db.define_table('person',
Field('uuid', length=64, default=lambda:str(uuid.uuid4())),
Field('modified_on', 'datetime', default=now),
Field('name'),
format='%(name)s')
db.define_table('thing',
Field('uuid', length=64, default=lambda:str(uuid.uuid4())),
Field('modified_on', 'datetime', default=now),
Field('owner', length=64),
Field('name'),
format='%(name)s')
db.thing.owner.requires = IS_IN_DB(db,'person.uuid','%(name)s')
if not db(db.person.id).count():
id = uuid.uuid4()
db.person.insert(name="Massimo", uuid=id)
db.thing.insert(owner=id, name="Chair")
``:code
-------
Note, in the above table definitions, the default value for the two 'uuid' fields is set to a lambda function, which returns a UUID (converted to a string). The lambda function is called once for each record inserted, ensuring that each record gets a unique UUID, even if multiple records are inserted in a single transaction.
-------
**2.** Create a controller action to export the database:
``
def export():
s = StringIO.StringIO()
db.export_to_csv_file(s)
response.headers['Content-Type'] = 'text/csv'
return s.getvalue()
``:code
**3.** Create a controller action to import a saved copy of the other database and sync records:
``
def import_and_sync():
specific for this example.
``XML-RPC``:inxx
Alternatively, you can use XML-RPC to export/import the file.
If the records reference uploaded files, you also need to export/import the content of the uploads folder. Notice that files therein are already labeled by UUIDs so you do not need to worry about naming conflicts and references.
#### HTML and XML (one Table at a time)
``DALRows objects``:inxx
DALRows objects also have an ``xml`` method (like helpers) that serializes it to XML/HTML:
``HTML``:inxx
``
>>> rows = db(db.person.id > 0).select()
>>> print rows.xml()
<table>
<thead>
<tr>
<th>person.id</th>
<th>person.name</th>
+ <th>thing.id</th>
+ <th>thing.name</th>
+ <th>thing.owner</th>
</tr>
</thead>
<tbody>
<tr class="even">
<td>1</td>
<td>Alex</td>
<td>1</td>
<td>Boat</td>
<td>1</td>
</tr>
...
</tbody>
</table>
``:code
web2py comes with a Database Abstraction Layer (DAL), an API that maps Python objects into database objects such as queries, tables, and records. The DAL dynamically generates the SQL in real time using the specified dialect for the database back end, so that you do not have to write SQL code or learn different SQL dialects (the term SQL is used generically), and the application will be portable among different types of databases. At the time of this writing, the supported databases are SQLite (which comes with Python and thus web2py), PostgreSQL, MySQL, Oracle, MSSQL, FireBird, DB2, Informix, and Ingres and (partially) the Google App Engine (SQL and NoSQL). Experimentally we support more databases. Please check on the web2py web site and mailing list for more recent adapters. Google NoSQL is treated as a particular case in Chapter 13.
The Windows binary distribution works out of the box with SQLite and MySQL. The Mac binary distribution works out of the box with SQLite.
To use any other database back-end, run from the source distribution and install the appropriate driver for the required back end.
``database drivers``:inxx
Once the proper driver is installed, start web2py from source, and it will find the driver. Here is a list of drivers:
----------
database | driver (source)
SQLite | sqlite3 or pysqlite2 or zxJDBC ``zxjdbc``:cite (on Jython)
PostgreSQL | psycopg2 ``psycopg2``:cite or zxJDBC ``zxjdbc``:cite (on Jython)
MySQL | pymysql ``pymysql``:cite or MySQLdb ``mysqldb``:cite
Oracle | cx_Oracle ``cxoracle``:cite
MSSQL | pyodbc ``pyodbc``:cite
FireBird | kinterbasdb ``kinterbasdb``:cite
DB2 | pyodbc ``pyodbc``:cite
Informix | informixdb ``informixdb``:cite
Ingres | ingresdbi ``ingresdbi``:cite
---------
(``pymysql`` ships with web2py)
web2py defines the following classes that make up the DAL:
**DAL** represents a database connection. For example:
``sqlite``:inxx
``
db = DAL('sqlite://storage.db')
``:code
``define_table``:inxx
**Table** represents a database table. You do not directly instantiate Table; instead, ``DAL.define_table`` instantiates it.
``
db.define_table('mytable', Field('myfield'))
``:code
The most important methods of a Table are:
``insert``:inxx
``truncate``:inxx
``drop``:inxx
``import_from_csv_file``:inxx
db().select(db.table.ALL, orderby=myorder)
``:code
### Connection strings
``connection strings``:inxx
A connection with the database is established by creating an instance of the DAL object:
``
>>> db = DAL('sqlite://storage.db', pool_size=0)
``:code
``db`` is not a keyword; it is a local variable that stores the connection object ``DAL``. You are free to give it a different name. The constructor of ``DAL`` requires a single argument, the connection string. The connection string is the only web2py code that depends on a specific back-end database. Here are examples of connection strings for specific types of supported back-end databases (in all cases, we assume the database is running from localhost on its default port and is named "test"):
-------------
**SQLite** | ``sqlite://storage.db``
**MySQL** | ``mysql://username:password@localhost/test``
**PostgreSQL** | ``postgres://username:password@localhost/test``
**MSSQL** | ``mssql://username:password@localhost/test``
**FireBird** | ``firebird://username:password@localhost/test``
**Oracle** | ``oracle://username/password@test``
**DB2** | ``db2://username:password@test``
**Ingres** | ``ingres://username:password@localhost/test``
**Informix** | ``informix://username:password@test``
**Google App Engine/SQL** | ``google:sql``
**Google App Engine/NoSQL** | ``google:datastore``
-------------
Notice that in SQLite the database consists of a single file. If it does not exist, it is created. This file is locked every time it is accessed. In the case of MySQL, PostgreSQL, MSSQL, FireBird, Oracle, DB2, Ingres and Informix the database "test" must be created outside web2py. Once the connection is established, web2py will create, alter, and drop tables appropriately.
It is also possible to set the connection string to ``None``. In this case DAL will not connect to any back-end database, but the API can still be accessed for testing. Examples of this will be discussed in Chapter 7.
#### Connection pooling
``connection pooling``:inxx
The second argument of the DAL constructor is the ``pool_size``; it defaults to 0.
As it is rather slow to establish a new database connection for each request, web2py implements a mechanism for connection pooling. Once a connection is established and the page has been served and the transaction completed, the connection is not closed but goes into a pool. When the next http request arrives, web2py tries to obtain a connection from the pool and use that for the new transaction. If there are no available connections in the pool, a new connection is established.
The ``pool_size`` parameter is ignored by SQLite and Google App Engine.
Connections in the pools are shared sequentially among threads, in the sense that they may be used by two different but not simultaneous threads. There is only one pool for each web2py process.
When web2py starts, the pool is always empty. The pool grows up to the minimum between the value of ``pool_size`` and the max number of concurrent requests. This means that if ``pool_size=10`` but our server never receives more than 5 concurrent requests, then the actual pool size will only grow to 5. If ``pool_size=0`` then connection pooling is not used.
For supported back-ends you may also specify if you would like to check against
``
check_reserved=['postgres', 'postgres_nonreserved']
``:code
The following database backends support reserved words checking.
-----
**PostgreSQL** | ``postgres(_nonreserved)``
**MySQL** | ``mysql``
**FireBird** | ``firebird(_nonreserved)``
**MSSQL** | ``mssql``
**Oracle** | ``oracle``
-----
### ``DAL``, ``Table``, ``Field``
The best way to understand the DAL API is to try each function yourself. This can be done interactively via the web2py shell, although ultimately, DAL code goes in the models and controllers.
Start by creating a connection. For the sake of example, you can use SQLite. Nothing in this discussion changes when you change the back-end engine.
-``DAL``:inxx ``SQLite``:inxx ``MySQL``:inxx ``PostgresSQL``:inxx ``Oracle``:inxx ``MSSQL``:inxx ``FireBird``:inxx ``DB2``:inxx ``Informix``:inxx
``
>>> db = DAL('sqlite://storage.db')
``:code
The database is now connected and the connection is stored in the global variable ``db``.
At any time you can retrieve the connection string.
``_uri``:inxx
``
>>> print db._uri
sqlite://storage.db
``:code
and the database name
``_dbname``:inxx
``
>>> print db._dbname
sqlite
``:code
or even more complex ones using a function:
``
>>> db.define_table('person', Field('name'),
format=lambda r: r.name or 'anonymous')
``:code
The format attribute will be used for two purposes:
- To represent referenced records in select/option drop-downs.
- To set the ``db.othertable.person.represent`` attribute for all fields referencing this table. This means that SQLTABLE will not show references by id but will use the format preferred representation instead.
``Field constructor``:inxx
These are the default values of a Field constructor:
``
Field(name, 'string', length=None, default=None,
required=False, requires='<default>',
ondelete='CASCADE', notnull=False, unique=False,
uploadfield=True, widget=None, label=None, comment=None,
writable=True, readable=True, update=None, authorize=None,
autodelete=False, represent=None, compute=None,
uploadfolder=os.path.join(request.folder,'uploads'),
uploadseparate=None)
``:code
Not all of them are relevant for every field. "length" is relevant only for fields of type "string". "uploadfield" and "authorize" are relevant only for fields of type "upload". "ondelete" is relevant only for fields of type "reference" and "upload".
- ``length`` sets the maximum length of a "string", "password" or "upload" field. If ``length`` is not specified a default value is used but the default value is not guaranteed to be backward compatible. ''To avoid unwanted migrations on upgrades, we recommend that you always specify the length for string, password and upload fields.''
- ``default`` sets the default value for the field. The default value is used when performing an insert if a value is not explicitly specified. It is also used to pre-populate forms built from the table using SQLFORM. Note, rather than being a fixed value, the default can instead be a function (including a lambda function) that returns a value of the appropriate type for the field. In that case, the function is called once for each record inserted, even when multiple records are inserted in a single transaction.
- ``required`` tells the DAL that no insert should be allowed on this table if a value for this field is not explicitly specified.
- ``requires`` is a validator or a list of validators. This is not used by the DAL, but it is used by SQLFORM. The default validators for the given types are shown in the following table:
----------
**field type** | **default field validators**
``string`` | ``IS_LENGTH(length)`` default length is 512
``text`` | ``IS_LENGTH(65536)``
``blob`` | ``None``
``boolean`` | ``None``
``integer`` | ``IS_INT_IN_RANGE(-1e100, 1e100)``
``double`` | ``IS_FLOAT_IN_RANGE(-1e100, 1e100)``
``decimal(n,m)`` | ``IS_DECIMAL_IN_RANGE(-1e100, 1e100)``
``date`` | ``IS_DATE()``
``time`` | ``IS_TIME()``
``datetime`` | ``IS_DATETIME()``
``password`` | ``None``
``upload`` | ``None``
``reference <table>`` | ``IS_IN_DB(db,table.field,format)``
``list:string`` | ``None``
``list:integer`` | ``None``
``list:reference <table>`` | ``IS_IN_DB(db,table.field,format,multiple=True)``
---------
Decimal requires and returns values as ``Decimal`` objects, as defined in the Python ``decimal`` module. SQLite does not handle the ``decimal`` type so internally we treat it as a ``double``. The (n,m) are the number of digits in total and the number of digits after the decimal point respectively.
The ``list:`` fields are special because they are designed to take advantage of certain denormalization features on NoSQL (in the case of Google App Engine NoSQL, the field types ``ListProperty`` and ``StringListProperty``) and back-port them all the other supported relational databases. On relational databases lists are stored as a ``text`` field. The items are separated by a ``|`` and each ``|`` in string item is escaped as a ``||``. They are discussed in their own section.
-------
Notice that ``requires=...`` is enforced at the level of forms, ``required=True`` is enforced at the level of the DAL (insert), while ``notnull``, ``unique`` and ``ondelete`` are enforced at the level of the database. While they sometimes may seem redundant, it is important to maintain the distinction when programming with the DAL.
-------
``ondelete``:inxx
- ``ondelete`` translates into the "ON DELETE" SQL statement. By default it is set to "CASCADE". This tells the database that when it deletes a record, it should also delete all records that refer to it. To disable this feature, set ``ondelete`` to "NO ACTION" or "SET NULL".
- ``notnull=True`` translates into the "NOT NULL" SQL statement. It prevents the database from inserting null values for the field.
- ``unique=True`` translates into the "UNIQUE" SQL statement and it makes sure that values of this field are unique within the table. It is enforced at the database level.
- ``uploadfield`` applies only to fields of type "upload". A field of type "upload" stores the name of a file saved somewhere else, by default on the filesystem under the application "uploads/" folder. If ``uploadfield`` is set, then the file is stored in a blob field within the same table and the value of ``uploadfield`` is the name of the blob field. This will be discussed in more detail later in the context of SQLFORM.
- ``uploadfolder`` defaults to the application's "uploads/" folder. If set to a different path, files will uploaded to a different folder. For example, uploadfolder=os.path.join(request.folder,'static/temp') will upload files to the web2py/applications/myapp/static/temp folder.
- ``uploadseparate`` if set to True will upload files under different subfolders of the ''uploadfolder'' folder. This is optimized to avoid too many files under the same folder/subfolder. ATTENTION: You cannot change the value of ``uploadseparate`` from True to False without breaking the system. web2py either uses the separate subfolders or it does not. Changing the behavior after files have been uploaded will prevent web2py from being able to retrieve those files. If this happens it is possible to move files and fix the problem but this is not described here.
- ``widget`` must be one of the available widget objects, including custom widgets, for example: ``SQLFORM.widgets.string.widget``. A list of available widgets will be discussed later. Each field type has a default widget.
- ``label`` is a string (or something that can be serialized to a string) that contains the label to be used for this field in autogenerated forms.
- ``comment`` is a string (or something that can be serialized to a string) that contains a comment associated with this field, and will be displayed to the right of the input field in the autogenerated forms.
- ``writable`` if a field is writable, it can be edited in autogenerated create and update forms.
- ``readable`` if a field is readable, it will be visible in readonly forms. If a field is neither readable nor writable, it will not be displayed in create and update forms.
- ``update`` contains the default value for this field when the record is updated.
- ``compute`` is an optional function. If a record is inserted or updated, the compute function will be executed and the field will be populated with the function result. The record is passed to the compute function as a ``dict``, and the dict will not include the current value of that, or any other compute field.
- ``authorize`` can be used to require access control on the corresponding field, for "upload" fields only. It will be discussed more in detail in the context of Authentication and Authorization.
- ``autodelete`` determines if the corresponding uploaded file should be deleted when the record referencing the file is deleted. For "upload" fields only.
- ``represent`` can be None or can point to a function that takes a field value and returns an alternate representation for the field value. Examples:
``
db.mytable.name.represent = lambda name,row: name.capitalize()
db.mytable.other_id.represent = lambda id,row: row.myfield
db.mytable.some_uploadfield.represent = lambda value,row: \
A('get it', _href=URL('download', args=value))
``:code
``blob``:inxx
"blob" fields are also special. By default, binary data is encoded in base64 before being stored into the actual database field, and it is decoded when extracted. This has the negative effect of using 25% more storage space than necessary in blob fields, but has two advantages. On average it reduces the amount of data communicated between web2py and the database server, and it makes the communication independent of back-end-specific escaping conventions.
The DAL allows you to explicitly issue SQL statements.
>>> print db.executesql('SELECT * FROM person;')
[(1, u'Massimo'), (2, u'Massimo')]
``:code
In this case, the return values are not parsed or transformed by the DAL, and the format depends on the specific database driver. This usage with selects is normally not needed, but it is more common with indexes.
``executesql`` takes two optional arguments: ``placeholders`` and ``as_dict``
``placeholders`` is an optional
sequence of values to be substituted in
or, if supported by the DB driver, a dictionary with keys
matching named placeholders in your SQL.
If ``as_dict`` is set to True,
and the results cursor returned by the DB driver will be
converted to a sequence of dictionaries keyed with the db
field names. Results returned with ``as_dict = True ``are
the same as those returned when applying **.as_list()** to a normal select.
``
[{field1: value1, field2: value2}, {field1: value1b, field2: value2b}]
``:code
#### ``_lastsql``
Whether SQL was executed manually using executesql or was SQL generated by the DAL, you can always find the SQL code in ``db._lastsql``. This is useful for debugging purposes:
``_lastdb``:inxx
``
>>> rows = db().select(db.person.ALL)
>>> print db._lastsql
SELECT person.id, person.name FROM person;
``:code
-------
web2py never generates queries using the "*" operator. web2py is always explicit when selecting fields.
-------
### ``drop``
Finally, you can drop tables and all data will be lost:
``drop``:inxx
db.define_table('account',
Field('accnum','integer'),
Field('acctype'),
Field('accdesc'),
primarykey=['accnum','acctype'],
migrate=False)
``:code
- ``primarykey`` is a list of the field names that make up the primary key.
- All primarykey fields have a ``NOT NULL`` set even if not specified.
- Keyed table can only refer are to other keyed tables.
- Referenceing fields must use the ``reference tablename.fieldname`` format.
- The ``update_record`` function is not available for Rows of keyed tables.
-------
Note that currently this is only available for DB2, MS-SQL, Ingres and Informix, but others can be easily added.
-------
At the time of writing, we cannot guarantee that the ``primarykey`` attribute works with every existing legacy table and every supported database backend.
For simplicity, we recommend, if possible, creating a database view that has an auto-increment id field.
### Distributed transaction
``distributed transactions``:inxx
------
At the time of writing this feature is only supported
by PostgreSQL, MySQL and Firebird, since they expose API for two-phase commits.
------
Assuming you have two (or more) connections to distinct PostgreSQL databases, for example:
``
db_a = DAL('postgres://...')
db_b = DAL('postgres://...')
``:code
In your models or controllers, you can commit them concurrently with:
``
DAL.distributed_transaction_commit(db_a, db_b)
``:code
On failure, this function rolls back and raises an ``Exception``.
In controllers, when one action returns, if you have two distinct connections and you do not call the above function, web2py commits them separately. This means there is a possibility that one of the commits succeeds and one fails. The distributed transaction prevents this from happening.
### Manual uploads
Consider the following model:
``
->>> db.define_table('myfile', Field('image', 'upload'))
``:code
Normally an insert is handled automatically via a SQLFORM or a crud form (which is a SQLFORM) but occasionally you already have the file on the filesystem and want to upload it programmatically. This can be done in this way:
``
>>> stream = open(filename, 'rb')
>>> db.myfile.insert(image=db.myfile.image.store(stream, filename))
``:code
The ``store`` method of the upload field object takes a file stream and a filename. It uses the filename to determine the extension (type) of the file, creates a new temp name for the file (according to web2py upload mechanism) and loads the file content in this new temp file (under the uploads folder unless specified otherwise). It returns the new temp name, which is then stored in the ``image`` field of the ``db.myfile`` table.
Note, if the file is to be stored in an associated blob field rather than the file system, the ``store()`` method will not insert the file in the blob field (because ``store()`` is called before the insert), so the file must be explicitly inserted into the blob field:
``
>>> db.define_table('myfile',
Field('image', 'upload', uploadfield='image_file'),
Field('image_file', 'blob'))
>>> stream = open(filename, 'rb')
>>> db.myfile.insert(image=db.myfile.image.store(stream, filename),
image_file=stream.read())
``:code
The opposite of ``.store`` is ``.retrieve``:
``
>>> row = db(db.myfile).select().first()
>>> (filename, stream) = db.myfile.image.retrieve(row.image)
>>> import shutil
>>> shutil.copyfileobj(stream,open(filename,'wb'))
``
### ``Query``, ``Set``, ``Rows``
Let's consider again the table defined (and dropped) previously and insert three records:
``
>>> db.define_table('person', Field('name'))
>>> db.person.insert(name="Alex")
1
>>> db.person.insert(name="Bob")
2
>>> db.person.insert(name="Carl")
3
which is equivalent to
db(db.mytable.id==id).update(myfield='somevalue')
``:code
and it updates an existing record with field values specified by the dictionary on the right hand side.
#### Fetching a ``Row``
Yet another convenient syntax is the following:
``
record = db.mytable(id)
record = db.mytable(db.mytable.id==id)
record = db.mytable(id,myfield='somevalue')
``:code
Apparently similar to ``db.mytable[id]`` the above syntax is more flexible and safer. First of all it checks whether ``id`` is an int (or ``str(id)`` is an int) and returns ``None`` if not (it never raises an exception). It also allows to specify multiple conditions that the record must meet. If they are not met, it also returns ``None``.
#### Recursive ``select``s
``recursive selects``:inxx
Consider the previous table person and a new table "dog" referencing a "person":
``
>>> db.define_table('dog', Field('name'), Field('owner','reference person'))
``:code
and a simple select from this table:
``
>>> dogs = db(db.dog).select()
``:code
which is equivalent to
``
>>> dogs = db(db.dog._id>0).select()
``:code
where ``._id`` is a reference to the primary key of the table. Normally ``db.dog._id`` is the same as ``db.dog.id`` and we will assume that in most of this book. ``_id``:inxx
For each Row of dogs it is possible to fetch not just fields from the selected table (dog) but also from linked tables (recursively):
``
>>> for dog in dogs: print dog.name, dog.owner.name
``:code
Here ``dog.owner.name`` requires one database select for each dog in dogs and it is therefore inefficient. We suggest using joins whenever possible instead of recursive selects, nevertheless this is convenient and practical when accessing individual records.
You can also do it backwards, by selecting the dogs referenced by a person:
``
person = db.person(id)
-for dog in person.dog.select(orderby=db.dog.name):
- print person.name, 'owns', dog.name
``:code
In this last expressions ``person.dog`` is a shortcut for
``
db(db.dog.owner==person.id)
``:code
i.e. the Set of ``dog``s referenced by the current ``person``. This syntax breaks down if the referencing table has multiple references to the referenced table. In this case one needs to be more explicit and use a full Query.
#### Serializing ``Rows`` in views
Given the following action containing a query
``SQLTABLE``:inxx
``
def index()
return dict(rows = db(query).select())
``:code
The result of a select can be displayed in a view with the following syntax:
``
{{extend 'layout.html'}}
<h1>Records</h1>
{{=rows}}
``:code
Which is equivalent to:
Due to Python restrictions in overloading "``and``" and "``or``" operators, thes
It is also possible to build queries using in-place logical operators:
``
>>> query = db.person.name!='Alex'
>>> query &= db.person.id>3
>>> query |= db.person.name=='John'
``
#### ``count``, ``isempty``, ``delete``, ``update``
You can count records in a set:
``count``:inxx ``isempty``:inxx
``
>>> print db(db.person.id > 0).count()
3
``:code
Notice that ``count`` takes an optional ``distinct`` argument which defaults to False, and it works very much like the same argument for ``select``.
Sometimes you may need to check is a table is empty. A more efficient way than counting is using the ``isempty`` method:
``
>>> print db(db.person.id > 0).isempty()
False
``:code
or equivalently:
``
>>> print db(db.person).isempty()
False
``:code
You can delete records in a set:
``delete``:inxx
``
>>> db(db.person.id > 3).delete()
In this case ``row.total_price`` is not a value but a function. The function tak
The lazy field in the example above allows one to compute the total price for each ``item``:
``
>>> for row in db(db.item).select(): print row.total_price()
``
And it also allows to pass an optional ``discount`` percentage (15%):
``
>>> for row in db(db.item).select(): print row.total_price(15)
``
------
Mind that virtual fields do not have the same attributes as the other fields (default, readable, requires, etc) and they do not appear in the list of ``db.table.fields`` and are not visualized by default in tables (TABLE) and grids (SQLFORM.grid, SQLFORM.smartgrid).
------
### One to many relation
``one to many``:inxx
To illustrate how to implement one to many relations with the web2py DAL, define another table "dog" that refers to the table "person" which we redefine here:
``
>>> db.define_table('person',
Field('name'),
format='%(name)s')
>>> db.define_table('dog',
Field('name'),
Field('owner', 'reference person'),
format='%(name)s')
``:code
Table "dog" has two fields, the name of the dog and the owner of the dog. When a field type is another table, it is intended that the field reference the other table by its id. In fact, you can print the actual type value and get:
``
>>> print db.dog.owner.type
reference person
``:code
Now, insert three dogs, two owned by Alex and one by Bob:
``
>>> db.dog.insert(name='Skipper', owner=1)
1
>>> db.dog.insert(name='Snoopy', owner=1)
2
>>> db.dog.insert(name='Puppy', owner=2)
3
``:code
You can select as you did for any other table:
``
>>> for row in db(db.dog.owner==1).select():
print row.name
-Skipper
-Snoopy
``:code
Because a dog has a reference to a person, a person can have many dogs, so a record of table person now acquires a new attribute dog, which is a Set, that defines the dogs of that person. This allows looping over all persons and fetching their dogs easily:
``referencing``:inxx
``
>>> for person in db().select(db.person.ALL):
print person.name
- for dog in person.dog.select():
- print ' ', dog.name
Alex
- Skipper
- Snoopy
Bob
Puppy
Carl
``:code
#### Inner joins
Another way to achieve a similar result is by using a join, specifically an INNER JOIN. web2py performs joins automatically and transparently when the query links two or more tables as in the following example:
``Rows``:inxx ``inner join``:inxx ``join``:inxx
``
>>> rows = db(db.person.id==db.dog.owner).select()
>>> for row in rows:
- print row.person.name, 'has', row.dog.name
-Alex has Skipper
-Alex has Snoopy
-Bob has Puppy
``:code
Observe that web2py did a join, so the rows now contain two records, one from each table, linked together. Because the two records may have fields with conflicting names, you need to specify the table when extracting a field value from a row. This means that while before you could do:
``
row.name
``:code
and it was obvious whether this was the name of a person or a dog, in the result of a join you have to be more explicit and say:
``
row.person.name
``:code
or:
``
row.dog.name
``:code
There is an alterantive syntax for INNER JOINS:
``
>>> rows = db(db.person).select(join=db.dog.on(db.person.id==db.dog.owner))
>>> for row in rows:
- print row.person.name, 'has', row.dog.name
-Alex has Skipper
-Alex has Snoopy
-Bob has Puppy
``:code
While the output is the same, the generated SQL in the two cases can be different. The latter syntax removes possible ambiguities when the same table is joined twice and aliased:
``
>>> db.define_table('dog',
Field('name'),
Field('owner1','reference person'),
Field('owner2','reference person'))
>>> rows = db(db.person).select(
- join=[db.person.with_alias('owner1').on(db.person.id==db.dog.owner1).
- db.person.with_alias('owner2').on(db.person.id==db.dog.owner2)])
``
The value of ``join`` can be list of ``db.table.on(...)`` to join.
#### Left outer join
Notice that Carl did not appear in the list above because he has no dogs. If you intend to select on persons (whether they have dogs or not) and their dogs (if they have any), then you need to perform a LEFT OUTER JOIN. This is done using the argument "left" of the select command. Here is an example:
``Rows``:inxx ``left outer join``:inxx ``outer join``:inxx
``
>>> rows=db().select(
- db.person.ALL, db.dog.ALL,
- left=db.dog.on(db.person.id==db.dog.owner))
>>> for row in rows:
- print row.person.name, 'has', row.dog.name
-Alex has Skipper
-Alex has Snoopy
-Bob has Puppy
Carl has None
``:code
where:
``
left = db.dog.on(...)
``:code
does the left join query. Here the argument of ``db.dog.on`` is the condition required for the join (the same used above for the inner join). In the case of a left join, it is necessary to be explicit about which fields to select.
Multiple left joins can be combined by passing a list or tuple of ``db.mytable.on(...)`` to the ``left`` attribute.
#### Grouping and counting
When doing joins, sometimes you want to group rows according to certain criteria and count them. For example, count the number of dogs owned by every person. web2py allows this as well. First, you need a count operator. Second, you want to join the person table with the dog table by owner. Third, you want to select all rows (person + dog), group them by person, and count them while grouping:
``grouping``:inxx
``
>>> count = db.person.id.count()
>>> for row in db(db.person.id==db.dog.owner).select(
db.person.name, count, groupby=db.person.name):
print row.person.name, row[count]
Alex 2
Bob 1
``:code
Notice the count operator (which is built-in) is used as a field. The only issue here is in how to retrieve the information. Each row clearly contains a person and the count, but the count is not a field of a person nor is it a table. So where does it go? It goes into the storage object representing the record with a key equal to the query expression itself.
### Many to many
``many-to-many``:inxx
In the previous examples, we allowed a dog to have one owner but one person could have many dogs. What if Skipper was owned by Alex and Curt? This requires a many-to-many relation, and it is realized via an intermediate table that links a person to a dog via an ownership relation.
Here is how to do it:
``
>>> db.define_table('person',
Field('name'))
>>> db.define_table('dog',
Field('name'))
>>> db.define_table('ownership',
Field('person', 'reference person'),
Field('dog', 'reference dog'))
``:code
the existing ownership relationship can now be rewritten as:
``
->>> db.ownership.insert(person=1, dog=1) # Alex owns Skipper
->>> db.ownership.insert(person=1, dog=2) # Alex owns Snoopy
->>> db.ownership.insert(person=2, dog=3) # Bob owns Puppy
``:code
Now you can add the new relation that Curt co-owns Skipper:
``
>>> db.ownership.insert(person=3, dog=1) # Curt owns Skipper too
``:code
Because you now have a three-way relation between tables, it may be convenient to define a new set on which to perform operations:
``
>>> persons_and_dogs = db(
(db.person.id==db.ownership.person) \
& (db.dog.id==db.ownership.dog))
``:code
Now it is easy to select all persons and their dogs from the new Set:
``
->>> for row in persons_and_dogs.select():
- print row.person.name, row.dog.name
-Alex Skipper
-Alex Snoopy
-Bob Puppy
-Curt Skipper
``:code
Similarly, you can search for all dogs owned by Alex:
``
->>> for row in persons_and_dogs(db.person.name=='Alex').select():
- print row.dog.name
-Skipper
-Snoopy
``:code
and all owners of Skipper:
``
>>> for row in persons_and_dogs(db.dog.name=='Skipper').select():
print row.person.name
Alex
Curt
``:code
A lighter alternative to Many 2 Many relations is tagging. Tagging is discussed in the context of the ``IS_IN_DB`` validator. Tagging works even on database backends that do not support JOINs like the Google App Engine NoSQL.
### Many to many, ``list:<type>``, and ``contains``
``list:string``:inxx
``list:integer``:inxx
``list:reference``:inxx
``contains``:inxx
``multiple``:inxx
``tags``:inxx
web2py provides the following special field types:
``
list:string
list:integer
Let's define another table "log" to store security events, their event_time and
Field('event_time', 'datetime'),
Field('severity', 'integer'))
``:code
As before, insert a few events, a "port scan", an "xss injection" and an "unauthorized login".
For the sake of the example, you can log events with the same event_time but with different severities (1, 2, 3 respectively).
``
>>> import datetime
>>> now = datetime.datetime.now()
>>> print db.log.insert(
event='port scan', event_time=now, severity=1)
1
>>> print db.log.insert(
event='xss injection', event_time=now, severity=2)
2
>>> print db.log.insert(
event='unauthorized login', event_time=now, severity=3)
3
``:code
-#### ``like``, ``startswith``, ``contains``, ``upper``, ``lower``
-``like``:inxx ``startswith``:inxx
``contains``:inxx ``upper``:inxx ``lower``:inxx
Fields have a like operator that you can use to match strings:
``
>>> for row in db(db.log.event.like('port%')).select():
print row.event
port scan
``:code
Here "port%" indicates a string starting with "port". The percent sign character, "%", is a wild-card character that means "any sequence of characters".
web2py also provides some shortcuts:
``
db.mytable.myfield.startswith('value')
db.mytable.myfield.contains('value')
``:code
which are equivalent respectively to
``
db.mytable.myfield.like('value%')
db.mytable.myfield.like('%value%')
``:code
Notice that ``contains`` has a special meaning for ``list:<type>`` fields and it was discussed in a previous section.
The ``contains`` method can also be passed a list of values and an optional boolean argument ``all`` to search for records that contain all values:
``
db.mytable.myfield.contains(['value1','value2'], all=True)
``
or any value from the list
``
db.mytable.myfield.contains(['value1','value2'], all=false)
``
The ``upper`` and ``lower`` methods allow you to convert the value of the field to upper or lower case, and you can also combine them with the like operator:
``upper``:inxx ``lower``:inxx
``
>>> for row in db(db.log.event.upper().like('PORT%')).select():
print row.event
port scan
``:code
#### ``year``, ``month``, ``day``, ``hour``, ``minutes``, ``seconds``
``hour``:inxx ``minutes``:inxx ``seconds``:inxx ``day``:inxx ``month``:inxx ``year``:inxx
The date and datetime fields have day, month and year methods. The datetime and time fields have hour, minutes and seconds methods. Here is an example:
``
>>> for row in db(db.log.event_time.year()==2009).select():
print row.event
port scan
xss injection
The SQL IN operator is realized via the belongs method which returns true when t
``belongs``:inxx
``
>>> for row in db(db.log.severity.belongs((1, 2))).select():
print row.event
port scan
xss injection
``:code
The DAL also allows a nested select as the argument of the belongs operator. The only caveat is that the nested select has to be a ``_select``, not a ``select``, and only one field has to be selected explicitly, the one that defines the set.
``nested select``:inxx
``
>>> bad_days = db(db.log.severity==3)._select(db.log.event_time)
>>> for row in db(db.log.event_time.belongs(bad_days)).select():
print row.event
port scan
xss injection
unauthorized login
``:code
-#### ``sum``, ``min``, ``max`` and ``len``
``sum``:inxx ``min``:inxx ``max``:inxx
Previously, you have used the count operator to count records. Similarly, you can use the sum operator to add (sum) the values of a specific field from a group of records. As in the case of count, the result of a sum is retrieved via the store object:
``
>>> sum = db.log.severity.sum()
>>> print db().select(sum).first()[sum]
6
``:code
You can also use ``min`` and ``max`` to the mininum and maximum value for the selected records
``
>>> max = db.log.severity.max()
>>> print db().select(max).first()[max]
3
``:code
``.len()`` computes the length of a string, text or boolean fields.
Expressions can be combined to form more complex expressions. For example here we are computing the sum of the length of all the severity strings in the logs, increased of one:
``
>>> sum = (db.log.severity.len()+1).sum()
>>> print db().select(sum).first()[sum]
``:code
#### Substrings
One can build an expression to refer to a substring. For example, we can group dogs whose name starts with the same three characters and select only one from each group:
``
db(db.dog).select(dictinct = db.dog.name[:3])
``:code
#### Default values with ``coalesce`` and ``coalesce_zero``
There are times when you need to pull a value from database but also need a default values if the value for a record is set to NULL. In SQL there is a keyword, ``COALESCE``, for this. web2py has an equivalent ``coalesce`` method:
``
>>> db.define_table('sysuser',Field('username'),Field('fullname'))
>>> db.sysuser.insert(username='max',fullname='Max Power')
>>> db.sysuser.insert(username='tim',fullname=None)
print db(db.sysuser).select(db.sysuser.fullname.coalesce(db.sysuser.username))
"COALESCE(sysuser.fullname,sysuser.username)"
Max Power
tim
``
Other times you need to compute a mathematical expression but some fields have a value set to None while it should be zero.
``coalesce_zero`` comes to the rescue by defaulting None to zero in the query:
And finally, here is ``_update`` ``_update``:inxx
>>> print db(db.person.name=='Alex')._update()
UPDATE person SET WHERE person.name='Alex';
``:code
-----
Moreover you can always use ``db._lastsql`` to return the most recent
SQL code, whether it was executed manually using executesql or was SQL
generated by the DAL.
-----
### Exporting and importing data
``export``:inxx ``import``:inxx
#### CSV (one Table at a time)
When a DALRows object is converted to a string it is automatically
serialized in CSV:
``csv``:inxx
``
>>> rows = db(db.person.id==db.dog.owner).select()
>>> print rows
-person.id,person.name,dog.id,dog.name,dog.owner
-1,Alex,1,Skipper,1
-1,Alex,2,Snoopy,1
-2,Bob,3,Puppy,2
``:code
You can serialize a single table in CSV and store it in a file "test.csv":
``
>>> open('test.csv', 'w').write(str(db(db.person.id).select()))
``:code
and you can easily read it back with:
``
>>> db.person.import_from_csv_file(open('test.csv', 'r'))
``:code
When importing, web2py looks for the field names in the CSV header. In this example, it finds two columns: "person.id" and "person.name". It ignores the "person." prefix, and it ignores the "id" fields. Then all records are appended and assigned new ids. Both of these operations can be performed via the appadmin web interface.
#### CSV (all tables at once)
In web2py, you can backup/restore an entire database with two commands:
To export:
``
Two tables are separated ``\r\n\r\n``. The file ends with the line
END
``:code
The file does not include uploaded files if these are not stored in the database. In any case it is easy enough to zip the "uploads" folder separately.
When importing, the new records will be appended to the database if it is not empty. In general the new imported records will not have the same record id as the original (saved) records but web2py will restore references so they are not broken, even if the id values may change.
If a table contains a field called
"uuid", this field will be used to identify duplicates. Also, if an
imported record has the same "uuid" as an existing record, the
previous record will be updated.
#### CSV and remote database synchronization
Consider the following model:
``
db = DAL('sqlite:memory:')
db.define_table('person',
Field('name'),
format='%(name)s')
db.define_table('dog',
Field('owner', 'reference person'),
Field('name'),
format='%(name)s')
if not db(db.person).count():
id = db.person.insert(name="Massimo")
db.dog.insert(owner=id, name="Snoopy")
``:code
Each record is identified by an ID and referenced by that ID. If you
have two copies of the database used by distinct web2py installations,
the ID is unique only within each database and not across the databases.
This is a problem when merging records from different databases.
In order to make a record uniquely identifiable across databases, they
must:
- have a unique id (UUID),
- have an event_time (to figure out which one is more recent if multiple copies),
- reference the UUID instead of the id.
This can be achieved without modifying web2py. Here is what to do:
**1.** Change the above model into:
``
db.define_table('person',
Field('uuid', length=64, default=lambda:str(uuid.uuid4())),
Field('modified_on', 'datetime', default=now),
Field('name'),
format='%(name)s')
db.define_table('dog',
Field('uuid', length=64, default=lambda:str(uuid.uuid4())),
Field('modified_on', 'datetime', default=now),
Field('owner', length=64),
Field('name'),
format='%(name)s')
db.dog.owner.requires = IS_IN_DB(db,'person.uuid','%(name)s')
if not db(db.person.id).count():
id = uuid.uuid4()
db.person.insert(name="Massimo", uuid=id)
db.dog.insert(owner=id, name="Snoopy")
``:code
-------
Note, in the above table definitions, the default value for the two 'uuid' fields is set to a lambda function, which returns a UUID (converted to a string). The lambda function is called once for each record inserted, ensuring that each record gets a unique UUID, even if multiple records are inserted in a single transaction.
-------
**2.** Create a controller action to export the database:
``
def export():
s = StringIO.StringIO()
db.export_to_csv_file(s)
response.headers['Content-Type'] = 'text/csv'
return s.getvalue()
``:code
**3.** Create a controller action to import a saved copy of the other database and sync records:
``
def import_and_sync():
specific for this example.
``XML-RPC``:inxx
Alternatively, you can use XML-RPC to export/import the file.
If the records reference uploaded files, you also need to export/import the content of the uploads folder. Notice that files therein are already labeled by UUIDs so you do not need to worry about naming conflicts and references.
#### HTML and XML (one Table at a time)
``DALRows objects``:inxx
DALRows objects also have an ``xml`` method (like helpers) that serializes it to XML/HTML:
``HTML``:inxx
``
>>> rows = db(db.person.id > 0).select()
>>> print rows.xml()
<table>
<thead>
<tr>
<th>person.id</th>
<th>person.name</th>
- <th>dog.id</th>
- <th>dog.name</th>
- <th>dog.owner</th>
</tr>
</thead>
<tbody>
<tr class="even">
<td>1</td>
<td>Alex</td>
<td>1</td>
<td>Skipper</td>
<td>1</td>
</tr>
...
</tbody>
</table>
``:code

+----------
+Because usually in web2py models are executed before controllers, it is possible that some table are defined even if not needed. It is therefore necessary to speed up the code by making table definitions lazy. This is done by setting the ``DAL(...,lazy_tables=True)`` attributes. Tables will be actually created only when accessed.
+----------
### Record representation
It is optional but recommended to specify a format representation for records:
``
>>> db.define_table('person', Field('name'), format='%(name)s')
``:code
or
``
>>> db.define_table('person', Field('name'), format='%(name)s %(id)s')
``:code
or even more complex ones using a function:
``
>>> db.define_table('person', Field('name'),
format=lambda r: r.name or 'anonymous')
``:code
The format attribute will be used for two purposes:
db(db.mytable.id==id).update(myfield='somevalue')
and it updates an existing record with field values specified by the dictionary on the right hand side.
#### Fetching a ``Row``
Yet another convenient syntax is the following:
``
record = db.mytable(id)
record = db.mytable(db.mytable.id==id)
record = db.mytable(id,myfield='somevalue')
``:code
Apparently similar to ``db.mytable[id]`` the above syntax is more flexible and safer. First of all it checks whether ``id`` is an int (or ``str(id)`` is an int) and returns ``None`` if not (it never raises an exception). It also allows to specify multiple conditions that the record must meet. If they are not met, it also returns ``None``.
#### Recursive ``select``s
``recursive selects``:inxx
Consider the previous table person and a new table "dog" referencing a "person":
``
>>> db.define_table('dog', Field('name'), Field('owner','reference person'))
``:code
and a simple select from this table:
``
>>> dogs = db(db.dog).select()
``:code
which is equivalent to
``
>>> dogs = db(db.dog._id>0).select()
``:code
where ``._id`` is a reference to the primary key of the table. Normally ``db.dog._id`` is the same as ``db.dog.id`` and we will assume that in most of this book. ``_id``:inxx
For each Row of dogs it is possible to fetch not just fields from the selected table (dog) but also from linked tables (recursively):
``
>>> for dog in dogs: print dog.name, dog.owner.name
``:code
In order to define one or more virtual fields, you have to define a container cl
>>> db.define_table('item',
Field('unit_price','double'),
Field('quantity','integer'),
``:code
One can define a ``total_price`` virtual field as
``
>>> class MyVirtualFields(object):
def total_price(self):
return self.item.unit_price*self.item.quantity
>>> db.item.virtualfields.append(MyVirtualFields())
``:code
Notice that each method of the class that takes a single argument (self) is a new virtual field. ``self`` refers to each one row of the select. Field values are referred by full path as in ``self.item.unit_price``. The table is linked to the virtual fields by appending an instance of the class to the table's ``virtualfields`` attribute.
Virtual fields can also access recursive fields as in
``
>>> db.define_table('item',
Field('unit_price','double'))
>>> db.define_table('order_item',
Field('item','reference item'),
Field('quantity','integer'))
>>> class MyVirtualFields(object):
def total_price(self):
return self.order_item.item.unit_price \
* self.order_item.quantity
>>> db.order_item.virtualfields.append(MyVirtualFields())
``:code
Notice the recursive field access ``self.order_item.item.unit_price`` where ``self`` is the looping record.
They can also act on the result of a JOIN
``
>>> db.define_table('item',
Field('unit_price','double'))
>>> db.define_table('order_item',
Field('item','reference item'),
Field('quantity','integer'))
>>> rows = db(db.order_item.item==db.item.id).select()
>>> class MyVirtualFields(object):
def total_price(self):
return self.item.unit_price \
* self.order_item.quantity
>>> rows.setvirtualfields(order_item=MyVirtualFields())
>>> for row in rows: print row.order_item.total_price
``:code
Notice how in this case the syntax is different. The virtual field accesses both ``self.item.unit_price`` and ``self.order_item.quantity`` which belong to the join select. The virtual field is attached to the rows of the table using the ``setvirtualfields`` method of the rows object. This method takes an arbitrary number of named arguments and can be used to set multiple virtual fields, defined in multiple classes, and attach them to multiple tables:
``
>>> class MyVirtualFields1(object):
def discounted_unit_price(self):
return self.item.unit_price*0.90
>>> class MyVirtualFields2(object):
def total_price(self):
return self.item.unit_price \
* self.order_item.quantity
def discounted_total_price(self):
The lazy field in the example above allows one to compute the total price for ea
And it also allows to pass an optional ``discount`` percentage (15%):
``
>>> for row in db(db.item).select(): print row.total_price(15)
``
------
Mind that virtual fields do not have the same attributes as the other fields (default, readable, requires, etc) and they do not appear in the list of ``db.table.fields`` and are not visualized by default in tables (TABLE) and grids (SQLFORM.grid, SQLFORM.smartgrid).
------
### One to many relation
``one to many``:inxx
To illustrate how to implement one to many relations with the web2py DAL, define another table "dog" that refers to the table "person" which we redefine here:
``
>>> db.define_table('person',
Field('name'),
format='%(name)s')
>>> db.define_table('dog',
Field('name'),
Field('owner', 'reference person'),
format='%(name)s')
``:code
Table "dog" has two fields, the name of the dog and the owner of the dog. When a field type is another table, it is intended that the field reference the other table by its id. In fact, you can print the actual type value and get:
``
>>> print db.dog.owner.type
reference person
``:code
Now, insert three dogs, two owned by Alex and one by Bob:
``
>>> db.dog.insert(name='Skipper', owner=1)
1
>>> db.dog.insert(name='Snoopy', owner=1)
2
>>> db.dog.insert(name='Puppy', owner=2)
3
``:code
You can select as you did for any other table:
row.person.name
or:
``
row.dog.name
``:code
There is an alterantive syntax for INNER JOINS:
``
>>> rows = db(db.person).select(join=db.dog.on(db.person.id==db.dog.owner))
>>> for row in rows:
print row.person.name, 'has', row.dog.name
Alex has Skipper
Alex has Snoopy
Bob has Puppy
``:code
While the output is the same, the generated SQL in the two cases can be different. The latter syntax removes possible ambiguities when the same table is joined twice and aliased:
``
>>> db.define_table('dog',
Field('name'),
+ Field('owner1','reference person'),
+ Field('owner2','reference person'))
>>> rows = db(db.person).select(
join=[db.person.with_alias('owner1').on(db.person.id==db.dog.owner1).
db.person.with_alias('owner2').on(db.person.id==db.dog.owner2)])
``
The value of ``join`` can be list of ``db.table.on(...)`` to join.
#### Left outer join
Notice that Carl did not appear in the list above because he has no dogs. If you intend to select on persons (whether they have dogs or not) and their dogs (if they have any), then you need to perform a LEFT OUTER JOIN. This is done using the argument "left" of the select command. Here is an example:
``Rows``:inxx ``left outer join``:inxx ``outer join``:inxx
``
>>> rows=db().select(
db.person.ALL, db.dog.ALL,
left=db.dog.on(db.person.id==db.dog.owner))
>>> for row in rows:
print row.person.name, 'has', row.dog.name
Alex has Skipper
Alex has Snoopy
When doing joins, sometimes you want to group rows according to certain criteria
>>> for row in db(db.person.id==db.dog.owner).select(
db.person.name, count, groupby=db.person.name):
print row.person.name, row[count]
Alex 2
Bob 1
``:code
Notice the count operator (which is built-in) is used as a field. The only issue here is in how to retrieve the information. Each row clearly contains a person and the count, but the count is not a field of a person nor is it a table. So where does it go? It goes into the storage object representing the record with a key equal to the query expression itself.
### Many to many
``many-to-many``:inxx
In the previous examples, we allowed a dog to have one owner but one person could have many dogs. What if Skipper was owned by Alex and Curt? This requires a many-to-many relation, and it is realized via an intermediate table that links a person to a dog via an ownership relation.
Here is how to do it:
``
>>> db.define_table('person',
Field('name'))
>>> db.define_table('dog',
Field('name'))
>>> db.define_table('ownership',
+ Field('person', 'reference person'),
+ Field('dog', 'reference dog'))
``:code
the existing ownership relationship can now be rewritten as:
``
>>> db.ownership.insert(person=1, dog=1) # Alex owns Skipper
>>> db.ownership.insert(person=1, dog=2) # Alex owns Snoopy
>>> db.ownership.insert(person=2, dog=3) # Bob owns Puppy
``:code
Now you can add the new relation that Curt co-owns Skipper:
``
>>> db.ownership.insert(person=3, dog=1) # Curt owns Skipper too
``:code
Because you now have a three-way relation between tables, it may be convenient to define a new set on which to perform operations:
``
>>> persons_and_dogs = db(
(db.person.id==db.ownership.person) \
END
``:code
The file does not include uploaded files if these are not stored in the database. In any case it is easy enough to zip the "uploads" folder separately.
When importing, the new records will be appended to the database if it is not empty. In general the new imported records will not have the same record id as the original (saved) records but web2py will restore references so they are not broken, even if the id values may change.
If a table contains a field called
"uuid", this field will be used to identify duplicates. Also, if an
imported record has the same "uuid" as an existing record, the
previous record will be updated.
#### CSV and remote database synchronization
Consider the following model:
``
db = DAL('sqlite:memory:')
db.define_table('person',
Field('name'),
format='%(name)s')
db.define_table('dog',
Field('owner', 'reference person'),
Field('name'),
format='%(name)s')
if not db(db.person).count():
id = db.person.insert(name="Massimo")
db.dog.insert(owner=id, name="Snoopy")
``:code
Each record is identified by an ID and referenced by that ID. If you
have two copies of the database used by distinct web2py installations,
the ID is unique only within each database and not across the databases.
This is a problem when merging records from different databases.
In order to make a record uniquely identifiable across databases, they
must:
- have a unique id (UUID),
- have an event_time (to figure out which one is more recent if multiple copies),
- reference the UUID instead of the id.
This can be achieved without modifying web2py. Here is what to do:
Here is some example usage:
Which would render something similar to
``
"hello"|35|"this is the text description"|"2009-03-03"
``:code
For more information consult the official Python documentation ``quoteall``:cite
### Caching selects
The select method also takes a cache argument, which defaults to None. For caching purposes, it should be set to a tuple where the first element is the cache model (cache.ram, cache.disk, etc.), and the second element is the expiration time in seconds.
In the following example, you see a controller that caches a select on the previously defined db.log table. The actual select fetches data from the back-end database no more frequently than once every 60 seconds and stores the result in cache.ram. If the next call to this controller occurs in less than 60 seconds since the last database IO, it simply fetches the previous data from cache.ram.
``cache select``:inxx
``
def cache_db_select():
logs = db().select(db.log.ALL, cache=(cache.ram, 60))
return dict(logs=logs)
``:code
+``cacheable``:inxx
+
+The ``select`` method has an optional ``cacheable`` argument, normally set to ``False``. When a select is cached, ``cacheable`` is set to ``True``. This makes a simple ``Rows`` result which is serializable but The ``Row``s lack ``update_record`` and ``delete_record`` methods.
+
+If you do not need these methods you can speed up selects even if you do not plan to cache then by setting the cacheable attribute:
+
+``
+rows = db(query).select(cacheable=True)
+``:code
+
-------
The results of a ``select`` are normally complex, un-pickleable objects; they cannot be stored in a session and cannot be cached in any other way than the one explained here unless the ``cache`` attribute is set or ``cacheable=True``.
-------
### Self-Reference and aliases
``self reference``:inxx
``alias``:inxx
It is possible to define tables with fields that refer to themselves although the usual notation may fail. The following code would be wrong because it uses a variable ``db.person`` before it is defined:
``
db.define_table('person',
Field('name'),
+ Field('father_id', 'reference person'),
+ Field('mother_id', 'reference person'))
``:code
### Record representation
It is optional but recommended to specify a format representation for records:
``
>>> db.define_table('person', Field('name'), format='%(name)s')
``:code
or
``
>>> db.define_table('person', Field('name'), format='%(name)s %(id)s')
``:code
or even more complex ones using a function:
``
>>> db.define_table('person', Field('name'),
format=lambda r: r.name or 'anonymous')
``:code
The format attribute will be used for two purposes:
db(db.mytable.id==id).update(myfield='somevalue')
and it updates an existing record with field values specified by the dictionary on the right hand side.
#### Fetching a ``Row``
Yet another convenient syntax is the following:
``
record = db.mytable(id)
record = db.mytable(db.mytable.id==id)
record = db.mytable(id,myfield='somevalue')
``:code
Apparently similar to ``db.mytable[id]`` the above syntax is more flexible and safer. First of all it checks whether ``id`` is an int (or ``str(id)`` is an int) and returns ``None`` if not (it never raises an exception). It also allows to specify multiple conditions that the record must meet. If they are not met, it also returns ``None``.
#### Recursive ``select``s
``recursive selects``:inxx
Consider the previous table person and a new table "dog" referencing a "person":
``
>>> db.define_table('dog', Field('name'), Field('owner',db.person))
``:code
and a simple select from this table:
``
>>> dogs = db(db.dog).select()
``:code
which is equivalent to
``
>>> dogs = db(db.dog._id>0).select()
``:code
where ``._id`` is a reference to the primary key of the table. Normally ``db.dog._id`` is the same as ``db.dog.id`` and we will assume that in most of this book. ``_id``:inxx
For each Row of dogs it is possible to fetch not just fields from the selected table (dog) but also from linked tables (recursively):
``
>>> for dog in dogs: print dog.name, dog.owner.name
``:code
In order to define one or more virtual fields, you have to define a container cl
>>> db.define_table('item',
Field('unit_price','double'),
Field('quantity','integer'),
``:code
One can define a ``total_price`` virtual field as
``
>>> class MyVirtualFields(object):
def total_price(self):
return self.item.unit_price*self.item.quantity
>>> db.item.virtualfields.append(MyVirtualFields())
``:code
Notice that each method of the class that takes a single argument (self) is a new virtual field. ``self`` refers to each one row of the select. Field values are referred by full path as in ``self.item.unit_price``. The table is linked to the virtual fields by appending an instance of the class to the table's ``virtualfields`` attribute.
Virtual fields can also access recursive fields as in
``
>>> db.define_table('item',
Field('unit_price','double'))
>>> db.define_table('order_item',
Field('item',db.item),
Field('quantity','integer'))
>>> class MyVirtualFields(object):
def total_price(self):
return self.order_item.item.unit_price \
* self.order_item.quantity
>>> db.order_item.virtualfields.append(MyVirtualFields())
``:code
Notice the recursive field access ``self.order_item.item.unit_price`` where ``self`` is the looping record.
They can also act on the result of a JOIN
``
>>> db.define_table('item',
Field('unit_price','double'))
>>> db.define_table('order_item',
Field('item',db.item),
Field('quantity','integer'))
>>> rows = db(db.order_item.item==db.item.id).select()
>>> class MyVirtualFields(object):
def total_price(self):
return self.item.unit_price \
* self.order_item.quantity
>>> rows.setvirtualfields(order_item=MyVirtualFields())
>>> for row in rows: print row.order_item.total_price
``:code
Notice how in this case the syntax is different. The virtual field accesses both ``self.item.unit_price`` and ``self.order_item.quantity`` which belong to the join select. The virtual field is attached to the rows of the table using the ``setvirtualfields`` method of the rows object. This method takes an arbitrary number of named arguments and can be used to set multiple virtual fields, defined in multiple classes, and attach them to multiple tables:
``
>>> class MyVirtualFields1(object):
def discounted_unit_price(self):
return self.item.unit_price*0.90
>>> class MyVirtualFields2(object):
def total_price(self):
return self.item.unit_price \
* self.order_item.quantity
def discounted_total_price(self):
The lazy field in the example above allows one to compute the total price for ea
And it also allows to pass an optional ``discount`` percentage (15%):
``
>>> for row in db(db.item).select(): print row.total_price(15)
``
------
Mind that virtual fields do not have the same attributes as the other fields (default, readable, requires, etc) and they do not appear in the list of ``db.table.fields`` and are not visualized by default in tables (TABLE) and grids (SQLFORM.grid, SQLFORM.smartgrid).
------
### One to many relation
``one to many``:inxx
To illustrate how to implement one to many relations with the web2py DAL, define another table "dog" that refers to the table "person" which we redefine here:
``
>>> db.define_table('person',
Field('name'),
format='%(name)s')
>>> db.define_table('dog',
Field('name'),
Field('owner', db.person),
format='%(name)s')
``:code
Table "dog" has two fields, the name of the dog and the owner of the dog. When a field type is another table, it is intended that the field reference the other table by its id. In fact, you can print the actual type value and get:
``
>>> print db.dog.owner.type
reference person
``:code
Now, insert three dogs, two owned by Alex and one by Bob:
``
>>> db.dog.insert(name='Skipper', owner=1)
1
>>> db.dog.insert(name='Snoopy', owner=1)
2
>>> db.dog.insert(name='Puppy', owner=2)
3
``:code
You can select as you did for any other table:
row.person.name
or:
``
row.dog.name
``:code
There is an alterantive syntax for INNER JOINS:
``
>>> rows = db(db.person).select(join=db.dog.on(db.person.id==db.dog.owner))
>>> for row in rows:
print row.person.name, 'has', row.dog.name
Alex has Skipper
Alex has Snoopy
Bob has Puppy
``:code
While the output is the same, the generated SQL in the two cases can be different. The latter syntax removes possible ambiguities when the same table is joined twice and aliased:
``
>>> db.define_table('dog',
Field('name'),
- Field('owner1',db.person),
- Field('owner2',db.person))
>>> rows = db(db.person).select(
join=[db.person.with_alias('owner1').on(db.person.id==db.dog.owner1).
db.person.with_alias('owner2').on(db.person.id==db.dog.owner2)])
``
The value of ``join`` can be list of ``db.table.on(...)`` to join.
#### Left outer join
Notice that Carl did not appear in the list above because he has no dogs. If you intend to select on persons (whether they have dogs or not) and their dogs (if they have any), then you need to perform a LEFT OUTER JOIN. This is done using the argument "left" of the select command. Here is an example:
``Rows``:inxx ``left outer join``:inxx ``outer join``:inxx
``
>>> rows=db().select(
db.person.ALL, db.dog.ALL,
left=db.dog.on(db.person.id==db.dog.owner))
>>> for row in rows:
print row.person.name, 'has', row.dog.name
Alex has Skipper
Alex has Snoopy
When doing joins, sometimes you want to group rows according to certain criteria
>>> for row in db(db.person.id==db.dog.owner).select(
db.person.name, count, groupby=db.person.name):
print row.person.name, row[count]
Alex 2
Bob 1
``:code
Notice the count operator (which is built-in) is used as a field. The only issue here is in how to retrieve the information. Each row clearly contains a person and the count, but the count is not a field of a person nor is it a table. So where does it go? It goes into the storage object representing the record with a key equal to the query expression itself.
### Many to many
``many-to-many``:inxx
In the previous examples, we allowed a dog to have one owner but one person could have many dogs. What if Skipper was owned by Alex and Curt? This requires a many-to-many relation, and it is realized via an intermediate table that links a person to a dog via an ownership relation.
Here is how to do it:
``
>>> db.define_table('person',
Field('name'))
>>> db.define_table('dog',
Field('name'))
>>> db.define_table('ownership',
- Field('person', db.person),
- Field('dog', db.dog))
``:code
the existing ownership relationship can now be rewritten as:
``
>>> db.ownership.insert(person=1, dog=1) # Alex owns Skipper
>>> db.ownership.insert(person=1, dog=2) # Alex owns Snoopy
>>> db.ownership.insert(person=2, dog=3) # Bob owns Puppy
``:code
Now you can add the new relation that Curt co-owns Skipper:
``
>>> db.ownership.insert(person=3, dog=1) # Curt owns Skipper too
``:code
Because you now have a three-way relation between tables, it may be convenient to define a new set on which to perform operations:
``
>>> persons_and_dogs = db(
(db.person.id==db.ownership.person) \
END
``:code
The file does not include uploaded files if these are not stored in the database. In any case it is easy enough to zip the "uploads" folder separately.
When importing, the new records will be appended to the database if it is not empty. In general the new imported records will not have the same record id as the original (saved) records but web2py will restore references so they are not broken, even if the id values may change.
If a table contains a field called
"uuid", this field will be used to identify duplicates. Also, if an
imported record has the same "uuid" as an existing record, the
previous record will be updated.
#### CSV and remote database synchronization
Consider the following model:
``
db = DAL('sqlite:memory:')
db.define_table('person',
Field('name'),
format='%(name)s')
db.define_table('dog',
Field('owner', db.person),
Field('name'),
format='%(name)s')
if not db(db.person).count():
id = db.person.insert(name="Massimo")
db.dog.insert(owner=id, name="Snoopy")
``:code
Each record is identified by an ID and referenced by that ID. If you
have two copies of the database used by distinct web2py installations,
the ID is unique only within each database and not across the databases.
This is a problem when merging records from different databases.
In order to make a record uniquely identifiable across databases, they
must:
- have a unique id (UUID),
- have an event_time (to figure out which one is more recent if multiple copies),
- reference the UUID instead of the id.
This can be achieved without modifying web2py. Here is what to do:
Here is some example usage:
Which would render something similar to
``
"hello"|35|"this is the text description"|"2009-03-03"
``:code
For more information consult the official Python documentation ``quoteall``:cite
### Caching selects
The select method also takes a cache argument, which defaults to None. For caching purposes, it should be set to a tuple where the first element is the cache model (cache.ram, cache.disk, etc.), and the second element is the expiration time in seconds.
In the following example, you see a controller that caches a select on the previously defined db.log table. The actual select fetches data from the back-end database no more frequently than once every 60 seconds and stores the result in cache.ram. If the next call to this controller occurs in less than 60 seconds since the last database IO, it simply fetches the previous data from cache.ram.
``cache select``:inxx
``
def cache_db_select():
logs = db().select(db.log.ALL, cache=(cache.ram, 60))
return dict(logs=logs)
``:code
-------
The results of a ``select`` are complex, un-pickleable objects; they cannot be stored in a session and cannot be cached in any other way than the one explained here.
-------
### Self-Reference and aliases
``self reference``:inxx
``alias``:inxx
It is possible to define tables with fields that refer to themselves although the usual notation may fail. The following code would be wrong because it uses a variable ``db.person`` before it is defined:
``
db.define_table('person',
Field('name'),
- Field('father_id', db.person),
- Field('mother_id', db.person))
``:code