OpenWGA 5.4 - Query languages reference


hql / fullhql

General


Queries against WGA Content Stores on relational database systems normally are done using HQL, the integrated query language of  Hibernate, a software library that OpenWGA uses internally to transform the relational database data into document objects and vice versa. You can find detail information about general HQL in the hibernate dokumentation.

OpenWGA simplifies the usage of HQL so that you only have to specify filter criterias by which you want yor result contents to be selected. The following query for example searches for a content document with the unique name "home":
<tml:query type="hql">content.uniquename = 'home'</tml:query>

In special cases you will want to formulate a whole HQL query which also specifies the objects that are to be returned. Therefor you are able to declare your query of type "fullhql", which expects full valid HQL queries, like the following
<tml:query type="fullhql">from Content as content where content.uniquename = 'home'</tml:query>

General syntax

HQL is an "object oriented" query language as you specify document objects and their properties and decide by what values you want to filter the documents in your result or by which properties you want them to be ordered. Those properties may be text, number or date values or even further objects with their own set of properties. But the base of a query is always the object "content", representing a content document.

Many other things in HQL resemble closely SQL, so in most cases it can be used just like SQL. String literals are specified by delimiting them with single quote characters ' (as you could see in the introductory example). The most commonly know operators and techniques are also available in HQL. So besides the object oriented nature of HQL you should have no problems understanding HQL concepts when coming from SQL.

A property of an object is queried by specifying the object and the property name, divided by a dot. The following HQL selects a content document based on its title property:
content.title = 'Home'

As stated some properties of objects are again objects themselves. You can also query the properties of those sub objects by continuing the expression with the propery name after a dot. For example the "language" property of "content" is a language object which itself has a property "name". To query for all contents of name "de" specify the following:
content.language.name = 'de'


Common operators


The following operators can be used anywhere in HQL and are not dependent on any special functionalities. So you can combine multiple HQL expressions by the operators AND/OR and can negate an expression by NOT:
content.title = 'Home' AND content.language.name = 'de'
content.title = 'Home' OR content.language.name = 'de'
content.title = 'Home' AND NOT content.language.name = 'en'

Using normal brackets you can specify the precendence of expressions:
content.title = 'Home' AND (content.language.name = 'en' OR  content.language.name = 'de')

You can query for empty properties with IS NULL or IS NOT NULL
content.virtuallink is not null

You can test for equality and non equality. For numeric properties you can also test for larger/smaller values:
content.structentry.position = 0
content.structentry.position != 0
content.structentry.position > 0
content.structentry.position >= 0

The "like" operator like in SQL is used for string matching. You can use "?" as a wildcard for a single arbitrary character and "%" as wildcard for an unspecified number of arbitrary characters:
content.title like '%wga%'

The "in" operator tests if a property is in a selection of values:
content.structentry.doctype.name in ('standard', 'homepage', 'news')

The "between" operator tests on numeric values if they are in a given range:

content.structentry.position between 0 and 10

Treatment of list properties

Some properties do not consist of single values but are effectively a list of values or objects. The individual values of those lists are either addressable by specifying a numeric index or a text descriptor.  You can specify this index by appending it to the property name in square brackets. For example the "keywords" property is a list indexed by numbers:
content.keywords[0] = 'Hamburg'


An example for a list property indexed by a text descriptor - and the most common use case of this syntax - is the property "items" on the content object. This lists all the custom data items that a content document contains, which themselves are objects of type "contentItem". The index text is the name of the item in lowercase. Those objects have individidual properties for addressing the items value as "text", "number", "date" or "boolean". Only the property that matches the data type of the item can be queried. Following examples test an item "revision" of number type and an item "docname" of text type:
content.items['revision'].number = 1 AND
content.items['docname'].text = 'ref'

More about treatment of items will be shown later in this document.
Text indexes of content items should always be specified lowercase, as they are internally stored that way.

All list properties have themselves a property "size" which provides the number of values in the list:
content.keywords.size > 10

Sorting

You can specify a sorting clause "order by" in HQL just like you do in SQL. You should prefer sorting inside a HQL query rather than in WebTML because in most cases it is faster and more resource effective:
content.language.name='de' order by content.title, content.created

You can also use item values for sorting:
content.language.name='de' order by content.items['shortname'].text



Items in HQL

Items in HQL, as already stated, are single objects that are available via the list property "items" of the content objects. They have individual properties for each data type that can be stored natively in an item. Testing strings and numbers is straightforward:
content.items['revision'].number = 1 AND
content.items['docname'].text = 'ref'



Boolean item values are compared to 1 (true) and 0 (false)
content.items['checked'].boolean = 1



Date values can be tested against date string literals, whose exact format is determined by the database platform. To find out which format this is search the documentation of your database platform how to do the same thing in it's native SQL syntax.
content.items['startdate'].date > '2005-05-01 10:00:01.0' 


You can of course combine multiple expressions testing for individual item values:
content.items['type'].text = 'article' AND content.items['count'].number = 5


Selecting and sorting items in HQL has its caveats which may let your query not work as expected:
  • When selecting or sorting for an item you are effectively filtering the query result to contain only documents that have an item of that name. All documents who do not contain an item of the given name will not appear in the result
  • You cannot check for values of items who contain themselves a value list or who do not contain one of the primitive data types text, number, date or boolean. While such values can be stored and retrieved in WebTML they are serialized into the database in a form that is not queryable.

Functions


HQL provides some predefined functions which are not dependent on the database platform that is used. Here are some of the most frequently used:

The functions elements and indices return the elements / index values of a list property. They often are combined with operator "in" to test if a certain value/index is contained in the list property:
'body' in indices(content.items)
'Hamburg' in elements(content.keywords)

The functions upper and lower convert strings to upper/lower case:
upper(content.title) like '%HAMBURG%'

In most cases you can also use native SQL functions of your database platform to modify values. The following example uses the MySQL functions SUBDATE()  and NOW() to select contents who were modified less than 10 days ago:

content.lastmodified < SUBDATE(NOW(), 10)
There are problems when using database functions with unusual parameter syntax. For example the MySQL function DATE_SUB() (which is a variant of function SUBDATE() used above) has a second parameter that itself consists of multiple keywords:

DATE_SUB(NOW(), INTERVAL 10 DAYS).

In that case OpenWGA cannot correctly indentify and merge HQL and SQL parts and the query will fail.

Query parameters

HQL supports the specification of query parameters in the query which can be dynamically injected, like with <tml:param> in WebTML. Specify the parameter with a colon and a custom name directly inside the query:
content.item['city'] = :city

Do not surround the parameter with type specific delimiters, like single quotes for strings. The parameter injection functionality will automatically take care of this. Her is a complete example in WebTML:
<tml:query type="hql">
  <tml:param name="city">Berlin</tml:param>
  content.item['city'] = :city
</tml:query>

While it at first might seem easier and better to just inject the needed value at the right position in the query using query parameters has some crucial advantages why they should be preferred where possible:
  • Query parameters are automatically protected against SQL injection attacks. They take care that the injected value will only be used as a literal.
  • Query parameters allow database drivers to precompile the SQL statement and reuse it with different parameter values, preventing unneccessary recompilation.

Content Relations

Content relations (available on OpenWGA content stores of version 5 or higher) are also queryable in HQL. They are usable just as items - specify the relation name in square brackets - instead that there are no data properties to read out but a "target" property which again serves a Content object:
content.relations['dependenton'].target.contentclass = 'project'

They really get useful in HQL in combination with query parameters holding content objects, like default parameter ":content" holding the content currently in context of the <tml:query> tag. You may just compare the "target" property of the relation to it to select documents whose named relation points to the current context document:
content.relations['parent-project'].target = :content

You can inject other WGAPI content objects (WGContent) as query parameters with <tml:param> and use them the same way.

Virtual resultsets in "fullhql"

The query type "fullhql" can be used to specify the full HQL query instead of only the part after WHERE. That way you have complete control over the SELECT part of the query and are able to return other data than content documents, like the result of aggregate functions. OpenWGA will interpret each "result row" as a "virtual document" (in true meaning of virtual, not meaning docs that point to some other content) holding the row colums as items. Therefor you should specify an alias for each value that you select which will become the virtual item name.

See the following simple "fullhql" query which just counts the available content documents:
select count(*) as cnt from Content as content

This will return a result set of a single "virtual" content document. It will contain a single item "cnt" containing the result of the aggregate function.

Of course you can combine this syntax with grouping clauses, like this query which groups contents by their content type and then counts each group:
select content.structentry.contenttype.name, count(*) as cnt from Content as content
group by content.structentry.contenttype

This query will return a "virtual" content document for each group, providing items "name" containing the name of the contenttype and again "cnt", which is the count of content documents with the given content type.

Unfortunately it is impossible in HQL to reference items via the syntax "content.items['itemname']" in the select clause. So to use aggregate functions on items you will need to select them directly and filter the item of the name you want via the where clause. Selections on the document itself can then be done via the "parentcontent" property of the item. For example like this:

select AVG(ratingitem.number) as avrating from ContentItem ratingitem

where ratingitem.name='rating' and rating.parentcontent.contentclass='pollresult'

Queries returning virtual result sets cannot be cached in Query cache. Therefor you cannot combine these queries together with <tml:query> attribute cache="true".

Object reference

The following documents provide complete reference information about the objects queryable in HQL:

Table of contents: