Wednesday, September 23, 2009

Informix technology

"Let S be an that is not generally contained in a . All >s that are contained in s that are generally contained, without an intervening
whose subject routines do not include an SQL function, either in S without
an intervening or in an contained in the of a trigger activated as a consequence of executing S, are effectively evaluated simultaneously.
The time of evaluation of a during the execution of S and its activated triggers
is implementation-dependent."



The above is a citation of SQL Standard 2008/2003/1999 (part 2 - SQL/Foundation). It's a little nightmare to read, follow and understand, but it's the cause of one behavior generally considered an annoyance in Informix: Inside a stored procedure, all CURRENT references will show the same value.

Let's check the Informix SQL Syntax guide to see how the above ANSI rule is explained in the fine manual:

"SQL is not a procedural language, and CURRENT might not execute in the lexical order of its position in a statement.
You should not use CURRENT to mark the start, the end, nor a specific point in the execution of an SQL statement.

If you use the CURRENT operator in more than once in a single statement,
identical values might be returned by each instance of CURRENT. You cannot rely
on CURRENT to return distinct values each time it executes.

The returned value is based on the system clock and is fixed when the SQL
statement that specifies CURRENT starts execution. For example, any call to
CURRENT from inside the SPL function that an EXECUTE FUNCTION (or
EXECUTE PROCEDURE) statement invokes returns the value of the system clock
when the SPL function starts."

This looks like a more friendly explanation. But it doesn't point out the reason why this is implemented as is, and that reason is mainly for ANSI compliance. In practice, what I usually hear from customers is that this is inconvenient. Many times they are trying to use CURRENT YEAR TO SECOND/FRACTION to find out the time spent on a procedure or parts of it. And it simply doesn't work, because all the values they get are the same. Is there a solution? Yes, if you just need precision up to second:


SELECT
DBINFO('utc_to_datetime', sh_curtime)
INTO
current_time
FROM
sysmaster:sysshmvals;


The sh_curtime field of sysmaster:sysshmvals contains the current unix time (number of seconds since January 1 1970). The DBINFO function with the 'utc_to_datetime' converts it into a DATETIME YEAR TO SECOND value.

IDS 11.50.xC5

It's been too long and I have a lot of catch up to do... Let's start with the latest version of IBM Informix Dynamic Server. It's 11.50.xC5, the latest fixpack in the 11.50 family, available since the end of July 2009.

Keeping up with what can now be considered as usual, this fixpack brings important features. Let's list them:

  • High availability
    • Configuring RS Secondary Server Latency for Disaster Recovery
      This allows the system administrator to configure a delay of time between the primary server and a RSS (remote secondary server). This means that the secondary server will lag behind the primary server for the specified amount of time.
      Associated to this we have the ability to stop and restart the roll forward of the logical logs on the secondary server.
      Note that for integrity purposes the primary will still send the logical logs as soon as possible. It's up to the secondary to hold the logs without applying them. This means that if your primary server disappears you're still able to recover up to the most current image of the primary server.
      The configuration of this feature involves three new parameters: DELAY_APPLY, STOP_APPLY and LOG_STAGING_DIR.
      DELAY_APPLY can be used to configure a delay between primary and secondary.
      STOP_APPLY can be used to specify a specific time to stop applying logs
      LOG_STAGING_DIR is a secondary locally writable directory where the server will keep the logs before applying them.
      All this can be dinamically changed with onmode -wf/-wm command
  • Administration
    • Forcing the Database Server to Shut Down
      Two new utilities were made available to handle situations where things went wrong... If for example your IDS server is hang, or when it was not shutdwon properly, you may have trouble to stop it and/or restart it (it can leave shared memory segments behind for example).
      The new utilities are onclean and onshutdown. Onclean can be used to force the IDS server down (it can kill the engine processes) and it will try to clean all the shared memory segments that were left behind. Without options it's use is to remove the sahred memory segments of a server that was already stopped. With "-k" it will kill the server and then make sure the shared memory is freed.
      Onshutdown is used to attempt to kill the server using the normal "onmode -ky", but if it fails it calls onclean.
      This is not a big feature, but it can be handy for scripting and for new users.
    • Enhanced Support for Multibyte Character Strings
      By using a new environment variable, SQL_LOGICAL_CHAR, the server will interpret the size of character fields in terms of characters instead of bytes as usual
    • New default for the INFORMIXTERM environment variable for terminal set up on UNIX
      Simply a new default for INFORMIXTERM variable
    • onconfig Portal: Configuration Parameters Listed by Functional Categories
    • onstat Portal: onstat Commands Listed by Functional Categories
      These two are important documentation enhancements available at IDS Infocenter
    • Enhancements to the Enterprise Replication plug-in for the OpenAdmin Tool for IDS
      Open Admin Tool keeps being expanded and improved. And free...
  • Enterprise Replication
    • Enterprise Replication Stops if Memory Allocation Fails
      Better handling of low memory conditions. Replication will stop and raise an alarm. After solving the memory issues just run cdr start
    • Notification of an Incorrect Log Position When Enterprise Replication Restarts
      A new alarm is fired if the replay position is invalid (too old or later than current log position)
    • Workaround: Reclaiming Space from Enterprise Replication Paging Smart Large Objects
      Later versions could allocate space for replication of SLOBs that was not necessary. This space can now be reclaim.
    • Improving the Performance of Consistency Checking with an Index
      A new shadow column can be added to replicated tables. This column (ifx_replcheck) can be used in a composite index with the primary key in order to speed up the checking of replicates.
    • Specifying the Range of Data Sync Threads to Apply Replicated Transactions
      An ONCONFIG parameter can control the number of replication threads
  • Warehousing
    These features are presented in the Warehousing category, but I personally think they can be used in any kind of environment
    • Loading Data into a Warehouse with the MERGE Statement
      This IDS version introduces support for the MERGE sql statement. We specify a destination table, a source table/query and a match condition. If the match condition is true than the row in the destination table is UPDATEd with the data from the source table/query. If the match condition is false a new row is inserted in the destination table
    • Retrieving Data by Using Hierarchical Queries
      Introduces support for the CONNECT BY sql construct. This allows the use of the so called hierarchical queries. Please note that the most efficient way to handle this would be using the node datablade. But for compatibility reasons IBM decided to introduce this feature. It's a nice feature for application portability
So, that's all for xC5. Let's use the new features and keep an eye on xC6 that should appear before the end of the year.

This is probably the first post I'll write following a customer facing situation. Although there may be good reasons to don't write about some customer facing situation, I feel this may bring some value to the blog, and we never know when someone having the same problem finds the blog on Google... So this may be the first of several small and direct posts.


Isolation level in WebSphere


Some time ago I had to go into a customer site who was having "major performance issues". After some examination and some talks with the development team I was able to identify several sessions running in Repeatable Read isolation level. The application has several components and one of them is an instance of WebSphere Application Server (WAS) v6.1.

Further investigation allowed us to understand that the application was not using EJBs, nor Session Beans (these allow the isolation level to be specified in the deployment descriptor XML file). As such the database connections were using the WAS default isolation level which is repeatable read.
For those less familiar with repeatable read it's equivalent to ANSI Serializable mode. Basically any record read in order to find the result set is locked and remains locked in shared mode for the duration of the transaction. So an instance which was configured for 20000 locks could reach about one million of them (thanks to the lock table automatic expansion). Obviously this caused a lot of contention between sessions and a lot more internal work for the engine. This was causing their "performance" problems.

The solution was simply to redefine the WAS isolation level for the datasource used by the application. This can be done by using a custom property called webSphereDefaultIsolationLevel which as the name implies can be used to change the database connection default isolation level. Complete information about that can be found in the following documentation:

http://www-01.ibm.com/support/docview.wss?rs=180&uid=swg21224492

In there you can find the property description with the explanation of the values it accepts as well as other ways to change the default isolation level.
After changing this the application behaved properly an most of the performance issues went away. The were some other issues like lack of indexes, and some minor configuration changes on the database side, but those were clearly not the cause of the problems.

11.50.xC4: Another mark in Informix history

IBM Informix Dynamic Server 11.50xC4 is available for customers on
It's another fixpack that comes with significant improvements. Traditionally we had to wait for full releases to have some significant features but we're getting used to see great improvements in fixpacks.
This one is no exception and from the release notes we can see an overview of the new functionality:


  • High Availability
    • Availability Enabled on Windows 64-bit Operating System
      Some of the HA features were not available on MS Windows 64 bit versions

  • Administration
    • Save Disk Space by Compressing Data
      Data compression for tables. This deserves a few more words... Check the end of the article
    • Improved Options for Specifying Processor Affinity
      New options for providing binding between virtual CPUs and physical CPUs. We can now specify a list of physical CPUs, one ore more ranges, and ranges with increments
    • Disable IPv6 Support
      If you prefer to disable the IPv6 protocol you can create and empty file named $INFORMIXDIR/etc/IFX_DISABLE_IPV6 (readable by informix) or set and enrionment variable: IFX_DISABLE_IPV6=yes
    • Enhancements to the OpenAdmin Tool for IDS
      These include: Control of table compression, query by example for all the usual table operations (query, insert, delete and update), SQL explorer history, managing of external directives and export/import of OAT connection definitions
    • Generating a Customized Database Server Configuration File
      A new utility called genoncfg was introduced. It takes a default onconfig.std and a text file with some directives (number of CPUs, memory, rootdbs location) and adjusts the instance configuration. The result is saved in a file called onconfig on the local directory. This can be the beginning of a great tool

  • Enterprise Replication
    • Generate ATS and RIS Files in XML Format for Custom Repair Script
      The Aborted Transaction Spooling and Row Information Spooling files can be generated in XML format to facilitate the implementation of recovery processes
    • Prevent Upserts During Conflict Resolution
      An Upsert happens if an Update it send to target and the row does not exists. This may not be desirable if the row was previously deleted. Now we can make the DELETE "win"
    • New Event Alarms for Enterprise Replication
      Three new situations will trigger an alarm: When a server connection is dropped, when a server, replicate or replicate set changes state (stopping, starting, modifying, deleting, or changing the connection status) and when a transaction fails (generating ATS/RIS files)
    • Monitor the Progress of a Consistency Check or a Direct synchronization
      New cdr commands to see the status of check and sync processes (cdr stats check and cdr stats sync)
    • Specify the Duration of a Consistency Check
      By default, inconsistent rows are rechecked once after two seconds, which might not be enough time for replicated transactions to be applied on the target server. You can now specify the number of seconds to spend on rechecking the consistency of inconsistent rows. Rechecking prevents transactions that are in progress from being listed as inconsistent in the consistency report. You can use the --inprogress option of the cdr check replicate and cdr check replicateset commands to specify the maximum number of seconds to perform rechecking.
    • Schedule Synchronization or Consistency Checking Operations
      cdr check and cdr sync allow for a new flag ( --background ) that will use the scheduler to make periodic runs of those processes
    • Improve Consistency Checking Performance
      Several options allow the execution of parallel threads for check operations. You can specify which data or intervals to check
    • Improve the Performance of Checking or Synchronizing Replicate sets
      A new option ( --process ) define how many threads to use in paralell for check and synchronization jobs
    • Role separation for Enterprise Replication (UNIX)
      If you're using role separations your DBSAs can control enterprise replication. Previously only user informix could
    • Enhancements to the Enterprise Replication plug-in for the OpenAdmin Tool for IDS
      A lot of new features in the ER plugin for Open Admin Tool (OAT)
  • Performance
    • Enable Concurrent I/O to Improve Performance on AIX Operating Systems
      IDS supports DIRECT I/O since version 11.10. Now it also supports Concurrent I/O (CIO) on AIX. It's a needed step to really get to the performance of raw devices. I have plans for an article on this subject...
    • IFX_LARGE_PAGES Environment Variable (AIX, Solaris)
      Allows IDS to take advantage of a pool of large memory pages configured by the system administrator. The shared memory virtual segments must be set to RESIDENT by the database system administrator

  • Application Development
    • DataBlade Module Registration through SQL
      Datablades can now be registered using plain old SQL. A new function, SYSBldPrepare is used for the registration process. So you can do it from any SQL tool or interface
    • Control the Results of a Fuzzy Search with the Basic Text Search DataBlade Module
      You can now specify the degree of similarity of search results in fuzzy searches when using the Basic Text Search DataBlade module. Specify a number between 0 and 1, where a higher value results in a higher degree of similarity. To limit results, specify a higher number. To maximize results, specify a lower number. The default degree of similarity is 0.5.
    • Map Characters for Indexing with the Basic Text Search DataBlade Module
      This allows us to replace special characters with other characters for indexing purposes
    • Default Boolean Operator in Basic Text Search Queries
      When we query a BTS index, the space between words is considered a default operator. Now we can define that default operator at index creation time
    • Storage for Temporary Basic Text Search Files
      It's now possible to define a separate sbspace for the temporary files used in BTS searches. This can increase performance. Up to now they were created in the same space as the index.
    • Track Basic Text Search Query Trends
      The possibility to save the queries made using a BTS index. This information can be used to provide hints to the end users in order to improve their own queries. It's like using past experiences to improve future queries
    • Fragment bts Indexes by Expressions
      It's now possible to split a BTS index into several sbspaces by using fragmentation
    • Monitor Change Data Capture Sessions
      A new onstat option ( onstat -g cdc ) that allows monitoring on data capture sessions. Change Data Capture API was introduced in 11.50.xC3 and can be used to capture data change and propagate it into other databases, web services etc. It's also one basic component to use SolidDB as a frontend cache for IDS.
    • View Change Data Capture Error Message Text
      A new function ( cdc_errortext() ) that retrieves an error message from an error generated by CDC

  • Security
    • Automatic Check for Installation Path Security (UNIX, Linux, Mac OS X)
      The installation process checks that the directory structure is secured
    • Utility to Enhance the Security of the Dynamic Server Installation Path (UNIX, Linux, Mac OS X)
      A new utility ( onsecurity ) that can check the integrity of the product installation. It can detect if the installation was compromised and in that case forbids the use of the product
As we've seen in previous fixpacks, IBM introduces another major feature in this one. Compression. There is a lot to say about IDS compression, but it would be a waste of time spending too much time here. The subject is already covered in several places:
  • The white paper from Scott Lashley:

  • The IDS compression site:

    On this site you can access the IDS compression estimator. It's a very simple tool that you can use to connect to your instances and estimates the level of compression you can achieve. The tool requires IBM Informix Connect or Client Software Developer Kit (CSDK) and works only on Windows
  • A developerworks article from Nilesh Ozarkar:

I'd like to highlight a few aspects of IDS compression:
  • It's a dictionary based compression, similar to DB2 deep compression. This means you don't have to rely on pattern repetition on each page
  • Since the dictionary is created by sampling the table it's very fast to get a compression estimate
  • It works for normal INSERTs and UPDATEs. It doesn't depend on load operations
  • It works online (without blocking the tables or rows)
  • It comes with two other features: repack and shrink. Repack regroups the data rows sequentially on the table, leaving the free space at the end. Shrink can return this free space to the dbspace/chunk where the table resides
But if compression is the brightest feature we should not let the others go by without notice:
  • From the list above we can see that Basic Text Search datablade has seen a lot of nice improvements. I'd bet that if IBM keeps improving it like this it will have to change it's name to something like "Not so Basic Text Search Datablade" :)
  • And again, Enterprise Replication (ER) sees a lot of improvements
  • A new version of Open Admin Tool (OAT) with several major improvements
  • The new utility to generate a configuration file (genoncfg). It's still a bit simple, but I believe it can be the start of a great utility to help new Informix users to get a basic working configuration
  • The introduction of CIO for AIX. This is a very important step after the introduction of DIRECT_IO. I would love to write a post about this. Stay tuned.
And that's it for now. If you're an IBM customer you can download 11.50.xC4 from the usual sites. If you're not an IDS user, you can try it by downloading the IBM Informix Dynamic Server Developer Edition from:

No comments:

Post a Comment