DavidYahalom.com is an IT knowledgebase dedicated to the world of databses and RDBMS systems by David Yahalom. Here you'll find articles, views, news, tips and in-depth analysis about Oracle, DB2 LUW, Sql Server and MySql. I hope you'll enjoy your stay.

6th
JAN

Proactive monitoring with Oracle Grid Control

Posted by David Yahalom under ITIL, Grid Control, Monitoring, General IT, Oracle

Data centers today are becoming huge and complex. There’s a definite need to provide 24/7, always-on, high SLA services. The sheer maintenance of such complex environments can be a daunting task for any IT staff.As the complexity of systems increases, the need for better, more comprehensive monitoring solutions becomes more critical. The monitoring tools and methodologies that were once considered good-enough just don’t cut it anymore.

Knowing that servers are up and running and that the backups have finished successfully isn’t enough. Complex systems require complex monitoring methodologies taking into account complex SLAs scenarios that usually span dozens of servers and tens of connected systems.

There are two methodologies for troubleshooting system performance. You can be reactive – wait for the problems to happen and then do your best to solve them as fast as you can or be proactive – learn of potential problems and bottlenecks before they become critical and affect your business.

Being proactive has many advantages including minimizing downtime and freeing your IT staff to advance future projects rather then put out fires all day long. In the IT world staying in one place is pretty much the same as going backward.

The proactive approach is derived from identifying problems before they happen and addressing them before they become a problem for your company. You can define a set of rules, either application based rules or system based rules which will be defined as the SLA indicator for the application. These rules can be anything from “page load time” to system-oriented tests such as CPU load or blocking sessions to a combination of both. Being proactive means being alerted once the load time for a certain page in the application has exceeded a predefined threshold.

Grid Control is Oracle’s high-end, fully fledged, enterprise-wide, monitoring and management solution. Grid Control is a single, integrated solution for administering, operating and monitoring applications and systems.

While Grid Control provides has the most complete set of Oracle Database monitoring metrics from any other tool on the market today, it can also do much more as a monitoring solution beside monitor Oracle Databases only.

Grid Control provides the following capabilities and features:

1) Application Management.

2) Database Management.

3) Middleware management.

4) Real User Experience insight

5) Host / Server management

6) SOA management

7) Identity management.

Each of these features consist of both predefined metrics for evaluating system performance and availability plus the ability to create custom metrics handing over control on what specifically Grid Control monitors to you. You have the capability to group several key systems together and define a service group that is dependent on the availability and performance of all the targets belonging to that service group.

Grid Control allows for self-tuned and self managed databases, automation of complex or routine manual tasks which are often error-prone, rapid root cause analysis through established standards and complete compliance enforcement.

Using the combination of Grid Control and Oracle 10g’s ADDM feature you can periodically examines the state of the database, automatically identify potential bottlenecks and have Grid Control recommend corrective actions.

Oracle Enterprise Manager presents ADDM findings and recommendations in a convenient and intuitive fashion and guides administrators in step-by-step process to quickly resolve performance problems by implementing ADDM recommendations.

Grid Control also offers a large array of compliance assessments (with out-of-box included policies), security violation reports, patch advisories, multiple-target deployment and of course – alerts. You can have Grid Control notify you of potential problems both via email, SMS text messages and SNMP traps.

 Using GridControl for ITIL compliance

This last section of the article can be crucial to some of you looking for ITIL compliance. ITIL stands for IT Infrastructure Library - a widely accepted approach and methodologies to IT service management providing a complete and comprehensive set of best practice guidebooks supported by a qualifications scheme, training and implementation and assessment tools. ITIL was designed to promote the quality of computer services in the organization and provides management with a template framework for best practices to achieve quality IT services and overcome difficulties associated with the growth of IT systems. ITIL is divided into a set of text books, qualification programs, software tools and user groups slicing the services expected from an IT department:

1. Service support.

2. Service delivery.

3. Managerial.

4. Software support.

5. Computer operations.

6. Security management.

7. Environmental

So how exactly can Grid Control help you become ITIL certified?

Since Grid Control can essentially monitor everything in your DataCenter it can better help you align business aspects with your IT aspects.

 

ITIL configuration management

The basic process defined by ITIL is that of configuration management. This is essential for both service support and service delivery.

Configuration management means tracking all of the individual hardware components supporting your business IT infrastructure and all of the changes these components undertake. Grid Control tracks all of these changes in a build-in Configuration Management DB (or CMDB for short) which keeps all history data easily accessible using Grid Control’s interface.

Each one of these Configuration Items is linkable to other Configuration Items in your infrastructure. You can define a connection between servers X,Y,Z and the organization HR system. Grid Control also has the ability to automatically identify the relation between separate configuration items and greatly reduce the overhead with manually documenting and linking individual configuration items.

This allows the developer to notify the system administrator that ever since he replaced the network card in the production server the entire application is running slower than usual. Identify the change that caused the problem.

  

ITIL Problem management

Another key ITIL process is the diagnostic of reoccurring problems and failures. Reoccurring errors are usually a sign of problems that were poorly handled when they first occurred. Grid Control allows us to identify the error when it happens by using root cause analysis, helping us identify the underlying problematic configuration item thus greatly simplifying troubleshooting.

ITIL Change management

Change management in complex IT environments is all about they way your IT staff can coordinate changes in your infrastructure so that they can easily identify which systems are being affected by the proposed change. Grid Control automatically documents any change made in your business IT infrastructure. By creating “baseline snapshots” using Grid Control you can later compare your infrastructure configuration to a given point in time.

Grid Control also supports Change Automation allowing us to schedule automatic deployment of Oracle patches.

 

ITIL Release Management

ITIL release management is the ability to release final, production-grade synchronized versions of your IT system along with all of its configuration items corresponding to the various installation procedures and regulations in place.

This gives us the ability to keep a “closed” version of our configuration management (combination of hardware, O/S choice, DB version…) as a standby baseline and use this baseline to easily make future copies. Grid Control fully supports ITIL release management by allowing us to automate our deployment procedures of the entire system from the bare-metal upwards.

ITIL Capacity Management

Since both system load and capacity play critical roles in the availability and performance of your business systems, ITIL defines a process to easily manage and quantify your systems capacities. In order to predict the capacity required to support future workloads, Grid Control tracks the system performance over a period of time allowing you to slice the performance by various indicators such as user activity.

This is just a quick tour of Grid Control and really, only a drop in the sea in terms of capabilities and options at your disposal. We strongly recommend that you install Grid Control for yourself and explore the world of proactive, smart IT infrastructure monitoring and management. 

And remember…. Once you go Grid, you’ll never go back!

 

25th
NOV

Hashing strings in Oracle 8i,9i,10g

Posted by David Yahalom under Oracle

There’s an easy way to hash strings in Oracle 8i,9i,10g.

In Oracle 10g you can use the ORA_HASH function which computes a hash value for any given expression. It recieves three arguments:

  • expr - determines the data for which you want Oracle Database to compute a hash value. You can go wild on the length of this string there isn’t any restriction on this.
  • max_bucket (optional) - determines the maximum bucket value returned by the hash function. You can specify any value between 0 and 4294967295. The default is 4294967295.
  • seed_value argument (optional) - enables Oracle to produce many different results for the same set of data. Oracle applies the hash function to the combination of expr and seed_value. You can specify any value between 0 and 4294967295. The default is 0.

SQL> select ora_hash('hello world') from dual;

ORA_HASH('HELLOWORLD')
----------------------
1896528268

SQL>

In Oralce 8i and 9i there is not ORA_HASH function. What we can use instead is the DBMS_UTILITY.GET_HASH_VALUE function. It is similar yet different from ORA_HASH .

The GET_HASH_VALUE function takes three mandatory parameters:

  • name - The string we want converted from string to integer via hash.
  • base - The base value for the integer used in the hashing algorithm.
  • hash_size - This is the size of the hash table. The total number of values that are available to the hashing algorithm as conversions from the string inputs. It is recommended that this will be the power of two.

For example, to get a hash value on a string where the hash value should be between 1000 and 3047, use 1000 as the base value and 2048 as the hash_size value.

SQL> select DBMS_UTILITY.GET_HASH_VALUE('hello world1', 2,1048576) from dual;

DBMS_UTILITY.GET_HASH_VALUE('HELLOWORLD1',2,1048576)
----------------------------------------------------
113894

SQL>

1st
NOV

Started a new job!

Posted by David Yahalom under General IT

Pop up the champagne bottles people!

Veracity Logo

I have started a new job as a Senior Project Manager at the Veracity Group , the leading Oracle Consulting group in Israel. Veracity is the only Oracle certified Value Added Distributor (VAD) in Israel and part of a very small and exclusive list of Oracle VADs around the world. Our customers benefit from the unique advantage we offer of both directly licensing and selling Oracle products as well as having an outstanding expert Professional Services team. We’re Simply the best in our field, trust me. As part of my new job at Veracity, I’m also holding the CTO title of a very special group in the Oracle Database development business, but more on this later.

I’ve been able to accomplish a great deal in the last six years as a database consultant, especially during my time at Xpert-One1 which was a very rewarding experience. For some time now I’ve been looking into getting a more managerial position and when the opportunity presented itself, I simply couldn’t resist. My new position puts me directly in the middle decision making process about all things technological - so I think this would make a great learning experience for me. I’m still (and always will be) a bits and bytes kind of guy so I’m looking forward combining my deep passion for technology with having a managerial position. Any tips on how to exactly accomplish this from all you veterans out there are welcome! :)

24th
AUG

Migrating from SPARC to x86

Posted by David Yahalom under RAC, Hardware, Solaris, Unix, Oracle

Remember the SPARC Vs. x86 post I made some time ago? I’ve talked about sizing old SPARC hardware compared to new x86 based servers.

Well, I have just finished a rather big project migrating several single-instance Oracle 8i databases running on old(ish) SPARC machines to a state of the art Oracle 10g RAC on commodity x86 Linux servers.

I have migrated my client from this:

Two: 4 X UltraSparc III+ 1GHZ CPUs, 8GB RAM.
Two: 3X UltraSparc II 300Mhz CPUs, 4GB RAM.

To this:

Two HP DL580G5: 2X Intel QuadCore Xeon X7350@2.93GHz, 16GB RAM (list price: $15,000).
One HP DL380G5: 2X Intel QuadCore Xeon E5345@2.33GHz, 8GB RAM (list price: $8,000).

The workload remained the same. The old config had several different unique schemas on each physical server. Each applications running connected to a different schema on a different physical server. No shared data between servers. This was done to spread the workload across several machines. Five years ago they (client) considered it a poor man’s cluster. :)

When installing and configuration the RAC cluster, I have combines all of the different schemas under a single database.

I’ll try and published more detailed performance benchmarks later on, but now all I can say is this: in terms of raw CPU power (taking storage I/O and memory restrictions out of the function) the RAC cluster smokes, obliterates and completely destroys all of the older SPARC servers combined.

The RAC cluster consist a total of 6 CPUs, each with 4 cores (a total of 24 cores).  The average load on each node is around the 2.0-2.2 mark. Meaning a combined load of ~6/24 or roughly 25%.  Given the same workload,  the old SPARC machines reached well above 100%. For example, one of the “more powerful” Ultrasparc III+ 1Ghz machines (total of 4 CPUs) reached load ranges of above 20!

The workload is a mixed multi-million user, multi-session web application with short, cached queries and transactions plus several single-threaded very cpu-intensive financial applications that perform thousands of logical DML operations per second. One of these DML intensive processes caused a 4XUltraSprac III+ 1Ghz machine to reach average loads of 15-20. The same process running on the same data barely tickles the QuadCore Xeon.

Now, please keep in mind two things:
1) This isn’t scientific, far from it. At least not yet. I have also performed more concrete benchmarks, but until I’ll have time to arrange all the data into user-readable format you’ll just have to take my word on it. :)

2) You might say that comparing 4 year old SPARC hardware to brand new x86 machines might not be fair. And I’ll agree. It isn’t. But what’s important is to take the general sizing figures from this post and try to adapt them to occasions where you have a site “suck” with old SPARC hardware and you want sizing information to decide how exactly SPARC Mhz translates to x86 Mhz.

The bottom line is this: don’t be afraid to migrate old SPARC machines to modern Linux x86 hardware. It will usually be allot faster.
And…. you can achieve great performance when using cheap, x86 servers in a RAC cluster and very high ROI.

31st
JUL

Show full process name / path / string in Solaris using ps

Posted by David Yahalom under Solaris

Let’s say you have a JAVA process running on an old Sun Solaris machine. You want to see the full path of the process running including any run-time variables that were passed to it.

Using ps -eaf | grep -i java gives us this:


[root@hostname ~]# which ps
/bin/ps

[root@hostname ~]# ps -eaf | grep -i java

nobody  4589  4588   0 14:26:35 ?           0:21 /usr/java1.4/bin/java -Djava.awt.headless=true -Xms200m -Xmx300m org.apache.jse

As you can see the process name is trimmed and you can’t see all parameters passed to Java. No matter what parameters or scripting you’ll try and do, using /bin/ps (default) will crop your process name.

There is, however, an entriely different ps program we can use. :)

[root@hostname ~]#/usr/ucb/ps -auxwww | grep -i java

nobody    4589  0.1  4.228480842528 ?        S 14:26:35  0:21 /usr/java1.4/bin/java -Djava.awt.headless=true -Xms200m -Xmx300m org.apache.jserv.JServ /jserv/etc/jserv.properties

Fantastic!

30th
JUL

Limelight

Posted by David Yahalom under Security, Oracle

I would like to thank the several people that have emailed me about this. Pete Finnegan, who is one of the most prominent figures in the world of Oracle databases and Oracle Security has mentioned AuthorityBase on his web site!

He has a writeup mentioning the Oracle Security presentation I’ve created for my company (XpertOne1) and he seemed to liked it. :)

Check it out at Pete’s site!

23rd
JUL

Intermittent ORA-12545 When Trying To Connect To RAC Database

Posted by David Yahalom under RAC, Oracle

Well, got another RAC tidbit for you today. I’ve been doing some very interesting RAC installations in the past few weeks and came across several “bugs” (or “random features” in Oracle-tongue :) ) I felt like sharing.

So, you have a brand new 10g RAC cluster installed and when you try to connect using OCI (probably JDBC thin as well) only to recieve intermittent ORA-12545 errors. The client will connect fine every other attempt or so.

This error happens when you enable server side load balancing but the client does not have domain address / DNS search function setup (or valid /etc/hosts file) so it cannot translate a hostname to a proper IP address.

The solution is simple, make sure that the client from which you are trying to connect is able to resolve all the hostnames in your RAC installation. Either via registering the nodes in a DNS server or by updateding the /etc/hosts file (remember, Windows also has an /etc/hosts file under <WINDOWS>\system32\drivers\etc\hosts).

I’ve seen many DBAs forget that for server-side LOB you need to be able to resolve each cluster node hostname from the client EVEN when you only use IPs in your TNSNAMES.ORA entries. Even without server-side LOB, it’s good practice to be able to resolve RAC hostnames from the client.

This isn’t actually a bug, it’s by-design and if you understand they way server-side LOB works in RAC you’ll see why the client has to be able to resolve RAC hostnames to IP addresses.

23rd

Ora-12520 When listeners on VIP in 10g RAC Setup

Posted by David Yahalom under RAC, Oracle

A few days ago I’ve stumbled across an annoying problem with Oracle 10.2.0.1 (yep, need to patch it) RAC causing my OCI client fail to connect with an ORA-12520 error (ORA-12520: TNS:listener could not find available handler for requested type of server ).

Everything was configured correctly on both the server side and the client.

1. Remote_Listener parameter is set to an alias which is defined in the Tnsnames.ora file on both the nodes.
2. The hostname is setup correctly in the tnsnames.ora file.
3. The IP address, VIP and hostname are all properly configured in /etc/hosts on all nodes.

This is a documented Oracle bug with a complete note on this in Metalink (342419.1). 10g RAC instances have a problem where the instances are not registering themselves correctly with the virtual IP address. They get registered to the real ip address where the listener is not listening.

The workaround is very simple:

Define a LOCAL_LISTENER entry for each node in the node’s local TNSNAMES.ORA file:

LISTENER_NODE_XXX =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = <node1-vip>)(PORT =1521))
)

Then issue the following command in the sqlplus prompt:

sql>Alter system set LOCAL_LISTENER= 'LISTENER_XXXX' scope=both sid='SID1'.

Remember to do the same for all other nodes in the cluster (each node should have an entry in its TNSNAMES.ORA file with the LISTENER_NODE_XXX definition + setting the LOCAL_LISTENER spfile parameter to the same TNSNAMES.ORA entry).

23rd

Simulate load on your linux server using a one-liner

Posted by David Yahalom under Unix, Linux, General IT

Yesterday I needed a quick way to simulate load on a Linux sever. I wanted to find out that my monitoring system is working and sending alerts.

While there are many many tools that can do this, all I needed was a simple bash one-liner shell script that can make the CPUs beg for mercy.

What I ended up using is this:

dd if=/dev/zero bs=100M | gzip | gzip -d | gzip | gzip -d | gzip | gzip -d > /dev/null &

Send a few of these babies to the background and you’ll start seeing smoke coming from your server soon enough.

6th
JUN

Start your servers: Intel’s next-gen CPU smokes!

Posted by David Yahalom under Hardware, Oracle

UPDATE: Whoops! Had bad URL in link to Nehalem Benchmarks. Fixed!

AnandTech, one of the best sources of hardware news & reviews,  have just published preliminary benchmarks of Intel’s next-gen CPU - the Nehalem.

Even when the CPU is not using final silicon and being coupled with a very early and unstable motherboard, it smoked the benchmarks suppressing a Penryn-based Core2 quad by as much as %50! In fact, a 2.6Ghz Nehalem is faster than a 3.2Ghz Penryn! Amazing.

This got me thinking about the x86 server market, especially when compared to non-x86 servers such as SPARC or PowerPC machines.
Even now we see that dollar-to-dollar, x86 servers can sometime outperform non-x86 hardware.

Nowadays when Oracle RAC clusters based on cheap $7000 x86 servers outperform single-instance super expensive high-end Sun/IBM servers, Intel’s new Nehalem could very well be the final proof many DBAs need in order to ditch their old SPARC/PowerPC servers and transition to Linux x86 RAC installations.

Anyway, AMD is going to launch a new chip design in early 2009 and I still see great promise in IBMs Cell CPU deign.

Interesting times ahead.

Read more about Nehalem’s architecture and on-die memory controller here.

DavidYahalom.com - Oracle, Databases, SQL, news, views, articles and in-depth analysis is powered by Wordpress. Designed by Free WordPress Themes.