Monday, November 28, 2005

Quick Sort

  1. Pick an element, called a pivot, from the list.
  2. Reorder the list so that all elements which are less than the pivot come before the pivot and so that all elements greater than the pivot come after it (equal values can go either way). After this partitioning, the pivot is in its final position. This is called the partition operation
  3. Recursively sort the sub-list of lesser elements and the sub-list of greater elements.

Pseudo Code:
function quicksort(q)
var list less, pivotList, greater
if length(q) = 1
return q
else
select a pivot value pivot from q
for each x in q except the pivot element
if x <>

if x = pivot then add x to greater

add pivot to pivotList

return concatenate(quicksort(less), pivotList, quicksort(greater))



TYPE int_arr IS TABLE OF INTEGER INDEX BY BINARY_INTEGER;

procedure qsort(arr in out int_arr, lo in INTEGER, hi in INTEGER ) is
i INTEGER := lo;
j INTEGER := hi;
x INTEGER := arr((lo+hi)/2);
tmp INTEGER;
begin
LOOP
WHILE (arr(i) <> x LOOP
j := j - 1;
END LOOP;
IF i <= j THEN tmp := arr(i); arr(i) := arr(j); arr(j) := tmp; i := i + 1; j := j - 1; END IF; EXIT WHEN i > j;
END LOOP;
IF lo <>

Monday, November 21, 2005

Oracle: Choosing Index Keys

Guidelines for choosing keys to index:
  • Consider indexing keys that are used frequently in WHERE clauses.
  • Frequently to join tables in SQL statements.
  • have high selectivity(percentage of rows in a table that have the same value for indexed key). Indexing low selectivity columns can be helpful if the data distribution is skewed so that one or two values occur much less often than other values.
  • Do not use b-tree indexes n keys with few distinct vlues.
  • Do not index columns that are modified frequently
  • Do not index keys that appear only in WHERE clauses with functions or operators except with function based indexes.
  • Consider indexing foreign keys of referential intergrity constraints in cases in which a large number of I/U/D access the parent and child tables. this will avoid "share locking" on child table.
  • when choosing to index a key, you should balance the perf gain for queries and the perf loss for I/U/Ds by using SQL trace facility.

Saturday, November 19, 2005

Oracle: Flashback Query in 10g

select versions_starttime, versions_endtime, versions_xid,
versions_operation, rate
from rates versions between timestamp minvalue and maxvalue
order by VERSIONS_STARTTIME

select rate, versions_starttime, versions_endtime
from rates versions
between timestamp
to_date('12/1/2003 15:57:54','mm/dd/yyyy hh24:mi:ss')
and to_date('12/1/2003 16:57:55','mm/dd/yyyy hh24:mi:ss')

select rate, versions_starttime, versions_endtime
from rates versions
between scn 1000 and 1001

select versions_starttime, versions_endtime, versions_xid,
versions_operation, rate
from rates versions between timestamp
to_date('12/11/2003 15:57:52', 'mm/dd/yyyy hh24:mi:ss')
and maxvalue
order by VERSIONS_STARTTIME

SELECT UNDO_SQL
FROM FLASHBACK_TRANSACTION_QUERY
WHERE XID = '000A000D00000029';

Monday, November 14, 2005

Oracle: Links

http://www.cuddletech.com/articles/oracle/node1.html

Oracle: Flashback

-- flashback doesn't work on system tablespace.( don't login as sys/system)

- startup mount

- alter database archivelog;

- alter system set DB_FLASHBACK_RETENTION_TARGET=1440

- alter system set DB_RECOVER_FILE_SIZE=512000000

- alter database flashback on;

- alter database open

- flashback table "TEST" to before drop;

-- two important things: SCN and 'Recycle Bin'
show recyclebin;

Thursday, November 10, 2005

Oracle: Using Loop and Random

Create table test_normal (empno number(10), ename varchar2(30), sal number(10));

Begin
For i in 1..1000000
Loop
Insert into test_normal
values(i, dbms_random.string('U',30), dbms_random.value(1000,7000));
If mod(i, 10000) = 0 then
Commit;
End if;
End loop;
End;/

Friday, November 04, 2005

Oracle: Histograms

When to use histograms:
1. When the columns are used frequently in where clauses.
2. it's hightly skewed data distribution.

Creating Histograms:
- execute DBMS_STATS.GATHER_TABLE_STATS
('scott','emp', METHOD_OPT => 'FOR COLUMNS SIZE 10 sal');
( create a 10 bucket histograms on sal column of emp table)

The default number of buckets is 75

Oracle: Generation statistics

Oracle use following techniques to generates statistics:
- Estimation based on random data sampling
- Exact computation
- User-defined statistics collection methods

To collection statistics on schema
- execute DBMS_STATS.GATHER_SCHEMA_STATS('OE',DBMS_STATS.AUTO_SAMPLE_SIZE);

You can use DBMS_STATS to gather, modify, view, export, import and delete statistics.

DBMS_STATS Package procedures:
- GATHER_INDEX_STATS : Index statistics
- GATHER_TABLE_STATS : Table, Columns and index statistics
- GATHER_SCHEMA_STATS : Statistics for all objects in a schema
- GATHER_DATABASE_STATS : Statistics for all objects in a database
- GATHER_SYSTEM_STATS : CPU and I/O statistics for the system

Gathering statistics on table/col/index will invalidate parsed SQL statements.
Gathering system statistics will not invalidate parsed SQL.


- Gather statistics. Gathering ends after 720 mins and is stored in mystats table
** day time **
BEGIN
DBMS_STATS.GATHER_SYSTEM_STATS(
gathering_mode => 'interval',
interval => 720,
stattab = 'mystats',
statid => 'OLTP');
end;
/

** night time **
BEGIN
DBMS_STATS.GATHER_SYSTEM_STATS(
gathering_mode => 'interval',
interval => 720,
stattab = 'mystats',
statid => 'OLAP');
end;
/

- to import OLAP statistics for nighttime

BEGIN
DBMS_JOB.SUBMIT(:jobno,
'DBMS_STATIS.IMPORT_SYSTEM_STATS(''mysstats'',''OLAP'');'
SYSDATE +0.5, 'SYSDATE + 1');
COMMIT;
END;
/

Oracle: Statistics

- Statistics is the data distribution and storage characteristics of tables, columns, indexes and partitions.
- Selectctivity is the fraction of rows in a table that sql statement's predicate chooses.

- Statistics generated include the following:
- Table statistics
- Number of rows
- Number of blocks
- Average row length
- Column statistics
- Number of distinct values in column
- Number of nulls in column
- Data distribution (histogram)
- Index statistics
- Number of leaf blocks
- Levels
- Clustering factor
- System statistics
- I/O performance and utilization
- CPU performance and utilization

Oracle: setup sga_aggregate_targate

- in oracle9i, it intruduced a new feature sga_aggreate_targate/workarea_size_policy

- sga_aggregate_targate will define how much memory will be allocated to sga for all sessions. if there're more user session, each session will be allocated less memory. if there're less user sessions, each session will be allocated more memory.

- this eliminate tunning on hash_area_size, sort_area_size and so on.

- how to set :
set workarea_size_policy to auto
set sga_aggregate_targate to 16% of physical memory on oltp and 40% of memory on dss system.

- how measure the performance
select * from v$pgastat
over allocation count should be 0
cache hit percent should be 100
select name,value from v$sysstat where name like 'workarea exe%'
one pass and multi pass should be zero

- advisor feature in 9iR2
v$pga_target_advice

- how to query v$pga_target_advice
to eliminate over_allocate:
select round(pga_target_for_estimate/1024/1024) as target_size_MB,
bytes_processed,estd_extra_bytes_rw as est_rw_extra_bytes,
estd_pga_cache_hit_percentage as est_hit_pct,
estd_overalloc_count as est_overalloc
from v$pga_target_advice;

to eliminate extra reads/writes
select round(pga_target_for_estimate/1024/1024) as target_size_MB,
bytes_processed,estd_extra_bytes_rw as est_rw_extra_bytes,
estd_pga_cache_hit_percentage as est_hit_pct,
estd_overalloc_count as est_overalloc
from v$pga_target_advice;

Thursday, November 03, 2005

Oracle: what happened during hot backup

  1. Oracle checkpoints the tablespace, flushing all changes from shared memory to disk.
  2. The SCN markers for each datafile in that tablespace are "frozen" at their current values. Even though further updates will be sent to the datafiles, the SCN markers will not be updated until the tablespace is taken out of backup mode.
  3. Oracle switches to logging full images of changed database blocks to the redologs. Instead of recording how it changed a particular block (the change vector), it will log the entire image of the block after the change. This is why the redologs grow at a much faster rate while hot backups are going on.

TCPWrapper: logging

in hosts.deny add below entry to enable tcpwrapper logging to send email to admin

ALL:All: spawn (echo '%h-%d' | mailx -s'%h-%d rejected on fcasun2' -r jamesp@synnex.com jamesp@synnex.com) &

Perl: automate sftp using perl or expect

#!/usr/bin/perl -w

use Net::SFTP;
use strict;

my $host = "host.ssh.com";
my %args = (
user => 'your_user_name,
password => 'your_password',
debug => 'true'
);

my $sftp = Net::SFTP->new($host, %args);
$sftp->get("/home/user/something.txt", "/home/user/hey.txt");
$sftp->put("bar", "baz");




#!/usr/local/bin/expect

spawn sftp -b cmdFile user@yourserver.com
expect "password:"
send "shhh!\n";
interact

This will spawn sftp in batch mode and pass in the password shhh! to the program. SFTP will than execute all the commads in cmdFile

cmdFile
lcd /home/ftp/test
cd /home/ftp/somedir
mput *.dat

lcd /home/recieving
cd /home/someotherdir
mget *.dat


Enable tcpwrapper on solaris10

- on solaris 10, ssh, sendmail and rcpbind was tcpwrapper enabled by default
if there's no hosts.allow/deny files, it will grant access by default to all

-To enable TCP Wrappers support for inetd-based services, you can simply use the following commands:
# inetadm -M tcp_wrappers=true
# svcadm refresh inetd

# inetadm -l telnet | grep tcp_wrappers

- you can also use svccfg to enable inetd based services.
# svccfg -s inetd setprop defaults/tcp_wrappers=true
# svcadm refresh inetd


Wednesday, November 02, 2005

Korn Shell Global funcations

- create file get_time in directory ~/kfunctions
function get_time{
TIME=$(date '+%m/%d/%y')
printf "$TIME\n"
}

- create .kshrc file
export FPATH=/export/home/jamesp/kfunctions
autoload get_time

- add entry to .profile
ENV=~.kshrc

- use global functions
#!/bin/ksh
SYSTIME=$(get_time)
echo "$SYSTIME\n"

Perl: creating modules

- Perl is distributed with a program called h2xs. it will be used to generate skeletion files for a new module.

- h2xs -AXc -n NewModule

- edit NewModuel.pm

- run perl Makefile.PL
make
make install

- use NewModule
NewModule->functions

The Changes file is where you might keep track keep track of changes you make to your module as you write new versions

MANIFEST contains a list of files in this directory. If you add new files to the directory, you should also add them to the MANIFEST. The MANIFEST is used to create a tarball of your module for distribution, and it's also checked when people unpack the tarball and install the module

Makefile.PL is a Perl program used to create a Unix Makefile. You'll use this Makefile to test and install your module.

NewModule.pm is your module. You'll write the code here in the next step.

Tuesday, November 01, 2005

Perl DBI and Loop

DBI and loops
DBI statements are often used within loops that store or fetch data from a database. You might insert the contents of a text file into a database line by line, or read and parse multiple lines of output from a database query. I often see code like the following from new DBI programmers:

foreach $line () {
chomp $line;
($alpha, $beta, $charlie, $delta) = split(/,/,$line);
$sql = qq{
insert into $table (col_a, col_b, col_c, col_d)
values($alpha,$beta,$charlie,$delta)
};
$dbh = DBI->connect($dsn,'login','password');
$sth = $dbh->prepare($sql);
$sth->execute;
$dbh->disconnect;
}
This code works, but not very efficiently. With the above code it took an hour to insert 200,000 rows into a MySQL database -- and MySQL is pretty fast! It might take half a day with Oracle.
There are lots of improvements we can make. The first is just common sense: Never put anything in a loop that doesn't absolutely have to be there. For each of our 200,000 insertions, we connect to the database, prepare our SQL statement, execute it, and disconnect from the database. However, we only need to connect and disconnect once. We simply have to move our connect() and disconnect() outside the loop to drastically improve our performance: $dbh = DBI->connect($dsn,'login','password');
foreach $line () {
chomp $line;
($alpha, $beta, $charlie, $delta) = split(/,/,$line);
$sql = qq{
insert into $table (col_a, col_b, col_c, col_d)
values($alpha,$beta,$charlie,$delta)
};
$sth = $dbh->prepare($sql);
$sth->execute;
}
$dbh->disconnect;
This reduces our run time to half an hour, doubling our speed.
Placeholders
DBI allows us to further improve our code by using placeholders. We now connect and disconnect the database only once, but we still prepare and execute our SQL statement 200,000 times. The statement we prepare doesn't change much; all that changes is the insertion value. We can create a generic statement that we prepare only once, outside the loop. With this generic statement we can specify the insertion values at execution time. DBI uses a ? to define a placeholder, so our new SQL statement looks like this: $sql = qq{
insert into $table (col_a, col_b, col_c, col_d)
values(?, ?, ?, ?)
};
The statement is prepared the same as before: $sth = $dbh->prepare($sql);
Now we can fill in each ? at execution time, once we fill in the ? values. There are two ways to do this. First, we can use the bind_param() subroutine to indicate what values to use: $sql = qq{
insert into $table (col_a, col_b, col_c, col_d)
values(?, ?, ?, ?)
};
$sth = $dbh->prepare($sql);
foreach $line () {
chomp $line;
($alpha, $beta, $charlie, $delta) = split(/,/,$line);
$sth->bind_param(1, $alpha);
$sth->bind_param(2, $beta);
$sth->bind_param(3, $charlie);
$sth->bind_param(4, $delta);
$sth->execute;
}
The bind_param() subroutine allows a data type as a third argument. Normally, this is used to tell the driver whether the placeholder is a number or a string: $sth->bind_param(1, $alpha, SQL_INTEGER);
or $sth->bind_param(2, $beta, SQL_VARCHAR);
The second way to specify values at execution time is to pass them directly as arguments to the execute statement: $sth->execute($alpha, $beta, $charlie, $delta);
Using this method our code simplifies to: $dbh = DBI->connect($dsn,'login','password');
$sql = qq{
insert into $table (col_a, col_b, col_c, col_d)
values(?, ?, ?, ?)
}
$sth = $dbh->prepare($sql);
foreach $line () {
chomp $line;
($alpha, $beta, $charlie, $delta) = split(/,/,$line);
$sth->execute($alpha, $beta, $charlie, $delta);
}
$dbh->disconnect;
Our code now takes twenty minutes to run. We achieved this improvement despite the fact that the MySQL database driver only emulates placeholders. Had this been a database that directly supports placeholders, such as Oracle, the improvement would have been even more dramatic. Placeholders are any easy way to improve your DBI coding, but there are a few gotchas that beginners need to avoid. First, notice that placeholders don't have quotes. This is correct: $sql = qq{
insert into $table (col_a, col_b, col_c, col_d)
values(?, ?, ?, ?)
};
This is not correct: $sql = qq{
insert into $table (col_a, col_b, col_c, col_d)
values('?', '?', '?', '?')
};
Placeholders cannot be used for column names or table names. This won't work: $sql = qq{
select alpha, ? from table
};
Nor will this: $sql = qq{
select alpha, beta from ?
};
Or even this: $sql = qq{
select alpha, beta from table where ? > 200
};
However, you can use Perl variables for column and table names: $sql = qq{
select alpha, beta from $table where $col > 200
};
You'll have to re-prepare this statement whenever you want to change the values of $table or $col.
Fetches
We've dealt mostly with inserting data into a database, so now we'll look at the ways DBI provides to get data out. Most of those ways follow the same basic format: $dbh = DBI->connect($dsn, 'login', 'passwd');
$sql = qq{
select * from $table
};
$sth = $dbh->prepare($sql);
$sth->execute;
while (@row = $sth->fetchrow_array) {
print "@row\n";
}
$dbh->disconnect;
The most efficient of these is fetchrow_arrayref():
while ($row = $sth->fetchrow_arrayref) {
print "@$row\n";
}
There's also fetchrow_hashref(), which stores the keys and values of the database into a hash: while ($ref = $sth->fetchrow_hashref) {
foreach $key (keys %{$ref}) {
print "$ref->{$key}, ";
}
print "\n";
}
Bind Columns
Database fetches can be made even more efficient and elegant with the use of bind columns. Bind columns allow you to bind a variable to each column of a table, so that whenever a fetch is performed the bound variables are automatically updated: $sth->bind_columns(\%attr,@references_to_variables_to_bind);
We can use it like this: $sql = qq{
select alpha, beta, delta from table
};
$sth->prepare($sql);
$sth->execute;
my ($alpha, $beta, $delta);
$sth->bind_columns(undef, \$alpha, \$beta, \$delta);
while ($sth->fetch) {
print "$alpha - $beta - $delta\n";
}
Here, the fetch() subroutine is synonymous with fetchrow_arrayref(). This subroutine, combined with bind_columns(), is the fastest way to fetch data with DBI.
Error Checking
There's a big problem with all of code snippets you've seen in this article: We never check the return values of DBI methods. We simply assume that every DBI call was successful. The obvious solution is to add an or die clause after every DBI method call: $dbh = DBI->connect($dsn,'login','passwd')
die $DBI::errstr;
$sth = $dbh->prepare($sql) die $dbh->errstr;
$sth->execute die $sth->errstr;
However, there is a cleaner solution. Each DBI handle has a boolean attribute called RaiseError. If we set this to true, any errors triggered by that handle cause the script to die with an appropriate error message, saving us from having to check every DBI call we make. RaiseError can be set via the connect statement: $dbh = DBI->connect($dsn,'login','passwd',
{ RaiseError => 1 });
We can also call it after a connection is already made: $dbh->{RaiseError} = 1;
Transactions
I've found that many administrators shy away from using transactions with DBI. For those unfamiliar, transactions provide a way to group multiple SQL statements together so that database changes are only made if every statement is successful. For example, a transaction for updating a customer order database might include statements to update inventory, update accounts payable, and update shipping manifests. All updates must succeed; otherwise, we don't want to perform any of them. If only some of the updates succeeded, we might bill a customer for an order not shipped, or ship an order to a customer without billing them.
Many administrators shy away from using DBI transactions.
By default, DBI commits each statement as it's performed. To use transactions, we need to disable this behavior with the AutoCommit variable: $dbh = DBI->connect($dsn,'login','passwd', {
RaiseError => 1,
AutoCommit => 0,
});
We can also just say: $dbh->{AutoCommit} = 0;
With AutoCommit off, changes are revocable until commit() is called. Uncommitted statements can be undone with rollback().
Next, we need a way to group our statements that lets us test whether any failed. The easiest and most robust way to do this is to set RaiseError and wrap the statements in an eval block: $dbh->{RaiseError} = 1;
eval {
$sth_1->execute; # Update Inventory
$sth_2->execute; # Update Accounts Payable
$sth_3->execute; # Update Shipping
};
if ($@)
$dbh->rollback;
} else {
$dbh->commit;
}
If any statement in the eval block fails, $@ is set and we revoke all three statements.
Don't forget to commit your statements. If you do forget, what happens next depends on the database. Some databases automatically commit them; others roll them back.
References
The material for this article came from personal experience, a thousand or so readings of the DBI documentation and FAQ, and a generous review by DBI guru Tim Bunce. You can find the DBI module and its related documentation at http://www.symbolstone.org/technology/perl/ DBI.
__END__