Tuesday, May 23, 2006

Common VI Commands used

To move to a log at the end of the file:


^f move forward one screen

^b move backward one screen

^d move down (forward) one half screen

^u move up (back) one half screen

* u UNDO WHATEVER YOU JUST DID; a simple toggle

* a append text after cursor, until hit
* i insert text before cursor, until hit

* r replace single character under cursor (no needed)

R replace characters, starting with current cursor position, until hit

* x delete single character under cursor

Nx delete N characters, starting with character under cursor
* dd delete entire current line

Ndd or dNd delete N lines, beginning with the current line;
e.g., 5dd deletes 5 lines


yy copy (yank, cut) the current line into the buffer

Nyy or yNy copy (yank, cut) the next N lines, including the current line, into the buffer

p put (paste) the line(s) in the buffer into the text after the current line


/string search forward for occurrence of string in text

?string search backward for occurrence of string in text

n move to next occurrence of search string

N move to next occurrence of search string in opposite direction

MOVING

:# move to line #

:$ move to last line of file

Monday, May 15, 2006

Why Use an Embedded Database?

whenever applications provide their own database management, those databases are called embedded. In a sense, they can be seen as being embedded in the application, as well as on the local system.

At one time, databases of this kind were the norm. Consider, for example, that COBOL has built-in primitives for creating and maintaining embedded databases. These are true primitive keywords of the language, they are not even function calls. Today, FoxPro and similar languages come closest to this model. In the heyday of COBOL and xPro languages, client-server relational database management systems were not common. As a result, these languages often created local data stores, which were used by the immediate application. Later on, data records would be exported to a central database as needed.

Embedded DB in Applications
Suppose you were designing a utility like Microsoft Outlook, which provides a GUI front end for entry of appointments, contacts, and individual e-mails. How would you handle the data management of these records? Likely, you would want a database to do this for you, but you'd be in a bit of jam deciding how to choose one. The first problem is that you cannot rely on the customer having any particular database, to serve as a back-end to your software. You could compensate for this by providing a generic database front-end (say, with ODBC, JDBC, or the like) and then have the customer map it to the DBMS in use at their site. But if the customer is a consumer running your productivity suite on an occasionally connected notebook, what are you going to do? The solutions is: You provide your own DBMS in the form of an embedded database. Later, if this embedded database has to sync with another (such as Outlook synching with Microsoft Exchange or with a handheld device), you rely on application-level data import and export functions to perform this function.

Consumer-oriented software is a clear, definable market for embedded databases. Candidate applications include accounting software, portfolio tracking, and so on. Today, each of these applications comes with its own embedded data store. The best embedded databases have several traits in common: they don't consume a lot of resources, they're fast, and they are utterly reliable—three qualities that consumer products must deliver in today's highly competitive markets.

ref:
http://www.devx.com/IBMDB2/Article/27622

Saturday, May 06, 2006

Spam prevention mechanism

One of the famous Spam prevention mechanism is Bayesian Filtering.
This Bayesian filtering can filter almost any kind of data but widely applied in email world to classify a spam mail.

The algorithm operates on the classic bayes theorem.
Probability that A occurs given B has occured = Probability that B occurs given A occured X probability that B occurs divided by Probability that A occurs

P(A/B) = P(B/A)x P(B) / P(A)


In email context, it is classifying that a mail is a spam based on the words in it.

Probability that these words occur in a spam mail X probability that a given mail is a spam normalised by the probability that these words can occur in a mail.

Probability that a given mail is a spam is a user specific factor : If the user rejects lot of mail as spam, he may be so picky or he really gets a lot of spam. Then this will be high for this user.

But if he gets very less probable spam mails; but what ever is put as spam contains these words; then it does not mean that this is a spam mail. Because these can be in a mail that is not a spam. So the probability of these words in any mail (denominator) checks this factor that commonly used words in all mails will not be given high weightage in rejecting a mail as spam.