Sie sind auf Seite 1von 27

Contents

Articles
Startingkdbplus/introduction Startingkdbplus/qlanguage Startingkdbplus/ipc Startingkdbplus/tables Startingkdbplus/hdb Startingkdbplus/rdb 1 3 12 14 18 21

References
Article Sources and Contributors Image Sources, Licenses and Contributors 24 25

Article Licenses
License 26

Startingkdbplus/introduction

Startingkdbplus/introduction
1.1 Overview
This is a quick start guide to kdb+ from Kx Systems, aimed primarily at those learning independently. It covers system installation, the q environment, q ipc, kdb+ tables and typical databases, and where to find more material. After completing this you should be able to follow the Borror tutorials Q for Mortals and Kdb+ For Mortals, and the wiki Reference and Cookbooks pages. One caution: you can learn kdb+ reasonably well by independent study, but for serious evaluation of the product you need the help of a kdb+ consultant. This is because kdb+ is typically used for very demanding applications that require experience to set up properly. Contact Kx Systems or one of its partners for help with such evaluations.

1.2 Kdb+
The kdb+ system is both a database and a programming language: kdb+ the database (k database plus). q the programming language for working with kdb+ Both kdb+ and q are written in the k programming language. You do not need to know k to work with kdb+, but will occasionally see references to it. For example, q is defined in the distributed script q.k.

1.3 Resources
Kx wiki
The Kx wiki is the best resource for learning kdb+, and includes: Jeff Borror's tutorials Q for Mortals and Kdb+ For Mortals. a cookbook of common tasks a reference on the built-in functions an svn repository with user and Kx contributed code.

Kx Html Pages
Some older, but still useful, html pages are at kx.com/documentation.php Kdb+ Database and Language Primer [2].
[1]

. See in particular, Dennis Shasha's

Other Web Pages


the Knowledge_Base Kdb [3] has a good overview

Discussion groups
the main discussion forum is the k4 listbox [4]. This is available only to licensed customers - please use a work email address to apply for access. the Kdb+ Personal Developers [5] forum is an open Google discussion group for users of the trial system.

Startingkdbplus/introduction

Additional Files
The kx.com/q [6] directory has various supporting files, for example the script sp.q referenced in this guide (which is also included with the trial system). These files are also copied to the svn repository, so for example, the sp.q script can also be found at kdb+.

1.4 Install Trial System


If you do not already have access to a licensed copy of kdb+, then get the 32-bit trial version from kx.com/Developers [7]. This is limited to a 32-bit address space, a 2 hour timeout and expiry every 90 days. Otherwise, it is a complete system and ideal for learning kdb+. When unzipping the install, take care to retain the folder structure.

1.5 Directory Layout


Kdb+ can be installed anywhere, but typically into a q subdirectory. For example: in Linux/Mac: ~/q ~/q/l32 in Windows: c:\q c:\q\w32 / main q directory / location of w32 executable / main q directory (under home) / location of l32 executable

If you need to install q elsewhere, set the environment variable QHOME to point to the new directory. If QHOME is not defined, kdb+ defaults to $HOME/q for unix-based systems, and c:\q for Windows. To run the system, see instructions in the next section, 2. Q Language.

1.6 Example Files


Two sets of scripts are referenced in this guide: 1. The trial system is distributed with the following example scripts in the main directory: sp.q - the Suppliers and Parts sample database trade.q - a stock trades sample database If you do not have these scripts, get them from kx.com/q [6] and save in your q directory. 2. Other examples are in the svn start directory. To install, download start.zip and unzip in the q directory, creating directory q/start.

Startingkdbplus/introduction

1.7 GUI
kdb+ has only a console interface, but there are some GUIs: the most popular is Charlie Skelton's studio for kdb+, a cross-platform execution environment. It is worth having this available even if you use another interface. First Derivatives [8] offer their clients the qIDE development system Q and K Development Tools [9] has an eclipse plugin Q Insight Pad [10] is an IDE for Windows Qconsole is an IDE using GTK Prev: Table of Contents Next: 2. Q Language Table of Contents

References
[1] http:/ / kx. com/ documentation. php [2] http:/ / www. kx. com/ q/ d/ primer. htm [3] http:/ / www. thalesians. com/ finance/ index. php/ Knowledge_Base/ Databases/ Kdb [4] http:/ / www. listbox. com/ subscribe/ ?listname=k4 [5] http:/ / groups. google. com/ group/ personal-kdbplus [6] http:/ / www. kx. com/ q [7] http:/ / kx. com/ Developers/ software. php [8] http:/ / www. firstderivatives. com [9] http:/ / www. qkdt. org [10] http:/ / www. qinsightpad. com

Startingkdbplus/qlanguage
2.1 Overview
Q is the programming system for working with kdb+. This corresponds to SQL for traditional databases, but unlike SQL, q is a powerful programming language in its own right. Q is an interpreted language. Q expressions can be entered and executed in the q console, or loaded from a q script, which is a text file with extension .q. You need at least some familiarity with q to use kdb+. Try following the examples here in the q console interface. Also, ensure that you have the example files installed. The following wiki pages will also be useful: Function Reference Data Types System Commands Command Line Parameters

Startingkdbplus/qlanguage

2.2 Loading q
You load q by changing to the main q directory, then running the q executable. You should not just click the q executable from the explorer - this will load q but in the wrong directory. It is best to create a startup script which might do other preprocessing such as setting environment variables, see examples q.sh and q.bat in the start [1] directory. in Windows, enter in a command window: c: cd \q w32\q.exe or create a batch file with contents below that allows parameters to be passed to q: c: cd \q w32\q.exe %* in Linux/Mac, it is usual to call the q executable under rlwrap to support line recall and edit. In the console: ..$ cd ~/q ~/q$ rlwrap l32/q The following Linux shell script changes to the q directory, sets the appropriate directory for 32 or 64 bit, then loads q under rlwrap: #!/bin/bash cd ~/q if [ "x86_64" == uname -m ]; then p=l64; else p=l32; fi rlwrap $p/q "$@"

2.3 First Steps


Once q is loaded, you can enter expressions for execution: q)2 + 3 5 q)2 + 3 4 7 5 6 9 You can confirm that you are in the main q directory by calling a directory list command, e.g. in Windows: q)\dir *.q "sp.q" "trade.q" ... in Linux/Mac: q)\ls *.q "sp.q" "trade.q"

Startingkdbplus/qlanguage ... Command line parameters are given here. For example: ... q profile.q -p 5001 loads script profile.q at startup. This can in turn load other scripts. sets listening port to 5001 At any prompt, enter \\ to exit q.

2.4 Console Modes


The usual prompt is q). Sometimes a different prompt is given; you need to understand why this is, and how to return to the standard prompt. 1. If a function is suspended, then the prompt has two or more ")". In this case, enter a single \ to reduce one level of suspension, and repeat until the prompt becomes q). For example: q)f:{2+x} q)f `sym {2+x} 'type + 2 `sym q))\ q) / define function f / function call fails with symbol argument / and is left suspended

/ prompt becomes q)). Enter \ to return to usual prompt

2. If there is no suspension, then a single \ will toggle q and k modes: q)count each (1 2;"abc") 2 3 q)\ #:'(1 2;"abc") 2 3 \ q) / q expression for length of each list item / toggle to k mode / equivalent k expression / toggle back to q mode

3. If you change namespace, then the prompt includes the namespace, see namespace directory. For example: q)\d .h q.h)\d . q) / change to h namespace / change back to root namespace

Startingkdbplus/qlanguage

2.5 Error Messages


Error messages are terse. The format is a single quote, followed by error text: q)1 2 + 10 20 30 'length q)2 + "hello" 'type / cannot add 2 numbers to 3 numbers

/ cannot add number to character

2.6 Introductory Examples


To gain experience with the language, enter the following examples and explain the results. Also experiment with similar expressions. The / marks a comment, which should not be entered. q)x:2 5 4 7 5 q)x 2 5 4 7 5 q)count x 5 q)8 # x 2 5 4 7 5 2 5 4 q)2 3 # x 2 5 4 7 5 2 q)sum x 23 q)sums x 2 7 11 18 23 q)distinct x 2 5 4 7 q)reverse x 5 7 4 5 2 q)x within 4 10 01111b q)x where x within 4 10 5 4 7 5 q)y:(x;"abc") q)y 2 5 4 7 5 "abc" q)count y 2 q)count each y 5 3

/ list of lists

The following has a function definition, where x represents the argument: q)f:{2 + 3 * x} q)f 5 17

Startingkdbplus/qlanguage q)f til 5 2 5 8 11 14 Q makes essential use of a symbol datatype: q)a:`toronto q)b:"toronto" q)count a 1 q)count b 7 q)a="o" `type q)b="o" 0101001b q)a~b 0b q)a~`$b 1b / symbol / character string

/ a is not the same as b / `$b converts b to symbol

2.7 Data Structures


Q basic data structures are atoms (singletons) and lists. Other data structures like dictionaries and tables are built from lists. For example, a simple table is just a list of column names associated with a list of corresponding column values, each of which is a list. q)item:`nut q)items:`nut`bolt`cam`cog q)sales: 6 8 0 3 q)prices: 10 20 15 20 q)(items;sales;prices) nut bolt cam cog 6 8 0 3 10 20 15 20 / atom (singleton) / list / list / list / list of lists

q)dict:`items`sales`prices!(items;sales;prices) / dictionary q)dict items | nut bolt cam cog sales | 6 8 0 3 prices| 10 20 15 20 q)tab:flip dict q)tab items sales prices -----------------nut 6 10 bolt 8 20 / table

Startingkdbplus/qlanguage cam cog 0 3 15 20 / keyed table

q)1!tab items| sales prices -----| -----------nut | 6 10 bolt | 8 20 cam | 0 15 cog | 3 20

The table created above is an ordinary variable in the q workspace, and could be written to disk. In general, you create kdb+ tables in memory and then write to disk. Since it is a table, you can use SQL-like query expressions on it: q)select from tab where prices < 20 items sales prices -----------------nut 6 10 cam 0 15

2.8 Functions, Verbs, Adverbs


Functions take arguments on their right. Verbs take arguments on left and right, as in * (multiplication). Adverbs take function or verb arguments (on their left) and produce derived functions or verbs. In practice, the term function is used for both functions and verbs, except where the distinction is relevant. For example: q)sales * prices 60 160 0 60 q)sum sales * prices 280 q)sumamt:{sum x*y} q)sumamt[sales;prices] 280 / verb: * / function: sum / define function: sumamt

q)(sum sales*prices) % sum sales / calculate weighted average 16.47059 q)sales wavg prices / built-in verb: wavg 16.47059 q)sales , prices 6 8 0 3 10 20 15 20 q)sales ,' prices 6 10 8 20 0 15 3 20 Functions can apply to dictionaries and tables: / verb: , join lists / adverb: ' join lists in pairs

Startingkdbplus/qlanguage q)-2 # tab items sales prices -----------------cam 0 15 cog 3 20 Functions can be used within queries: q)select items,sales,prices,amount:sales*prices from tab items sales prices amount ------------------------nut 6 10 60 bolt 8 20 160 cam 0 15 0 cog 3 20 60

2.9 Scripts
A q script is a plain text file with extension .q, which contains q expressions that are executed when loaded. For example, load the script sp.q and display the "s" table that it defines: q)\l sp.q q)s s | --| s1| s2| s3| s4| s5| / load script / display table s name status city ------------------smith 20 london jones 10 paris blake 30 paris clark 20 london adams 30 athens

Within a script, a line that contains a single / starts a comment block. A line with a single \ ends the comment block, or if none, exits the script. A script can contain multi-line definitions. Any line that is indented is assumed to be a continuation of the previous line. Blank lines, superfluous blanks, and lines that are comments (begin with /) are ignored in determining this. For example, if a script has contents: a:1 2 / this is a comment line 3 + 4 b:"abc" Then loading this script would define a and b as: q)a 5 6 7

/ i.e. 1 2 3 + 4

Startingkdbplus/qlanguage q)b "abc"

10

2.10 Q Queries
Q queries are similar to SQL, though often much simpler: \l sp.q q)select from p where weight=17 p | name color weight city --| -----------------------p2| bolt green 17 paris p3| screw blue 17 rome SQL statements can be entered, if prefixed with s) q)s)select * from p where color in (red,green) p | name color weight city --| ------------------------p1| nut red 12 london p2| bolt green 17 paris p4| screw red 14 london p6| cog red 19 london The Q equivalent would be: q)select from p where color in `red`green Similarly, compare: q)select distinct p,s.city from sp s)select distinct sp.p,s.city from sp,s where sp.s=s.s and q)select from sp where s.city=p.city s)select sp.s,sp.p,sp.qty from s,p,sp where sp.s=s.s and sp.p=p.p and p.city=s.city Note that the dot notation in q automatically references the appropriate table. Q results can have lists in the rows: q)select qty by s from sp s | qty --| ----------------------s1| 300 200 400 200 100 400 s2| 300 400 s3| ,200 s4| 100 200 300 ungroup will flatten the result: / SQL query

Startingkdbplus/qlanguage q)ungroup select qty by s from sp s qty -----s1 300 s1 200 s1 400 s1 200 ... Calculations can be performed on the intermediate results: q)select countqty:count qty,sumqty:sum qty by p from sp p | countqty sumqty --| --------------p1| 2 600 p2| 4 1000 p3| 1 400 p4| 2 500 p5| 2 500 p6| 1 100 Prev: 1. Introduction Next: 3. Q IPC Table of Contents

11

References
[1] http:/ / code. kx. com/ wsvn/ code/ contrib/ cburke/ start

Startingkdbplus/ipc

12

Startingkdbplus/ipc
3. Q IPC
3.1 Overview
A production kdb+ system may have several q processes, possibly on several machines. These communicate via tcp/ip. Any q process can communicate with any other q process as long as it is accessible on the network and is listening for connections. a server process listens for connections and processes any requests a client process initiates the connection and sends commands to be executed Client and server can be on the same machine or on different machines. A process can be both a client and a server. A communication can be synchronous (wait for a result to be returned) or asynchronous (no wait and no result returned).

3.2 Initialize Server


A q server is initialized by specifying the port to listen on, with either a command line parameter or a session command: ..$ q -p 5001 q)\p 5001 / command line / session command

3.3 Communication Handle


A communication handle is a symbol that starts with : and has the form: `:[server]:port where the server is optional, and port is a port number. The server need not be given if on the same machine. Examples: `::5001 `:genie:5001 `:198.168.1.56:5001 `:www.example.com:5001 / / / / server server server server on on on at same machine as client machine genie given IP address www.example.com

The function hopen starts a connection, and returns an integer connection handle. This handle is used for all subsequent client requests. For example: q)h:hopen `::5001 q)h "3?20" 1 12 9 q)hclose h

Startingkdbplus/ipc

13

3.4 Synchronous/Asynchronous
Where the connection handle is used as defined (it will be a positive integer), the client request is synchronous. In this case, the client waits for the result from the server before continuing execution. The result from the server is the result of the client request. Where the negative of the connection handle is used, the client request is asynchronous. In this case, the request is sent to the server, but the client does not wait or get a result from the server. This is done when a result is not required by the client. For example: q)h:hopen `::5001 q)(neg h) "a:3?20" q)(neg h) "a" q)h "a" 0 17 14

/ send asynchronously, no result / again no result / synchronous, with result

3.5 Message Formats


There are two message formats: a string containing any q expression to be executed on the server a list (function; arg1; arg2; ...) where the function is to be applied with the given arguments For example: q)h "1 2 3 +/ 10 20" 31 32 33 q)h (+/;1 2 3;10 20) 31 32 33 / send q expression / send function and its arguments

3.6 Http Connections


A qserver can also be accessed via http. To try this, run a q server on your machine with port 5001. Then, load a web browser, and go to http://localhost:5001 [1]. You can now see the names defined in the base context. Prev: 2. Q Language Next: 4. Kdb+ Tables Table of Contents

References
[1] http:/ / localhost:5001

Startingkdbplus/tables

14

Startingkdbplus/tables
4.1 Overview
A basic understanding of the internal structure of tables is needed to work with kdb+. The structure is actually quite simple, but very different from conventional databases. This section gives a quick overview, followed by an explanation of the sp.q script, and then a typical table for stock data. After completing this, you should read the page kdbplus database, which has a detailed comparison of kdb+ and conventional RDBMS. Kdb+ tables are created out of lists. A table with no key columns is essentially a list of column names associated with a list of corresponding column values, each of which is a list. A table with key columns is internally built from a pair of tables - the key columns associated with the non-key columns. Kdb+ tables are created in-memory, and then written to disk if required. When written to disk, smaller tables can be stored in a single file, while larger tables are usually partitioned in some way. The partitioning can be seen when viewing the file directories, but the table is treated as a single object within a q process.

4.2 Creating Tables


There are two ways of creating a table. One way explicitly associates lists of column names and data; the other uses a q expression that specifies the column names and initial values. The second method also permits the each column's datatype to be given, and so is particularly useful when a table is created with no data. create table by association: q)tab:flip `items`sales`prices!(`nut`bolt`cam`cog;6 8 0 3;10 20 15 20) q)tab items sales prices -----------------nut 6 10 bolt 8 20 cam 0 15 cog 3 20 create table by specifying column names and initial values: q)tab2:([]items:`nut`bolt`cam`cog;sales:6 8 0 3;prices:10 20 15 20) q)tab~tab2 1b / tab and tab2 are identical

The form for the second method, for a table with j primary keys and n columns in total, is: t:([c1:v1;...;cj:vj]cj+1:vj+1;...;cn:vn) Here table t is defined with column names ci, and corresponding values vi. The square brackets are for primary keys, and are required even if there are no primary keys.

Startingkdbplus/tables

15

4.3 Suppliers and Parts


The script sp.q defines C.J. Date's Suppliers and Parts database. You can view this script in an editor to see the definitions. Load the script with: q)\l sp.q

Table s
Table s has a primary key column, also called s, given as a list of symbols which should be unique. Note that in this example, the name "s" is used both for the table and the primary key column, but this is not required. The remaining columns are of type symbol, integer, symbol. s:([s:`s1`s2`s3`s4`s5] name:`smith`jones`blake`clark`adams; status:20 10 30 20 30; city:`london`paris`paris`london`athens) Display in q: q)s s | --| s1| s2| s3| s4| s5|

name status city ------------------smith 20 london jones 10 paris blake 30 paris clark 20 london adams 30 athens

Note that the column types are set from the data given. If this were first created as an empty table, say table "t", then the column types could be defined explicitly as follows: q)t:([s:`$()]name:`$();status:"I"$();city:`$()) Insert a row: q)`t insert (`s1;`smith;20;`london) ,0 q)t s | name status city --| ------------------s1| smith 20 london

Table p
Table p is created much like table s. As before, the table name and primary key name are both the same: p:([p:`p1`p2`p3`p4`p5`p6] name:`nut`bolt`screw`screw`cam`cog; color:`red`green`blue`red`blue`red; weight:12 17 17 14 12 19; city:`london`paris`rome`london`paris`london) Display in q:

Startingkdbplus/tables q)p p | --| p1| p2| p3| p4| p5| p6|

16

name color weight city ------------------------nut red 12 london bolt green 17 paris screw blue 17 rome screw red 14 london cam blue 12 paris cog red 19 london

Table sp
Table sp is defined with no primary key. Columns s and p reference tables s and p respectively as foreign keys. The syntax for specifying another table's primary key as a foreign key is: `tablename$data The definition of sp is: sp:([] s:`s$`s1`s1`s1`s1`s4`s1`s2`s2`s3`s4`s4`s1; p:`p$`p1`p2`p3`p4`p5`p6`p1`p2`p2`p2`p4`p5; qty:300 200 400 200 100 100 300 400 200 200 300 400) Display in q: q)sp s p qty --------s1 p1 300 s1 p2 200 s1 p3 400 s1 p4 200 s4 p5 100 ...

4.4 Stock Data


The following is a typical layout populated with random data. The definitions are in script trades.q in the start directory. Load as: q)\l start/trades.q A trade table might include: date, time, symbol, price, size, condition code: q)trades:([]date:`date$();time:`time$();sym:`symbol$(); price:`real$();size:`int$(); cond:`char$()) q)`trades insert (2010.02.21;10:03:54.347;`IBM;20.83e;40000;"N") q)`trades insert (2010.02.21;10:04:05.827;`MSFT;88.75e;2000;"B") q)trades
[1]

Startingkdbplus/tables date time sym price size cond --------------------------------------------2010.02.21 10:03:54.347 IBM 20.83 40000 N 2010.02.21 10:04:05.827 MSFT 88.75 2000 B The ? verb will generate random data: q)syms:`IBM`MSFT`UPS`BAC`AAPL q)tpd:100 / trades per day q)day:5 / number of days q)cnt:count syms / number of syms q)len:tpd*cnt*day / total number of trades q)date:2010.02.21+len?day q)time:"t"$raze (cnt*day)#enlist 09:30:00+15*til tpd q)time+:len?1000 q)sym:len?syms q)price:len?100e q)size:100*len?1000 q)cond:len?" ABCDENZ" q)`trades:0#trades / empty trades table q)`trades insert (date;time;sym;price;size;cond) 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 .. q)trades:`date`time xasc trades / sort on time within date q)5#trades date time sym price size cond -----------------------------------------------2010.02.21 09:30:00.766 UPS 70.38 46900 A 2010.02.21 09:30:00.801 IBM 89.24799 58600 N 2010.02.21 09:30:00.942 UPS 38.4812 54600 A 2010.02.21 09:30:15.116 IBM 25.56824 55700 A 2010.02.21 09:30:15.224 MSFT 75.97006 23800 E -- Prev: 3. Q IPC Next: 5. Historical Database Table of Contents

17

Startingkdbplus/hdb

18

Startingkdbplus/hdb
5.1 Overview
A historical database (hdb) holds data before today, and its tables would be stored on disk as being much too large to fit in memory. Each new day's records would be added to the hdb at the end of day. Typically, large tables in the hdb (such as daily tick data) are stored splayed, i.e. each column is stored in its own file, see cookbook/splayed tables and kdb+formortals/splayed. Typically also, large tables are stored partitioned by date. Very large databases may be further partitioned into segments, using par.txt. These storage strategies give best efficiency for searching and retrieval. For example, the database can be written over several drives. Also, partitions can be allocated to slave threads so that queries over a range of dates can be run in parallel. The exact set up would be customized for each installation. For example, a simple partitioning scheme on a single disk might be as follows. Here, the daily and master tables are small enough to be written to single files, while the trade and quote tables are splayed and partitioned by date:

5.2 Sample Database


The script buildhdb.q in the start [1] directory will build a sample hdb. This builds a month's random data in directory start/db, and takes a few seconds to run. To do so, load q then: q)\l start/buildhdb.q To load the database, either start q with an argument of the database directory: ..$ q start/db or load the database within a q session: q)\l start/db In q (actual values may vary): q)count trade 342102j

Startingkdbplus/hdb q)count quote 1709919j q)t:select from trade where date=last date, sym=`IBM q)count t 1041 q)5#t date time sym price size stop cond ex --------------------------------------------------2010.12.31 09:30:00.055 IBM 55.65 74 0 N N 2010.12.31 09:30:00.114 IBM 55.66 72 1 W N 2010.12.31 09:30:01.970 IBM 55.56 37 0 T N 2010.12.31 09:30:03.091 IBM 55.56 41 1 R N 2010.12.31 09:30:06.930 IBM 55.57 89 0 B N q)select count i by date from trade date | x ----------| ----2010.12.01| 14991 2010.12.02| 14705 2010.12.03| 14817 2010.12.06| 14877 ... q)select[5] cnt:count i,sum size,last price, wprice:size wavg price by 5 xbar time.minute from t minute| cnt size price wprice ------| ----------------------09:30 | 42 2138 55.24 55.37768 09:35 | 22 1329 55.32 55.35988 09:40 | 23 1279 55.2 55.25091 09:45 | 16 716 54.99 55.13633 09:50 | 24 1187 54.82 54.83702 q)select[-5] open:first price,lo:min price,hi:max price, close:last price by 10 xbar time.minute from t minute| open lo hi close ------| ----------------------15:10 | 55.64 55.43 55.64 55.56 15:20 | 55.56 55.54 55.95 55.95 15:30 | 55.88 55.61 55.99 55.74 15:40 | 55.81 55.8 56.18 55.86 15:50 | 55.84 55.84 56.5 56.38

19

Startingkdbplus/hdb

20

5.3 Sample Segmented Database


The buildhdb.q script can be customized to build a segmented database. In practice, database segments should be on separate drives, but for illustration, the segments are here written to a single drive. Both the database root, and the location of the database segments need to be specified. For example, edit the first few lines of the script as follows: dst:`:start/dbs dsp:`:/dbss dsx:5 bgn:2007.01.01 end:2010.12.31 / / / / new database root database segments directory number of segments set 4 years data

Ensure that the directory given in dsp is created, writable and empty, then load the modified script, which should now take a minute or so. This should write the partioned data to subdirectories of dsp, and create a par.txt file like: /dbss/d0 /dbss/d1 /dbss/d2 /dbss/d3 /dbss/d4 Restart q, and load the segmented database: q)\l start/dbs q)(count quote), count trade 81258538 16248124j q)select cnt:count i,sum size,size wavg price from trade where date in 2007.11.19+til 5, sym=`IBM cnt size price -------------------4213 227283 47.12981 Prev: 4. Kdb+ Tables Next: 6. Realtime Database Table of Contents

Startingkdbplus/rdb

21

Startingkdbplus/rdb
6.1 Overview
A real-time database (rdb) stores today's data. Typically, it would be stored in memory during the day, and written out to the historical database (hdb) at end of day. Storing the rdb in memory results in extremely fast update and query performance. As a minimum, it is recommended to have RAM of at least 4 times expected data size, so for 5 GB data per day, the rdb machine should have at least 20 GB RAM. In practice, much larger RAM might be used.

6.2 Data Feeds


Data feeds can be any market or other time series data. A feedhandler converts the data stream into a format suitable for writing to kdb+. These are usually written in a compiled language, such as c or c++. In the example described here, the data feed is generated at random by a q process.

6.3 Tickerplant
The data feed could be written directly to the rdb. More often, it is written to a q process called a tickerplant, which may run several actions whenever data is received, for example: write all incoming records to a log file push all data to the rdb push all or subsets of the data to other processes run any other q code that should be executed as new data arrives

Other processes would subscribe to a tickerplant to receive new data, and each would specify what data should be sent (all or a selection). The kdb+tick [1] product from Kx is a tickerplant that is recommended for production systems with large volumes of real time data.

6.4 Example
The scripts in start/tick [2] run a simple tickerplant/rdb configuration. Note that they are not suitable for production use (no logging, error handling, end of day roll over etc). The layout is: feed | tickerplant / / | \ \ \ rdb vwap hlcv tq last show /\ /\ /\ /\ /\ ... client applications ... Here: feed is a demo feedhandler, that generates random trades and quotes and sends them to the tickerplant. In practice, this would be replaced by real feedhandlers. The tickerplant gets data from feed and pushes it to clients that have subscribed. Once the data is written, it is discarded.

Startingkdbplus/rdb The rdb, vwap, hlcv, tq and last processes are databases that have subscribed to the tickerplant. Note that these databases can be queried by a client application. rdb has all of today's data vwap has volume weighted averages for selected stocks hlcv has high, low, close, volume for selected stocks tq has a trade and quote table for selected stocks. Each row is a trade joined with the most recent quote. last has the last entries for each stock in the trade and quote tables

22

The show process displays the incoming feed for selected stocks. Note that all the client processes load the same script file cx.q, with a parameter that selects the corresponding code for the process in that file. Alternatively, each process could load its own script file, but since the definitions tend to be very short, it is convenient to use a single script for all. See c.q [3] for more examples (written for kdb+tick).

6.5 Running the Demo


The start/tick [2] scripts run the demo, which should display each q process in a separate window. If necessary, update for the actual directories used. In Windows, call start/tick/run.bat. In !Linux/Gnome, call start/tick/run.sh. In any other system, either modify the scripts for your environment or start the processes manually, see next section. The calls starting each process are essentially: 1. tickerplant - the parameter ticker.q is the script defining the tickerplant, and the port is 5010: ..$ q ticker.q -p 5010 2. feed - connects to the tickerplant and sends a new batch every 107 milliseconds: ..$ q feed.q localhost:5010 -t 107 3. rdb - the parameter cx.q defines the realtime database and its own listening port (similarly for other databases): ..$ q cx.q rdb -p 5011 4. show - the show process, which does not need a port: ..$ q cx.q show

6.6 Running Processes Manually


If the run scripts are unsuitable for your system, then you can call each process manually. In each case, open up a new terminal window, change to the q directory and enter the appropriate command. The tickerplant should be started first. For example on a Mac, for each of the following commands, open a new terminal, change to ~/q, then call the command: m32/q start/tick/ticker.q -p 5010 m32/q start/tick/feed.q localhost:5010 -t 107 m32/q start/tick/cx.q rdb -p 5011 Refer to run1.sh for the remaining processes.

Startingkdbplus/rdb

23

6.7 Process Examples


Set focus on the last window, and view the trade table. Note that each time the table is viewed, it will be updated with the latest data: q)trade sym | time price size stop cond ex ----| -----------------------------------AIG | 14:26:48.844 27.62 18 0 Z O DELL| 14:26:49.058 11.83 57 0 K N DOW | 14:26:49.058 19.81 69 1 G O ... Set focus on the vwap window, and view the vwap table. Note that the "price" is actually price*size. This can be updated much more efficiently than storing all prices and sizes: q)vwap sym | price size ----| ------------IBM | 42153.14 998 MSFT| 51620.66 717 To get the correct weighted average price: q)select price%size,size by sym from vwap sym | price size ----| -------------IBM | 41.74374 31824 MSFT| 73.38304 28612 Prev: 5. Historical Database Table of Contents

References
[1] http:/ / kx. com/ kdb+ tick. php [2] http:/ / code. kx. com/ wsvn/ code/ contrib/ cburke/ start/ tick [3] http:/ / kx. com/ q/ tick/ c. q

Article Sources and Contributors

24

Article Sources and Contributors


Startingkdbplus/introduction Source: http://code.kx.com/mediawiki/index.php?oldid=2528 Contributors: Chris Burke, Simon Garland Startingkdbplus/qlanguage Source: http://code.kx.com/mediawiki/index.php?oldid=2530 Contributors: Chris Burke, Colm Earley Startingkdbplus/ipc Source: http://code.kx.com/mediawiki/index.php?oldid=2481 Contributors: Chris Burke Startingkdbplus/tables Source: http://code.kx.com/mediawiki/index.php?oldid=2289 Contributors: Charlie Skelton, Chris Burke Startingkdbplus/hdb Source: http://code.kx.com/mediawiki/index.php?oldid=2274 Contributors: Chris Burke Startingkdbplus/rdb Source: http://code.kx.com/mediawiki/index.php?oldid=2285 Contributors: Chris Burke

Image Sources, Licenses and Contributors

25

Image Sources, Licenses and Contributors


File:tree.png Source: http://code.kx.com/mediawiki/index.php?title=File:Tree.png License: unknown Contributors: -

License

26

License
terms and conditions TermsAndConditions