Sie sind auf Seite 1von 9

LSORT 4.04 (C) Copyright London Computing, 1983-1999. All rights reserved.

What is LSORT LSORT is a general purpose sort/merge utility written in Microsoft Visual C++ for Microsoft Windows NT 3.1 and above and for Windows 95. It runs on IBM PCs and compatibles with at least 16MB of RAM and a fixed disk. LSORT is User Supported Software, if this program proves useful, please make a contribution ($35 suggested) to: London Computing, P.O. Box 696 Cherry Hill, NJ 08003 Support is available at www.londoncomputing.com as are the latest versions of LSORT. You can email support questions to: londoncomputing@abac.com Anyone sending a contribution will receive a disk containing the source code to LSORTNT as well as a copy of the LSRTNT sort filter. LSRTNT is similar to the SORT filter but works much faster and will sort on multiple fields. Once you have registered, you will be able to create write user exits to LSORT using Visual C++ 5.0 or later. You may make copies of this software and distribute to other users as long as there is no charge or other consideration and this notice is not removed or bypassed. LSORTNT will sort MSDOS, Windows NT, OS2 files and dBase II and dBase III databases. (dBase III memo files and FOXPRO memo files are not sorted but .DBF files will be sorted.) Each file may be sorted using 1 to 32 sort fields. The file to be sorted may contain either fixed length records, variable length records or comma delimited records. Variable length records are records ending with cr/lf. Comma delimited records are variable length records where the fields are also variable length and separated by a comma. Character fields may be enclosed in either single or double quotes. It will merge up to 5 files using 1 to 32 sort fields. dBase databases may not be merged. Any field may be sorted in either ascending or descending sequence. LSORT allows for three user defined field types to be used: X,Y and Z. You must write your own comparison subroutine to compare user defined fields. The sort knows about: binary fields (to 127 bytes) A binary field is compared left to right based on value of the code in each field (0-255). It is useful for comparing strings where binary zeros are embedded and for comparing IBM Mainframe stype binary numbers packed decimal fields (1-8 bytes) Stored as on IBM Mainframe computers. Each digit position is stored in 4 bits as a binary value between 0 and 10. The digits are stored left to right with the rightmost position containing a sign, 0x0D for negative, 0x0C or 0x0F for positive. A packed decimal field can store between 1 and 15 digits depending on the length of the field. If an invalid sign field type B

field is specified, the sort won't produce what you would expect. Packed decimal values are only meaningful in fixed length record. character fields (to 127 bytes) character fields compare up to the first binary zero in the field following C language conventions. upper case character fields (sort fields are translated to upper case before compare) 2 byte integers in internal format 4 byte integers in internal format floating point numbers (ieee) double precision floating point (ieee) zoned decimal numbers (Text format numbers, Decimals are allowed) (LSORT now supports scientific notation as) (well, using E notation, eg. .98 == 9.8E-1) (xBase N and F fields are sorted as type N) 1 byte logical fields (dBase II or III) 2 position year field used as part of a date to be sorted, i.e. as the yy part of a yymmdd date. If the value of yy is > 30, it is assumed to be a year in 1900. If the value of yy is <= 30, it is assumed to be a year in 2000. 6-8 position mm/dd/yy date. The delimiter can be any punctuation character, e.g. period, comma slash, ... This date field is considered Y2K compliant using the rule of 30 as above. If the value of yy is > 30, it is assumed to be a year in 1900. If the value of yy is <= 30, it is assumed to be a year in 2000. yy,mm and dd may be either 1 or 2 positions long. Leading and trailing spaces are ignored. 6-8 position dd/mm/yy date. The delimiter can be any punctuation character, e.g. period, comma slash, ... This date field is considered Y2K compliant using the rule of 30 as above. If the value of yy is > 30, it is assumed to be a year in 1900. If the value of yy is <= 30, it is assumed to be a year in 2000. yy,mm and dd may be either 1 or 2 positions long. Leading and trailing spaces are ignored. 6-8 position yy/mm/dd date. The delimiter can be any punctuation character, e.g. period, comma slash, ... This date field is considered Y2K compliant using the rule of 30 as above. If the value of yy is > 30, it is assumed to be a year in 1900. If the value of yy is <= 30, it is assumed to be a year in 2000. yy,mm and dd may be either 1 or 2 positions long. Leading and trailing spaces are ignored. 6-10 position mm/dd/yyyy date. The delimiter can be any punctuation character, e.g. period, comma slash, ... dd and mm may be either 1 or 2 positions long. Leading and trailing spaces are ignored. yyyy may be from 1-4 positions long. Leading and trailing spaces are ignored. 6-10 position dd/mm/yyyy date. The delimiter can

U I L F D N

T 2

be any punctuation character as above. dd and mm may be either 1 or 2 positions long. Leading and trailing spaces are ignored. yyyy may be from 1-4 positions long. Leading and trailing spaces are ignored. 6-10 position yyyy/mm/dd date. The delimiter can be any punctuation character as above. dd and mm may be either 1 or 2 positions long. Leading and trailing spaces are ignored. yyyy may be from 1-4 positions long. Leading and trailing spaces are ignored. User defined field type X User defined field type Y User defined field type Z

X Y Z

A zoned decimal number is stored as a character string and may contain leading and trailing spaces, minus sign, decimal point and digits. NOTE: zoned decimal numbers and comma delimitted files sort very slowly! The only reasonable field types for comma delimited files are C or N. LSORT will accept other field types, but the results are undefined. Sorting Dates: Until the Y2K issues became important, it was not necessary for LSORT to specifically deal with date fields. Release 4.02 of LSORT has added a Y2K compliant 2 character year field to handle dates that exceed the year 2000. The internal representation of date fields is generally application dependent and can be simulated using other field types: o Unix style date fields (seconds since 1900) can be sorted as a 4 byte binary integer. o xBase style date fields can be sorted as the appropriate sized floating point number. o yyyymmdd fields (stored as character strings) can be sorted as an 8 byte character string. o mm/dd/yyyy and dd/mm/yyy date fields can be sorted as 3 character strings where the yyyy field is sorted first, the mm field is sorted next and the dd field is sorted last. o dd/mm/yy, mm/dd/yy and yymmdd date fields can be sorted as 3 fields where the yy field is sorted first, the mm field is sorted next and the dd field is sorted last. If all dates are part of the 20th Century 19yy, then use a two position character field for the year. If the dates can be in either the 20th or 21st centuries, 19yy or 20yy, use the new Y2K complient YY field instead, which will translate yy fields < 30 to 20yy dates and yy fields over 30 to 19yy dates. o dd/mm/yy, mm/dd/yy and yy/mm/dd fields can be sorted using the special field types for these date fields. When the special types are used the dd, mm and yy fields may be either 1 or 2 positions long and may contain a leading or trailing space. The rule of 30 is used to make the dates Y2K compliant. If the dates can be in either the 20th or 21st centuries,

19yy or 20yy, yy fields < 30 are changed to 20yy dates and yy fields over 30 to 19yy dates. o dd/mm/yyyy, mm/dd/yyyy and yyyy/mm/dd fields can be sorted using the special field types for these dates. When the special types are used the dd and mm fields may be either 1 or 2 positions long and may contain a leading or trailing space. The yyyy field may be 1-4 positions long and may contain leading or trailing spaces. LSORT can be run as a Windows application or as a command line application. The maximum record length is 4096 bytes. Files will be sorted in memory if possible. Files larger than available memory are sorted in pieces and then merged together. Running the LSORTNT console application: SYNTAX: LSORTNT [flags] @sort.inp All sort specifications are stored in sort.inp. or LSORTNT [flags] sort specifications--will take the specification specified on the command line. LSORTNT -R -- will restart a sort. Flags: -R -- restart an existing sort -V -- verify that all delimited fields are present. -D="x" -- Use character x as a field delimitter for delimitted files. -W -- Display sort statistics at end of sort -Q -- Quiet mode. Do not display any messages except error messages while runni ng LSORT. -B -- Batch mode. Do not prompt for any information when running LSORT. Send e rror messages to STDERR or the logfile instead of opening a message box. -Uxx -- Use xx amount of memory for sorting. If xx < 100, then it is the percentage of system memory to use. If xx >100 it is the number of KB to use. -Fxx -- Leave xx amount of memory free for use by the system. If xx < 100, the it is the percentage of memory to leave for other users, otherwise it is th e number of KB to leave free for other uses. -Llogfi logfi is the name of a log file showing LSORT progress. If not present, LSORT.LOG is used. Sort Specifications: You must specify either a SORT or MERGE operation. If you ask for a SORT, you may tell LSORTNT to use either a QUICKSORT or

HEAPSORT for internal sorting. You will also be asked to specify two devices to hold merge files if any are needed. Merge files may be placed on floppy disk, hard disk or RAM disk. The specified drive must be large enough to hold the entire input file. If you specify SORT or MERGE you must enter your input file(s) and output file as well as the definition of the key fields to be used in the comparisons. Fields are specified by their starting position and length. The types of fields have been listed above. The sort specifications must be entered on the command line or in the redirection file in the order requested by LSORTNT. Each parameter should be separated from the others with one or more spaces. The sort needs the following information in the order shown: Type of Sort: S -- for QUICKSORT H -- for HEAPSORT Merge Drive 1: You may reply with any drive letter, although it is best to specify a fixed disk (if any). Merge Drive 2: This should be different from drive 1 if you are using floppy disks, but should be a fixed disk if you have one. Name of input file: You may specify any name including drive letter and path. Specify :X to use a user specified input routine. Name of output file: See above. Specify :X to use a user specified output routine. File Type (Unless you are sorting a dBase file): F nnnn -- Fixed length file (all records are the same length), nnnn is the length of each record. V -- for a varying length file (records must end with CR LF.) D -- for delimited files. You must then enter field definitions. Each field definition has four parts: starting position (from 1) or starting field (delimited files) field length (in bytes) (no prompt for delimited files) field type (See above list of valid types) sort order (A--Ascending, D--Descending) In order to work as efficiently as possible, LSORTNT does not check the starting position of a field against the actual length of a record. If some field starts past the end of a record (e.g. sort field 1 starts in column 10 but the record is only 8 bytes long), the results will be undefined and most certainly not what you want. Please be careful. Enter a '0' for the starting position to end the prompt for field definitions. If you are sorting a dBase file, you may specify a field by name, in which case you will only be need to enter the sort order {A|D}. You may enter starting position, length, type and order as above. example 1:

Sort file test.dat on positions 1-5,char,ascending and 6-7, binary integer, descending. Use drive C for the work files and put the sorted file in test.srt. Issue the following command: LSORTNT S C C test.dat test.srt | | | | | | | | input output | | | file file | | | name name | | | | | merge drive 2 | | | merge drive 1 | sort using quicksort V | F i l e T y p e 1 5 C A |_____| | sort field 1 starts at byte 1, is 5 byte character string ascending 6 2 H D 0 |_____| | | | sort | field 2 | starts | at byte | 6, is 2 | byte | long | integer | sorted ends list of sort fields. descending

Merge Specification: Enter 'M' to indicate the merge operation. You will be asked to enter the number of files to be merged followed by 1-5 files to be merged. They are entered one at a time. You will be asked to enter a file type, output file and a field list as above. example: Merge files t1.dat t2.dat and t3.dat on positions 4-7 defined as a character field, ascending. LSORTNT M 3 t1.dat t2.dat | | | | | | input input | | file 1 file 2 | | | merge 3 files | do a merge Restarting: If a sort stops in the middle due to lack of space or is stopped by you by pressing ^BREAK, it may be restarted by issuing the LSORTNT -R command providing the dataset(s), SORTPARM.DAT and (DB3PARM.DAT for dBase III files only) are still available and further providing all files LSMERGE?.DAT are still available. The sort will be restarted at the beginning of the LSSORT phase (where the input file is read and sorted) or at the beginning of an LSMERGE pass, where several partially sorted files are combined. User Exits: You may define your own user exits to read and write data and you may define your own compare routines for the standard field types or for user defined t3.dat | input file 3 test.mrg | output file V 4 4 c a | |_____| | | | merge | field | 1 | file type 0 y y | |_| | | | response to mount | messages | end of list of merge fields

field types. These routines must be written in Microsoft Visual C++ 5.0 or in any other language that can be linked to Microsoft Visual C++. User Exits are only available to registered users, who will recieve the source and object files for LSORT. User input: (Available for Sorting Only) Specify :X as the name of the input file. LSORTNT uses a routine named USERIP to read the records to be sorted. You may write your own version of USERIP and link it with LSORT to create a custom version containing your own input routine. USERIP is used as follows: int l,userip(); char buffer[...]; l = userip(buffer); USERIP must return the length of the record read which must be <= 4096 or -1 for end of file. If you have specified V type files, USERIP must return a string ending with a '\0'. The string length must include the trailing '\0'. User Output: Specify :X as the name of the output file. LSORT uses a routine called USEROP to write to the :X file. You may write your own user output routine to be used to write the final sorted or merged output by creating a custom version of USEROP and relinking LSORT to create a custom LSORT. USEROP works as follows: int buflen; char buffer[...]; userop(buffer,buflen); userop(NULL,-1); /* userop must write buflen bytes from buffer */ /* buflen == 0 means that you want to write a 0 terminated string */ /* userop must perform end of file processing */

Sample versions of userip and userop appear below: /* Userip to return a varying length string */ #define CPMEOF 26 #include "stdio.h" userip(s) char *s; { static char firsttime = 1; static int inchan; char *fgets(); int l; /* input is string buffer, max length 4k, 4k always available */ /* this routine must return length of string or EOF if end of file */ /* example follows: (Note length of string includes 0 byte at end */ if (firsttime) { firsttime=0; inchan=fopen("usertest.dat","r"); }

if (fgets(s,4096,inchan)) return strlen(s)+1; else return EOF; } /* Userip to return a fixed length string */ #define CPMEOF 26 #define STRLEN 128 #include "stdio.h" userip(s) char *s; { static char firsttime = 1; static int inchan; char *fgets(); int c,l; /* input is string buffer, max length 4k, 4k always available */ /* this routine must return length of string or EOF if end of file */ /* example follows: (Note length of string includes 0 byte at end */ if (firsttime) { firsttime=0; inchan=fopen("usertest.dat","r"); } if ((l=read(inchan,buffer,STRLEN)) == STRLEN) return STRLEN; else return EOF; } userop(s,l) char *s; int l; { /* s is string to write, l is length or 0 if 0 terminated or -1 for close */ static char firsttime = 1; static int otchan; if (firsttime) { firsttime = 0; otchan = fopen("usertest.srt","w"); } if (l == -1 || s == NULL) fclose(otchan); else if (l) /* write an F type record */ while(l--) fputc(*s++,otchan); else fputs(s,otchan); /* write a V type record */ } User Compare Routines: You may define up three user defined fields: X,Y,Z. You must write a compare routine for each field type used. The routine names are: sxcmp -- for field type X. sycmp -- for field type Y. szcmp -- for field type Z.

The compare routines are called with three arguments, the address of the first field, the address of the second field and the field length. The routine must return 1 if field 1 < field 2, 0 if field1 == field2 and -1 if field1 > field2. Sample routines are shown below: sxcmp(a,b,l) long int *a,*b; int l; { /* this routine compares two long integers */ long int c; c = *a - *b; return c <0 ? -1 : c == 0 ? 0 : 1; } sycmp(a,b,l) int *a,*b; int l; { /* this routine compares two integers (2 bytes) */ int c; c = *a - *b; return c <0 ? -1 : c == 0 ? 0 : 1; } szcmp(a,b,l) float *a,*b; int l; { /* this routine compares two floating numbers */ float c; c = *a - *b; return c<0 ? -1 : c == 0 ? 0 : 1; }

Das könnte Ihnen auch gefallen