Assembly Language Extensions of Visual Prolog for Data Mining of Huge Data
by George A. Stathis (ENB Ltd)


Uploaded on authorSTREAM by gstathis

A lecture and presentation at the
"ALC Visual Prolog Programming Conference"

(Faro / Portugal, April 2006)

Shortcuts: Click here for "Date Binaries", here for the ASM Predicate Libraries, here for the Appendices, here for "Algorithm Iphigenia" and here for "Multiple Form Logic" (the theory behind “Algorithm Iphigenia”). (best viewed with FIREFOX browser)
ABSTRACT:
  • Pure Assembly Language is the fastest way to implement Visual Prolog predicates that can achieve intelligent processing of huge data.
  • High-level logic programming power is then combined with extremely low-level speed enhancements.
  • A number of special techniques emerge that combine the best of both worlds: Visual Prolog and Assembler.


INTRODUCTION

One important reason that discourages the use of Prolog (and Visual Prolog) in industry is the inherent difficulty of traditional recursive Prolog programming techniques to deal with huge data adequately. Most Prolog textbooks are prolific with quite elegant list-processing predicates, that are adequate for lists of a few hundred (or a few thousand) elements, at most. When the number of list-elements exceeds a certain limit, most of these ‘elegant’ predicates (even using the best “tail recursion optimization”) tend to cause crashes. Mainstream languages (such as Visual Basic or C++/C) tend to have simple ways of avoiding such problems (the use of iteration instead of recursion, etc.) that reinforce certain common prejudices against the use of Prolog, while underestimating the advantages and the high-level expressiveness of Prolog (beyond doubt in academic circles).

In this paper we present an overview of very specific methodologies that were used in real-world problems, producing quite adequate -or even impressive- results by the combined use of Visual Prolog and Assembly Language.

  • Combining Visual Prolog with Assembly (or with any other programming language) is not at all difficult to implement, from the point of view of linking. Any predicates implemented in another programming language and linked to Visual Prolog appear and behave exactly as if they were implemented in Visual Prolog.

Visual Prolog has been designed with serious optimizations in mind: Whenever possible, it tries to avoid unnecessary computations (that other Prolog compilers perform all the time) if these computations are not required for the operation of each specific (e.g. deterministic) predicate. All linked “foreign language code” works seamlessly inside native Prolog code. As a result, the linking of Assembly language subroutines “as if they are Prolog predicates” is efficient from the outset, and these foreign predicates run as fast as they would in any other (“mainstream”) developer environment.


PREDICATE LIBRARIES

At the time of writing (March 2006) there are more than 1100 Visual Prolog predicates that have been implemented in Assembler, as well as more than 150 predicates implemented as ‘C’ glue-functions. In addition, about 20 new predicates are added to these libraries every month (depending on the practical needs of programming projects). There are families of such predicates dealing with lists, strings and characters, “binaries”, CSV Excel-files (texts delimited by separators), time-series, dates, tree-like data structures and other objects. All these predicates originated from real needs in real software applications. The efficiency and speed of these applications destroys the myth of Prolog being “just an academic language”, while outperforming commercial applications of similar nature developed using mainstream languages by a significant factor, typically of an order of magnitude - and quite often more.
 

WORK ORGANIZATION

This work has been organized on the basis of the fact that Assembly Language code is not easy to modify or debug unless it is properly commented, and it is not useable at all unless the comments include instructions for calling each routine externally as a Prolog predicate.


CONVERSIONS OF STRINGS TO LISTS

Such conversions are typically necessary for 95% of current Prolog projects. The growing need for such predicates led to about a dozen of them implemented in Assembly. Their full global definitions are listed in Appendix A: This library includes customized predicates implementing many such conversions, from strings to lists -and vice versa- in a variety of possible ways.  

When dealing with large texts, our most basic need is to copy them inside main memory for further processing. Unfortunately, it has been verified that -after certain specific list or text size limits- almost any Prolog predicate that uses recursion is useless for simple tasks, like converting texts to lists!

This situation can (of course) be improved: If we don’t want to use Assembly language, it is certainly possible to create optimized Visual Prolog predicates that minimize the use of the stack, so that texts of somewhat larger sizes (than before) can be converted, at somewhat improved speeds. 

Visual Prolog includes implementations of (non-ISO) predicates for string-handling that can access the memory of strings (or lists) directly. Examples of code using these built-in predicates are already given inside Visual Prolog’s documentation for advanced users, so they won’t be repeated here. The important thing, however, is that even such optimized coding, impossible to do in most other Prolog compilers (which lack those extra predicates) and possible only in Visual Prolog, inevitably and very quickly hits against some new limits of speed and memory, that can be shown to be (this time) unsurpassable. 

One of our most highly optimized conversion predicates is ‘strx2list’: It can convert strings (typically large texts, which can be of several megabytes size) to Visual Prolog string-lists. Here is a table of measured execution times, for the optimized predicate “strx2list”, in all its possible variants and memory methods. 

Size

List-Size

(A)

(B)

(C)

(Mb):

elements

stack

alloc.

Heap

alloc.

Pre-

alloc.

1

100,000

0.01

<0.01

<0.01

2

200,000

0.02

0.01

<0.01

3

300,000

0.03

0.01

0.02

4

400,000

0.03

0.03

0.02

5

500,000

Crash!

0.03

0.02

6

600,000

-

0.05

0.03

7

700,000

-

0.06

0.03

8

800,000

-

0.06

0.03

10

1,000,000

-

0.09

0.05

20

2,000,000

-

0.17

0.09

30

3,000,000

-

0.26

0.14

40

4,000,000

-

0.36

0.18

50

5,000,000

-

0.47

0.23

60

6,000,000

-

0.55

0.28

70

7,000,000

-

0.66

0.36

80

8,000,000

-

0.75

0.39

90

9,000,000

-

0.84

0.44

100

10,000,000

-

0.88

0.48

110

11,000,000

-

1.00

0.50

120

12,000,000

-

1.02

0.52

130

13,000,000

-

1.17

0.61

140

14,000,000

-

1.29

0.64

150

15,000,000

-

1.36

0.72

160

16,000,000

-

1.45

0.74

170

17,000,000

-

1.56

0.80

180

18,000,000

-

2.31

1.48

190

19,000,000

-

3.30

2.12

200

20,000,000

-

8.06

5.08

……

….

250

25,000,000

-

20.45

19.58


These timings show an apparently impressive speed: -It is evidently possible to convert a 100Mb text of 10 million lines to a list (of 10 million elements) in much less than 1 second! 

The timings (in seconds) of columns A, B, C refer to literally millions or tens of millions of elements! Alas, it was discovered that (after a few hundred thousand elements) most high-level language code does not work (Prolog, as well as any language). Optimised Prolog code executes at about one order of magnitude (i.e. 10 times) slower than optimised Assembly for the restricted small range of text and list sizes where it does work.

Now, although these results are correct (even at higher data-sizes than those shown in the table) at a turning point of about 250 Mb of text (and 25 million elements) the execution time of the ASM-predicates increases steeply and unpredictably, due to RAM limitations of the specific computer used (P4, 512 Mb).  

Experienced users of Visual Prolog may have noticed that columns A, B, C (of the previous timing-table) carry annotations of the memory allocation method used in each case. And the three possible ways of allocating memory are

A) Visual Prolog’s stack, B) Visual Prolog’s heap, and C) no new allocation (i.e. memory must be already allocated, before the call).

All these three memory allocation methods are possible, using the predicate “strx2list”: It was implemented in two variants, one with two and one with three input-arguments. 

PREDICATES

 strx2list : (string8 STR, char sepc)

-> slist

procedure(i,i) language c

 strx2list : (string8 STR, char sepc,

              binary Buffer)

è      slist procedure(i,i,i)

language c


Our main concern is producing an output-list which is correct and adequate for fast further processing. The inputs are recycled, in a very destructive “procedural” way -that can make a Prolog purist turn his head away in dismay: 

The first variant strx2list/2 allocates memory before returning the output-list, but… only as much memory as strictly necessary. In order to operate well under “extreme conditions”, it does not allocate any memory for the strings inside the list: Instead -like the 2nd predicate- it re-uses the input string in “recycled form” by chopping it up into pieces, while inserting pointers to these pieces inside the list, rather than creating new strings and copying the old.

If you are somewhat familiar with Assembly language, you might see what is going on in detail, by looking into Appendix B , the sourcecode of “strx2list”: The assembly language main loop checks out every byte of the input string for presence of a “separator-character” (arg. 2) while also planting into this position a zero-byte (signifier of the “end of a string” in ‘C’ and Visual Prolog) and then inserts as a list-element only a pointer to the start of each chopped part of the string (rather than a new string). Consequently, we could say that the input-text is “recycled”, to play the part of the output-strings (in the list) as well.

  • This is a raw memory allocation strategy that works very well to economise memory, when processing huge texts and lists in RAM.

As regards the internal structure of a “list” in Visual Prolog, suffice it to say that it consists of 12-byte structures, where the first 4 bytes are indicators of “element-type” (“normal” or end elements), the next 4 bytes are (pointers to) each element itself, and the last 4 bytes are pointers to the “next list-element”. So, instead of allocating small pieces of memory N times (for N list-members), memory can be allocated only once, for an entire list, if the number of list-elements is known: For this reason, the first predicate strx2list/2 has two internal loops: (1) One for finding the number of lines in the text (which is also the number of list-elements) and (2) a second loop that uses this number to create each list-element. Before the second loop begins, a single call of an external memory allocation predicate is enough to reserve memory for the entire list (either in the stack, or in the heap).

The second variant of strx2list/3 was created for those rare cases when we need to call this predicate several times, inside another loop (external to it). In this case, it is not really worth allocating memory during every call. Instead, memory is allocated prior to the outer loop only once, to create a binary buffer which is used by strx2list/3 as a third (bound) argument, a Visual Prolog “binary”, pre-allocated with 12*(N+1) bytes, where N is the pre-calculated number of list-elements. So, the second variant, strx2list/3, uses no memory allocation at all. Instead of this, it re-cycles all the memory of its own inputs. The reason we use a “binary” is that Visual Prolog binaries are equipped with an indicator of size embedded in a memory location before their pointer. This size-indicator can be used later to release the heap-memory -if in the heap- precisely as required, without side-effects that can arise from intermediate processing, e.g. zero-bytes inside a string making it appear shorter -confusingly for de-allocation- etc.

Of course, there are other similar conversion predicates in this family that don’t behave as stingily towards memory, as does “strx2list”: Predicates “str2list” (with no “x” before the 2) and “conclist/2” are routinely used in most ordinary text-to-list /  list-to-text conversions.

  • In Appendix A, a full list of Global Predicate specifications is given for all list-conversion predicates in these libraries, as well as many other list-handling ASM predicates. Some of these predicates are non-deterministic: There are interesting non-deterministic versions of list-membership and other list-operations in this library that cannot be discussed here, due to limitations of space and time.

As a final note -about the predicate strx2list- that may be of general value, if you examine the source code of strx2list, in Appendix B, you may be surprised to realize that it doesn’t call literally any external memory allocation predicates. Instead of this, it uses function-pointers to an external allocation predicate, included in the library’s Global data segment as “_allocfunc”. This was done for flexibility; so that the actual allocation predicate may be changed, at any time. E.g. the test-program, which produced the results for the previous timing-table, used different assignments for “_allocfunc” for each column of the results:

Column A was generated while using (as the actual contents of the location “_allocfunc”) Visual Prolog’s stack-allocation predicate “_MEM_AllocGStack”, and column B was created by using another -similar- predicate, which allocates memory in the heap, instead.

The results (shown in the timing table) should not surprise experienced Visual Prolog users:

Heap memory is much more suitable -than the stack- for storing huge data!

In conclusion, programming techniques such as these may seem “dirty” to “Prolog Purists” (since they are “procedural as hell”) but many years of Visual Prolog programming dictate the necessity of using such “dirty methods” when we are in desperate need to process huge amounts of data in RAM, rather than in hard disks.

Of course, today’s hard disk databases (with SQL, etc) are more efficient than ever before. However, there are important advantages in keeping as much as possible of our data in RAM, when we want to do (e.g.) exhaustive fuzzy string-matching, cross-examining each list-element versus all other list-elements, etc.
 

NO HARD-CODED EXTERNAL CALLS 

Avoiding hard-coded external function-calls was found to be good programming practice.

Today all Assembly language predicates in these libraries use pointers to external calls instead of literal external predicates. So, if (e.g.) a newer version of Visual Prolog (or a new ‘C’ compiler, used for mixed language programming) is installed, it is still possible to use the same libraries, with minor changes in very few pointer-definition files where the names of some external predicates are stored (and are compiled into pointers immediately).  

Alternatively, it is also possible to use initial calls of certain “configuration predicates”, to fill the locations of external call pointers with addresses of external predicates at run-time. 

So, instead of _IO_WrStr (a built-in Visual Prolog/‘C’ predicate that writes strings to the screen or to current output, defined in the file “pdcrunt.h”) many ASM predicates use a call to the function pointer “_wr_str”, located in the Data Segment. This makes it possible to do much more than just writing on the screen:

E.g. if the contents of location _wr_str are changed at run-time to contain the address of hp_wrstr, a new ASM predicate that writes strings to “virtual devices” inside the heap, it becomes possible to accumulate text that may be the result of some processing, writing it in small lumps inside the heap, and in the end to dump it -all at once- to a file, or to the screen, after certain conditions are met. 

Nevertheless, the default-values of all these memory locations are pre-filled with Visual Prolog predicates, and are rarely expected to change: Location “_wr_ch” contains Visual Prolog’s “_IO_WrCh”, location “_wr_int” contains Visual Prolog’s “_IO_WrIntg”, and so on. If any of these need changing, there are appropriate ASM-predicates to do the job.

For example, in order to change at run-time the current string-writing predicate, there is a configuration predicate, “set_wr_str/1” (in Visual Prolog 5.* syntax):

GLOBAL DOMAINS

  STR_i: determ (STRING)

–(i) language c 

GLOBAL PREDICATES

   set_wr_str( STR_i )

       –(i) language c 

Calling “set_wr_str(foo)” where “foo” is any predicate of domain “STR_i”, results in “foo” becoming the current string-writing predicate globally called by every ASM predicate (that does any calls to any external string-writing).

Now, “foo” is not “obliged” to do any literal string-writing; Instead of this, it could very happily be asked to do other things, such as calling “assert” (so that -instead of writing- it asserts facts in a Prolog database, and so on). 

Another common example (part of which was discussed earlier, with “strx2list”) is this: To change at run-time the current global memory allocation predicate used (by “strx2list”, as well as others), there is a global configuration predicate, “set_allocfunc”: 

GLOBAL DOMAINS

allocbin_func

  = determ BINARY (UNSIGNED size)

        -(i) language c

GLOBAL PREDICATES

  set_allocfunc(allocbin_func)

       -(i) language c  

The possibility of changing the currently used memory-allocation predicate (globally) by a program on the fly is also useful when calling these predicates inside a “main program” that is in a language other than Visual Prolog.  

SPECIALISED WRITING PREDICATES 

The ASM-predicate write_slist, which writes a list of strings with separators, belongs to a large family of similar writing predicates that can have their output redirected anywhere by changing the function pointers called by them for writing. The default-values of the pointers (as explained before) are set to Visual Prolog writing predicates, but can be changed to any other predicates at any time. 

The family where “write_slist” belongs also includes predicates that perform sophisticated writing scripts: Some of them can write the contents of Visual Prolog binaries in ways understandable by humans; others may write integer-lists, strings and binaries in ways that are needed by specific software applications, e.g. to create CSV Excel-files or HTML files. 

All these can be redirected for writing to the heap (in “virtual strings”), or for doing other (often unexpected) tasks irrelevant to writing:

Any predicate of similar type to “_IO_WrStr” may be used as a “virtual writing predicate” by inserting its address in location “_wr_str”.

In Appendix C there is a complete listing of all these “special writing predicates”, with some brief comments about their operation.

Appendix C also includes another big family of ASM predicates, which are often useful for experimentations and software development, but are not really so important for most final applications: Random generation predicates.  

CSV-FILE HANDLING PREDICATES 

In Appendix D there are Predicate definitions and descriptions of an Assembly language library to process CSV- (Excel) files, in many possible ways. Some of these ASM predicates analyse CSV files to extract parameters from them (e.g. their number of data-fields); others write them out in various ways; others select, delete or insert columns, rows, or cells, etc.  

However, due to space and time limitations this CSV-file handling library is not included in the main part of this presentation.

USING THE HEAP FOR “STREAMING” 

A common need is to suspend actual writing (to the screen, or to a file), writing everything to a virtual device or “stream”, which can be flushed. So, another family of predicates was created for this; writing strings, numbers and characters to “virtual strings” inside the heap.  

First, the initialisation-predicate “heap_init” should be called, with a single argument: the number of virtual heap strings required.  

This call allocates heap-memory for the array of pointers (and all the “position indicators”) needed by the “virtual strings”, but not for the strings themselves. To allocate heap memory for each individual virtual string, predicate heap_create/1 must be called, once for each “virtual string”: It has only one argument, the required size of every string; It returns (in register EAX) the “integer-handle” for each “virtual string”. This “handle” can then be re-used for “virtual writing”, i.e. adding content to the specific “virtual string” in ways that are similar to writing on the screen, or to a file.  

If we don’t need many such “virtual strings”, or if we want to use one of them most of the time, a “current default string” can be defined by predicate “hpstream/1”, with a single argument: -This virtual string’s “handle”. 

Some Assembly language predicates in this “heap-string library” emulate precisely Visual Prolog’s _IO_WrStr, _IO_WrCh, IO_WrIntg,  etc, but (this time) writing to “virtual strings”, rather than to the screen or to a file. Their “target” is the “current virtual string” (which can be changed by “hpstream/1”). These are predicates that can be inserted as “clones” of standard writing predicates, in the locations invoked by “write_slist” (and many others). 

See Appendix E for a listing of all the global predicate definitions in this specialised heap-writing and “streaming” library.
 

ENHANCED SEARCH PREDICATES 

 

An entirely new family of predicates for searching (inside strings, as well as inside binaries) was implemented in Pure Assembly language. This family of predicates contains many enhancements, including fuzzy search, non-deterministic search, etc), in order to:

 

1)      Start from a pre-specified position (rather than always at the beginning)

2)      Return the “rest” (after the search, as a pointer; not through memory allocation)

3)     Search backwards (used in parsers)

4)      Search non-deterministically (a Prolog–specific feature, useful for Data Mining)

5)     Search case-insensitively, and with Generalized Case-insensitivity  based on Character Table techniques

6)      Do fuzzy searches and inexact pattern- matching Searches (in many ways),

7)      Search optimally, using the Boyer-Moore algorithm (and other types of optimized search)

8)      Do everything for both Strings and Binaries (providing all possible variants of  all these predicates, together with all their features; as a result, e.g. there are 15 different variants of  nd_search” alone, which does non-deterministic search)

9)      Handle all possible types of objects (I.e. be able to find strings or binaries, or characters inside strings or binaries, in all possible type-combinations).

In Appendix F there is a complete listing of all the Global Predicate definitions for this large family of universal search-predicates. 

Of particular interest are non-deterministic and “fuzzy search” predicates in this family.

  1. All non-determinism in these predicates was customized to fit the inner workings of Visual Prolog. It is not at all easy to implement non-determinism in Assembly language, and it is also much harder (or even impossible) to implement it for most other Prolog compilers. At least in Visual Prolog it is guaranteed to remain 100% compatible with non-deterministic code written in Visual Prolog itself, which is nice.
  2. Fuzzy search predicates have been implemented in a variety of ways, some of them customized for applications. The most common method for these is the use of Character Tables, together with repeated calls of the Assembly Language instruction “XLAT”: If the commonest possible erroneous values of every character are already known, then weights can be assigned to these values in a table (with the highest weight assigned to the correct value). Search algorithms can then exploit the sum of  these weights: -As the text is scanned, each character is “translated” to a value in the table, using instruction “XLAT”, this value is accumulated and its accumulated sum is finally compared to a threshold.
  3. Inexact string-matching: Experimentally it was discovered that the combination of the above XLAT- / character table-based  techniques with Assembly-variants of the well-known “Levenstein algorithm” for inexact string-matching, gave the best results, and also the best possible manual or automatic data-specificrefinement of the matching process,

Of course, these simple (sometimes ad-hoc) methods of fuzzy search may seem somewhat childish to bio-computing experts, who know more efficient algorithms for inexact string matching,, used in today’s DNA Research. However, the boosting effects of Assembly Language on an algorithm’s execution speed often compensate for the algorithm's minor imperfections. Besides, it may be worthwhile to rewrite today’s bestbio-computing search algorithms in hand-optimized Assembly , and then perhaps use Visual Prolog for the more intelligent parts of DNA analysis, e.g. using professor David Searle’s recently discovered String Variable Grammar (already written in Prolog) to delve into semantic DNA parsing!

In any case, a combination of character table- based methods with Levenstein’ s algorithm was used in a Visual Prolog ENB application that correctly identified the names of  many hundreds of hydrological stations, matching each station’s true name to many commonly used aliases and sub-station names. The lack of a common convention about hydrological station names -in previous years- had caused a painfully prolific number of aliases, inside paper documents of the raw hydrological data for previous decades. 

In Appendix G there are selected screen-shots from this Visual Prolog Windows program, which was used by ENB Ltd to compile and correct the final and official Nationwide List of Hydrological Stations in Greece.

In Appendix G there are also some source code sample listings, for both fuzzy and non-deterministic search predicates (of global definitions provided in Appendix F).

DATA MINING IN BITS (1): “DATE-BINARIES”

Date-Binariesare data-structures of Dates encoded as bits.

A “Date-Binary” is a Visual Prolog Binary with the following structure:

  1. A four-byte header, which contains (a) the number of Years present inside the Date-Binary, encoded in the first 2 bytes, and (b) the Starting Year, encoded in the next two bytes. Both these parameters have a range of 0 - 65535, more than adequate.
  2. A number of “year-records”, each one containing precisely 12 “month-records”. These month-records are 4-byte (32 bit-) objects, containing days encoded as bits. 

In addition,

  • Date-Binaries contain complete years. Each year has a size of 48 bytes, or 12 double words, each one being a month, where each 31 bits represent the days of the month and 1 bit is available as a “flag-bit”, for use by specialized operations.
  • Date-Binaries have been created to implement a number of statistical, logical and data-mining operations on time-series data, in the fastest and most compact way possible.
  • Date-Binaries can be combined with other forms of data, e.g. hydrological measurements. Such additional data can be numeric arrays, records of various types in SQL databases, GIS maps, EXCEL-file fields/records, etc.
  • If the position of a bit is known, inside a date-binary, then the date represented by this bit can instantly be decoded, while any corresponding record with a  certain calendar-index equal to the bit’s position is also very naturally accessible. 

In short, date-binaries consist of bit-patterns or bit-sets, representing the days of specific months within specific -series of- years in a compact way, allowing fast and easy access. 

Date-binaries are complete and adequate, as specifications of any single Boolean attribute, over calendar-time. This attribute can denote truth or existence of any property, of any type of data in relation to calendar time.  

E.g. date-binaries could be used to analyze the market behavior of customers who did or did not buy a certain product, over time, and who did or did not have an “attribute” that we think is relevant, in the likelihood to buy this product, within given periods of time. In this particular -hypothetical- example, all we need to do, to verify a suspected latent correlation  between buying patterns of customers who do or don’t  have a “specified attribute”, is to use date-binary bit-operations (AND, OR and NOT) at a massive scale, in various ways.

So, Assembly language predicates have been written, that implement a variety of logic and numeric bit-operations on Date-Binaries: Union and Intersection (like Set-operations), deletion, insertion, writing to the screen or to the current output device, bit-operations of statistical nature (or at a slightly higher level) such as counting the number of bits or days inside each month (to produce the density of valid measurements in a period of years), etc. 

Date-binaries were originally created to assist the quest for the longest intervals of valid and contiguous hydrological time-series data, in a large-scale project of hydrological predictions financed by the Greek government.

The problem of this project was that much of the raw data, sent from hydrological stations throughout the country, was partly incomplete and unusable. Soon enough, it was proved to be more tedious to distinguish the good from the bad data, than to actually use it!  

The company's main hydrological software used by ENB's engineers at the final phase became useless if the data fed to it had gaps of erroneous data, inside the raw time-series.

So, a Visual Prolog Windows application was developed, based on date-binaries, to produce reliable data-summaries, readable by humans, for use in important decisions on this project.

To identify the erroneous data-gaps, a five-phase extraction process was devised: 

  1. First, the dates of all the time-series were turned into bits / date-binaries and invalid raw data was excluded.
  2. Then, the date-binaries were AND-ed together, finding common days (or bits)
  3. Then, all valid common days inside each month, or bits in each double word, were counted and compared to a threshold of 25 days per month, producing common valid month – binaries, where each month was encoded as either 1 (valid) or 0.,
  4. Then a special Assembly language predicate was used, to find the sizes of all contiguous time-periods, or consecutive 1’s, inside these common month binaries
  5. Finally, a (Visual Prolog) fail-driven loop was used to scan through the resulting sizes (of contiguous time periods) finding their maximums for all possible pairs of time-series data. These maximums were then plotted in CSV-file format and transferred to the engineers of ENB Ltd. 

The use of date-binaries in this project was a clear success. All known previous methods for achieving similar goals became obsolete.

Visual Prolog program development was also fast: Less than a day, plus about ten working person-days to… invent the “Date-Binaries Library” itself; i.e. the ASM predicates used by the program. This program was so fast that it took longer to save results to a file than to... calculate them, even though the data included several decades and many dozens of stations.  

Admittedly, it would be possible to use any “mainstream language”, such as Visual Basic or C++, to achieve the same goals. However, even forgetting the advantage of a very short development time (using Visual Prolog) the ultra-high execution speed of the program empowered the company’s engineers to experiment widely with many different ranges, settings, thresholds and time-periods, so as to re-adjust and re-define the complete picture of the data. This had a very positive impact on the quality of the decisions taken, about which parts of the raw data to accept. 

  • There are screen shots of this Visual Prolog application of date-binaries, in Appendix H.

We now have good reason to believe, that the software methodology used for assisting this project, can achieve similarly good results in any field where incomplete or erroneous time-series data is an obstacle, which means: every other company working with raw time-series data. However, the feeling of the author is that this is only the beginning. A logical next step is to associate date-binaries with other types of data and rule extraction algorithms, e.g. operating on raw time-series to discover hidden patterns or hidden relations inside the data, at a very high speed. Now, “Inductive Logic Programming” methods are known to demand vast amounts of time to arrive at the simplest results, but if they are assisted or boosted with such low-level innovations (as date-binaries), they are bound to become much faster and much more practical, in business fields that are extremely demanding for instant and powerful results. 

In Appendix H there is also a listing of all Global Predicate definitions in our libraries that deal with Date-Binaries.

In the last section, we will take a brief look into a methodology much more powerful and important than “date-binaries”, but also 100% compatible with them: It's a methodology that can assist serious and extremely reliable data-mining processes at a fraction of the time normally required for them: A logic inference algorithm which (like date-binaries) also uses bits and is also based on low-level logic bit-operations, but it is of more general scope, a part of an “Alternative Logic System” which could be described as a “Reduced Instruction Set Inference Engine for Boolean Logic”.

Appendices A, B, C, D, E, F, G, H, I:

  • A. http://omadeon.com/alc/a.html: In Appendix A there are Global Predicate Definitions for String-to-list / List-to-string Conversions: Such conversions are typically necessary for 95% of current Prolog projects.
  • B.  http://omadeon.com/alc/b.html: Appendix B contains the source code of “strx2list/2” and “strx2list/3”.
  • C.  http://omadeon.com/alc/c.html: In Appendix C there are Global Predicate definitions of “write_slist/2” together with all the other “special writing predicates”, and comments about their operation. Appendix C also includes another family of ASM predicates used in experimentation and development, but not so important for final applications: Random generation predicates.  
  • D.  http://omadeon.com/alc/d.html: In Appendix D there are Global Predicate Definitions and brief descriptions of an Assembly language library to process CSV-(Excel-) files, in many possible ways.
  • E. http://omadeon.com/alc/e.html In Appendix E there are Global Predicate Definitions for the heap-writing / “Streaming” Virtual String Library. 
  • F.  http://omadeon.com/alc/f.html In Appendix F there is a complete listing of all the Global Predicate definitions for a large family of Universal Search Predicates. Of particular interest here are non-deterministic and fuzzy search predicates.
  • G.   http://omadeon.com/alc/g.html In Appendix G there are selected screen-shots from a Visual Prolog Windows program with fuzzy string-matching capabilities, used by ENB Ltd to compile and correct al list of Hydrological Stations in Greece. Also in Appendix G there are source code samples for fuzzy and non-deterministic search predicates.
  • H. http://omadeon.com/alc/h.html In Appendix H there are screen shots of a Visual Prolog ENB Ltd application using date-binaries, which played an important part in identifying the optimally usable data in a large nationwide hydrological project. Also in Appendix H there is a listing of Global Predicate definitions relevant to Date-Binaries. 
  • I. In Appendix I, included here, some sample results from Algorithm Iphigenia (linked to a Visual Prolog test program) are discussed. Recent results from running Algorithm Iphigenia, linked to a Visual Prolog test program are given, with selected screen shots from the program. Also in Appendix I, the Global definitions of all ASM predicates used for implementation of this algorithm is given (in Visual Prolog 6.3 syntax) with some brief references and explanations:

 

Appendix I (iota): ALGORITHM IPHIGENIA 

The “Iphigenia algorithm” is a certain kind of “mathematical curiosity” or invention of the author, dated back to the mid-eighties. Today, after many years -during which the theory of this algorithm was... buried in the closet- there are some fresh results and theorems, proved in 2004 (and more recently). Assembly Language implementations of this algorithm also led to some advances in the theory behind it: “Multiple Form Logic”, about which you can find out more in the site http://multiforms.netfirms.com (a non-commercial research site).

Here is some information about the theory leading to “Algorithm Iphigenia”, as well as a list of all the predicates implementing it, in Visual Prolog 6.3 syntax, towards the end:


Working with BITS, Multiple Form Logic and “ALGORITHM IPHIGENIA”:

Unfortunately, “Multiple Form Logic” is not easy to expound in just a few minutes, and it is also considerably off topic for the concerns of this conference. So this particular section will come to an end with a brief discussion of practical results obtained by using algorithm Iphigenia and also some hints about how this algorithm can be combined with certain other compatible methodologies, implemented in Assembly language, such as date-binaries, in the context of Visual Prolog programming.

  • One of the easiest results to deduce (at a very high speed of execution), using the Iphigenia Algorithm implemented in Assembler, is the fast checking of whether or not a collection of logic facts of the form “A and B and C and... => X or Y or...” when combined with a new question or “logic query”, is “true” or not.

The algorithm operates blindly and with brute force, as a bit-pattern reduction and inference engine, in a way that is “pseudo-parallel”, and it crunches our bit-encoded knowledge base in a fraction of a second, to find the result. 

There only very few “bit reduction rules”, all of which are implemented as bit operations on Visual Prolog binaries. There are a couple of rules for “vertical or horizontal” reduction, a couple of “row elimination rules”, and...
…well, that's it!
 

If our Knowledge Base is encoded as bits, it can also be easily extended to include date-binaries, or any other bit-encoded attributes (of raw unexplored information). Algorithm Iphigenia can then also be used as a powerful data-mining engine, able to deduce new logic patterns and inferences (in bits) that were not known (at the start), provided we make sure that the interface between the “raw data” and our logical “meta-rules” (used for the data mining strategy) is kept transparently clear and well-designed.

Quite often, it is necessary to include external criteria or some types of “thresholds” at an intermediate level, before “raw data” is allowed to pass into the “blind” logic inference phases. E.g. in the previously mentioned hydrological project (at ENB Ltd) where we had used date-binaries, without using algorithm Iphigenia, a threshold of 25 days per month was decided to be the “minimum  number of days for a particular month's data to become acceptable” by subsequent phases.

A common mistake that one can make easily, if one uses this algorithm carelessly is to ask a wrong question, causing an “avalanche” of useless but... correct  inferences, which -even though are (logically) correct- they are also... impossible to digest, as regards their results, unless you possess... supernatural abilities for logical complexity comprehension. The larger our knowledge base, the more logic facts will be deduced massively and blindly (but also correctly) as logic consequences of our query.

If our question is reasonable, then the answer will also be reasonable. If -however- our query is ambiguous, then the answers are not produced by anything... decent, polite and sequential as Prolog-like back-tracking, but -instead- they are hammered out by a brute-force potentially parallel inhuman algorithm of blind reduction at lightning-fast speed, but also hard to stop or control, once it begins!

Here are sample screen shots from a test-program (an EasyWin Visual Prolog executable) linked to the “Iphigenia Inference Engine Library”: 

(1) First example:

ALGORITHM IPHIGENIA TESTING PROGRAM (EasyWin interface).

Select: 1 to load and test, 2 to create new, 0 to exit:

?- 1

IPHIGENIA ALGORITHM Expert-System KBS/memory diagnostics:

***********************************************************************

Number of expert-system RULES = 4.

Terms in the Expert-system 'dictionary' = 6.

Active Bits = 7 (Number of Terms + 1 'flag-bit').

Covered by 1 dword(s), i.e. 4 bytes.

(7 / 32  = 0, 0 * 32 = 0, 1 * 32 = 32)

Size of the main array: (2 x 4 fact-bytes x 4 rules) = 32 bytes.

Total memory allocation (including 'conclusion-line'):

4 bytes x (4+1) x 2 = 40 bytes.

**********************************************************************

Sentences in Logic Array: 4, Dwords per row: 2
 

 0)01100000000000000000000000000000 00010000000000000000000000000000: intelligent(1)&have time(2)=>can learn programming(3)

 1)00010000000000000000000000000000 00001000000000000000000000000000: can learn programming(3)=>can learn Prolog(4)

 2)00001000000000000000000000000000 00000100000000000000000000000000: can learn Prolog(4)=>can learn Logic(5)

 3)00000100000000000000000000000000 00000010000000000000000000000000: can learn Logic(5)=>can learn Iphigenia(6)

 

(*) 00000000000000000000000000000000 00000000000000000000000000000000: VOID


1:Proof 2:Query 3:Edit 4:HRC 5:LogFile 6:List 7:DHC 9:REFRESH 10:Save 0:Exit

?- 2 

Give conclusion_row:

> intelligent&have time=>can learn Iphigenia

[1,2] => [6] 

(*) 01100000000000000000000000000000 00000010000000000000000000000000: intelligent(1)&have time(2)=>can learn Iphigenia(6)

ok(y/n)?- y 

1:Proof 2:Query 3:Edit 4:HRC 5:LogFile 6:List 7:DHC 9:REFRESH 10:Save 0:Exit

?- 1

FINISHED(3) - The proposed conclusion is TRUE (by HRC): 

 (*) 01100000000000000000000000000000 00000010000000000000000000000000: intelligent(1)&have time(2)=>can learn Iphigenia(6) 

1:Proof 2:Query 3:Edit 4:HRC 5:LogFile 6:List 7:DHC 9:REFRESH 10:Save 0:Exit

?- 0

(2) Second example: 

ALGORITHM IPHIGENIA TESTING PROGRAM (EasyWin interface).

Select: 1 to load and test, 2 to create new, 0 to exit:

?- 1 

IPHIGENIA ALGORITHM Expert-System KBS/memory diagnostics:

***********************************************************************

Number of expert-system RULES = 37.

Terms in the Expert-system 'dictionary' = 29.

Active Bits = 30 (Number of Terms + 1 'flag-bit').

Covered by 1 dword(s), i.e. 4 bytes.

(30 / 32  = 0, 0 * 32 = 0, 1 * 32 = 32)

Size of the main array: (2 x 4 fact-bytes x 37 rules) = 296 bytes.

Total memory allocation (including 'conclusion-line'):

4 bytes x (37+1) x 2 = 304 bytes.

***********************************************************************

Sentences in Logic Array: 37, Dwords per row: 2 

  0)01000000000000000000000000000000 00100000000000000000000000000000 : bird(1)=>covered by feathers(2)

  1)01000000000000000000000000000000 00010000000000000000000000000000 : bird(1)=>can fly(3)

  2)01000000000000000000000000000000 00001000000000000000000000000000 : bird(1)=>has eggs(4)

  3)00000100000000000000000000000000 00000010000000000000000000000000 : mammal(5)=>has hair(6)

  4)00000100000000000000000000000000 00000001000000000000000000000000 : mammal(5)=>has live birth(7)

  5)00000000100000000000000000000000 00000000010000000000000000000000 : carnivore(8)=>eats meat(9)

  6)00000000001000000000000000000000 00000000000100000000000000000000 : ungulate(10)=>has hoofs(11)

  7)00000000000010000000000000000000 00000000000001000000000000000000 : cheetah(12)=>tawny color(13)

  8)00000000000010000000000000000000 00000000000000100000000000000000 : cheetah(12)=>has darkspots(14)

  9)00000000000000010000000000000000 00000000000001000000000000000000 : tiger(15)=>tawny color(13)

 10)00000000000000010000000000000000 00000000000000100000000000000000 : tiger(15)=>has darkspots(14)

 11)00000000000000001000000000000000 00000000000001000000000000000000 : giraffe(16)=>tawny color(13)

 12)00000000000000001000000000000000 00000000000000000100000000000000 : giraffe(16)=>long-legged(17)

 13)00000000000000001000000000000000 00000000000000000010000000000000 : giraffe(16)=>long-necked(18)

 14)00000000000000001000000000000000 00000000000000100000000000000000 : giraffe(16)=>has darkspots(14)

 15)00000000000000000001000000000000 00000000000000000000100000000000 : zebra(19)=>white color(20)

 16)00000000000000000001000000000000 00000000000000000000010000000000 : zebra(19)=>has black stripes(21)

 17)00000000000000000000001000000000 00000000000000000100000000000000 : ostrich(22)=>long-legged(17)

 18)00000000000000000000001000000000 00000000000000000000000100000000 : ostrich(22)=>cannot fly(23)

 19)00000000000000000000001000000000 00000000000000000010000000000000 : ostrich(22)=>long-necked(18)

 20)00000000000000000000001000000000 00000000000000000000000010000000 : ostrich(22)=>black color(24)

 21)00000000000000000000001000000000 00000000000000000000000001000000 : ostrich(22)=>has white marks(25)

 22)00000000000000000000000000100000 00000000000000000000000000010000 : penguin(26)=>can swim(27)

 23)00000000000000000000000000100000 00000000000000000000000010000000 : penguin(26)=>black color(24)

 24)00000000000000000000000000100000 00000000000000000000000001000000 : penguin(26)=>has white marks(25)

 25)00000000000000000000000000001000 00000000000000000000100000000000 : albatross(28)=>white color(20)

 26)00000100000000000000000000000000 00000000000000000000000000000100 : mammal(5)=>animal(29)

 27)00000000100000000000000000000000 00000100000000000000000000000000 : carnivore(8)=>mammal(5)

 28)00000000001000000000000000000000 00000100000000000000000000000000 : ungulate(10)=>mammal(5)

 29)01000000000000000000000000000000 00000000000000000000000000000100 : bird(1)=>animal(29)

 30)00000000000000000000001000000000 01000000000000000000000000000000 : ostrich(22)=>bird(1)

 31)00000000000000000000000000100000 01000000000000000000000000000000 : penguin(26)=>bird(1)

 32)00000000000000000000000000001000 01000000000000000000000000000000 : albatross(28)=>bird(1)

 33)00000000000010000000000000000000 00000000100000000000000000000000 : cheetah(12)=>carnivore(8)

 34)00000000000000010000000000000000 00000000100000000000000000000000 : tiger(15)=>carnivore(8)

 35)00000000000000001000000000000000 00000000001000000000000000000000 : giraffe(16)=>ungulate(10)

 36)00000000000000000001000000000000 00000000001000000000000000000000 : zebra(19)=>ungulate(10) 

 (*) 00000000000000010000000000000000 00000000000000000000000000000000: tiger(15)=>
 

1:Proof 2:Query 3:Edit 4:HRC 5:LogFile 6:List 7:DHC 9:REFRESH 10:Save 0:Exit

?- 2 

Give conclusion_row:

> tiger=>animal

[15] => [29]

 (*) 00000000000000010000000000000000 00000000000000000000000000000100: tiger(15)=>animal(29) 

ok(y/n)?- y 

1:Proof 2:Query 3:Edit 4:HRC 5:LogFile 6:List 7:DHC 9:REFRESH 10:Save 0:Exit

?- 1

FINISHED(1) - The proposed conclusion is TRUE (by RVC):

 (*) 00000000000000010000000000000000 00000000000000000000000000000100: tiger(15)=>animal(29) 

Press a key for more proof-cycles (or ESC to stop):

------------------------------------------- (end of sample screen-shots)------------------------------------ 

INTERPRETATION:

  1. The Iphigenia inference engine was first asked if the assertion “intelligent & have time => can learn Iphigenia” can be derived from the specific Knowledge Base. This example is -of course- trivial, since by examining the knowledge-base, anyone can verify that certain statements, when combined with simple deduction, lead immediately to the validity of the assertion.
  2. The second example is also very elementary, but not as trivial: - The knowledge base has been imported from a test-example in many Expert System Textbooks: The “animal expert system”. In this case. The inference engine was asked if the query “tiger=>animal” is valid for the existing knowledge-base. It immediately replied affirmatively. A human verification is also easy in this case: Although the assertion “tiger=>animal” does not exist literally as it stands, inside the knowledge-base, there are other statements inside it that imply the assertion. In particular, the following:

26)00000100000000000000000000000000 00000000000000000000000000000100 : mammal (5)=>animal (29)

27)00000000100000000000000000000000 00000100000000000000000000000000 : carnivore(8)=>mammal(5)

34)00000000000000010000000000000000 00000000100000000000000000000000 : tiger (15)=>carnivore(8)

 

The proposed assertion is the following (bottom bit-row):

(*) 00000000000000010000000000000000 00000000000000000000000000000100: tiger(15)=>animal(29)

 

Interestingly, this was not proved by “following implications” (as a human prover, or a conventional mechanical theorem prover might do). Instead, the assertion was proved by more elementary low-level bit-reductions, in a Logic System that is an alternative to Boolean Logic itself. In this Logic system there are only three axioms (used in the proofs) and no need for “logic implication” as such.

In fact, logic implication can be proved as a theorem in this system, rather than taken for granted as an axiom. A complete proof of this claim is in the web-page:

http://multiforms.netfirms.com/mflogic_proofs.html#theorem_T7 

It seems that the “quantum”, or the “elementary particle” of logic computation, is not “implication”, but a more primitive operation, the act of “drawing a distinction”. From this low-level operation, logic implications arise as consequences. The theory about this is “Multiple Form Logic”, explained in the (strictly non-commercial) research site: http://multiforms.netfirms.com.
 

IPHIGENIA ALGORITHM PREDICATES:

 

% Applies brute-force RVC (Rule of Vertical Cancellation) to an entire Iphigenia array:

  iphig_rvc : (binary Array,binary Conclusion) 

-> integer procedure(i, i) language c as "_iphig_rvc_0".

% Returns 0 if array was PROVED, 1 if array was changed, -1 if no change.

 

% Like the previous, but also produces (as output arg.3) an array of “rows changed”:  

iphig_rvc : (binary Array, binary Conclusion, binary RowsChangedBuffer)

-> integer procedure(i, i, i) language c as "_iphig_rvc_1".

% Returns 0 if array was PROVED, 1 if array was changed, -1 if no change.

 

% Applies RDT (“Rule of Diagonal Transfer”)

% to a single-row Iphigenia array/query-line (only used for testing purposes):

iphig_rdt : (binary SingleRowArray, binary ConclusionRow)

-> integer procedure(i, i) language c as "_iphig_rdt_0".

% Returns 0 if the array was PROVED, 1 if array was changed, -1 if no change.

 

% Applies reduction rule “RDT” to a given row (and query line) of an Iphigenia array:

iphig_rdt : (binary Array, binary ConclusionRow, unsigned RowIndex)

-> integer procedure(i, i, i) language c as "_iphig_rdt_1".

% Returns 0 if the array was PROVED, 1 if array was changed, -1 if no change.

 

% Applies reduction rule “RDT” to all specified rows (ARG 3) of an Iphigenia array:

iphig_rdt_rows: (binary Array, binary ConclusionRow, binary RowIndexesAffected)

-> integer procedure(i, i, i) language c as "_iphig_rdt_rows".

% Returns 0 if the array was PROVED, 1 if array was changed, -1 if no change.

 

% Applies reduction rule “RDT” to ALL the rows of an Iphigenia array:

iphig_rdt_all: (binary Array, binary ConclusionRow)

-> integer procedure(i, i) language c as "_iphig_rdt_all".

% Returns 0 if the array was PROVED; 1 if array was changed; -1 if no change.

 

% Checks if the (32-bit-encoded) input number contains an isolated “1”-bit:

single_onbit: (unsigned Num)

-> integer procedure(i) language as "_single_onbit_0".

 

% Checks if a particular row of an Iphigenia Array contains an isolated “1”-bit:

single_onbit: (binary Array, unsigned NumDwords, unsigned DwPosOut)

-> integer procedure(i, i, o) language c as "_single_onbit_1".

 

% Reverses left-hand & right-sides of a “Query Row” relevant to an Iphigenia array:

iphig_revsides : (binary ConclusionRow,unsigned DwordsInBinary)

-> integer procedure(i, i) language c as "_iphig_revsises".

 

% Sets a doubleword value (32 bits) at a row and column of an Iphigenia array:

iphig_setdword : (binary array,unsigned Nrows,unsigned Row,unsinged Col,unsigned Value)

-> integer procedure(i, i, i, i, i) language c as "_iphig_setdword".

 

% Sets the byte-value  : (all 8 bits) of a row and column in an Iphigenia array:

iphig_setbyte : (binary array,unsigned Nrows,unsigned Row,unsinged Col,byte Value)

-> integer procedure(i, i, i, i, i) language c as "_iphig_setbyte".

 

% Sets the bit-value of a specific row and column, in an Iphigenia array:

iphig_setbit : (binary array,unsigned Nrows,unsigned Row,unsinged Col,byte Value)

-> integer procedure(i, i, i, i, i) language c as "_iphig_setbit_0".

 

% Sets the bit-value of a specific row and column in a SIDE of an Iphigenia array:

iphig_setbit : (binary Array, unsigned Nrows,unsigned Row,unsinged Col,

byte Side, byte Value)

-> integer procedure(i, i, i, i, i, i) language c as "_iphig_setbit_1".

 

% Gets the bit-value of a specific row and column, in an Iphigenia array:

iphig_getbit : (binary Array, unsigned Nrows,unsigned Row,unsinged Col)

-> integer procedure(i, i, i, i) language c as "_iphig_getbit_0".

 

% Gets the bit-value of a specific row and column in a SIDE of an Iphigenia array:

iphig_getbit : (binary Array, unsigned Nrows,unsigned Row,unsinged Col,byte Side)

-> integer procedure(i, i, i, i, i) language c as "_iphig_getbit_0".

 

% Tests if the Conclusion-Row of an Iphigenia Array contains “…&X& => X”:

iphig_xsb : (binary ConclusionRow) -> integer procedure(i) language c as "_iphig_xsb_0".

 

% Tests if any Row of an Iphigenia Array contains “…&X& => X”:

iphig_xsb : (binary Array,unsigned RowWidth)

-> integer procedure(i, i) language c as "_iphig_xsb_1".

 

% Deletes (setting “invalid bit” to 1) all the rows satisfying condition “DHC”:

iphig_dhc : (binary Array, binary ConclusionRow)

-> integer procedure(i, i) language c as "_iphig_dhc".

 

% Writes an Iphigenia array’s “conclusion row” to current output:

iphig_wrbin : (binary ConclusionRow)

-> integer procedure(i) language c as "_iphig_wrbin_0".

 

% Writes an entire Iphigenia array, as well as a “conclusion row” to current output:

iphig_wrbin : (binary Array, unsigned NumDwordsPerRow)

-> integer procedure(i, i) language c as "_iphig_wrbin_1".

 

% Writes a specific Row (arg.3) of an Iphigenia array to current output:

iphig_wrbin : (binary Array, unsigned NumDwordsPerRow, unsigned Row)

-> integer procedure(i, i, i) language c as "_iphig_wrbin_2".

 

% Clears a row in an Iphigenia array by setting the “invalid bit” to 1:

iphig_clear_row : (binary Array, unsigned NDwordsPerRow, unsigned Row)

-> integer procedure(i, i, i) language c as "_iphig_clear_row".

 

% Deletes completely an Iphigenia Array’s row (by setting all bits to 0):

iphig_del_row : (binary Array, unsigned NDwordsPerRow, unsigned Row)

-> integer procedure(i, i, i) language c as "_iphig_del_row".

 

% Inserts (i.e. modifies) a given row (arg.3) in an Iphigenia array:

iphig_ins_row : (binary Array, unsigned NDwordsPerRow, unsigned Row, binary Content)

-> integer procedure(i, i, i, i) language c as "_iphig_ins_row".

NOTES

  •  The basic algorithm was implemented as a mixture of Visual Prolog and Assembly language code, i.e. the main control loop of the algorithm is in Visual Prolog. This was done because interesting new possibilities and enhancements are often discovered, and the main loop needs to be flexible enough to accommodate changes quickly. It is, however, straight-forward to hard-code the whole loop independently of Prolog (or the calling program) in Pure Assembly.
  • Although this system has no variables (in Prolog’s sense) it works well for a wide variety of problems, just like SAT (Satisfiability) methods are today producing increasingly important results, throughout the world, in areas where it was once thought inappropriate to express optimization problems, using (the often underestimated) Zero-order Propositional Calculus.
  • Despite the last comment, strong efforts are made to also implement Prolog unifications with variables. A discussion of this is in http://multiforms.netfirms.com/extending_iphigenia.html., as well as http://multiforms.netfirms.com/mf_unifications.html

MULTIPLE FORM LOGIC -the theory that led to “Algorithm Iphigenia”:

 

Multiple Form Logic is a simple but strange mutation of Propositional Calculus, based solely on operations OR and XOR, and also (which is even more strange) requiring no absolute or pre-defined truth values (like 0 and 1): Instead of 0 or 1, there is a “Universal Object” equivalent to 1, but not exactly the same, ontologically. This “Universal Object” is not given, but instead it is constructed, as the “Union of all possible objects” in this system.

 

Multiple Form Logic is an enhanced descendant of another system of logic, expressed in a book by British author George Spencer-Brown: “Laws of Form”. Although the logic theories of George Spencer-Brown have not been entirely accepted by the international academic community, there are many people in the world working seriously with “Laws of Form”, some of them respected professors of Computer Science and Mathematics: Professor Lou Kauffman (the inventor of “Knot Theory”, from the University of Illinois), Richard Shoup (who also runs the “Laws of form site” www.lawsofform.org), David Keenan, William Bricken, Eddie Oshins, and many others.

 

Since long ago (mid-eighties) the three axioms of Multiple Form Logic were proved to be:

  1. Sufficient to derive the axioms of Propositional Calculus as trivial consequences (which is consistent with George Spencer Brown’s proofs of similar results, in his own system)
  2. Sufficient to derive George Spencer-Brown’s algebraic axioms as theorems in Multiple Form Logic (indicating that Multiple Form Logic is stronger and more fundamental than Brown’s). On-line listings of these proofs are in: http://multiforms.netfirms.com/mflogic_proofs.html#theorem_T4
  3. Sufficient for the author to… waste a few years in weird research that produced no financial (or other) rewards, and no recognition by anyone else (except Professor Cliff Jones, head of Computer Science in the University of Manchester, at the time, and very few others…)
  4. Sufficient to impress some university professors, about the countless shorter logic proofs supplied by this theory, replacing much more tedious proofs in Logic Textbooks
  5. Sufficient for the theory itself to be eventually abandoned (in 1991) after the author had quite enough of it!  

However...

  • in 2003, some new theorems were proved and the theory was slowly resurrected: E.g.
  • Huntington’s formula was derived from the three axioms of Multiple Form Logic, in a remarkably short proof: http://multiforms.netfirms.com/mflogic_proofs.html#theorem_T10
  • In 2004, the axioms of Multiple Form Logic were proved to be generalised and stronger versions of William Bricken’s axioms (for his "boundary algebera" - a similar system, yet another descendant of G.S.Brown’s):  http://multiforms.netfirms.com/more_theorems.html
  • In 2005, “Iphigenia” was reborn, implemented in Assembly Language for modern PC's. It finally reincarnated in April 2006, as a library of predicates for Visual Prolog.
  • In 2007 a mathematician (Mr. Art Collings) proved some theorems, derived certain new results and expressed some intriguing insights about the characterisation of Multiple Form Logic (together with some interesting criticisms of its philosophical basis).
  • (UPDATE) In 2008 references and links to Multiple Form Logic were added (by a group of researchers) to the Wikipedia entry for "Laws of Form (related work)", as well as in the  "links for further reading".