[gecode-users] Problems with some REG and some suggestions

Mon Sep 26 21:42:41 CEST 2011

Hi,

Great that you like Gecode.

Unfortunately there is very little that we can do about 1: the algorithm we
use for turning a regexp into a DFA actually uses that much memory (and it
is not even naïve). Remember: you are creating a DFA with 1000 states
_after_ minimization. The intermediate DFA that is created has millions of
states (any conversion of a regexp directly to a DFA or  via a NFA can show
this behavior as the DFA that is constructed might be exponentially larger
than the regexp/NFA). And you artfully picked a pretty bad example ;-)

Maybe there are better algorithms that combine DFA construction with
minimization to deal with the DFAs but for most cases it works fine for
Gecode...

Is your regexp related to a real example or did you want to torture Gecode
;-)

I never thought about operations on DFAs as I thought one would use regexps
anyway. The only hard part is in fact intersection, the rest is very simple.
What do you need them for?

Best
Christian

--
Christian Schulte, KTH, web.it.kth.se/~cschulte/

-----Original Message-----
From: users-bounces at gecode.org [mailto:users-bounces at gecode.org] On Behalf
Of TeXitoi
Sent: Monday, September 26, 2011 8:32 PM
To: users at gecode.org
Subject: [gecode-users] Problems with some REG and some suggestions

Hi,

First, Thanks for you're very good CP library: I enjoy using it.

I have see 1 "bug" and have identified some missing features.

First, I have poblems when I try to generate REG of the form
  REG r = *a + a(1000,1000);
It takes a lot of RAM.  Here is a main function that show the problem:

#include <gecode/int.hh>
#include <gecode/search.hh>
#include <gecode/minimodel.hh>

using namespace Gecode;
using namespace std;

int main(void)
{
        REG any_l = REG(0) | REG(1);
        cout << "Generating '.{1000}'..." << flush;
        REG bigReg_l = any_l(1000,1000);
        cout << " done." << endl;
        cout << "Generating DFA(.{1000})..." << flush;
        DFA dfa_l(bigReg_l);//OK
        cout << " done." << endl;
        cout << "Generating '.{1000}.{1000}'..." << flush;
        REG veryBigReg_l = bigReg_l + bigReg_l;
        cout << " done." << endl;
        cout << "Generating DFA(.{1000}.{1000})..." << flush;
        dfa_l = DFA(veryBigReg_l);//OK
        cout << " done." << endl;
        cout << "Generating '.{1000}.*'..." << flush;
        veryBigReg_l = bigReg_l + *any_l;
        cout << " done." << endl;
        cout << "Generating DFA(.{1000}.*)..." << flush;
        dfa_l = DFA(veryBigReg_l);//OK
        cout << " done." << endl;
        cout << "Generating '.*.{1000}'..." << flush;
        veryBigReg_l = *any_l + bigReg_l;
        cout << " done." << endl;
        cout << "Generating DFA(.*.{1000})..." << flush;
        dfa_l = DFA(veryBigReg_l);//takes lots of RAM...
        cout << " done." << endl;
}

Then, it could be great to have different IntConLevel for the sequence
constraint.  In my model, the filtering does not help and takes time, so I
post a count constraint for each interval.  It can be useful to have that
with IntConLevel = ICL_VAL. And with IntConLevel = ICL_DOM, the extensional
version (which is exponential).

Finally, DFA operations can be very useful : union, intersection,
concatenation, star and opposite.  I already coded intersection and opposite
for my needs (I can show them if you want, but the implementation is quite
naive).

Again, thanks a lot for Gecode!

-- 
Guillaume Pinot                               http://www.texitoi.eu

« Il semble que la perfection soit atteinte non quand il n'y a plus rien à
ajouter, mais quand il n'y a plus rien à retrancher. »
                      -- Antoine de Saint-Exupéry, Terre des hommes

()  ASCII ribbon campaign      -- Against HTML e-mail
/\  http://www.asciiribbon.org -- Against proprietary attachments

_______________________________________________
Gecode users mailing list
users at gecode.org
https://www.gecode.org/mailman/listinfo/gecode-users