[gecode-users] Problems with some REG and some suggestions
Christian Schulte
cschulte at kth.se
Mon Sep 26 21:42:41 CEST 2011
Hi,
Great that you like Gecode.
Unfortunately there is very little that we can do about 1: the algorithm we
use for turning a regexp into a DFA actually uses that much memory (and it
is not even naïve). Remember: you are creating a DFA with 1000 states
_after_ minimization. The intermediate DFA that is created has millions of
states (any conversion of a regexp directly to a DFA or via a NFA can show
this behavior as the DFA that is constructed might be exponentially larger
than the regexp/NFA). And you artfully picked a pretty bad example ;-)
Maybe there are better algorithms that combine DFA construction with
minimization to deal with the DFAs but for most cases it works fine for
Gecode...
Is your regexp related to a real example or did you want to torture Gecode
;-)
I never thought about operations on DFAs as I thought one would use regexps
anyway. The only hard part is in fact intersection, the rest is very simple.
What do you need them for?
Best
Christian
--
Christian Schulte, KTH, web.it.kth.se/~cschulte/
-----Original Message-----
From: users-bounces at gecode.org [mailto:users-bounces at gecode.org] On Behalf
Of TeXitoi
Sent: Monday, September 26, 2011 8:32 PM
To: users at gecode.org
Subject: [gecode-users] Problems with some REG and some suggestions
Hi,
First, Thanks for you're very good CP library: I enjoy using it.
I have see 1 "bug" and have identified some missing features.
First, I have poblems when I try to generate REG of the form
REG r = *a + a(1000,1000);
It takes a lot of RAM. Here is a main function that show the problem:
#include <gecode/int.hh>
#include <gecode/search.hh>
#include <gecode/minimodel.hh>
using namespace Gecode;
using namespace std;
int main(void)
{
REG any_l = REG(0) | REG(1);
cout << "Generating '.{1000}'..." << flush;
REG bigReg_l = any_l(1000,1000);
cout << " done." << endl;
cout << "Generating DFA(.{1000})..." << flush;
DFA dfa_l(bigReg_l);//OK
cout << " done." << endl;
cout << "Generating '.{1000}.{1000}'..." << flush;
REG veryBigReg_l = bigReg_l + bigReg_l;
cout << " done." << endl;
cout << "Generating DFA(.{1000}.{1000})..." << flush;
dfa_l = DFA(veryBigReg_l);//OK
cout << " done." << endl;
cout << "Generating '.{1000}.*'..." << flush;
veryBigReg_l = bigReg_l + *any_l;
cout << " done." << endl;
cout << "Generating DFA(.{1000}.*)..." << flush;
dfa_l = DFA(veryBigReg_l);//OK
cout << " done." << endl;
cout << "Generating '.*.{1000}'..." << flush;
veryBigReg_l = *any_l + bigReg_l;
cout << " done." << endl;
cout << "Generating DFA(.*.{1000})..." << flush;
dfa_l = DFA(veryBigReg_l);//takes lots of RAM...
cout << " done." << endl;
}
Then, it could be great to have different IntConLevel for the sequence
constraint. In my model, the filtering does not help and takes time, so I
post a count constraint for each interval. It can be useful to have that
with IntConLevel = ICL_VAL. And with IntConLevel = ICL_DOM, the extensional
version (which is exponential).
Finally, DFA operations can be very useful : union, intersection,
concatenation, star and opposite. I already coded intersection and opposite
for my needs (I can show them if you want, but the implementation is quite
naive).
Again, thanks a lot for Gecode!
--
Guillaume Pinot http://www.texitoi.eu
« Il semble que la perfection soit atteinte non quand il n'y a plus rien à
ajouter, mais quand il n'y a plus rien à retrancher. »
-- Antoine de Saint-Exupéry, Terre des hommes
() ASCII ribbon campaign -- Against HTML e-mail
/\ http://www.asciiribbon.org -- Against proprietary attachments
_______________________________________________
Gecode users mailing list
users at gecode.org
https://www.gecode.org/mailman/listinfo/gecode-users
More information about the users
mailing list