|
1
10th August 01:02
External User
|
General Purpose parsing
I'm trying to write a parser in prolog and all the tutorials on the net
assume that I'm only interested in a limited number of tokens:
sentence --> verb, noun.
verb(hammering).
noun(hammer).
noun(nail).
I'd like to pick up tokens that fit a certain regular expression:
[a-zA-Z][a-zA-Z0-9]*
I've tried a number of different things.
First, I tried:
id --> [S], {regmatch("^[a-zA-Z][a-zA-Z0-9]*$", S)}.
| ?- phrase(id, "bob").
no
So then I tried, this, thinking that the problem might have to do with
the difference between strings and atoms:
id --> [C], {atom_chars(C,S), regmatch("^[a-zA-Z][a-zA-Z0-9]*$", S)}.
again:
| ?- phrase(id, "bob").
no
When I trace through, it doesn't even look like regmatch is being called:
| ?- phrase(id, "bob").
1 1 Call: id([98,111,98],[]) ?
2 2 Call: 'C'([98,111,98],_1148,[]) ?
2 2 Fail: 'C'([98,111,98],_1148,[]) ?
1 1 Fail: id([98,111,98],[]) ?
no
I tried a different tack, which was to manually encode the regular
expression:
id --> [C], {letter([C])}.
id --> [C], restid, {letter([C])}.
restid --> [C], {letter([C])}.
restid --> [C], {num([C])}.
restid --> [C], restid, {letter([C])}.
restid --> [C], restid, {num([C])}.
letter("a").
....
num("9").
This seems to work:
| ?- phrase(id, "bob").
yes
| ?- phrase(id, "bob ").
no
But now I have the following problem.
I've added another rule:
bind --> id, ["="], id.
Now when I try this rule out it doesn't work:
| ?- phrase(bind, "a=b").
no
I've pasted in the trace for this below, and it's really freaking long
for some reason.
Any advice on what the correct way to parse arbitrary tokens using
regular expressions is in addition to why I can't get my 'bind' rule to
work would be appreciated.
Thanks,
Reuben Grinberg
| ?- phrase(bind, "a=b").
26 5 Call: restid([],_1107) ?
1 1 Call: bind([97,61,98],[]) ?
27 6 Call: 'C'([],_7704,_1107) ?
2 2 Call: id([97,61,98],_1107) ?
27 6 Fail: 'C'([],_7704,_1107) ?
3 3 Call: 'C'([97,61,98],_1773,_1107) ?
28 6 Call: 'C'([],_7704,_1107) ?
3 3 Exit: 'C'([97,61,98],97,[61,98]) ?
28 6 Fail: 'C'([],_7704,_1107) ?
4 3 Call: letter([97]) ?
29 6 Call: 'C'([],_7710,_7711) ?
? 4 3 Exit: letter([97]) ?
29 6 Fail: 'C'([],_7710,_7711) ?
? 2 2 Exit: id([97,61,98],[61,98]) ?
30 6 Call: 'C'([],_7710,_7711) ?
5 2 Call: 'C'([61,98],[61],_1101) ?
30 6 Fail: 'C'([],_7710,_7711) ?
5 2 Fail: 'C'([61,98],[61],_1101) ?
26 5 Fail: restid([],_1107) ?
2 2 Redo: id([97,61,98],[61,98]) ?
13 4 Fail: restid([98],_1107) ?
4 3 Redo: letter([97]) ?
31 4 Call: 'C'([61,98],_3736,_3737) ?
4 3 Fail: letter([97]) ?
31 4 Exit: 'C'([61,98],61,[98]) ?
6 3 Call: 'C'([97,61,98],_1779,_1780) ?
32 4 Call: restid([98],_1107) ?
6 3 Exit: 'C'([97,61,98],97,[61,98]) ?
33 5 Call: 'C'([98],_5717,_1107) ?
7 3 Call: restid([61,98],_1107) ?
33 5 Exit: 'C'([98],98,[]) ?
8 4 Call: 'C'([61,98],_3730,_1107) ?
34 5 Call: letter([98]) ?
8 4 Exit: 'C'([61,98],61,[98]) ?
? 34 5 Exit: letter([98]) ?
9 4 Call: letter([61]) ?
? 32 4 Exit: restid([98],[]) ?
9 4 Fail: letter([61]) ?
35 4 Call: num([61]) ?
10 4 Call: 'C'([61,98],_3730,_1107) ?
35 4 Fail: num([61]) ?
10 4 Exit: 'C'([61,98],61,[98]) ?
32 4 Redo: restid([98],[]) ?
11 4 Call: num([61]) ?
34 5 Redo: letter([98]) ?
11 4 Fail: num([61]) ?
34 5 Fail: letter([98]) ?
12 4 Call: 'C'([61,98],_3736,_3737) ?
36 5 Call: 'C'([98],_5717,_1107) ?
12 4 Exit: 'C'([61,98],61,[98]) ?
36 5 Exit: 'C'([98],98,[]) ?
13 4 Call: restid([98],_1107) ?
37 5 Call: num([98]) ?
14 5 Call: 'C'([98],_5717,_1107) ?
37 5 Fail: num([98]) ?
14 5 Exit: 'C'([98],98,[]) ?
38 5 Call: 'C'([98],_5723,_5724) ?
15 5 Call: letter([98]) ?
38 5 Exit: 'C'([98],98,[]) ?
? 15 5 Exit: letter([98]) ?
39 5 Call: restid([],_1107) ?
? 13 4 Exit: restid([98],[]) ?
40 6 Call: 'C'([],_7704,_1107) ?
16 4 Call: letter([61]) ?
40 6 Fail: 'C'([],_7704,_1107) ?
16 4 Fail: letter([61]) ?
41 6 Call: 'C'([],_7704,_1107) ?
13 4 Redo: restid([98],[]) ?
41 6 Fail: 'C'([],_7704,_1107) ?
15 5 Redo: letter([98]) ?
42 6 Call: 'C'([],_7710,_7711) ?
15 5 Fail: letter([98]) ?
42 6 Fail: 'C'([],_7710,_7711) ?
17 5 Call: 'C'([98],_5717,_1107) ?
43 6 Call: 'C'([],_7710,_7711) ?
17 5 Exit: 'C'([98],98,[]) ?
43 6 Fail: 'C'([],_7710,_7711) ?
18 5 Call: num([98]) ?
39 5 Fail: restid([],_1107) ?
18 5 Fail: num([98]) ?
44 5 Call: 'C'([98],_5723,_5724) ?
19 5 Call: 'C'([98],_5723,_5724) ?
44 5 Exit: 'C'([98],98,[]) ?
19 5 Exit: 'C'([98],98,[]) ?
45 5 Call: restid([],_1107) ?
20 5 Call: restid([],_1107) ?
46 6 Call: 'C'([],_7704,_1107) ?
21 6 Call: 'C'([],_7704,_1107) ?
46 6 Fail: 'C'([],_7704,_1107) ?
21 6 Fail: 'C'([],_7704,_1107) ?
47 6 Call: 'C'([],_7704,_1107) ?
22 6 Call: 'C'([],_7704,_1107) ?
47 6 Fail: 'C'([],_7704,_1107) ?
22 6 Fail: 'C'([],_7704,_1107) ?
48 6 Call: 'C'([],_7710,_7711) ?
23 6 Call: 'C'([],_7710,_7711) ?
48 6 Fail: 'C'([],_7710,_7711) ?
23 6 Fail: 'C'([],_7710,_7711) ?
49 6 Call: 'C'([],_7710,_7711) ?
24 6 Call: 'C'([],_7710,_7711) ?
49 6 Fail: 'C'([],_7710,_7711) ?
24 6 Fail: 'C'([],_7710,_7711) ?
45 5 Fail: restid([],_1107) ?
20 5 Fail: restid([],_1107) ?
32 4 Fail: restid([98],_1107) ?
25 5 Call: 'C'([98],_5723,_5724) ?
7 3 Fail: restid([61,98],_1107) ?
25 5 Exit: 'C'([98],98,[]) ?
2 2 Fail: id([97,61,98],_1107) ?
26 5 Call: restid([],_1107) ?
1 1 Fail: bind([97,61,98],[]) ?
27 6 Call: 'C'([],_7704,_1107) ?
27 6 Fail: 'C'([],_7704,_1107) ?
28 6 Call: 'C'([],_7704,_1107) ?
| ?-
28 6 Fail: 'C'([],_7704,_1107) ?
| ?-
29 6 Call: 'C'([],_7710,_7711) ?
29 6 Fail: 'C'([],_7710,_7711) ?
30 6 Call: 'C'([],_7710,_7711) ?
30 6 Fail: 'C'([],_7710,_7711) ?
26 5 Fail: restid([],_1107) ?
13 4 Fail: restid([98],_1107) ?
31 4 Call: 'C'([61,98],_3736,_3737) ?
31 4 Exit: 'C'([61,98],61,[98]) ?
32 4 Call: restid([98],_1107) ?
33 5 Call: 'C'([98],_5717,_1107) ?
33 5 Exit: 'C'([98],98,[]) ?
34 5 Call: letter([98]) ?
? 34 5 Exit: letter([98]) ?
? 32 4 Exit: restid([98],[]) ?
35 4 Call: num([61]) ?
35 4 Fail: num([61]) ?
32 4 Redo: restid([98],[]) ?
34 5 Redo: letter([98]) ?
34 5 Fail: letter([98]) ?
36 5 Call: 'C'([98],_5717,_1107) ?
36 5 Exit: 'C'([98],98,[]) ?
37 5 Call: num([98]) ?
37 5 Fail: num([98]) ?
38 5 Call: 'C'([98],_5723,_5724) ?
38 5 Exit: 'C'([98],98,[]) ?
39 5 Call: restid([],_1107) ?
40 6 Call: 'C'([],_7704,_1107) ?
40 6 Fail: 'C'([],_7704,_1107) ?
41 6 Call: 'C'([],_7704,_1107) ?
41 6 Fail: 'C'([],_7704,_1107) ?
42 6 Call: 'C'([],_7710,_7711) ?
42 6 Fail: 'C'([],_7710,_7711) ?
43 6 Call: 'C'([],_7710,_7711) ?
43 6 Fail: 'C'([],_7710,_7711) ?
39 5 Fail: restid([],_1107) ?
44 5 Call: 'C'([98],_5723,_5724) ?
44 5 Exit: 'C'([98],98,[]) ?
45 5 Call: restid([],_1107) ?
46 6 Call: 'C'([],_7704,_1107) ?
46 6 Fail: 'C'([],_7704,_1107) ?
47 6 Call: 'C'([],_7704,_1107) ?
47 6 Fail: 'C'([],_7704,_1107) ?
48 6 Call: 'C'([],_7710,_7711) ?
48 6 Fail: 'C'([],_7710,_7711) ?
49 6 Call: 'C'([],_7710,_7711) ?
49 6 Fail: 'C'([],_7710,_7711) ?
45 5 Fail: restid([],_1107) ?
32 4 Fail: restid([98],_1107) ?
7 3 Fail: restid([61,98],_1107) ?
2 2 Fail: id([97,61,98],_1107) ?
1 1 Fail: bind([97,61,98],[]) ?
no
|