Mombu the Programming Forum

Go Back   Mombu the Programming Forum > Programming > OpenToken version 3.1 preview
User Name
Password
REGISTER NOW! Mark Forums Read




Reply
1 27th September 05:51
stephen leake
External User
 
Posts: 1
Default OpenToken version 3.1 preview



I've finally made some time to work on the OpenToken release.

I've created a new web page
http://www.stephe-leake.org/ada/opentoken.html

It has a tarball of the version 3.1w ('w' for working) sources. These
are also in the ada-france monotone server.

The web page also has a list of changes. I've merged the version I
used in webcheck (which required significant changes in the HTML
parser), and in my work (which uncovered some other bugs). Both of
those projects are now using this OpenToken package. It includes fixes
for the two Debian bugs against OpenToken.

Please look it over, and let me know if you'd like something changed.

If I get no responses in two weeks, I'll declare this release final,
and just drop the 'w'.

If someone could provide dates for the previous OpenToken releases,
that would be fun.

--
-- Stephe
  Reply With Quote


 


2 3rd October 19:28
adamagica
External User
 
Posts: 1
Default OpenToken version 3.1 preview



There is a problem with Bracketed_Comment. If it extends over more
than one line, the token is correctly recognized, but the lexeme
fails.

You can take the Bracketed_Comment_Test in directory Test to verify
the wrong behaviour.


These should be on Ted Dennisons's site.
  Reply With Quote
3 13th October 00:51
stephen leake
External User
 
Posts: 1
Default OpenToken version 3.1 preview


AdaMagica <christoph.grein@eurocopter.com> writes:

The line feed characters are dropped from the lexeme, on Windows.


I've added a test that shows the problem.

I don't suppose you have an idea of how to fix it?

It will be interesting to figure out how to make that test portable
between Windows and Gnu/Linux. The easiest way to identify which line
ending to use that I know of is to look at
GNAT.Directory_Operations.Dir_Separator; it's '\' for CR LF, '/' for
LF. Don't know how to deal with Mac!

The Ada standard intended System.Name to deal with this, but in GNAT
that's always SYSTEM_NAME_GNAT, so that's no help.

Anyone have a better idea?


If you mean
http://www.telepath.com/~dennison/Te...OpenToken.html, there
are a couple of dates there, thanks:

8/13/00 - Version 3.0b is now available.
1/27/00 - Version 2.0 is now available.

--
-- Stephe
  Reply With Quote
4 13th October 00:51
adamagica
External User
 
Posts: 1
Default OpenToken version 3.1 preview


Also on Linux.


You guessed right - I haven't. I shortly browsed the code, but found no simple solution.


There are other OSs where an end of line is not a character in the
stream. Can OpenToken handle these?
We could do a Get_Line and insert a LF irrespective of what the OS
uses. If then a lexeme was output that comprises several lines
(currently only Bracketed_Comment I think), the output routine would
have to translate this back to the OS's New_Line (this has of course
to be documented in the recognizer).

There is a declaration EOL_Character in package OpenToken.
  Reply With Quote
5 13th October 00:51
dmitry a. kazakov
External User
 
Posts: 1
Default OpenToken version 3.1 preview


You could do what I did in the Simple Components for Ada parser. I
decoupled sources from the parser itself. The source is an abstract object
that provides basic operations like "get next line" and "forward to the
next line". The obvious advantage is that you need not to care about LF, CR
in the parser, and can use files, streams, strings, GUI text buffers, etc,
as a source to the same parser.

My 2 cents.

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de
  Reply With Quote
6 13th October 00:51
vlc
External User
 
Posts: 1
Default OpenToken version 3.1 preview


When I was using a MAC the last time, they used a colon (":") to
separate directories. But this was before Mac OS X. I don't know if
this has changed with the BSD kernel.
  Reply With Quote
7 13th October 00:51
sjw
External User
 
Posts: 1
Default OpenToken version 3.1 preview


for the 99.999% of people who are using Mac OS X, treat as Unix.
  Reply With Quote
8 13th October 00:52
stephen leake
External User
 
Posts: 1
Default OpenToken version 3.1 preview


"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:


Yes, OpenToken does this; the sources are called Text_Feeders.

The provided file Text_Feeder uses Ada.Text_IO, so it does "the right
thing" for each operating system.

Which is why the LF is missing from the lexeme; Ada.Text_IO.Get_Line
consumes it.

OpenToken also provides a String Text_Feeder, which of course has no
notion of lines.

--
-- Stephe
  Reply With Quote
9 13th October 00:52
stephen leake
External User
 
Posts: 1
Default OpenToken version 3.1 preview


AdaMagica <christoph.grein@eurocopter.com> writes:


The current file Text_Feeder uses Ada.Text_IO, so it should do "the
right thing" for any OS.

That's what the text feeder does now. Actually, it inserts
EOL_Character (see below).

So the LF must be dropped after that; I'll have to look harder.


Which has a comment to change it for your OS; not very friendly, as
it's a constant!

It's used in OpenToken.Recognizer.Character_Set.Standard_Whites pace,
OpenToken.Recognizer.Line_Comment.Analyze,
OpenToken.Recognizer.String.Analyze.

I'll change the comment to "we use this regardless of OS, since we
need a standard way of representing an end of line in a string
buffer".

--
-- Stephe
  Reply With Quote
10 13th October 00:52
dmitry a. kazakov
External User
 
Posts: 1
Default OpenToken version 3.1 preview


What is wrong with that? I do it exactly same way for text files.

Well, provided there is no concern to make it compatible with how Ada RM
defines "a line" (I am not sure that "LRM line" always same as "OS line").
Another issue could be compatibility with the GUI text editor.

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de
  Reply With Quote
Reply


Thread Tools
Display Modes




666