Mombu the Programming Forum sponsored links

Go Back   Mombu the Programming Forum > Programming > PL/I cant parse strings without looping
User Name
Password
REGISTER NOW! Mark Forums Read

sponsored links


Reply
 
1 23rd April 02:13
david frank
External User
 
Posts: 1
Default PL/I cant parse strings without looping



I just challenged John Smith who inferred in comp.lang.fortran
that C syntax was better than Fortran manipulating text
(see: topic "C or Fortran") by posting below example Fortran
parsing
problem and inviting him to translate it to C.

Since PL/I is also inferior to Fortran syntax parsing text because of
lack of char array parsing functions, and ability to address a
caller's string as an array in a parsing function/subroutine.
If someone wants to have a crack writing below in PL/I, be my guest,

!-----------------
program parse_text
character(50) :: line = 'remove blanks, reverse string using array
syntax'
call parse(line)
write (*,*) line ! " xatnysyarragnisugnirtsesrever,sknalbevomer"
end program

! --------------------------------
subroutine parse(string) ! strip blanks and reverse string
character(*) :: string ! via char array pack function
character,pointer :: a(
integer :: n
interface
function S2P(s) result(p)
character(*),target :: s
character,pointer :: p(
end function
end interface

a => S2P(string) ! equate array to string
n = len_trim(string) ! #chars in string
a(n:1:-1) = PACK( a(1:n), a(1:n) /= ' ', SPREAD(' ',1,n) )
end subroutine

! -----------------------------
function S2P(s,n) result(p)
integer :: n
character,target :: s(n)
character,pointer :: p(
p => s
end function
  Reply With Quote


  sponsored links


2 2nd May 11:48
david frank
External User
 
Posts: 1
Default PL/I cant parse char-by-char



<<<< snip "Fortran or C" message in comp.lang.fortran >>>

well, the basic idea is to run though a file and record any word
fitting a
particular criterion i.e: in html :
<.... "keywords" , contents="contentsA , contentsB, ...">
so here I have to look for "<" and the corresponding ">" because,
in between, is the reserved word giving me the nature of "contentsA"
etc...
I then have to find the starting point to the keywords list, read them
one
by one and save them in an array.
Once this is done I have to scan through the entire file, avoiding
everything that's html (so reading only the text) and look for those
words
I saved (the keywords here).
This process is to be repeated for all types of text used to qualify a
text
file (keywords, title, author, that kind of things).
So obviously a subroutine is going to be used with criterion-dependent
arguments.


word, giving its position. But this is very simple since I specify the
word
I am looking for within the code. Now I wish to save a number of words
whose length and number I do not know, in an array (problem with fixed
size
here isn't there ?), knowing only their position via the fact that
they are
"close" to some other word (the reserved word) that I know.

That so far is my problem. I have given _some_ thought to the problem
and
started to wonder if f77 was the best language etc ... you know the
rest.

Also, I noticed that a html file done with dreamweaver has a ^M at the
end
if line, probably carriage return, whilst another html file did not :
the
code worked fine with the second file but not with the first.

Any idea on this one ?

Thanks contentsB contentsB
<<<<<<<<<<<<<<<<< end message snip >>>>>>>>>>>>>..

Having challenged you to solve my parsing problem and not spotting
you had already posed your own problem, I feel obligated to post
a solution..

I used your message (above) containing 1 occurance of string contentsA
and inserted a couple contentsB at the end.
My program successfully finds the occurances of the 2 keywords
see below:

contentsA
contentsB
1 occurances found for contentsA
2 occurances found for contentsB

! ------------------------
program parse3
implicit none
integer :: i, n, nlen, nkeywords = 0, nfinds(99) = 0
character(30) :: keyword, keywords(99) = ' '
character :: ch

open (1,file='john.txt',form='binary')

do ! read file char-by-char
read (1,end=101) ch
keyword = keyword(2:10) // ch
if (keyword == 'contents="') then ! found start of keywords
keyword = ' ' ; i = 0
do
read (1,end=101) ch
if (ch == ' ') cycle
if (ch == '"') go to 101 ! end of all keywords
i = i+1 ; keyword(i:i) = ch
if (ch /= ',') cycle ! not end of keyword
nkeywords = nkeywords +1
keywords(nkeywords) = keyword(1:i-1) ! save
write (*,*) keyword(1:i-1) ! output keyword
keyword = ' ' ; i = 0 ! reset to ac*** next keyword
end do
end if
end do
101 continue
keyword = ' ' ; i = 0
do
read (1,end=102) ch
keyword = keyword(2:30) // ch ! 30 char text ac***ulator
do n = 1,nkeywords
nlen = len_trim(keywords(n))
if (keyword(30-nlen+1:30) == keywords(n)(1:nlen)) then ! found
text
nfinds(n) = nfinds(n) +1 ! count occurances
end if
end do
end do
102 continue

do n = 1,nkeywords
write (*,'(I3,2A)') nfinds(n), ' occurances found for ',
trim(keywords(n))
end do
end program
  Reply With Quote
3 2nd May 11:48
heinz wiggeshoff
External User
 
Posts: 1
Default PL/I cant parse strings without looping


....
Repost this problem to a REXX newsgroup.
They could use the laughs.
  Reply With Quote
4 2nd May 11:48
david frank
External User
 
Posts: 1
Default PL/I cant parse strings without looping


Be my guest and post this challenge wherever..

Methinks the laffs stop as soon as a non-fortran'er tries to write
code to process a string WITHOUT use of loops..

Since you apparently are rexx knowledgeable how about you?
  Reply With Quote
5 2nd May 11:48
bill turner, wb4alm
External User
 
Posts: 1
Default PL/I cant parse strings without looping


Didn't have to...
....I read it here.

And I'm laughing so hard, I can hardly type.

That guy just never learns that FORTRAN, while useful,
is one of the oldest languages, and therefore is missing
many modern features, because, in the interest of backwards
compatability, it just can't support many of the newer
concepts.

PL/I, however, was so forward thinking, especially for it's time,
that it provided not only the syntacatial capabilities for
future requirements, it also provided the functions already
built into the compiler. Way back in 1964.

FORTRAN is still playing catchup.


--

/s/ Bill Turner, wb4alm


******* -..-..-.. *** -..-..-.. *** -..-..-.. *** -. * . *******
*** ATTN: This newsgroup has been harvested for email addresses ***
*** and may still be. Sorry for the inconvience, but after being ***
*** on the receiving end of thousands of directly addressed spam ***
*** emails containing virus payloads, I am now forced to provide ***
*** some protection for my communication resources. ***
*** Please remove the dashes and abracadabra magic to email me. ***
*** Thank you. /s/ Bill Turner, Wb4alm ***
******* -..- * -... * ....- * .- * .-.. * -- **** .- -.- ******
  Reply With Quote
6 2nd May 11:48
howard hess
External User
 
Posts: 1
Default PL/I cant parse strings without looping


I was thinking it would fit nicely in comp.lang.perl.

And I was thinking that the FORTRAN example showed looping through the
array ... so the thread subject is wrong.

So is the reference to lack of parsing functions, since PL/I has the
following: native types for fixed and variable-length strings, an
operator for string concatenation ( || ), and an bunch of built-in
string functions:

BIT Converts a value to bit
BOOL Performs Boolean operation on 2 bit strings
CENTERLEFT Returns a string with a value centered (to the left) in it
CENTERRIGHT Returns a string with a value centered (to the right) in
it
CENTRELEFT Synonym for CENTERLEFT
CENTRERIGHT Synonym for CENTERRIGHT
CHARACTER Converts a value to a character string
CHARGRAPHIC Converts a GRAPHIC string to a mixed character string
COPY Returns a string consisting of n copies of a string
EDIT Returns a string consisting of a value converted to a given
picture specification
GRAPHIC Converts a value to graphic
HIGH Returns a character string consisting of n copies of the highest
character in the collating sequence
INDEX Finds the location of one string within another
LEFT Returns a string with a value left-justified in it
LENGTH Returns the current length of a string
LOW Returns a character string consisting of n copies of the lowest
character in the collating sequence
LOWERCASE Returns a character string with all the characters from A to
Z converted to their corresponding lowercase character.
MAXLENGTH Returns the maximum length of a string
MPSTR Truncates a string at a logical boundary and returns a mixed
character string
REPEAT Returns a string consisting of n+1 copies of a string
REVERSE Returns a reversed image of a string
RIGHT Returns a string with a value right-justified in it
SEARCH Searches for the first occurrence of any one of the elements of
a string within another string
SEARCHR Searches for the first occurrence of any one of the elements
of a string within another string but the search starts from the right
SUBSTR Assigns a substring of a string
TALLY Returns the number of times one string occurs in another
TRANSLATE Translates a string based on two translation strings
TRIM Trims specified characters from the left and right sides of a
string
UPPERCASE Returns a character string with all the characters from a to
z converted to their corresponding uppercase character.
VERIFY Searches for first nonoccurrence of any one of the elements of
a string within another string
VERIFYR Searches for first nonoccurrence of any one of the elements of
a string within another string but the search starts from the right
WHIGH Returns a widechar string consisting of n copies of the highest
widechar (’ffff’wx).
WIDECHAR Converts a value to a widechar string
wLOW Returns a widechar string consisting of n copies of the lowest
widechar (’0000’wx).

I usually try to avoid feeding the trolls, but this is, after all,
Thanksgiving week here.
  Reply With Quote
7 2nd May 11:48
heinz wiggeshoff
External User
 
Posts: 1
Default PL/I cant parse strings without looping


Even an APL\360 person is chuckling now!
  Reply With Quote
8 2nd May 11:48
heinz wiggeshoff
External User
 
Posts: 1
Default PL/I cant parse strings without looping


Or SNOBOL4. Maybe even FORTH, if I dust off that tome.


Notice that Frankenfortran insisted on a solution using an array of
characters overlaying a character string. This is his way of
denying
any solution without ******** array manipulation to satisfy the
requirements, which could lead to weeks of follow-ups if the
troll is fed.

The Frankenfarter still hasn't RTFM about PL/1 DEFINED and BASED
overlay,
despite all the examples I posted here months ago.


....
Some of these are new to me. COPY? Equals REPEAT?

People in Florida should wear hats outside, even in November.
  Reply With Quote
9 2nd May 11:48
david frank
External User
 
Posts: 1
Default PL/I cant parse strings without looping


Below is the function my example(s) used WITHOUT loops.
As you see from Robin's reply in another topic, he didnt have
a PL/I equivalent for below, do you?


a(n:1:-1) = PACK( a, a /= ' ', SPREAD(' ',1,n) )

above strips blanks, reverses text on array a(1:n) chars.

Whats the APL, REXX, PL/I, or ANY language's equivalent?
  Reply With Quote
10 2nd May 11:49
howard hess
External User
 
Posts: 1
Default PL/I cant parse char-by-char


Now that we're past the "my language can beat up your language" stage
....

From your description, it's clear that F77 is not the 3GL *I'd* choose
to solve this problem. As far as processing strings, building tree
structures and symbol tables, etc., C, C++ and PL/I do a fine job.

C and C++ have better tools support for building parsers, i.e. things
like lex and yacc. C and C++ also have more sources for pre-built
parsers for popular languages (including HTML).

Java's support for parser-building tools and pre-built parsers for Web
languages is also quite good.

If you're really parsing HTML (or XML), you should be able to reuse a
parser written in C, C++, or Java, extending it with the actions you
need to perform on the parse tree. (Assuming the licensing terms meet
your needs.)

It isn't clear to me whether you're parsing HTML, or that was just the
example you chose.

If it isn't HTML, it isn't clear to me whether you need a full parser,
or just a simpler scanner to chunk and store strings between
delimiters "<" and ">". It also isn't clear to me whether the
delimiters can appear in quoted strings between delimiters, and how
sophisticated the lexical ****ysis needs to be.

Without knowing the execution environment, volume/throughput
requirements, etc., it's impossible to make a language recommendation.

If I were developing a prototype to perform the general tokenizing and
parsing of text, I'd use REXX. Other people I know would use perl.
If that prototype performed well enough, I wouldn't bother
reimplementing it in another language.
  Reply With Quote
Reply


Thread Tools
Display Modes




Copyright © 2006 SmartyDevil.com - Dies Mies Jeschet Boenedoesef Douvema Enitemaus -
666