Mombu the Php Forum

Go Back   Mombu the Php Forum > Php > Importing Multiple Text Files
User Name
Password
REGISTER NOW! Mark Forums Read




Reply
1 4th November 17:14
amit malhotra
External User
 
Posts: 1
Default Importing Multiple Text Files



Looking for help in doing this:

I have a directory full of text files that are named with an ID and
contains body of an article.

The names are such 1001_1.txt, 1001_2.txt, 1002_1.txt, 1003_1.txt, and
so forth and so on.

Basically, I need a script (PHP or otherwise) that would simply put
the name of the text file (before the underscore) in a column, and the
content of the text file in another column, and if there is any number
after the underscore, to put that in one column as well.

So you would have a table structure looking like this:

1001 | 1 | content of the text file
1001 | 2 | content of the text file
1002 | 1 | content of the text file
1003 | 1 | content of the text file

Any help on how I can accomplish this would be appreciated.

There are about 7000 text files that need to be read and imported in
the database. Each text file contains only an article of some sort and
1001_1 and 1001_2 would basically contain different bodies of the same
article.

It would be even better if the _1 and _2 files content could be
segregated in different colums, eg:

1001 | 1 | content | 2 | content | 3 | content | 4 | content
1002 | 1 | content
1003 | 1 | content

(the file names go maximum up to _4
so you could have 1001_1, 1001_2, 1001_3, 1001_4.txt or only 1002_1
and 1003_1.txt)
  Reply With Quote


 


2 5th November 10:27
amit malhotra
External User
 
Posts: 1
Default Importing Multiple Text Files



Thanks for your reply, but at this point, there is no database, just
text files with data in them which need to be put into a table.

I'll look into this, thank you for your tips.

Database design doesn't need to be correct. I simply require a table
that will contain the data within the text files. Really, there's
nothing more than ONE table, so there's no fuss for keys, primary,
secondary, relationships, etc. Just one table that contains the text
file name in one column and the content in another. Only caveat is,
text file names are as I described, 1001_1.txt 1001_2.txt 1001_3.txt
(for content related to each other in 3 different paragraphs) and
1002_1.txt, 1003_1.txt, 1004_1.txt, 1004_2.txt, and so forth and so
on.

Regards
  Reply With Quote


 


3 5th November 10:27
jerry stuckle
External User
 
Posts: 1
Default Importing Multiple Text Files


Database design DOES MATTER! Get it correct before you go any further.
A sloppy design will cause you no end of problems.

This doesn't mean you need 50 tables, etc. But the time spent on
getting it right will be worth the effort.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex@attglobal.net
==================
  Reply With Quote
4 5th November 10:28
geoff berrow
External User
 
Posts: 1
Default Importing Multiple Text Files


This is where the normalisation comes in. You are suggesting putting
the data into an, as yet, undetermined number of columns. OK, you say
a maximum of 4, but experience tells us to always plan for more.

Why don't you store the data like this?

ID | file_prefix | file_suffix | content | ...(other file data)

It's hard to give script examples without writing it for you.

You can start by looking at opendir and the examples here
http://www.php.net/manual/en/function.opendir.php

Get the contents
http://uk3.php.net/manual/en/functio...t-contents.php

Then you'll have to extract the data from the file name. Chop off the
extension using http://uk.php.net/manual/en/function.substr.php

Finally create an array containing the file elements using
http://uk.php.net/manual/en/function.explode.php

Bang the whole lot into your database and Robert is your mother's
brother.

You may have to do it in batches. I doubt most normal setups will do
this without timing out. So you'll need to query the database first
to find out where you left off and start adding files from that point.

--
Geoff Berrow (Put thecat out to email)
It's only Usenet, no one dies.
My opinions, not the committee's, mine.
Simple RFDs www.ckdog.co.uk/rfdmaker
  Reply With Quote
5 8th November 00:06
amit malhotra
External User
 
Posts: 1
Default Importing Multiple Text Files


Hi Jerry,

Thanks for replying again. Perhaps I'm looking at the wrong place for
this as I really do not need a database. I know, very well, from
experience, that Database Design matters. I know that I should have a
normalized database, but I do not want to create a database from
this. I know exactly what type of files I have, I need the data in
those files collected in a tabular format for me to do whatever I want
with that data. Perhaps I should look at PERL to do that instead of
PHP as I believe PERL has a very strong Text I/O functionality. My
knowledge of either is limited, so it might prove to be a learning
experience.

If there is a way for PHP to just make me a table using all those text
files and I can manipulate that data for future use, it would be
perfect as I do prefer to do it in PHP, mainly to learn more of its
capabilities.

Regards,
  Reply With Quote
6 8th November 00:06
amit malhotra
External User
 
Posts: 1
Default Importing Multiple Text Files


Hi Geoff,

Thank you so much for pointing me out to some of the things that can
help in accomplishing this task. It is going to be 4, and the script
will ONLY be used to import the data in the text files I have, that's
it. It is never going to grow as once the data is in a table, i can
manipulate it as I want.

There's no point in saving it like this. Again, thinking about it,
Ideally, it would be

ID | file_prefix | content

for all files that have file_prefix 1001 (whether its 1001_1, 1001_2,
1001_3), content must be concatenated in the "content" field. But it
doesn't matter, even if its:

1001 | 01| content1 | 02 | content2 | 03 | content3

I will be able to manipulate the data easily, as long as its there in
a table format. In this case, i can always write a query to
concatenate all the data in field "content1, content2, content3,
content4" into "content_all" and bam, I have a table with the file
name "1001" and "content_all".


Thank you. I will look at these resources to see what I can make out
of them, but I think, I'm leaning towards getting my solution from
PERL instead of PHP as it's probably better to manipulate the text
data for what I need.


Regards,

Amit
  Reply With Quote
7 8th November 00:06
jerry stuckle
External User
 
Posts: 1
Default Importing Multiple Text Files


Ok, that's a little more clear. When you said placing the data in
columns, I thought you were talking about SQL database columns.

You can also do it quite easily in PHP. The only difference in the what
I suggested earlier would be for you to create the html for a table for
a table row instead of inserting into a database.

The idea would be similar in Perl or any other language. So I guess I
don't understand your question, then. What have you tried so far?


--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex@attglobal.net
==================
  Reply With Quote
Reply


Thread Tools
Display Modes




666