Mombu the Programming Forum

Go Back   Mombu the Programming Forum > Programming > compare contents of file with those of a directory
User Name
Password
REGISTER NOW! Mark Forums Read




Reply Bookmark and Share
1 10th November 03:14
seb
External User
 
Posts: 1
Default compare contents of file with those of a directory



Hi,

I'm struggling with a task that seems perfect for awk, as is usually the
case with BibTeX files. The file has thousands of entries like this:


@ARTICLE{913,
author = {Doe, J.},
title = {title},
journal = {some journal},
year = {1492},
volume = {30},
pages = {40-50},
key = {913},
pdf = {filename.pdf}
}


Many entries already have that last 'pdf' line, but a lot of entries need
to have that line added. The file names are the base names (without the
directory) of files available in some directory. The idea is therefore to
find which files in the directory with the pdf files do not have a 'pdf'
line entry somewhere on the BibTeX file.

I know how I can get a list of the file names already in the BibTeX file,
but am not sure how to proceed from there. I would highly appreciate some
suggestions on how to approach this. Thanks in advance.


Cheers,

--
Seb
  Reply With Quote


 


2 10th November 03:14
bob harris
External User
 
Posts: 1
Default compare contents of file with those of a directory



cd /to/directory/interest
ls | awk '
NR == FNR && $1 == "pdf" {
#
# reading BibTeX and found a pdf entry
#
gsub(/{}/,"",$3)
files[$3] = 1 # build hash of pdf entries
}
NR == FNR {
#
# Reading stdin of directory names
#
if ( !($1 in files) ) {
print # no BibTeX pdf entry for this file
}
}
' BibTeX.file - # dash says read stdin _AFTER_ BibTeX

Bob Harris
  Reply With Quote
3 10th November 03:14
ed morton
External User
 
Posts: 1
Default compare contents of file with those of a directory


ITYM:

cd /to/directory/interest
ls | awk '
NR == FNR {
if ($1 == "pdf") {
# reading BibTeX and found a pdf entry
gsub(/[{}]/,"",$3)
files[$3] = 1 # build hash of pdf entries
}
next
}
# Reading stdin of directory names
!($0 in files) # no BibTeX pdf entry for this file
' BibTeX.file - # dash says read stdin _AFTER_ BibTeX

It still won't work for files that contain newlines in them but
hopefully you don't have that situation.

Regards,

Ed.
  Reply With Quote
4 10th November 03:14
seb
External User
 
Posts: 1
Default compare contents of file with those of a directory


Indeed, I don't. This dealt with the task completely. It's amazing how
quickly my awk skills get rusty if I don't use it for a while. Thanks to
both of you for getting me up to speed again.


Cheers,

--
Seb
  Reply With Quote
Reply


Thread Tools
Display Modes


Some other forums that might be of your interest : Development, Ada, Apple script, Assembler, Awk, Beos, Basic, C, C++, C#, C# .net, .net, .net frameworks, Asp .net, Clarion, Clipper, Clos, Clu, Cobol, Coldfusion, Delphi, Dylan, Eiffel, Forth, Fortran, Haskell, Hermes, Icon, Idl, Java, Java script, Jscript .net, Jcl, Linoleum, Lisp, Lotus, Limbo, Logo, Ml, Mumps, Oberon, Postscript, Pop, Pl1, Prolog, Python, Ruby, Pascal, Perl, Php, Rebol, Rexx, Sed, Sather, Scheme, Smalltalk, Tcl, Vhdl, Vrml, Visual basic, Visual basic .net, Yorick, Mysql, Omnis, Postgresql, Xbase, Access, Oracle, Adabas, Berkeley, Btrieve, Filemaker, Gupta, Db2, Informix, Ingres, Mssql server, Object, Olap, Paradox, Rdb, Revelation, Sybase, Theory, Dbase, Html, Java script, Css, Flash, Photoshop, Corel script, Xml, Tech, Beos, Gem, Hp48, Hpux, Linux, Mac, Ms-dos, Os2, Palm, Solaris, Ti99, Windows, Xenix, Aos, Chorus, Geos, Inferno, Lantastic, Lynx, Mach, Minix, Netware, Os9, Parix, Plan9, Psos, Qnx, Xinu, Sco, Unix, Aix, Aux, 386bsd, Bsdi, Freebsd, Netbsd, Openbsd, Ultrix, Amd, Intel, Aptiva, Buz, Deals, Homebuilt, Overclocking, Programming, Extra forums


Copyright © 2006 SmartyDevil.com - Dies Mies Jeschet Boenedoesef Douvema Enitemaus -
666