Mombu the Microsoft Forum

Go Back   Mombu the Microsoft Forum > Microsoft > Finding duplicates with RegEX
User Name
Password
REGISTER NOW! Mark Forums Read




Reply
1 30th October 05:49
scott
External User
 
Posts: 1
Default Finding duplicates with RegEX



For the sample below, I'm trying to construct an expression to capture drive
sizes (second word) and the corrisponding serial id (fourth word). The
expression I have so far is this : '\w*(\d{2,3}.\dGB).*(\(.*\))'
It only returns part of the drive size (ie: 74.4GB instead of 274.4GB).
Can the expression also filter duplicates?


XMTOMC320 274.4GB 512B/sect (A8188BKE)
XMTOMC320 136.5GB 512B/sect (A818DNME)
XSCHT6146F 68.0GB 520B/sect (3HYX5ERR0000750175VW)
XMTOMC320 488.7GB 512B/sect (A818NGCE)

XMTOMC320 274.4GB 512B/sect (A8188BKE)
XMTOMC320 136.5GB 512B/sect (A818DNME)
XMTOMC320 488.7GB 512B/sect (A818NGCE)
XSCHT6146F 68.0GB 520B/sect (3HYX5ERR0000750175VW)
  Reply With Quote


 


2 30th October 05:50
alexander mueller
External User
 
Posts: 1
Default Finding duplicates with RegEX



scott schrieb:

That's because you need to denote the literal 'dot' as: \.
instead of an arbitrary character: .

Not directly, but you can handle it yourself, by letting a
function process (sub)matches while calling
RegExp-Replace:


MfG,
Alex

Dim src
Dim dic
Dim r

src = _
"XMTOMC320 274.4GB 512B/sect (A8188BKE)" & vbCrLf _
& "XMTOMC320 136.5GB 512B/sect (A818DNME)" & vbCrLf _
& "XSCHT6146F 68.0GB 520B/sect (3HYX5ERR0000750175VW)" & vbCrLf _
& vbCrLf _
& "XMTOMC320 274.4GB 512B/sect (A8188BKE)" & vbCrLf _
& "XMTOMC320 136.5GB 512B/sect (A818DNME)" & vbCrLf _
& "XSCHT6146F 68.0GB 520B/sect (3HYX5ERR0000750175VW)" & vbCrLf

Set dic = CreateObject("Scripting.Dictionary")
Set r = New RegExp

r.MultiLine = True
r.Global = True

r.Pattern = "^\w+ +(\d{2,3}\.\dGB) +.*\((.+)\)$"

Call r.Replace (src, GetRef("GetDriveSize"))
WSH.Echo Join(dic.Items, vbCrLf)


Sub GetDriveSize (match, s1, s2, pos, src)

Dim key
key = s1 & " " & s2

If Not dic.Exists(key) Then
dic.Add key, "Serial: " & s2 & ", Size: " & s1
End If

End Sub
  Reply With Quote
3 30th October 05:51
harvey colwell
External User
 
Posts: 1
Default Finding duplicates with RegEX


Why don't you split each line into an array. then pull out the two but of
info you want. After then it should be easy to do any other processing you
want.


sTmp = "XMTOMC320 274.4GB 512B/sect (A8188BKE)"

aTmp = Split(sTmp, " ")

MsgBox "Size: " & aTmp(1) & vbCrLf & "SerialNo: " & aTmp(3)
  Reply With Quote
Reply


Thread Tools
Display Modes




666