Mombu the Microsoft Forum sponsored links

Go Back   Mombu the Microsoft Forum > Microsoft > SQL SERVER (TECHNET) > Converting text column to XML
User Name
Password
REGISTER NOW! Mark Forums Read

sponsored links


Reply
 
1 30th August 22:10
kenj
External User
 
Posts: 1
Default Converting text column to XML


We just completed the conversion of one text column in our database to
the XML datatype (about 15 million rows):
ALTER TABLE xmlTable ALTER COLUMN xmlColumn XML

Our database has gone from 36GB to 54GB with this single change (the
36GB and 54GB both indicate post-shrink utilized space).

I had read in another thread that XML was stored in a more efficient
binary format so, I was rather suprised at the growth of the database.

Is this to be expected or is there something I can configure to reduce
the storage requirement?

We're going to test a text to varchar(max) conversion next, to see how
that stacks up against the XML conversion but would love to just store
XML as XML.

Thanks,

Ken

Assuming I can link directly so the google group aggregator, the thread
with the more efficient reference can be viewed at
http://groups.google.com/group/microsoft.public.sqlserver.xml/browse_thread/thread/6e5858f5cf8f46fc/3da6e8a32f668783?lnk=gst&q=convert+text+to+xml+datatype&rnum=6#3da6e8a32f668783
  Reply With Quote


  sponsored links


2 30th August 22:11
kent tegels
External User
 
Posts: 1
Default Converting text column to XML


Hello KenJ,

I'm not too surprised, actually. The TEXT datatype was storing the XML which
is essentially one byte of storage for one byte of lexical XML. My guess
is that you only stored UTF-8 encoded XML in that type instance. The internal
storage of XML datatype is actually UTF-16 (as SQL Server understands collations,
not encodings). So in effect, you've likely doubled the amount of space consumed
to store the instances. Why not exactly double? SQL Server does do some tokenizing
of entity names, so we do see something less than 2x in most cases.


Expect it, and no, there's only ways to increase the space impact, not really
reduce it.

The critical question comes downt to this: are you validating or other processing
the XML with the SQL context or not? If not, don't use the XML type for exactly
this reason. If SQL Server is just secured storage of the XML for some other
application, consider using varbinary(max) instead so you don't have to worry
about the encoding issues.

Thanks,
Kent Tegels, DevelopMentor
http://staff.develop.com/ktegels/
  Reply With Quote
3 30th August 22:11
dan at pluralsight dot com dan sullivan
External User
 
Posts: 1
Default Converting text column to XML


I have found that it's not obvious what the most compact way to save xml data is. I blogged the tests I did and how did them here:

http://pluralsight.com/blogs/dan/archive/2006/08/06/32814.aspx

Dan AT Pluralsight DOT COM
  Reply With Quote
4 31st August 06:07
kenj
External User
 
Posts: 1
Default Converting text column to XML


Hi Kent,

Great reply from both you and Dan Sullivan. The UTF-16 encoding was
what caught us! We would have really been in trouble had we converted
both TEXT columns; the high-water mark for converting the one column
was about 215GB

Since the SQL 2000 database we're looking to upgrade just uses the text
columns as dumb storage for XML do***ents, we can definitely get by
without using the XML datatype. Assuming nobody is interested in
having the database enforce valid XML (it has to make it through MS
bizTalk before it gets to the DB), I'll take a closer look at
varbinary(max); I had never even considered that one.

FYI: As I mentioned, we did run a varchar(max) conversion and it was
virtually instantaneous. I'm guessing this is handled internally in
almost the exact same manner as the text data type.

Thanks again for the helpful information.

KenJ
  Reply With Quote


  sponsored links


Reply


Thread Tools
Display Modes




Copyright 2006 SmartyDevil.com - Dies Mies Jeschet Boenedoesef Douvema Enitemaus -
666