So I found myself having to im****t a boatload of multimillion-line text
files
into SQL Server. Because of idiosyncrasies in the data, BULK INSERT and
DTS
proved impracticable. Which meant writing a standalone program to do it.
I needed to code a (VB6) class that took care of the idiosyncrasies. The
object has a method called Activate() that opens a recordset which
verifies
the data structure of a destination (ADO) table, and one called PutValues
()
that inserts a single record into the table by calling AddNew, setting
values, and then calling Update.
rs.CursorLocation = adUseClient
rs.Open m_rsname, m_conn, adOpenStatic, adLockBatchOptimistic, adCmdTable
It should be apparent that I'm using batch updates: There's a "driver"
class
that copies rows with PutValues() for 2 seconds, then performs a batch
update, then updats a progress bar on a dialog.
This worked reasonably well, but the Activate() method creates a query to
return all the records in the table. When the table (in a test DB) has
millions of records in it already (and this can't be avoided, several
multimillion-line text files have to be appended into the same table) , it
can take up to several minutes just to open the recordset.
So, I made a second class whose Activate() method opened the recordset
this
way:
rs.Open "SELECT TOP 0 * FROM " + m_rsname, m_conn, adOpenStatic,
adLockReadOnly, adCmdText
The recordset was used only for the table structure validation and the
inserts are done by building INSERT commands as strings, executed against
the
connection object. It mimics "batch updates" by having PutValues() merely
collect the value into a big array of arrays of variants, and the
BatchUpdate() method creates a big string of INSERT commands separated by
semicolons.
Unfortunately, this is much slower than the original class, which used a
recordset. It inserts 45-ish records per tick of the progress bar where
the
original class inserted several hundred per tick.
So it was a failed experiment, and it looks like I'll have to use the
recordset. But I'm wondering if there's a way to open the recordset so
that
I can call AddNew on it, but doesn't query the entire table first.


|