On daily bases I have received files with double quotes comma separated, but in some records I found 2 times double quotes and some time different squence of double quotes placements.
I just want to remove all unnecessary (extra) double quotes (see the following values where i want to remove extra double quotes.
Following code i used is work fine with ""Company D-Val"" but didn't work with ""Recycle" Rubbish, C class", The code is
code:- Do While inFile.AtEndOfStream = False record = InFile.ReadLine() record = Replace(record, """", """") 'replace "" with single space outFile.WriteLine record & vbclrf Loop
<Learn4Deve...@discussions.microsoft.com> wrote: > On daily bases I have received files with double quotes comma separated, but > in some records I found 2 times double quotes and some time different squence > of double quotes placements.
> I just want to remove all unnecessary (extra) double quotes (see the > following values where i want to remove extra double quotes.
It was tricky getting the indexing right, but this seemed to do what you want on the test sample you provided ...
Do Until inFile.AtEndOfStream record = InFile.ReadLine nPos = Instr(record, Chr(34) & Chr(34)) do while nPos > 0 record = Left(record, nPos - 1) & Mid(record, nPos + 1) nPos = Instr(nPos + 1, record, Chr(34)) record = Left(record, nPos - 1) & Mid(record, nPos + 1) nPos = Instr(nPos + 1, record, Chr(34) & Chr(34)) loop outFile.WriteLine record Loop
The approach removes the first of a pair of double quotes and the very next double quote that follows; thereby removing the starting and ending quotes that enclose text, when there are too many. _____________________ Tom Lavedas
> On daily bases I have received files with double quotes comma separated, > but > in some records I found 2 times double quotes and some time different > squence > of double quotes placements.
> I just want to remove all unnecessary (extra) double quotes (see the > following values where i want to remove extra double quotes.
> Following code i used is work fine with ""Company D-Val"" but didn't work > with ""Recycle" Rubbish, C class", The code is
> code:- > Do While inFile.AtEndOfStream = False > record = InFile.ReadLine() > record = Replace(record, """", """") 'replace "" with single space > outFile.WriteLine record & vbclrf > Loop
You could do it like this for every line of text: 1. Replace every "," with your own delimiter. I use chr(0). 2. Remove all remaining double quotes. 3. Replace your own delimiters with ",". 4. Surround the line with double quotes. This code appears to work for your data sample:
Q = """" Z = Chr(0) sDelimiter = Q & "," & Q Set oFSO = CreateObject("Scripting.FileSystemObject") Set oFile = oFSO.OpenTextFile("d:\test.txt") While Not oFile.AtEndOfStream sLine = Replace(oFile.ReadLine, sDelimiter, Z) sLine = Replace(sLine, Q, "") sLine = Q & Replace(sLine, Z, sDelimiter) & Q WScript.Echo sLine Wend
Strange - my initial reply appears to have disappeared into the never-never. I'll try to post it once more:
You could do it like this for every line of text: 1. Replace every "," with your own delimiter. I use chr(0). 2. Remove all remaining double quotes. 3. Replace your own delimiters with ",". 4. Surround the line with double quotes. This code appears to work for your data sample:
Q = """" Z = Chr(0) sDelimiter = Q & "," & Q Set oFSO = CreateObject("Scripting.FileSystemObject") Set oFile = oFSO.OpenTextFile("d:\test.txt") While Not oFile.AtEndOfStream sLine = Replace(oFile.ReadLine, sDelimiter, Z) sLine = Replace(sLine, Q, "") sLine = Q & Replace(sLine, Z, sDelimiter) & Q WScript.Echo sLine Wend
> On daily bases I have received files with double quotes comma separated, > but > in some records I found 2 times double quotes and some time different > squence > of double quotes placements.
> I just want to remove all unnecessary (extra) double quotes (see the > following values where i want to remove extra double quotes.
The rubbish line looks like exactly that. the first paired double quotes are not matched by another such pair. In the context of your data, what does it mean to have a doubled double-quote?
Learn4Develop wrote: > On daily bases I have received files with double quotes comma separated, but > in some records I found 2 times double quotes and some time different squence > of double quotes placements.
> I just want to remove all unnecessary (extra) double quotes (see the > following values where i want to remove extra double quotes.
> Following code i used is work fine with ""Company D-Val"" but didn't work > with ""Recycle" Rubbish, C class", The code is
> code:- > Do While inFile.AtEndOfStream = False > record = InFile.ReadLine() > record = Replace(record, """", """") 'replace "" with single space > outFile.WriteLine record & vbclrf > Loop
record = Replace(record, Chr(34)&Chr(34), Chr(34)) 'replace "" with single "
-- Todd Vargo (Post questions to group only. Remove "z" to email personal messages)
"Tom Lavedas" wrote: > On Nov 16, 6:20 am, Learn4Develop > <Learn4Deve...@discussions.microsoft.com> wrote: > > On daily bases I have received files with double quotes comma separated, but > > in some records I found 2 times double quotes and some time different squence > > of double quotes placements.
> > I just want to remove all unnecessary (extra) double quotes (see the > > following values where i want to remove extra double quotes.
> It was tricky getting the indexing right, but this seemed to do what > you want on the test sample you provided ...
> Do Until inFile.AtEndOfStream > record = InFile.ReadLine > nPos = Instr(record, Chr(34) & Chr(34)) > do while nPos > 0 > record = Left(record, nPos - 1) & Mid(record, nPos + 1) > nPos = Instr(nPos + 1, record, Chr(34)) > record = Left(record, nPos - 1) & Mid(record, nPos + 1) > nPos = Instr(nPos + 1, record, Chr(34) & Chr(34)) > loop > outFile.WriteLine record > Loop
> The approach removes the first of a pair of double quotes and the very > next double quote that follows; thereby removing the starting and > ending quotes that enclose text, when there are too many. > _____________________ > Tom Lavedas > .
Hi Tom Thanks for the code, its working fine with the above criteria, but gives me error if there is Null value between double quotes ("") i.e.
>> On Nov 16, 6:20 am, Learn4Develop >> <Learn4Deve...@discussions.microsoft.com> wrote: >> > On daily bases I have received files with double quotes comma >> > separated, but >> > in some records I found 2 times double quotes and some time different >> > squence >> > of double quotes placements.
>> > I just want to remove all unnecessary (extra) double quotes (see the >> > following values where i want to remove extra double quotes.
>> It was tricky getting the indexing right, but this seemed to do what >> you want on the test sample you provided ...
>> Do Until inFile.AtEndOfStream >> record = InFile.ReadLine >> nPos = Instr(record, Chr(34) & Chr(34)) >> do while nPos > 0 >> record = Left(record, nPos - 1) & Mid(record, nPos + 1) >> nPos = Instr(nPos + 1, record, Chr(34)) >> record = Left(record, nPos - 1) & Mid(record, nPos + 1) >> nPos = Instr(nPos + 1, record, Chr(34) & Chr(34)) >> loop >> outFile.WriteLine record >> Loop
>> The approach removes the first of a pair of double quotes and the very >> next double quote that follows; thereby removing the starting and >> ending quotes that enclose text, when there are too many. >> _____________________ >> Tom Lavedas >> .
> Hi Tom > Thanks for the code, its working fine with the above criteria, but gives > me error if there is Null value between double quotes ("") i.e.
<Learn4Deve...@discussions.microsoft.com> wrote: > "Tom Lavedas" wrote: > > On Nov 16, 6:20 am, Learn4Develop > > <Learn4Deve...@discussions.microsoft.com> wrote: > > > On daily bases I have received files with double quotes comma separated, but > > > in some records I found 2 times double quotes and some time different squence > > > of double quotes placements.
> > > I just want to remove all unnecessary (extra) double quotes (see the > > > following values where i want to remove extra double quotes.
> As i have some columns in CSV file where the value is Null "". How i can > resolve it.
> Thanks again for your great help.
Try this (though Pegusus' approach is also workable) ...
Do Until inFile.AtEndOfStream record = InFile.ReadLine nPos = Instr(record, Chr(34) & Chr(34)) do while nPos > 0 record = Left(record, nPos) & Mid(record, nPos + 2) nPos = Instr(nPos + 1, record, Chr(34)) if nPos > 0 then record = Left(record, nPos - 1) & Mid(record, nPos + 1) nPos = Instr(nPos + 1, record, Chr(34) & Chr(34)) else record = record & Chr(34) end if loop outFile.WriteLine record Loop _____________________ Tom Lavedas
> On daily bases I have received files with double quotes comma separated, but > in some records I found 2 times double quotes and some time different squence > of double quotes placements.
> I just want to remove all unnecessary (extra) double quotes (see the > following values where i want to remove extra double quotes.
If you want to implement the appropriate algorithm, you cannot use a replace-mechanism based on literals, you always have to be aware of the quoting and delimiting context. The right way, j2mc, is to parse the string char by char, count the DQs and check whether a DQ is followed by a delimiter. If the delimiter appears after an even number of DQs it truly terminates the token, if the number is odd it's an "inline-delimiter". A DQ on the other hand is kept if its the first one (obvious, as opener) or one with an even ordinal number followed by the delimiter (terminator).
Christoph
In VBS a function that follows these rules would like this:
'--------------------- Function unquote(record)
Const DQ = """" Const CM = ","
Dim newRecord Dim dqCount Dim char Dim nextChar Dim i Dim keep
If char = DQ Then If dqCount = 0 Then 'beginning of token keep = 1 dqCount = 1
ElseIf (dqCount Mod 2 = 1) And _ (nextChar = CM OR nextChar = "") Then
'end of token marked by ", OR "\r\n keep = 1 dqCount = 0 Else 'inline " keep = 0 dqCount = dqCount + 1 End If Else 'char other then " keep = 1 End If If keep = 1 Then newRecord = newRecord & char End If 'WSH.Echo dqCount, i, keep, char, nextchar, newRecord Next unquote = newRecord End Function '---------------------
To test the results, run:
'--------------------- Option Explicit Dim records records = Array ( _ """0123x"",""""Company D-Val"""",""Class D, sector N"",""DD5894""" _ , """4894D"",""""Recycle"" Rubbish, C class"",""Class D, Sector F, Block N"",""D870GH""" _ , """AB8679"",""""AB Ltd"""",""""Need"" Clean Drive Way, stores"",""GF0347""" _ , """""AB8679"", Test New"",""""""AB Ltd"""""",""""Need"" Clean Drive Way, stores"",""GF0347""" _ )
Dim Record
For Each Record in records WSH.Echo record WSH.Echo unquote(record) WSH.Echo Next
Function unquote(record) ... End Function '---------------------
Learn4Develop wrote: > Hi Tom > Thanks for the code, its working fine with the above criteria, but gives > me error if there is Null value between double quotes ("") i.e.
> As i have some columns in CSV file where the value is Null "". How i can > resolve it.
> Thanks again for your great help.
ISTM, only strings containing a comma should need quoted. But since you say it gives you an error, it begs the questions. What is giving the error? And what is creating these incorrect csv files?
-- Todd Vargo (Post questions to group only. Remove "z" to email personal messages)
> On daily bases I have received files with double quotes comma separated, > but > in some records I found 2 times double quotes and some time different > squence > of double quotes placements.
> I just want to remove all unnecessary (extra) double quotes (see the > following values where i want to remove extra double quotes.
> Following code i used is work fine with ""Company D-Val"" but didn't work > with ""Recycle" Rubbish, C class", The code is
> code:- > Do While inFile.AtEndOfStream = False > record = InFile.ReadLine() > record = Replace(record, """", """") 'replace "" with single space > outFile.WriteLine record & vbclrf > Loop
I agree with Christoph that either the mechanism that creates this non-standard quoting of a CSV file should be made more robust, or a more robust mechanism should be used to read the poorly implimented CSV file. Perhaps a simple VBScript using ADO as the robust file reading mechanism could be used.