SAS Support
Self Service Support
Assisted Support
Premium Support
Enterprise Support
Logging a Call
Service Levels
Hints and Tips
 

A QUESTION FROM A SAS CUSTOMER ON THE CUSTOMER SUPPORT HELPDESK

SAS modules addressed: Base, Data Access

Question - I have been trying to import a pipe delimited file into SAS where some records have carriage returns within a text field - SAS interprets as an end of line and puts the remainder of the record on the next line. Normally I would get round this with a search and replace in notepad or by importing into Access and exporting to SAS however I am dealing with huge datasets (+10m records) and thus this is just not practical. I have previously imported .CSV files containing the same data with no issues - the supplier has changed the feed to pipe delimited and now these issues arise. I have searched the Internet and tried to use _infile_ with compress and also trailing@ and ignoredoseof, all to no avail.

Please can you advise on how to get around this - I have been looking at this for the past two days and I am completely frustrated and want to scream!

From Holly Annand, AXA TECHNOLOGY SERVICES UK LIMITED

Answer - As the embedded blank lines also have the hex characters '0D0A'x unfortunately we can't differentiate between a real end of line and the embedded one. But as we know that there will always be 22 columns or even 21 delimiters, then we can use this important fact to determine the difference between a real end of line and the surplus embedded ones. Here's the code that should do the trick!


filename holly    'd:\temp\helpholly.txt'    ; 

filename hollyout 'd:\temp\helphollyout.txt' ; 

data _null_ ; 

   infile holly    lrecl=300 ; * Check Log to ensure LRECL is not too short ; 

   file   hollyout lrecl=300 ; * Check Log to ensure LRECL is not too short ; 

   retain count 0 ; 

   * Read the current record into the input buffer ; 

   input ; 

   * Write out the record, but hold the output buffer ; 

   put _infile_ @ ; 

   * Determine how many delimiters are in this record, and add it to the total read so far 

     The maths here uses the length of the record - the length of the record with no delimiters ; 

   count + length(_infile_) - length(compress(_infile_,'|')) ; 

   * If the count is 21 then we know we have read all 22 columns, so free the output buffer, and reset count ; 

   if count = 21 then 

      do ; 

         count = 0 ; 

         put ; 

      end ; 

run ; 

Now you can just read the file normally!

Resolution - Thank you, thank you, thank you!

I have tried this on the smaller file and it all works - I'll now run the 6m record file and then I can get on with my day job!

Thank you so much to everyone that has helped solved this problem for me - I really appreciate not only your help but also the rapid response. I was almost banging my head against my desk at one point so I really am grateful to you!


Did you find this page useful?

If you have any comments or questions, feel free to contact us.



0845 402 9907