Excel to CSV with UTF8 encoding for CiviCRM

Working with Excel where there are characters from Russian Cyrillic, Japanese, Polish and Arabic languages to make the data clean, can be an issue. Some non-English characters such as European accent marks will turn into question marks, and many others will lose their accent (e.g. “Tőkés” becomes “T?kes”) (cause: no UTF8 support)

Names beginning with an accented character will lose their first character when imported into CiviCRM (e.g. a sheet from Excel would import “Ždanoka” as “danoka”) (cause: no option to enclose fields in quotes)

Accented Charachters
Accented Characters

Errors, strange results, and data appearing in the wrong place can happen where there are blanks in the last row (cause: unconventional line ending)

Sheets might not import at all in some international versions of Excel, e.g. French (cause: using semicolons or tabs instead of commas as delimiters)

To stop this happening, a simple workaround is to use Google Spreadsheet. Paste (values only if you have complex formulas) or import the sheet then download CSV. I just tried a contact import and it works rather well.

Another alternative is to use a text editor, open your file in it, you can now encapsulate your fields by quotes, the easiest way is to use a text editor that has the option in which you can save it with quotes on all field or an open office suite, I use the LibreOffice Calc Spreadsheet when saving select Edit filter settings

Settings_CPPv

Click Save and a box comes up select Use Text CSV Format

Use_Text_csv

Click the Use Text CSV Format, a box opens select Quote all text cells and click OK – this saves a .csv file that has each cell with Quotes.

quote_text_cells


Posted

in

,

by

Tags:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *