Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: CSV Format output is unstructured [Powershell] #3620

Open
mubashirali84 opened this issue Mar 28, 2024 · 15 comments
Open

[Bug]: CSV Format output is unstructured [Powershell] #3620

mubashirali84 opened this issue Mar 28, 2024 · 15 comments
Assignees
Labels
status/awaiting-reponse Waiting response from Issue owner

Comments

@mubashirali84
Copy link

Steps to Reproduce

  1. command
    prowler aws --services s3 -M csv
  2. Provider: AWS
  3. Single AWS account
  4. CSV output is attached
    prowler-output.csv

Expected behavior

The HTML output looks fine. I am looking for something similar output in CSV so that it can be actioned in a report.

image

Actual Result with Screenshots or Logs

This how the CSV looks:

image

How did you install Prowler?

From pip package (pip install prowler)

Environment Resource

Prowler is running on Powershell

OS used

Windows/PowerShell

Prowler version

Prowler 3.15.3

Pip version

pip 24.0

Context

No response

@mubashirali84 mubashirali84 added bug status/needs-triage Issue pending triage labels Mar 28, 2024
@christiandavilakoobin
Copy link
Contributor

Try to use ";" when importing the csv to Excel. Excel by default uses the "," symbol.

@mubashirali84
Copy link
Author

Try to use ";" when importing the csv to Excel. Excel by default uses the "," symbol.

Thanks for your response.

My apologies but I couldn't understand. The CSV generated by prowler already has ";" in it, shouldn't it be "," instead so that the excel is able to properly create the columns for CSV. At the moment it seems like like multiple column headers are merged.

For e.g.:

image

Interestingly, the output for same query in html table is properly structured and displayed.

@dstreefkerk
Copy link

dstreefkerk commented Apr 9, 2024

Interestingly, the output for same query in html table is properly structured and displayed.
I agree that it would make more sense to delimit using commas.

In the meantime, a workaround to get the data into Excel is to do the following instead of double-clicking the CSV file:

  1. Start a blank workbook in Excel
  2. Go to the Data ribbon tab
  3. Under Get & Transform Data pick From Text/CSV
  4. Point Excel to your CSV file.
  5. It should detect that it's semicolon-delimited. Click Load

Or do the following in PowerShell to turn it into a comma-delimited file:
Import-Csv -Path "c:\temp\blah_cis_2.1_azure.csv" -Delimiter ';' | Export-Csv -Path c:\temp\prowler-output.csv -NoTypeInformation

@dstreefkerk
Copy link

dstreefkerk commented Apr 9, 2024

Prowler is running on Powershell

This is a red herring, in my experience Prowler doesn't run differently under a PowerShell or regular CMD window. It's still using Python as the interpreter.

The issue here is that the 'CSV' file generated by Prowler isn't actually a Comma Separated Values file, it's semicolon separated.

Perhaps this was a design decision at some point by the Prowler team because the data might contain commas, and it's easier to switch to a different separator than to enclose each data field in quotes?

@mubashirali84
Copy link
Author

Interestingly, the output for same query in html table is properly structured and displayed.
I agree that it would make more sense to delimit using commas.

In the meantime, a workaround to get the data into Excel is to do the following instead of double-clicking the CSV file:

  1. Start a blank workbook in Excel
  2. Go to the Data ribbon tab
  3. Under Get & Transform Data pick From Text/CSV
  4. Point Excel to your CSV file.
  5. It should detect that it's semicolon-delimited. Click Load

Or do the following in PowerShell to turn it into a comma-delimited file: Import-Csv -Path "c:\temp\blah_cis_2.1_azure.csv" -Delimiter ';' | Export-Csv -Path c:\temp\prowler-output.csv -NoTypeInformation

Thanks for that.

Also the workaround that appears to be working for me is use to use semicolon as the separator in CSV. Here's how:

  1. Edit the CSV in a text editor
  2. On top of the CSV, on the very first like type sep=; and SAVE

Re-open the csv in excel and it should be properly structured.

image

image

@dstreefkerk
Copy link

On top of the CSV, on the very first like type sep=; and SAVE

That's really cool, and might be a trivial fix for the Prowler team to implement...

@pedrooot
Copy link
Member

Hi! @mubashirali84 are you still having the same output running prowler aws --services s3 -M csv ?
Waiting for your response, thanks for using Prowler! 🚀

@pedrooot pedrooot added status/awaiting-reponse Waiting response from Issue owner and removed status/needs-triage Issue pending triage labels Apr 11, 2024
@jfagoagas
Copy link
Member

On top of the CSV, on the very first like type sep=; and SAVE

I think this is just an Excel option, not something native in CSV, am I right?

@mubashirali84
Copy link
Author

On top of the CSV, on the very first like type sep=; and SAVE

I think this is just an Excel option, not something native in CSV, am I right?

Its a CSV option. Just edit the CSV in notepad or notepad++ and append seq=; at the top. You can then open that in excel and it will open with proper columns.

Another option is to replace the commas with full-stops because they are used in comments and then replace semi-colons with commas. That will properly delimit the CSV too using commas though the RISK column will be with full-stops instead of commas.

@jfagoagas
Copy link
Member

Its a CSV option. Just edit the CSV in notepad or notepad++ and append seq=; at the top. You can then open that in excel and it will open with proper columns.

We cannot do that since we'll break the current CSV format and automations forcing customers to skip the first line.

Another option is to replace the commas with full-stops because they are used in comments and then replace semi-colons with commas. That will properly delimit the CSV too using commas though the RISK column will be with full-stops instead of commas.

We cannot do that either since we won't be backwards compatible, we need to use the ; as the delimiter since we use , in some columns. I think you can still load the CSV output in excel loading the data from the file.

@jfagoagas jfagoagas removed the bug label Apr 11, 2024
@jfagoagas jfagoagas assigned jfagoagas and unassigned pedrooot Apr 11, 2024
@mubashirali84
Copy link
Author

Its a CSV option. Just edit the CSV in notepad or notepad++ and append seq=; at the top. You can then open that in excel and it will open with proper columns.

We cannot do that since we'll break the current CSV format and automations forcing customers to skip the first line.

Another option is to replace the commas with full-stops because they are used in comments and then replace semi-colons with commas. That will properly delimit the CSV too using commas though the RISK column will be with full-stops instead of commas.

We cannot do that either since we won't be backwards compatible, we need to use the ; as the delimiter since we use , in some columns. I think you can still load the CSV output in excel loading the data from the file.

Well as a workaround for me both options are ok however, the original CSV spat out by prowler should be structured and mustn't require any manual intervention.

Should I be doing something additional while opening in Excel? What's the recommendation?

@jfagoagas
Copy link
Member

Well as a workaround for me both options are ok however, the original CSV spat out by prowler should be structured and mustn't require any manual intervention.

I don't get this, I think this issue is specifically related with how Excel handles CSVs. As you can see in our documentation here the CSV format is structured with a common format for all the providers covered by Prowler.

Should I be doing something additional while opening in Excel? What's the recommendation?

I think for now you should follow the provided workarounds since it's the way to load it based on Microsoft Excel documentation https://support.microsoft.com/en-us/office/import-or-export-text-txt-or-csv-files-5250ac4c-663c-47ce-937b-339e391393ba.

@dstreefkerk
Copy link

I don't get this, I think this issue is specifically related with how Excel handles CSVs

The clue is in the name of the file format: Comma Separated Values.

The problem is in the following change:

Since Prowler v3 the CSV column delimiter is the semicolon (;)

If the data contains commas, Prowler should be encapsulating it in quotes: https://csvloader.com/csv-guide/why-quotation-marks-are-used-in-csv

@mubashirali84
Copy link
Author

Well as a workaround for me both options are ok however, the original CSV spat out by prowler should be structured and mustn't require any manual intervention.

I don't get this, I think this issue is specifically related with how Excel handles CSVs. As you can see in our documentation here the CSV format is structured with a common format for all the providers covered by Prowler.

Should I be doing something additional while opening in Excel? What's the recommendation?

I think for now you should follow the provided workarounds since it's the way to load it based on Microsoft Excel documentation https://support.microsoft.com/en-us/office/import-or-export-text-txt-or-csv-files-5250ac4c-663c-47ce-937b-339e391393ba.

Generally, the primary reason in my view for generating the CSV is to analyse the data and use it in some sort of report. I think CSV among all other formats is the best one to do that and the best way to represent the CSV data is MS Excel.

I think for now you should follow the provided workarounds since it's the way to load it based on Microsoft Excel documentation https://support.microsoft.com/en-us/office/import-or-export-text-txt-or-csv-files-5250ac4c-663c-47ce-937b-339e391393ba.

Like I said, I am doing a workaround by using sep=; on csv and it seems I will have to continue doing it.

@mubashirali84
Copy link
Author

I don't get this, I think this issue is specifically related with how Excel handles CSVs

The clue is in the name of the file format: Comma Separated Values.

The problem is in the following change:

Since Prowler v3 the CSV column delimiter is the semicolon (;)

If the data contains commas, Prowler should be encapsulating it in quotes: https://csvloader.com/csv-guide/why-quotation-marks-are-used-in-csv

Exactly, that OR why use commas in RISK column anyway when it is used as a delimiter in a CSV. Why can't it be replaced by a period or some other character? Use commas where they are actually required i.e., delimiting the columns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status/awaiting-reponse Waiting response from Issue owner
Projects
None yet
Development

No branches or pull requests

5 participants